PDA

View Full Version : is memory parity necessary ?



retrogear
December 26th, 2015, 03:00 PM
So I'm digging around in stuff I acquired and found a Z-205 256k memory card for my Zenith Z-100.
I installed it and after running the Zenith disk based memory diagnostics and replacing some memory and reseating others,
it passes all the memory tests except parity:
28531
It gives a list of chips as possibilities starting with U88 which is memory. Replacing that didn't help. I tested the socket with an ohmmeter too.
My question would be: Is parity necessary for operation?
Isn't it just an odd/even checksum to verify the memory is good?
If I know the memory is good, who cares about parity?
Just wondering if I need to chase it further ...

Larry G

PS - I'm probably going to anyway because it will bug the crap out of me otherwise :)

Marty
December 26th, 2015, 03:41 PM
Hi All;

Larry, it all depends on what Your system has..

"" Isn't it just an odd/even checksum to verify the memory is good? ""

That is One system, But, depending on what Your System has in it,
I could also be Parity Checking and Correction..
Then it has the ability to check and change memory If a Bit is Bad.. In some some Systems two bits, not more than that..
But, it can still report that at that Location (address) the Memory is BAD !!!!

THANK YOU Marty

Timo W.
December 27th, 2015, 12:44 AM
If I know the memory is good, who cares about parity?
First, if memory would be good, you wouldn't get any parity errors.

Parity is quite important in older machines as memory chips back then weren't very reliable. Parity ensures that a memory error can be detected in real-time and the system stopped to prevent data damage/loss.

As for your question itself: Can you turn off parity checking on the card? If not, you have no other way than replacing all chips listed.

g4ugm
December 27th, 2015, 02:58 AM
It depends on system settings and what happens when a Parity Error is detected. If it causes the system to halt you need to fix it...

daver2
December 27th, 2015, 03:19 AM
You can get both hard and soft errors.

Hard errors are when a memory chip goes faulty - these will probably result in the machine not working at all.

Soft errors are when a memory chip (or some supporting logic) starts to become marginal. A memory test will not always detect these as they 'come and go' depending upon ambient conditions (temperature, electrical interference etc.). Also a passing gamma ray or some other form of radioactive decay can affect a memory cell (flipping it from a '1' to a '0' or vice-versa).

Without parity checking - these soft errors would go undiagnosed - but the CPU would still act upon them (e.g. a wrong instruction or addressing mode on an instruction fetch or the wrong data on an operand fetch).

Adding parity detection logic causes a single bit flip in the word to be detected (an odd number of bits flipped) but not an even number of bit flips (which is less rare).

To detect multiple bit errors - you can implement EDC (error detection and correction) using Hamming codes. These will generally detect and correct all single-bit errors and are guaranteed to detect (but not correct) double bit errors - and can detect more than two bit errors - but with a reducing probability.

The down side is that parity detection requires (usually) an extra RAM chip per bank, whereas EDC usually requires an additional 5 RAM chips per bank.

If it's there I would keep it. I have seen these happen in reality (which is why we use EDC memory at work exclusively for the equipment we control). I sometimes see reported single-bit soft errors that are corrected for.

Dave

retrogear
December 27th, 2015, 05:50 AM
Thanks for the input, guys. Now I know what answers to look for.
I RTFM on Monahan's site (many thanks), and here's what I found:
At power up, parity is disabled. Parity is enabled by software.
Hmmm - so what software would enable it? If it got enabled, then if parity error occurs, I assume it would halt? Otherwise, why enable it?
My system boots zdos or cp/m and runs just fine, so I'm assuming it is disabled.
Now looking at the schematic, parity is an extra ram chip per bank and U88 is the parity chip itself which is what I suspected.
Hmmm - so how would self diagnostics point to the parity circuit itself as the fault?
Looking at my screenshot:
The diagnostic step is called parity generator/checker.
Does it do it's own parity calculation and compare it to what U88 should have?
It indicated DH is parity used which = 02 so even parity?
Hmmm - so does this mean the parity circuit is working for odd parity?

Sorry, I've had my morning coffee so my brain is running on all cylinders as they say ... :p
(I don't see a smiley for someone who is highly caffeinated)
Fun stuff !!

Larry G

daver2
December 27th, 2015, 06:17 AM
Larry,

I will try and answer your 'caffeine-fuelled' questions in some sort of order...

At power up parity is disabled. This is a statement and semi-sensible as the DRAM will contain rubbish on a power-up. Normally, software shouldn't read memory before it has been initialised - but (if it did) this would almost invariably provoke a PARITY error if this feature was enabled.

The 'bootstrap' should initialise the memory to a known good state (possibly by performing a memory test) and then enable parity checking after the test. It should then perform another test with the parity checker enabled...

The board contains a CPU-readable port to identify if a parity error has occurred or not - and the ability to reset the flag. The memory card also drives S100 bus pin 98 (ERROR). What happens in your system when the S100 ERROR pin is asserted is down to the designer of the CPU board of course. It could HALT (but this is not sensible) or (better still) initiate some form of Non Maskable Interrupt (NMI) assuming the bootstrap has set-up an NMI handler and it is uncorrupt...

The parity generation/checking is done by a standard TTL chip (an SN74LS280) incorrectly identified on the schematics as a 74LS80 incidentally. Notice that the parity memory can be written to or not independently of the main memory bytes. This provides us with some 'options' when it comes to testing out the memory/parity... If we disable the parity generation/checking logic - we can read or write bytes/words to/from the memory without the parity from hindering us. If this works fine - the problem doesn't lie with the memory itself. If we then enable the parity generation/checking (and we get errors) then the problem maybe the parity RAMS or the parity generation/checking chip.

We could also force incorrect parity into the RAM/PARITY_RAM by enabling parity generation/checking and writing (say) 00 into the memory. The parity generation would have then updated the parity RAM to suit. If we now disable parity generation/checking and write 01 to the memory - the parity RAM won't be updated. We then enable parity generation/checking and perform a read. This time we should expect an error (because we forced one to be there)...

Page 12 of http://www.s100computers.com/Hardware%20Manuals/HeathZenith/Z-205%20RAM%20Card.pdf describes what parity bit is used.

In fact, the manual is very well written - describing pretty well how things work. Perhaps you need the caffeine effects to wear off before reading the manual again?

Dave

retrogear
December 27th, 2015, 11:49 AM
Well, after a game of musical chips, I've substituted everything in that failure list except U118 74LS367. I have a hard time believing statistically that after trying 6 of the 7 chips, that it could be the 7th.
Now I wonder if it's worth pursuing it further ...
Besides, the caffeine has worn off :(

Larry G

Chuck(G)
December 27th, 2015, 12:09 PM
To quote (at least apocryphally) Seymour Cray, "Parity is for farmers". This reportedly came from someone asking him why his 6600 did not use (core) memory parity and is widely misunderstood. He said what he said because at least with core, it's possible to run exhaustive diagnostics when starting the system and have it run perfectly well after that. On his later 7600, however, there was parity, probably because core there was being run within an inch of its life and heavily stressed. I believe that the Cray I used ECC in memory.

A good diagnostic should be able to pinpoint any failures.

daver2
December 27th, 2015, 12:11 PM
Looking at the schematic there are potentially other devices that could cause a failure not identified on the diagnostic screen (for example, U11 gate B and U150).

I suspect they have presented the 'obvious' areas of error?

Dave

retrogear
December 27th, 2015, 05:40 PM
Yea, I found a sub for U118, U11 and U150. Still no joy ...

daver2
December 28th, 2015, 02:04 AM
Looking at the schematic again in more detail I found the following IC's in the parity circuitry:

U43/U88/U79
U42/U61/U70

U12C
U40B
U23
U17B
U150
U11A
U118

It may be worth exchanging these (well, the ones you haven't tried already that is).

Can you get the machine working with this particular memory board installed? If so, we could write our own memory test software to test the card a bit at a time - and this time we can control what we are testing.

Do you have some test equipment (e.g. an oscilloscope and/or logic analyser)?

I did notice that this card has some fuse-programmable PALs. These beasts are known to regrow their fuses over time - thus changing the logic equations and making a card fail. I have seen this effect in reality so it does exist (plus MMI took out a patent on a new programmable device technology and cited fuse regrowth as a problem with their older line).

Dave

retrogear
December 28th, 2015, 02:47 AM
Thanks, I'll keep swapping maybe. I do have a 200MHz dual trace scope, Fluke multimeter and a S100 extender card. I wouldn't mind scoping around. I have extra Z100 boards (and others) including a spare Z100 motherboard which I've been swapping IC's from. I have been exchanging IC's unless they are verified bad. I've heard of PAL's but know nothing about them. My background is mostly analog circuit troubleshooting. I had some training on digital video, etc but never got into discreet digital troubleshooting much. However, I lived by scope for many years. Yes, as far as I can tell the memory board is functioning just fine. I wouldn't even know it had a parity problem if I hadn't run diagnostics chasing down memory failures. I verified a couple bad memory sockets with an ohmmeter and managed to bend the leads of the IC's to make contact. The board passes all the bit banger tests, just not parity. I realize the tests are only as good as the programmer who designed them. I had a few exchanges with Barry Watzmann, God rest his soul. I have worked with assembly more 8 bit than 16 bit but I am not a programmer by trade. My official title is Senior Workstation Specialist but in reality we are the grunts that go do field service on IT hardware / software. I am carrying an oncall pager as I type this. Larry G

daver2
December 29th, 2015, 01:49 AM
I have tracked down and had a cursory read of the Z-100 technical manuals so can I ask you a few questions first to establish how the system you have is configured before we go off at a complete tangent?

Can you identify the link and switch settings on the memory board in question please.

If you remove the memory board from your Z-100 - do all the diagnostics pass? I just want to rule out a faulty Z-100 before we start...

With the memory board installed - does the system still run and (if so) what are you using (e.g. the 8085 or 8088, CP/M or Z-DOS). Or does the whole shebang work?

What computer languages and/or debug monitors do you have available on the beast so that we can write and run some bespoke programs?

My 8085 is somewhat rusty (I was a Z80 assembler programmer back in the day and then moved on to the 80x86 - I am still playing with the 8086 and 80286 at work to this day) - but it's good to see that you have the right collection of tools for the job.

Dave

retrogear
December 29th, 2015, 06:58 PM
Bingo !!

I was googling for Z100 parity error and came across a z100 newsgroup post that said parity errors occur when the parity control switch is set to the wrong port - huh? SW1 is the setting for the parity control port. I had set it to the typical configuration specified on page 8 of the Z205 manual. Well, reading ahead to page 11, it specifies the port to set it to based on the total memory which for me is 448k so should be port 98h !!! Dang, setting to that value caused the parity generator to pass and a couple parity bit errors to show up on certain chips. Replacing those chips, now it's passing all the diagnostics. I am running a burn-in now with the cover on to heat it up. So far so good. Thanks Dave for offering to help and making me start over from scratch.

PS - I ran out of 4264-15 ram chips so substituted a couple 4164-15 and they seem to be passing diagnostics just fine. I saw on this forum some talk about this. Anyone know why this is not a good idea?

Larry G

Chuck(G)
December 29th, 2015, 07:43 PM
Just be careful--the 4264s should be from Micron (MT4264). If that's the case, yes, you can use a 4164. There are also 4264s that have a different organization as 4bitx16K.

daver2
December 30th, 2015, 01:23 AM
Larry,

Bingo I like!!!

That was where I was first coming from - the motherboard and each expansion memory card should have a unique I/O address for the parity control and status port. If not (or it was at the wrong I/O address) then things wouldn't work as expected (as you have found out).

It is the same with our EDC memory cards at work. Each card needs it's own unique I/O address setting via switches on the card. In our case, however, we initially configure our software to match the hardware settings. If the maintenance people come along to change a memory card - they have to set-up everything correctly to match the card they are exchanging (otherwise the software complains).

Glad to hear it now works though.

Have a Happy New Year.

Dave

retrogear
December 30th, 2015, 03:08 AM
>Just be careful--the 4264s should be from Micron (MT4264).

Interesting, yes these are MT prefixed, thanks. When I ordered chips for the video ram, instead of 4 chips, I got 4 sets of 4, so ended up with plenty of 4164's so paid off in the end. Thank goodness they were cheap.

Now I just received my gizzy's to try and get this to display in color vga. My next task ...

Larry G