• Please review our updated Terms and Rules here

Question About Parity Check 1 Error

PCFreek

Experienced Member
Joined
Apr 5, 2013
Messages
322
Location
Florida USA
Oh boy do I feel like a noob asking a question about such a well documented error... I just want to make sure that I am not missing something obvious.

I have another early 16-64kb MB from an IBM 5150 that was dead. The Supersoft Diagnostic ROM diagnosed a faulty 8237 which I replaced. It now powers up. It will run for a minute or 2 before it gets a Parity Check 1 Error (the fact that it runs a couple of minutes indicates a thermal failure?). The Supersoft ROM gives the RAM error address as 5040. I have replaced socketed RAM until I am blue in the face and it did not help. I have since removed all socketed RAM and have only Bank 0 populated. I have set SW1 and SW2 accordingly (SW1: 3,4 on... SW2: 1,2,3,4 all on) and I still get the Parity Check 1 error with a 57000671 ROM installed. Using SuperSoft Diagnostics to locate RAM issues is useless without all 4 banks populated, so I have not gone back to that. The 5150 gives no error code prior to Parity Check 1 as the error occurs while it is running. I can even boot to DOS before the error pops up.

Any ideas? I now have a Pace soldering/desoldering station and can switch out chips with ease. I honestly don't know how I ever lived without it! I will sooner or later be revisiting some of my other threads that I never finished now that I can desolder like a pro. I pulled the 8237 chip in less than 5 minutes and it popped right out without sticking when I was done. Amazing!

I have plenty of RAM chips and can swap out Bank 0 if necessary. Unfortunately, my replacement RAM is not a proper date code for this board so I would like to avoid that if the error may lie somewhere else. I have piggy-backed all bank 0 RAM with no success. I understand that piggy-backing will not correct all RAM failures. I have replacements for all RAM related chips (LS245, LS280N, etc...) if it could likely be a problem elsewhere.

thanks!
 
Is the failing address always the same? Can you boot from floppy?

The failing address using the Supersoft ROM switched between 5040 and 8040, but has been only 5040 recently. As stated earlier, I have found it to be unreliable for identifying RAM error addresses in some situations. I am now only running Bank 0. POST does not give me a memory error address. It will boot from floppy and even begin to perform basic commands such as "diskcopy", but will always give a Parity Check 1 error or will just freeze up. It sometimes freezes up before the floppy fully boots and will just hang with the floppy drive spinning.
 
Don't you just love intermittent faults.

The Supersoft ROM gives the RAM error address as 5040.
Last month, on a different 16K-64K motherboard of yours, the Supersoft ROM was also reporting 5040. Another SuperSoft red herring ?

It will run for a minute or 2 before it gets a Parity Check 1 Error (the fact that it runs a couple of minutes indicates a thermal failure?).
I would think so if it was the case that:
* From the motherboard in cold state, "It will run for a minute or 2".
* When the motherboard is in warm state, between resets, the most the board will run for is something significantly less than "a minute or 2".

To remember is that to get the motherboard from a warm state to a cold state, the motherboard may need to be left unpowered for half an hour or so.

If the symptom is temperature sensitive, then obviously you are in a position to try area based heating/cooling techniques to narrow down the area of the motherboard containing the cause.

Any ideas?
By your, "for all RAM related chips (LS245, LS280N, etc...)", I see that you recognise that the failure is of the RAM subsystem, not specifically that of RAM chips.

It would be really good if you could identify whether the failing address/bit is constant or random or pattern.

Assuming random addresses and bits:

The trigger can be external. A few years ago, there was someone on these forums who eventually observed that their random PARITY CHECK errors coincided with their air conditioning system kicking in. So, unstable power. Have you tried the motherboard using a known good power supply?

The capacitors near the RAM chips are critical to reliable RAM operation. A good read on that subject is on pages 51 and 52 of the document at [here]. So perhaps (because it is relatively easy to do) replace the tantalum capacitors filtering the voltage rails to the RAM chips (the ones near each parity RAM chip).
 
By your, "for all RAM related chips (LS245, LS280N, etc...)", I see that you recognise that the failure is of the RAM subsystem, not specifically that of RAM chips.

It would be really good if you could identify whether the failing address/bit is constant or random or pattern.

I am not that good at this yet... is this error not related to a RAM chip but something else in the subsystem?

I ran the superSoft with only Bank 0 populated and get an error at 5040 bit 4. I then populated Banks 1-3 with brand new RAM and I still get 5040 bit 4.

I have used two separate power supplies (one black, one silver) with this and the error does not change.
 
While cool, it booted right up to BASIC with the 5700671 installed and ran for a minute or 2. It eventually froze up (one cassette relay click and no error displayed). I let it sit and heat up a bit and then power cycled it several times to see if it would give a memory error address prior to the Parity Check error (since I was booting it in a warm state). It does not display a memory error address with the 57000671 installed. It either freezes, or begins to boot and displays Parity Check 1 during the warm boot. On one occasion, it gave the beep code for no video card. In the pic below, it garbled the words displayed on the screen during another power on attempt. I am absolutely willing to piggy-back or swap chips if anyone has suggestions. with my de-solder station, chip swaps are no longer a problem for non-socketed chips.
PA160020.jpg

Most recent SuperSoft pass below. I took this as it sounded out the 16K Critical Memory beep code and on the 2nd pass seconds before it registered the 2nd error on this test. This is why it says 5 total errors, but only 4 are shown. When it tests System Memory to 10000, the failure address given is 4000 and that is when the parity error at 0C000 pops up in the bottom right of the screen. I have replaced the Bank 3 parity chip, but it doesn't make a difference... the error appears on every pass. I have replaced all socketed RAM with good RAM in banks 1-3. I have also tried freeze spray on the bank 0 RAM, the LS245, and the S280N... Now it goes straight to Parity Check 1 without booting up. It began this immediately after I cooled the LS245... hmmmmmm.....
PA160023.jpg

UPDATE: It no longer boots at all with the 5700671. As you can tell from the SuperSoft photo above, my switches were still set to 16kb. with all 4 banks populated, I have since switched back to 64kb settings (SW1: 3,4 are now off with SW2 unchanged). It shows zero activity with a 5700671 or a 1501476. The Supersoft now shows an error address of 5046 bit 5 in the 16K Critical RAM test since i changed the switch settings. Of interest, the 8237 chip gets very warm compared to the other chips. Much warmer than the processor.
 
Last edited:
Hallo PCFreek,

The cause can be other hardware than the RAM. Get a schematic and study it a bit. I myself would disable the 74LS280 by disconnecting pin 6 and connecting the freed line to +5V. The 280 checks the parity and generates the NMI that reports the error. If you still get a parity error, then something else is wrong. As there is nothing else IMHO that can generate a NMI, the circuit itself or the CPU is the error. This can be checked by disabling the NMI pin at the 8088.
If you don't get an error anymore, the 280 is the error.

I hope this helps a bit.
 
This is unlikely, but did you swap out the CPU too?

One idea to locate the issue:
Deactivate the on-board RAM and use an expansion card instead.

1. You will need an expansion card that can map its memory to address 0. The IBM 64kB expansion seems capable of this (switches 1-5 all closed).
2. Deactivate the on-board RAM. On the 64kB mainboard, remove U48 and bridge Pin 15 and 16.

In case it's still unstable many memory related chips can be excluded as cause.

Update: For temperature dependent issues you can try cooler spray. System may start working again when you hit the correct spot.


This gives an idea for an experimental hardware tool to capture memory errors (XT, don't know about AT).
The card contains an SRAM, a data comparator and latches with display digits.
- When writing to memory in the main memory range, data is written to the SRAM.
- When reading, data from the SRAM is compared to what's at the bus. If they differ, address, correct and incorrect data are latched and displayed (i.e. 7-segment).
- It can be configured to continuously retrigger at fault, or stop at the first error. In the latter case it can be reset.
 
Last edited:
Hallo PCFreek,

The cause can be other hardware than the RAM. Get a schematic and study it a bit. I myself would disable the 74LS280 by disconnecting pin 6 and connecting the freed line to +5V. The 280 checks the parity and generates the NMI that reports the error. If you still get a parity error, then something else is wrong. As there is nothing else IMHO that can generate a NMI, the circuit itself or the CPU is the error. This can be checked by disabling the NMI pin at the 8088.
If you don't get an error anymore, the 280 is the error.

I hope this helps a bit.

As per my last post, it no longer POSTS after using the freeze spray on the LS245... I swapped both the S280N and the 8088 and the SuperSoft gives the same errors that I saw earlier:
16K Critical RAM: error address 5040 bit 4
System memory to 10000: error address 4000, Parity error at 0c000

I swapped the 48k chip and now it is reporting:
16K Critical RAM: error address 7E40 bit 2
System memory to 10000: error address 4040, Parity error at 0c000

The RAM address errors have changed around before... but have been steady since I started this thread.
I have repeatedly swapped the 16k and 32k chips (for the 7E40 and 4040) and the error addresses do not change during several subsequent SuperSoft runs. I seriously think that the SuperSoft is on the wrong track.
 
This is unlikely, but did you swap out the CPU too?

One idea to locate the issue:
Deactivate the on-board RAM and use an expansion card instead.

1. You will need an expansion card that can map its memory to address 0. The IBM 64kB expansion seems capable of this (switches 1-5 all closed).
2. Deactivate the on-board RAM. On the 64kB mainboard, remove U48 and bridge Pin 15 and 16.

In case it's still unstable many memory related chips can be excluded as cause.

Update: For temperature dependent issues you can try cooler spray. System may start working again when you hit the correct spot.


I recently swapped the CPU (as per last post) and used cooler spray earlier too... system hasn't booted since I sprayed the LS245 which I am going to change next.

I have never even seen a 64KB card, but have a 64-256KB card... would that work? If so, what would be the SW1, SW2, and card SW settings?

Thanks!
 
That card should work.
On the card, SW1-4 should be on (closed), configuring the address to 0.
SW5-8 depend on how much RAM it has or how much you would like to give to the mainboard. Only one is set to ON at a time.
SW5 on: 64k, SW6 on: 128k, SW7 on: 192k, SW8 on: 256k.

For SW1/2 I'd try the 64k setting first.
 
One idea to locate the issue:
Deactivate the on-board RAM and use an expansion card instead.

1. You will need an expansion card that can map its memory to address 0. The IBM 64kB expansion seems capable of this (switches 1-5 all closed).
2. Deactivate the on-board RAM. On the 64kB mainboard, remove U48 and bridge Pin 15 and 16.

In case it's still unstable many memory related chips can be excluded as cause.

Now that is COOL!! It runs with ZERO memory errors, but 3 repeating Parity Errors.

A little before and after for comparison... before I set up you expansion card RAM trick, I fired up the SuperSoft and let it cycle 10 times without interruption. I previously power cycled the computer after each Supersoft run. In 10 uninterrupted runs... again this is before the expansion card RAM trick, it gave the following memory error address:
16K Critical: 07C40 x6 , 06B32, 05212, and 07C12 x2
System memory to 10000: 04140 x6, 04440, 04240, 0404D, and 0C060
I did not pay much attention to Parity Errors, but they were generally 04000, 08000, 0C000

After removing U48, jumpering pin 15-16, and installing the card (with SW1/SW2 at 64K, card SW 6 on )... I ran the SuperSoft uninterrupted for 10 cycles. it PASSED 16K critical RAM on the first pass, but then failed it 9 times in a row. it would not boot from a 1501476 either. supersoft errors for the 10 runs:
16K Critical: no memory errors or failing bits in 10 runs, Parity Error at 5040 x9 (no error the first pass)
System Memory to 10000: no memory errors or failing bits in 10 runs, Parity Error at 04000 and 0C000 all 10 times

I have changed the chips at those locations many times with no help and now it gives the same parity error locations on the expansion so the problem is not in the RAM chips.

this means something, but I don't know what... I wish I understood this stuff like the rest of you!

and FYI: I have not yet swapped the LS245

UPDATE: I still think this is thermal failure... After sitting, it passed 16K on the first pass again... i have no idea which chip to cool
 
Last edited:
OK... I'm confused, but that's nothing new with this one... As stated earlier in this thread, the 8237 feels hot. I decided to use the cooling spray on it. I ran the Supersoft through 4 uninterrupted passes with constant cooling of the 8237, and the results are:
PA170301.jpg

When I stopped cooling the 8237, it went another pass and a half then the failures reappeared:
PA170302.jpg

The 8237 reached a temp of 127F (53C) after a few minutes of running which I would not think is that significant, but it is by far the hottest chip on the board.

The 8237 originally failed SuperSoft diagnostics and was swapped out. I installed a chip socket and have swapped in 5 different 8237's and they all fail the same way. I am going to desolder and resolder the chip socket.

UPDATE: Desoldered and re-soldered the 8237 and the exact same errors listed in #13 are generated once the chip gets hot (after 1 successful PASS run). A different 8237 was also used with the same results.

ANOTHER UPDATE: I repeated the cooling test of the 8237 and ran the SuperSoft 10 times before I stopped cooling the 8237. It ran 13 passes with only "Memory Refresh" errors before it began to fail. See photos below. For cooling, I applied a plastic film to the 8237 and slowly sprayed it with an inverted can of compressed air. As you can see in the photo, the hot spot on the 8237 is almost dead center. Using the film pretty much limited the cooling effect to just the surface of the 8237 helping to isolate exactly what was cooled. I do not believe that the cooling spilled over very much to adjacent chips or the board itself, but I suppose it is possible. Once I stopped cooling the 8237, I removed the film and monitored the temperature until it failed a SuperSoft test. Failure occurred at approx 115F (46C). Once it fails, it still displays the same errors as those listed in post #13. The number of failures counted for each test becomes an "F" on the 10th fail. The number of total passes is in the lower right of the screen.
PA170307.jpgPA170305.jpgPA170308.jpg
 
Last edited:
(Are you sure the Pin 15-16 connection is reliable?)

One assumption: the initial 8237 died because of some other defect. The replacement 8237 are not dead yet but suffering.
When overheated, the refresh stops causing memory/parity errors.

How to kill a 8237? My ideas:
1. clock too high (pc speaker tones same as on other boards?)
2. shorting outputs.
- shorts inside the ISA slots?
- Logic probe (also compare board in working and overheated condition):
Pin 10 (HRQ) positive pulses
Pin 36 (-EOP) probably negative pulses
Pin 25 (-DACK0) negative pulses
Pin 24 (-DACK1) high
Pin 14 (-DACK2) high
Pin 15 (-DACK3) high
Pin 8 (ADSTB) probably positive pulses

- resistance tests of the above pins against 5V and GND, looking for shorts

3. (maybe) bus data conflicts: Will it not heat up as much when you keep the board in reset by grounding 'power good'?
 
What temperature is reached on the 8237 in your good motherboards?

I took the temperature in both a working board and this board sporadically for 30 minutes. I admit it was under slightly different conditions and I hope that doesn't make a difference. The good board booted to BASIC and sat there. The bad board will not boot, so I ran it uninterrupted using a SuperSoft.

Good board:
5 minutes: 115F
15 minutes: 117F
30 minutes: 121F
Temperature rose so slowly that I didn't take a lot of interim temps

Bad board:
1 minute: 111F (began to Fail tests)
3 minutes: 115F
7 minutes: 122F
15 minutes: 127F
30 minutes: 128F

The overall temperatures are not really that different, but the temperature of the bad board rose much more quickly. After running the bad board for 30 minutes and 30+ consecutive failed SuperSoft tests, I cooled the chip with my earlier method and it immediately passed the memory tests as shown in the photo below. The memory address error and failing bit shown in the lower right is residual data from the prior failed test and should be disregarded.
PA180309.jpg
 
(Are you sure the Pin 15-16 connection is reliable?)

The 15-16 connection is soldered from the bottom... I realize it looks odd from the top. I agree with your assumption that something likely killed the 8237, but similar heat levels do not affect the good board. I assume a bad chip or shorted condition downstream from the 8237???

The speaker sounds absolutely normal when giving error code beeps for the supersoft. I will have to perform the other tests and get back to you.
 
Last edited:
How to kill a 8237? My ideas:
1. clock too high (pc speaker tones same as on other boards?)
2. shorting outputs.
- shorts inside the ISA slots?
- Logic probe (also compare board in working and overheated condition):
Pin 10 (HRQ) positive pulses
Pin 36 (-EOP) probably negative pulses
Pin 25 (-DACK0) negative pulses
Pin 24 (-DACK1) high
Pin 14 (-DACK2) high
Pin 15 (-DACK3) high
Pin 8 (ADSTB) probably positive pulses

- resistance tests of the above pins against 5V and GND, looking for shorts

I ran several supersoft cycles on a good board and this bad board while using a logic probe on the pins listed above. the good board would go into lengthy floppy test/read, so I removed bit 0 from bank 1 to replicate the 16K Critical memory error and skip these tests. Even before I did this, the results were consistent, they just took longer to get to.

Both 8237's on the 2 different boards behaved the same (Logic Probe on TTL setting):
pin 8: Low Pulsing (green/low light lit, yellow data light pulsing)
pin 10: both green/low and red/high light lit, yellow data light pulsing
pin 14: high
pin 15: high
pin 24: high
pin 25: high pulsing
pin 36: high

Resistance readings on the bad board are below... very similar readings on the good board:
pin 8: infinity
pin 10: 1.9K to +5, 2.1K to ground
pin 14: infinity
pin 15: infinity
pin 24: infinity
pin 25: 4.7K to +5, 4.9K to ground
pin 36: 4.7K to +5, 4.9K to ground
 
Last edited:
> Both 8237's on the 2 different boards behaved the same (Logic Probe on TTL setting):

Is this with the bad board heated up enough to show the error?
 
Back
Top