PDA

View Full Version : Dead IBM 5170 clone - Protected Mode / 8237 troubleshooting



diskettenfett
July 4th, 2016, 09:57 AM
Hello,

I recently obtained a clone of a 5170 Rev. 1 (very similar board layout, 2 banks of RAM, although 6/10 MHz). Sadly, it is mostly dead.
I have so far tested the board with two known-good PSUs and Hercules cards, swapped the RAM banks, cleaned the board (minor NiCd battery damage, no dead traces to be found).

On powering it up with the original ROMs, there's a 50/50 chance. It sometimes gives only a black screen (no cursor, no beeps).
Sometimes, it comes up with a message "8237 error" and hangs.

So, I gave the Landmark diagnostic ROMs a try. See what happens in this video:


https://vimeo.com/173373947

On startup, the beep codes are: 6 HiLo, 1 Beep (Can't init video?!), then 6 HiLo, 2 Beeps (not defined in the manual?!). Then the screen comes up.
The protected mode test seems to fail (Error count goes up to 1), then it displays "failed" in the first line of the screen, the error count increases to two and then it seems to simply reset.

31949

Seems like I am not the only one with this problem, as I read two threads (sorry, can't find them right now?) where the problem was similar.
I'd love to diagnose the problem further. Yes, it's only a clone motherboard, but I hate to trash it...

A 286 in the same package isn't available to me, so I tried replacing the 8042 keyboard controller with one from a 486 mainboard, since I read it also deals with leaving Protected Mode.
That made no difference. I then also read that the 386 and later CPUs handle the protected mode differently, so the 486 keyboard controller might not include the correct procedure for a 286 - who knows? Any ideas?

Any suggestions on how to continue? Any other diagnostic roms or tests I could do?

Thanks a lot :)

Chuck(G)
July 4th, 2016, 10:20 AM
The process on a 286 getting back into real mode is a bit arcane on the 5170, probably because the engineers at Intel couldn't imagine why anyone would want to do that. But basically it amounts to this:

The protected-mode BIOS service writes a "reason for reset" into CMOS location 0Fh.
The keyboard controller is then instructed to pulse the 80286 reset line.
The 80286, now reset to its usual vector at FFFF0h, does some minimal device initialization and then looks at CMOS location 0Fh.
If it sees that the reason for shutdown was a "switch from protected mode", it jumps to a double-word pointer that was stored prior to the switch to protected mode.

You can see how the code works in the 5170 BIOS routines for INT 15H, Function 87H, "move block".

The 80386 fortunately, has a way to get back from protected mode, so its BIOS doesn't use the 286 scheme. Neither does the keyboard controller, so you can't swap them into a 5170 and expect them to work. The other aspect is that the CMOS RTC chip has to be functional.

Later 80286 code avoids the huge latency in this process by employing a then-undocumented instruction, called "LOADALL", that enables a program to set all 80286 registers without restarting the CPU.

modem7
July 4th, 2016, 04:53 PM
On powering it up with the original ROMs, there's a 50/50 chance. It sometimes gives only a black screen (no cursor, no beeps).
Sometimes, it comes up with a message "8237 error" and hangs.
Instability like that is bad. You did not report the same instability with the Supersoft/Landmark ROMs fitted, so perhaps the original ROMs had a poor connection to their sockets, or perhaps the original ROMs are intermittent.


So, I gave the Landmark diagnostic ROMs a try...then it displays "failed" in the first line of the screen
Note that "FAILED" not appearing on the same line as the associated test has been seen by a few people.


Seems like I am not the only one with this problem, as I read two threads (sorry, can't find them right now?) where the problem was similar.
One thread is [here (http://www.vcfed.org/forum/showthread.php?46489)]. Like with your motherboard, the PROTECTED MODE CPU test failed, although on romanon's motherboard, the Supersoft/Landmark ROMs did not display "FAILED". Romanon then put back the IBM ROMs, then used a POST card (http://www.minuszerodegrees.net/misc/post_cards.htm). The POST card stopped at 0C, which for IBM 5170 ROMs, pointed to the keyboard controller. The keyboard controller was found to be faulty.

The other thread is [here (http://www.vcfed.org/forum/showthread.php?43880-I-obtained-a-very-sad-5170&p=345067#post345067)]. Slightly different Supersoft/Landmark screen output as yours - the tests continued.


Any suggestions on how to continue? Any other diagnostic roms or tests I could do?
Maybe you should try what romanon did; use a POST card (http://www.minuszerodegrees.net/misc/post_cards.htm) when the original ROMs are installed to the motherboard. One problem with that is that the authors of your non-IBM BIOS may have used different POST codes to IBM, in which case, you would need access to the POST code list for your non-IBM BIOS.

Or try a set of IBM 5170 ROMs together with a POST card, and the POST code list at [here (http://www.minuszerodegrees.net/5170/post_errors/5170_post_codes.htm)]. It would be interesting to compare that result to what the Supersoft/Landmark ROMs are indicating.

yuhong
July 4th, 2016, 05:15 PM
The process on a 286 getting back into real mode is a bit arcane on the 5170, probably because the engineers at Intel couldn't imagine why anyone would want to do that. But basically it amounts to this:

The protected-mode BIOS service writes a "reason for reset" into CMOS location 0Fh.
The keyboard controller is then instructed to pulse the 80286 reset line.
The 80286, now reset to its usual vector at FFFF0h, does some minimal device initialization and then looks at CMOS location 0Fh.
If it sees that the reason for shutdown was a "switch from protected mode", it jumps to a double-word pointer that was stored prior to the switch to protected mode.

You can see how the code works in the 5170 BIOS routines for INT 15H, Function 87H, "move block".

The 80386 fortunately, has a way to get back from protected mode, so its BIOS doesn't use the 286 scheme. Neither does the keyboard controller, so you can't swap them into a 5170 and expect them to work. The other aspect is that the CMOS RTC chip has to be functional.

Later 80286 code avoids the huge latency in this process by employing a then-undocumented instruction, called "LOADALL", that enables a program to set all 80286 registers without restarting the CPU.

LOADALL cannot switch to real mode. Triple fault was used instead.

Chuck(G)
July 4th, 2016, 05:18 PM
Eh, yeah--it's been too long. You can use LOADALL, IIRC, to access memory above 1MB.

diskettenfett
July 11th, 2016, 10:18 AM
Hello,

thanks for your helpful comments. I've been able to narrow down the problem a little further:

- I remembered that I have a book-size 286 PC in my collection. I opened it up, found a socketed 8042-based floppy controller, swapped them between the boards and - yeah, no cigar, both work fine.
- Sadly the CPU is soldered in on the book-size motherboard. I do have the tools to desolder it, but I rather try to get hold of a single CPU instead of maybe frying the book-size PC just for troubleshooting.

I then experimented with the BIOSes availabe for the 5170 on the minuszerodegrees site and at least managed to get a screen out of the Quadtel BIOS. In about 1 out of 50 cases, I get a repeated Quadtel message in the top left of the screen, then it goes blank again and starts all over. Two times I only got the message "!12" in the top left corner of the screen (the PC then froze), which is kind of funny, since 12 (hex) is the POST code in the Quadtel bios for a faulty 8237 DMA controller - the same error the original BIOS on the motherboard gave me (although in clear text).

So it is still flaky and doesn't give the same result with every power-on. Most of the time there's just a black screen with no sounds at all, even with the Landmark ROM. There seems to be a very low level problem.
I then noticed that the reset pins don't give a reset at all. Does the reset have to be handled by software? I thought it would just hard-reset the CPU and peripherals.

Well, there's an ISA POST card on the way to me, so then I'll experiment again with different BIOSes and maybe will try to swap the 8237 first. Luckily, I still have two in my IC box.
Let's say it is a broken 8237... Any hints on how to increase the odds of unsoldering the bad one first?

Funny. Maybe 10 years ago I gave a big box full of maybe 40 mainboards with 8088/286/386 to the recycler because nobody wanted them and they weren't worth anything - only taking away space. Even rare ones with onboard SCSI and so on. Now I'm spending hours to (maybe) resurrect a cheap-ass 286 clone motherboard. D'oh!

Thanks for your help, I will keep you updated as soon as the POST card arrives. :)

modem7
July 13th, 2016, 03:11 AM
So it is still flaky and doesn't give the same result with every power-on.
Hopefully you are not dealing with multiple problems.

Your earlier, "minor NiCd battery damage", is concerning. One of my 5170 motherboards is very unstable, and I believe that the instability was caused by a leaking battery.


Let's say it is a broken 8237... Any hints on how to increase the odds of unsoldering the bad one first?
One technique is [here (http://minuszerodegrees.net/soldering/ic_removal/ic_removal.htm)]. I suggest that you put in an IC socket afterwards.


I then noticed that the reset pins don't give a reset at all. Does the reset have to be handled by software? I thought it would just hard-reset the CPU and peripherals.
Your IBM 5170 clone is probably very clone to the 5170. Refer to the reset diagram at [here (http://minuszerodegrees.net/5170/motherboard/5170_reset_flow.jpg)]. At the instant of power-on, the 5170 motherboard reset is active, i.e. motherboard in a reset state. Then, the power supply's POWER GOOD line going active takes the motherboard out of the reset state.

Later, as part of the power-on self test (POST), the POST will put the CPU into protected mode, and later take the CPU out of protected mode (see [here (http://minuszerodegrees.net/5170/post_errors/5170_post_codes.htm)]). It takes the CPU out of protected mode by sending a particular command to the keyboard controller (as shown in the earlier referenced reset diagram). That command results in the CPU getting reset - you will see the reset line to the CPU go momentarily active. No other reset lines on the motherboard change state.

Of course, depending on the BIOS you have fitted, you may not see the CPU's reset line momentarily go active during the POST, because the POST may be failing (and stopping) before a POST checkpoint/test that takes the CPU in and out of protected mode.