PDA

View Full Version : Bad CQD-220TM



fraveydank
February 25th, 2011, 01:41 PM
I acquired a CMD CQD-220TM (QBUS SCSI card) a while back that was working when I got it. Now, however, it's gone flaky; about 4 times out of 5, when I start my 11/23+ up, the thing doesn't respond properly to its CSRs. It's working in a sense, in that I can get the diagnostic utility to copy over to RAM when it does work properly, but it goes off after that at some point (when I bring up the "other utilities" section, it hangs when trying to read the serial number, and stepping through the code using ODT I can see it's stuck polling the CSRs and never leaving because it's always getting 0 back).

The infuriating part about this is that it was working, and it stopped behaving just when I finally had the time (and SCSI disks) to do something with it. The CPU appears to be an 8088 (or 8086, but there's only two '373s on the bus interface and an 8086 usually has 3; I don't really want to peel back the label on the CPU because I'll never get it back on). I'm worried that what's gone wrong is one of the million or so PLDs on the board (this thing is more than half PAL/GAL/PEEL); if that's the case, it'll never work unless someone happens to have bitfiles for the devices lying around somewhere. I could get an 8088 to replace this pretty easily, and if anyone has known good ROM dumps, I can at least verify the ROMs (and I have a UV EPROM eraser and a burner, but that would require peeling the labels back again).

Anyone have experience fixing/troubleshooting these? I'm a little bummed about this.

RSX11M+
February 25th, 2011, 04:35 PM
No experience with that particular module, but if no one else can give you better help - I'll take a crack at it with ya.

Is this the one (http://sites.google.com/site/glensvintagecomputerinfo/cmd-cqd-200-tm)?

If so, it looks like someone has done some of the work for us, decoding / reverse engineering some of the PALs.

I still have a lot of 808X data books here - we have to be able to get somewhere with it. I'll look to see if there's any more documentation out there for it.


Found a User's Manual (http://www.bitsavers.org/pdf/cmd/CQD-220_CQD-223_Nov90.pdf)
Somebody's Switch settings and Notes (http://www.bitsavers.org/pdf/cmd/CQD-220A_SwitchSettings.txt)


Apparently one of our members is familiar with this unit. You might try PMing him (http://www.vintage-computer.com/vcforum/member.php?14620-gslick).

I'm guessing this is you (http://groups.google.com/group/comp.sys.dec/browse_thread/thread/229fced1940b8268)?

Looks like a nice module... probably worth fixing.

gslick
February 25th, 2011, 04:58 PM
I have EPROM versions F220Y1A8 / F220Y2A8 on my CMD CQD-220/TM. I could dump them and upload images of them.

I don't have source code for the CSR decode PAL, but I reverse engineered a functional equivalent so I could turn a CQD-223/M version and a CQD-223/T version into CQD-223/TM functional versions. I used 22V10's for the replacement although the logic is purely combinatorial and there are no registered outputs or feedback terms.

https://sites.google.com/site/glensvintagecomputerinfo/cmd-cqd-220-scsi

RSX11M+
February 25th, 2011, 05:10 PM
Glad to see you gslick... I linked to you hoping he'd PM ya. Let's see what happens. There's one of these on eBay for $799 right now. LOL

fraveydank
February 25th, 2011, 06:00 PM
Thanks, that should be really helpful; I happen to have some GAL22v10s lying around, but it appears my new desktop machine doesn't have a parallel port, so I'll have to wait for one before I can burn the GAL with my burner. If that doesn't fix the problem, I'll ping you about the ROM images (can't very well use those without the burner either). Thanks!

RSX11M+
February 25th, 2011, 06:17 PM
gslick, could you dump / post your EPROM contents someplace? It would be good to have them. Have you tested the reverse engineered PAL code? If so you might post that too, if you're willing.

I've got no GALs, but I have a couple programmers here, so if this effort gets stuck, there should be a way. [I kept my Win98 machine for exactly this reason] I've got both an EMP20 and a PROTEUS, so I'm sure they can be programmed. Even got a couple erasure chambers here somewhere.

I used the EMP20 about 2 years ago to re-flash my mother's [I]bricked hp laptop. Boy, that was a mess. Had to surgically remove the old PLCC [soldered to the motherboard] and install a socket. Bought the exact replacement part at DigiKey. Whole project took a month end-to-end.

fraveydank
February 25th, 2011, 06:53 PM
Amusingly enough, I got this burner to fix a server I had bricked by applying a BIOS patch that didn't work. Handy things! No longer in support, but I know it does GALs and any kind of EPROM from before about 2002 just fine. I've ordered a parallel port card from NewEgg which should come in a few days; we'll find out then.

gslick
February 26th, 2011, 09:21 PM
I dumped the EPROMs of all of the firmware versions I have and uploaded the contents here:
https://sites.google.com/site/glensvintagecomputerinfo/

MTI QTS-30 REV A, TMSCP
-----------------------
F200T1C5.BIN, F200T2C5.BIN

MTI QTS-30 REV B, TMSCP
-----------------------
F200T1C6B.BIN, F200T2C6B.BIN

CMD CQD-200/TM, MSCP and TMSCP
------------------------------
F200Y1A8C.BIN, F200Y2A8C.BIN


CMD CQD-220/M, MSCP
-------------------
F22001E7A.BIN, F22002E7A.BIN

CMD CQD-220/T, TMSCP
--------------------
F220T1F1.BIN, F220T2F1.BIN

CMD CQD-220/TM, MSCP and TMSCP
------------------------------
F220Y1A8.BIN, F220Y2A8.BIN


CMD CQD-220A/M/T, MSCP or TMSCP
-------------------------------
F220AE1F2L.BIN, F220AE2F2L.BIN


What I don't have and what I would like to find is a copy of this CQD-220A firmware:
CQD-220A/TM F220AY1B2L, F220AY2B2L

RSX11M+
February 26th, 2011, 11:34 PM
Saw your board docs... Very cool! Thank you.

I particularly liked the HP-10276A module. I heard of this, but never saw one. It could be used by the hp-1631 logic analyzer [brochure here (http://www.mrtestequipment.com/getfile.php%3Fs%3DAgilent%2BHP%2B1631A%2BLogic%2BA nalyzer%2BData%2BSheet.pdf) and here (http://www.tucker.com/webimages/productpdf/00000086.pdf)]

I have hp-1615 and hp-1611 logic analyzers here, with a general purpose "roll your own" pod for the 1615, but not the original Z80 I/F unit that came with the 1611.

Your card is still listed on some of the test equipment websites.

fraveydank
March 8th, 2011, 03:49 PM
Got the burner going again, finally (apparently Xeltek's software won't support anything past XP, so I had to run it under VMWare). Results:

- Burning a new GAL22V10 (10 ns vs the PEEL's 30 ns) with your code had about the same effect as removing and firmly reseating the existing PEEL. The CSR response seems much more stable now, but the same problems crop up. I can issue a format command, but it finishes more quickly than it seems it should; qualify doesn't seem to work at all; pulling up "additional utilities" still crashes when it's reading the serial number. Changing the LUN offset seems to work, and the change persists through several power cycles, so it looks like the little serial EEPROM soldered on the board seems to work; I don't know where the serial number is stored, I might try disassembling the ROMs in a bit.

- Replacing the firmware ROMs with what you've provided has interesting effects. My board comes with rev A7A (you seem to have A8), so I burned your A8 onto a pair of spare 27(C)256 chips and tried them out. The board responds to the CSRs OK, and I can copy the utility into RAM, but when I try to start it and provide it with the CSR, it says "NOT A SCSI CONTROLLER". If I try to specify a CSR that it's not at, it says "NO CONTROLLER PRESENT", which indicates that at least it sees something there. Not sure what it's looking for; I'd have to dump the program and disassemble it to see what's going on. I can also provide you with the A7A firmware if it would be useful.

- I still can't seem to see anything on the serial port. The manual indicates that the serial pinout is pretty much the same as DEC's; if I hook up one of my bodged-up PC serial cables resoldered to match the DEC pinout, I get nothing and the card seems to show an error on the diagnostic LEDs. Is there a trick to the serial port? Perhaps it's 5V instead of RS232 levels? I don't see any level converters on the card, though I haven't tried to map out a schematic to see if they've done a custom one. Hoping I haven't done any damage, though it at least seems to be diode-protected against reverse polarization.


Anyone got next step ideas beyond disassembling the code? I'm working on that (have a copy of IDA pro which is doing as admirable a job as it can, but I'm having a bit of trouble with the 8086's disgusting segment register concept).

fraveydank
March 8th, 2011, 04:03 PM
Fun fact: The ROM images actually form a DOS executable. I thought IDA was nuts when it said "hey, this looks like a DOS EXE", but I guess they must have used an XT as the prototyping system or something and just tagged the reset vector on at the end? Dunno. Anyway, that does seem to make disassembling a whole lot easier.

RSX11M+
March 8th, 2011, 06:44 PM
That is interesting. I've used Datalight ROM DOS (http://www.datalight.com/products/rom-dos) when I designed stuff like that.

Very clean. The BIOS is distributed in source so it can be customized to the platform.

A little hint... debugging was via Borland Turbo Debug using "remote mode".

Might help you figure things out.

gslick
March 8th, 2011, 09:22 PM
- I still can't seem to see anything on the serial port. The manual indicates that the serial pinout is pretty much the same as DEC's; if I hook up one of my bodged-up PC serial cables resoldered to match the DEC pinout, I get nothing and the card seems to show an error on the diagnostic LEDs. Is there a trick to the serial port?

I have used the serial ports on all of my CMD controllers without any problems. Just a three wire connection: GND, TX, RX to my PC running a terminal emulator. You don't need the jumper (7-9?) on the 10-pin connector that you need on standard DEC 10-pin serial connectors.

-Glen

fraveydank
March 9th, 2011, 07:05 AM
Is there any trick to it? The manual isn't too clear; says the system should be in halt or something like that. I've tried every combination of halt, reset, etc. with no luck. Perhaps that's indicative of the problem... maybe I need to trace out the UART circuit and see if I can't see something closer up the chain on the scope.

fraveydank
March 9th, 2011, 07:11 AM
That is interesting. I've used Datalight ROM DOS (http://www.datalight.com/products/rom-dos) when I designed stuff like that.

Very clean. The BIOS is distributed in source so it can be customized to the platform.

A little hint... debugging was via Borland Turbo Debug using "remote mode".

Might help you figure things out.

Doesn't look to be that; the site indicates that ROM DOS came out in 1989 (the board is from '88, though I guess the firmware might not be) and claims to require a 186 or higher (this is a straight up 8086 in a 40-pin DIP). Moreover, the bottom of the ROM (the first 0x200 bytes) are an MZ-style DOS executable header, which indicates that they intended this to be run on a PC actually running DOS. The reset vector at offset 0xFFF0 starts execution at the program entry point (0xF000:0200, indicating that the ROM is mapped at 0xF0000 on the board).

Good guess, though... do you know of any PC-based embedded prototyping frameworks from around then?

RSX11M+
March 9th, 2011, 08:28 AM
Doesn't look to be that; the site indicates that ROM DOS came out in 1989 (the board is from '88, though I guess the firmware might not be) and claims to require a 186 or higher (this is a straight up 8086 in a 40-pin DIP). Moreover, the bottom of the ROM (the first 0x200 bytes) are an MZ-style DOS executable header, which indicates that they intended this to be run on a PC actually running DOS. The reset vector at offset 0xFFF0 starts execution at the program entry point (0xF000:0200, indicating that the ROM is mapped at 0xF0000 on the board).

Good guess, though... do you know of any PC-based embedded prototyping frameworks from around then?

I understand what you saying about the EXE in the ROM. I guess there could be another ROM with the BIOS, but more likely it's in the same devices and partitioned by an addressing trick.

Have a look at Borland Turbo C (http://en.wikipedia.org/wiki/Turbo_C) [there were other languages too] for DOS and later, Windows.


Myself and others used this as a cross-development environment for many years, even on embedded targets. Once I'd get a board up and working as a prototype, I'd port a BIOS to it and include TD's Remote Debugging driver in it. Allows the Turbo Debugger [Windows application] to run on my local machine and talk to the target system over a serial port to debug software on it remotely. [even the BIOS] Extremely small footprint on the target [~2k as I recall].

My last such design was a 386EX based board intended to be a "special purpose" protocol translator with up to 18 serial ports or other devices. Fun project. Too bad we don't do manufacturing in the U.S. any more... I'd like to still be doing that kind of work.

fraveydank
March 9th, 2011, 09:32 AM
Hm, curious. I wouldn't be terribly surprised if they just did the initial development with an ISA card mockup and ran in DOS; the DOS program seems to load the ROM into a different region of memory than the ROM seems to be mapped at (0x10000 instead of 0xF0000) and just tagged a boot vector onto the binary and stuck it in ROM. I wouldn't be surprised if it was done using Borland, though; the debug stubs might be present.

In any case, I think the fact that I don't see anything at all on the UART may be a sign of something else failing and messing with the firmware execution; I'm going to trace that circuit out and see if I can figure out where it goes and whether I can fix it.

RSX11M+
March 9th, 2011, 10:05 AM
If you know the serial port should be active that's a good strategy. Look to see if the processor reset line gets toggled alot too.

You could be right, and the EXE header is just a vestigial artifact of the compilation process.
If the EXE contains code to relocate the ROM address after startup, it seems more likely that it has the entire code for the board. After all, it's real mode [186?] so everything is possible, and there may be no need for a BIOS if the author knows enough.
I can't imagine debugging on anything easier than TD, but if they didn't have those tools, they may have done it another way. Of course, the debugging isn't needed in the final product and may not be present, or it's deactivated by some trick too.
Note: One thing about TD... it needs the target EXE to be in RAM to really be fully functional. Look for a copy-ROM-to-RAM phase in the startup somewhere. If it's not there, TD can't be active in this edition except to step.
Special hardware versions for development were possible in those days too, so anything could be.

Should be fun for you to puzzle out. I'd look through the ROM contents to see what else you can recognize.

Got any idea what the running memory map looks like yet?

RSX11M+
March 9th, 2011, 10:33 AM
Incidentally... DataLight is just the ROM DOS I used later one [because they included the BIOS code for your use] There were other ROM DOS's going way back to the first laptops and portables.

86 is when I saw the first preliminary Borland Turbo products, although not fully fleashed out.

Anyway, let's focus on what you have there... could you post a hi-res photo of the board someplace?

I gotta get one of these things if I'm gonna help, even if it's bad. Anyone got anything to send me?

fraveydank
March 9th, 2011, 11:19 AM
I'm pretty sure this is just bare metal; there's only 16k of RAM on this board (not counting the hardware FIFOs attached to the SCSI controller), so I'm willing to bet this thing runs straight out of ROM. The code is actually pretty much all position independent (within the code segment, anyway), so it doesn't actually matter where it's located in memory; the DOS header says it's located at one location, but the actual execution from the reset vector indicates that the real board has it located in another.

My guess is that the memory map at least has the ROM in 0xF0000-0xFFFFF, since the 8086 boot vector is 0xFFFF0 and it directs execution to 0xF000:0x2000 (or 0xF0200 in real-person terms). The initial parts of the boot code zero both DS and SS, so I'm assuming the RAM starts at 0x0000 and goes to 0x3FFF (16k).

There's at least a bunch of string tables I can see that look a lot like what the PDP-11 side of things spits out for the online tool; I'm assuming the same strings get copied out to the PDP-11 with the online tool code, but I haven't looked hard enough to find where that happens. Right now, I'm concerned that some of the internal memory decoding may be screwed up, which would indicate why the serial number can't get read and why the UART doesn't work; the interface between QBUS and the 8086 seems to be more or less OK, and it does appear to at least operate normally a lot of the time until it gets stuck. Curiously, while reading the serial number seems to fail, reading the LUN offset (also stored in non-volatile memory) does not, which indicates that they may be in different non-volatile locations. I only see one 8-pin serial EEPROM.

As far as a high-res picture, the ones on gslick's site are pretty good; they should probably give you a good idea of what's on there. Unfortunately, a lot of the routing goes underneath the chips, so I'll probably need to buzz some things out with my meter.

RSX11M+
March 9th, 2011, 07:07 PM
On the photos - he's got several versions [his have an 8086]. Could you please pick out one that's right?

16K (8Kx16) Ram, and 64K (32Kx16) PROM

I did a 188 design a few years back... I'll pull the docs and re familiarize myself with the 186 programming details.

Did you look to see if the reset keeps getting hit?

fraveydank
March 10th, 2011, 04:28 AM
OK, so a bit of tracing later...

The UART does indeed seem to be implemented in software, though it's a hard-to-trace system; the Tx comes off of a 74LS273 (8-bit edge-triggered flip-flop) which takes its data (at least for that bit) off one of the four FIFOs. The Rx goes straight into another one of the FIFOs through a 74LS244 (just a buffer with output enable). I haven't traced where the other data bits in the FIFOs go (would be quite a bit of work), but their read/write strobes come from a PAL. Once I get some more time today, I'll scope out those bits; I wouldn't be surprised if the strobe happens to be coming at 9600 Hz.

I haven't check to see if the reset line is toggling, but I'm fairly confident that the CPU is running more or less OK; it doesn't respond to the CSRs with any intelligible data if I pull the ROMs or the CPU, so I'd be willing to say that if it were getting constantly reset, I wouldn't be able to download the on-line program into the host RAM.

fraveydank
March 10th, 2011, 04:33 AM
On the photos - he's got several versions [his have an 8086]. Could you please pick out one that's right?

16K (8Kx16) Ram, and 64K (32Kx16) PROM

Oh, and I only see one version on the site (not the 220A, just the original 220). That looks to be the correct size for RAM and PROM. There's a little SPI EEPROM (NMC9306N) on there that gets bit-banged via a crude GPIO setup using the same '273 the UART does; I'd be curious to see if that hooks up to the same FIFO.

It might be useful to cook up a schematic. I have a number of schematic packages at my disposal; I'll see what I can do with a limited time budget.

fraveydank
April 21st, 2011, 05:41 PM
I've finally had a bit of time to sit down and disassemble this, both in the 8086 and PDP-11 domain. IDA Pro, as it turns out, has some issues with doing the disassembly of the 8086 because all of the string tables are kept in ROM (and are thus referenced by the cs register) but whenever an address is loaded into a register (e.g. for printing a string), it assumes it's an address relative to ds. Oh well. Duplicated the data at the ds address and all is mostly well.

For anyone curious about the structure of the ROM: the PDP-11 utility portion starts at address 0xE000 and goes until nearly the end. On the PDP-11 side, it gets copied into address 05000 (octal), so if you're disassembling it, make sure it points there (there is an absolute location at the start and all the string references are pretty much absolute).

Fun facts:

- The "additional utilities" section actually communicates its text directly with the 8086 (which makes a certain amount of sense). The actual routine is just a loop polling CSR+2 for text data with appropriate bits set and printing them out if it gets them, then polling the UART for incoming bytes and sending them back to CSR+2 with appropriate bits set. This explains why I couldn't find the relative strings in the massive string table in the PDP-11 section. It also means the 8086 is definitely where the thing is crashing, because it prints out "S/N =" and then stops.

- The 8086 does its serial via bit-banging. I haven't quite figured out how it does its timing (I'm not particularly good with 8086 bus cycles; give me a 68k any day), but there seems to be a general-purpose set of I/O registers at 0x80xx (curiously, the actual QBUS data transaction registers appear to be at 0x7Fxx). Assuming all the bit-banging is done via one register (not out of the question, since the same latches/bus drivers seem to go to both the UART pins and the SPI EEPROM), I think I may be getting close to the problem. The bad news: It's probably a PLD that I'll have to desolder and guess the function of by tracing it out on the board, which is not a task I relish.

I wish I had a logic analyzer that would clip onto the pins of the 8086. That would help things immensely. I have an FPGA board that I could run a cable over to and capture the data, but prying the 8086 loose and plugging square pins into its sockets for a capture board is another task I don't relish.

If anyone has IDA Pro, I'll be glad to send my disassembly database so far. It's certainly an interesting example of an embedded computer, if nothing else.

fraveydank
October 4th, 2012, 08:53 PM
If anyone has any interest in this (a year and a half later), I found out what the problem was. If you somehow accidentally set the number of tape and disk drives to zero (which is not impossible to do), the firmware for the 8086 crashes in a routine that sets up the device IDs because it uses a pre-decremented counter to loop through the devices; when it pre-decrements 0, it becomes 255 and the thing runs amok through the memory. Bad news, especially because it turns out to be not very difficult to set both to zero.

There are two solutions. The first is to replace the little 9306 EEPROM on board, which will cause the board to identify it as erased/corrupt and initialize from defaults. Unfortunately, the damn thing is soldered in, so you'll have to desolder it (I put in a socket for future replacements). Also, you'll lose your serial number unless you copy it over; I don't quite recall what byte positions held that, but I can look it up if anyone's curious. If you take out the 9306 and have a ROM burner that supports it, you can also just erase it instead; my ROM burner theoretically supports it, but it turns out to have a bug in the 9306 algorithm and doesn't work. I repurposed an FPGA board to be my read-write maching.

The second solution involves fixing the ROM. As luck would have it, there is a useless five bytes of instructions in precisely the spot where it's needed (a register is loaded with what it's already loaded with, so the five bytes of instruction space are unneeded). I was able to put a quick check to determine that the number of devices was zero and if so, branch to loading the defaults. It's not perfect, but it keeps the 8086 from crashing, which essentially bricks the card, so I'll take it.

I'll try to get the patched ROMs posted somewhere; the legality is a little fuzzy, I think, but CMD was long ago absorbed by Silicon Image, and I doubt they'd be all that picky.

patscc
October 4th, 2012, 09:43 PM
I'm always available for a PM of fuzzy.
I don't have any of the hardware involved, but I just had to post because the debugging job is phenomenal. I'm seriously impressed. That probably doesn't mean much, but it's sincere.
patscc

RSX11M+
October 7th, 2012, 10:08 AM
If anyone has any interest in this (a year and a half later), I found out what the problem was....
Sorry for not posting a response sooner, but this is the first time I've had to a chance to re-read the thread enough to comment.

After a year+ I'm glad I was still around to see your solution. I think your code fix was definitely the way to go. Good Job! So many of these type of threads go unresolved, it's nice to see the loop closed on one for a change.

Personally, I'd also like to see your intermediate products along the way... disassembly of source code, marked up listings, what you learned about the memory map, etc...

I've done this kind of reverse engineering too, and it's always been gratifying to complete. Anything you could share about the experience would make entertaining reading for me. [maybe a couple other nutz here at the forums too] :jumping1:

The legality shouldn't be a problem. As long as you reverse engineered it without signing any agreements with the original manufacturer, your work product would be "clean" from a legal standpoint. Your options would be to publish [post] a complete revised ROM image, or instructions for modifying the original ROM contents to fix the problem. [Be sure to include the "before" and "after"]

Anyway, great going. It's useful knowledge for this community to be aware of in any case.