PDA

View Full Version : FDC error retry



Mike_Z
December 5th, 2016, 08:58 AM
I want to add a read/write retry to my FDC software on my 8080 CP/M 2.2 machine. I'm using a 8272 FDC chip. The read command will return these errors should they occur;
1. Sector not found, Status Register 1 Bit 2
2. CRC error in ID field, Status Register 1 Bit 5
3. CRC error in DATA field Status Register 2 Bit 5
I figure that these errors are 'soft' errors and are worthy of a retry. Is my thinking right here? The other errors that the 8272 finds seem to be related to equipment or other stuff. Comments? Thanks Mike

Chuck(G)
December 5th, 2016, 09:29 AM
It depends on how you're doing the retry. All of the errors that you describe can be caused by positioning failures. The usual way of handling errors on floppies is this:

1. Retry the read without moving the heads for x times (x=3 is good enough)
2. Recalibrate (seek/restore to track 0), seek to track and retry 1
3. Recalibrate, seek to target track +1, step back 1 track, and retry 1

The idea behind (3) is to attempt to compensate for any "slop" in the positioning mechanism. (2) serves a dual purpose in seeking directly to the desired track from track 0 and also knocking any loose detritus from the target track. (1) should be obvious.

Mike_Z
December 5th, 2016, 11:39 AM
OK, I was thinking that the errors I suggested, were due to reading problems. A read that misread the sector number, a read that misread something that caused the CRC error. Since the retry methods you suggest change the position I will take it that these errors are good ones to check for. Then, attempt a retry. Which one or all of them? My thinking is, #1 does not move the head so if it was indeed a positional problem, only ambient vibration will affect head position and probably be a waste of time. #2 seems to provide two benefits, but takes more time. And #3 is kind of a subset to #2, but quicker. I like the #2, I can wait.

One other question regarding disk status change. I know we talked about this some time ago, but I am not satisfied with how my software is handling this now. When a Drive status change occurs, I display a message that Drive X: is not ready (or ready) and do a warm reboot. Most times this occurs when I want to change disks to copy a library file from a different disk. On the warm boot, I can lose what I was working on. So instead of rebooting on Not Ready, would not it be more useful to just wait until the drive is ready again and not boot at all? Then if it is a disaster, I can reboot manually and take the consequences, Thanks Mike

Chuck(G)
December 5th, 2016, 11:56 AM
Floppy-resident operating systems have been a problem ever since there have been floppies. Where you really get into trouble is changing floppies that are similar enough that a simple directory check doesn't show any change, or when a change isn't expected. If CP/M sees a change in floppy between warm boots, it marks the volume as "read only" and allows the program to crash (after alerting you to the change to R/O status). Often, this isn't satisfactory--ending partially-completed applications that write to floppy can result in very ugly outcomes.

At Durango, we solved this by noting the drive status change in the driver and posting a message to return the subject disk to the drive--and then wait (stall) until it actually happened. This reduced our "disk change" errors to nearly zero. Similarly, for a simple "Drive not ready", we posted a message and stalled until the drive was made ready.

We were using 5.25' drives and wanted to catch the floppy change as soon as it happened, not opting to wait until a disk operation was next done. So we polled the status of the write-protect notch and put that on a half-second periodic check. Note that, on 5.25" floppies, the notch is on the side of the disk, so if you remove a floppy, you'll always see a status change. It's impossible for a human to swap floppies in less than a second, so we caught every attempt with our "PUT THAT BACK" message, accompanied by an audible alarm.

If you really intend a cold reboot, the operator/user can always initiate one manually by pushing a button, no?

Chuck(G)
December 5th, 2016, 12:05 PM
OK, I was thinking that the errors I suggested, were due to reading problems. A read that misread the sector number, a read that misread something that caused the CRC error. Since the retry methods you suggest change the position I will take it that these errors are good ones to check for. Then, attempt a retry. Which one or all of them? My thinking is, #1 does not move the head so if it was indeed a positional problem, only ambient vibration will affect head position and probably be a waste of time. #2 seems to provide two benefits, but takes more time. And #3 is kind of a subset to #2, but quicker. I like the #2, I can wait.

You can't "misread" a sector number without also affecting the associated header CRC; if you get a bad IDAM CRC diagnosed, it's not because the sector number didn't match--there's something else wrong.

Mike_Z
December 5th, 2016, 12:48 PM
Are you saying that if a 'sector not found' error occurred there would be a CRC error also, maybe in the ID field? So should I only check for the CRC errors?

Hmm...I never thought about the directory being screwed up with a different disk in the drive. I have two drives and generally I have WordStar on one (which takes up most of the disk) and my working files in the other drive. But sometimes I need something else from a different disk. So I suppose the best thing to do in that case is to save my work, stop the machine, put in the different disk and copy what I need and then start over with the original disks. Or.... maybe I need a third larger disk. Thanks Mike

Chuck(G)
December 5th, 2016, 02:40 PM
A "sector not found" error will occur if there is no sector that matches the CHRN of the FDC command. One can still match the CHRN but get a IDAM CRC error; i.e. the sector is found, but I believe that most FDCs abort the operation at that point as the IDAM information cannot be considered to be reliable.

Mike_Z
December 5th, 2016, 03:22 PM
So.... a 'sector not found' is more of a hard, non-recoverable error. Is the CRC ID or DATA errors what is referred to as soft errors? That can be recovered with a retry? Mike

Chuck(G)
December 5th, 2016, 04:04 PM
Yes, CRC errors often yield to retries. However:

"Sector not found" isn't a precise error; that is, it doesn't describe a single condition.

You can get a "sector not found" if there is no match when the FDC is searching for a matching IDAM. In other words, a non-existent sector number will match no IDAM.

But--that's only half the picture. After the IDAM matches, the FDC next looks for the DAM. No DAM will also generate a "Sector not Found". Either one can yield to retries, so it's best to retry the errors.

archeocomp
December 5th, 2016, 09:47 PM
How are you handling motor on/off? I am using PC8477B and it has built in motor/on off timers. In what they call Mode2 it can be set to over 10s. But I do not understand how it works because I still have to start/stop motors via corresponding bits 4-7 in DOR register and work out motor off time myself.

Mike_Z
December 6th, 2016, 07:13 AM
My 8" drive motors come on when the power is turned on and off when it is shut down. No control at all. Mike

Chuck(G)
December 6th, 2016, 08:57 AM
How are you handling motor on/off? I am using PC8477B and it has built in motor/on off timers. In what they call Mode2 it can be set to over 10s. But I do not understand how it works because I still have to start/stop motors via corresponding bits 4-7 in DOR register and work out motor off time myself.

You have to understand the genesis of the NSC PC8477 and why it is what it is.

It's based on the NEC uPD765/Intel 8272 chip (same chip, different numbers--Intel and NEC enjoyed a cross-licensing relationship back then). The chip was originally a design to support MFM encoding on 8" drives (the Intel 8271 supports only FM) and most 8" drives of that time used line-operated always-running AC motors. The head was loaded or unloaded from the media using a simple solenoid. It was a very straightforward arrangement, as the drive ready/not ready status was always available and the time to load and settle the head was rarely more than 30 msec. So the 765 has drive polling built in, where the status of up to 4 drives is always known.

But there came 5.25" drives and the IBM PC. Since 5.25" drives generally (there are exceptions) did not have a head-load mechanism and used (initially) brushed DC motors, one couldn't simply leave them to run all of the time--the media and head wear issues were too great. So, spindle motor control was used in place of the head-load mechanism. There are two downsides--one is that you don't know the status of a drive if the motor is off and the time between motor turn-on and coming to a stable speed is long--usually on the order of seconds.

So IBM in their design of the 5150, used the 765 and bypassed the more interesting features. The chip-select outputs are NC and motor control and drive select are taken over by simple external latch. The LSI chips that followed the 765, such as the WD37C65 and the PC8477 and the 82077 all keep register compatibility with the original 5150 and 5170.

archeocomp
December 6th, 2016, 10:13 AM
My 8" drive motors come on when the power is turned on and off when it is shut down. No control at all. Mike
The same situation here, but HEAD LOAD solenoid is driven by Motor On signal here by me. I am also using 5.25 and 3.5 inch drives, right now it is one 5.25 and one 8".


You have to understand the genesis of the NSC PC8477 and why it is what it is.

It's based on the NEC uPD765/Intel 8272 chip (same chip, different numbers--Intel and NEC enjoyed a cross-licensing relationship back then). The chip was originally a design to support MFM encoding on 8" drives (the Intel 8271 supports only FM) and most 8" drives of that time used line-operated always-running AC motors. The head was loaded or unloaded from the media using a simple solenoid. It was a very straightforward arrangement, as the drive ready/not ready status was always available and the time to load and settle the head was rarely more than 30 msec. So the 765 has drive polling built in, where the status of up to 4 drives is always known.

Thanks, that makes "polling" from manual more understandable.



But there came 5.25" drives and the IBM PC. Since 5.25" drives generally (there are exceptions) did not have a head-load mechanism and used (initially) brushed DC motors, one couldn't simply leave them to run all of the time--the media and head wear issues were too great. So, spindle motor control was used in place of the head-load mechanism. There are two downsides--one is that you don't know the status of a drive if the motor is off and the time between motor turn-on and coming to a stable speed is long--usually on the order of seconds.

So IBM in their design of the 5150, used the 765 and bypassed the more interesting features. The chip-select outputs are NC and motor control and drive select are taken over by simple external latch. The LSI chips that followed the 765, such as the WD37C65 and the PC8477 and the 82077 all keep register compatibility with the original 5150 and 5170.


The Specify command sets the initial values for three internal
timers. The function of these Specify parameters is described
below. The parameters of this command are undefined
after power up, and are unaffected by any reset. Thus,
software should always issue a Specify command as part of
an initialization routine. This command does not generate
an interrupt.
The Motor Off and Motor On timers are artifacts of the
mPD765. These timers determine the delay from selecting a
drive motor until a read or write operation is started, and the
delay of deselecting the drive motor after the command is
completed. Since the PC8477B enables the drive and motor
select line directly through the DOR, these timers only provide
some delay from the initiation of a command until it is
actually started.
I tried motor off set to 10s but nothing had happened. Motor was still on. Now I understand. Thanks Chuck.

Chuck(G)
December 6th, 2016, 10:23 AM
One fumble in my above post. " The chip-select outputs..." should read "The drive select outputs..."

Mike_Z
December 7th, 2016, 10:40 AM
The Intel 8272 also has Drive Status Polling for 4 drives. Mike

Chuck(G)
December 7th, 2016, 11:26 AM
The Intel 8272 also has Drive Status Polling for 4 drives. Mike

The 8272 and the NEC uPD765 are essentially (but for the label) the same chip. They even track revisions; i.e. i8272A=uPD765A.

Same for the Zilog and other OEM versions. It's all NEC.