PDA

View Full Version : How does MS-DOS distinguish a file from device using handles?



cr1901
February 24th, 2014, 02:19 AM
I learned tonight that MS-DOS will automatically allocate 5 file handles for various devices when a program is loaded. that MS-DOS would simply "fake" accessing the device as if it were the file and directly access the device driver code as soon as an int 21h call which operated on that handle occurred. According to this page (http://vitaly_filatov.tripod.com/ng/asm/asm_014.html), it is possible to change the behavior of multiple file handles (from ASCII to BINARY mode) pointing to the same device using an IOCTL call- so that reinforces my belief that they are tied directly to the device driver with no glue code/data. However, this doesn't explain redirection- there is no driver explicitly for STDIN/STDOUT services whose sole purpose is to redirect it's input/output to another driver or file, so under my current assumptions there is no way to redirect I/O without glue logic just after the int 21h call. But somehow can MS-DOS figured out where to send it's I/O anyway.

So there must be some glue code or data structures which can properly handle the above cases before MSDOS writes to a device or file or STDIN/STDOUT (I'm guessing that's treated specially). Assuming that it didn't change dramatically between versions, was there a standardized way that MS-DOS distinguished files from devices when it was just passed the file handle?

RBIL, int 21h subfunction 45h mentions this:

moving file pointer for either handle will also move it for the other,
because both will refer to the same system file table

Chuck(G)
February 24th, 2014, 09:56 AM
I'm not exactly certain what your question is. The first few SFT entries are initialized to point to certain devices. There's no crime in having several SFT entries (handles are just ordinals into the SFT) point tot he same device (see, for example the DUP and DUP2 system calls).

If you want to dig into this more, realize that the MSDOS 6.0 source has been available for several years now. Reading through it, you'll get to understand what a chewing-gum and bailing wire later DOS really was.

cr1901
February 24th, 2014, 12:17 PM
I'm not exactly certain what your question is. The first few SFT entries are initialized to point to certain devices. There's no crime in having several SFT entries (handles are just ordinals into the SFT) point tot he same device (see, for example the DUP and DUP2 system calls).

If you want to dig into this more, realize that the MSDOS 6.0 source has been available for several years now. Reading through it, you'll get to understand what a chewing-gum and bailing wire later DOS really was.

Okay, ignoring my 6:00 AM ramble, my question was: Is the file handle enough for MS-DOS to determine "This is a device driver, don't bother looking at the relevant file data structures which hold file information/buffers, and use the driver's routine instead"? Or will MS-DOS still look at the data structures held for files (which I now know is called the SFT) before recognizing that the handle is either a device driver or a file? The bold in your quote was the piece of information I was looking for- and indicates to me that MS-DOS will look at this so-called SFT before it knows whether a driver or file is to be operated upon using file I/O.

Now, I'm wondering what the data structure for an SFT entry looks like, whether it stayed consistent from DOS 2.0 to 6.0, and whether it's documented on the Internet or in other literature without looking at the MS-DOS source code (to me, that's "cheating", and if it's not backwards compatible with other versions, there's no real point to looking at the source- I can't remember 5 different versions of the same table, let alone 1 version of a table :P). Apparently, there is also something called the Job File Table (JFT) in the Program Segment Prefix (PSP), which points to the SFT, and at least one Usenet post (https://groups.google.com/forum/#!topic/alt.msdos.programmer/kH5ucZu8wlg) indicates that the SFT was a well-known DOS concept back in the late 80's/90's (presumably because the default JFT is in the PSP- which was documented).

If the SFT is a version-dependent data structure, then it's no use trying to figure out. However, I'm going to guess there's backwards compatibility with SFTs in previous versions of DOS, since that was paramount back in the DOS days, and experienced programmers liked to use parts of the BIOS/DOS they weren't supposed to for speed/space reasons. Like most of DOS, I assume the SPT officially undocumented, but I feel like there should be some literature on the structure of the SFT, since most of DOS got documented by other people outside Microsoft, and that Usenet post makes me think the existence of the SPT was well-known to DOS programmers back then.

Why go through all this trouble? Just a curiousity- is all.

Chuck(G)
February 24th, 2014, 12:29 PM
Since handles were introduced into MS-DOS in 2.0, I think it's a lead-pipe cinch that there have been numerous revisions to the structure.

Certain SFT ordinals (i.e. "handles") have hard-coded handling in DOS (e.g. SFT 0 and 1). But you can find all of this out yourself, right?

I thought it interesting that even through file access through FCB is the older method, FCB access in later DOS is supported by translating to the handle interface.

gslick
February 24th, 2014, 02:01 PM
If you're curious enough to spend $6 and some change reading about the internal details Andrew Schulman's Undocumented DOS is pretty much the classic reference.

http://www.amazon.com/Undocumented-DOS-Programmers-Structures-Programming/dp/020163287X

Mike Chambers
February 24th, 2014, 02:36 PM
If you're curious enough to spend $6 and some change reading about the internal details Andrew Schulman's Undocumented DOS is pretty much the classic reference.

http://www.amazon.com/Undocumented-DOS-Programmers-Structures-Programming/dp/020163287X

I second this. Great book.

Chuck(G)
February 24th, 2014, 02:49 PM
I've got Schulman's book, but going through the code is far more instructive. You learn about all of the special cases and hacks that Schulman doesn't go into.

You might also go through the source for FreeDOS or DRDOS if you don't like hacks.

cr1901
February 24th, 2014, 03:07 PM
Just found an article that has the System File Table for up to DOS 3.x... http://www.unz.org/Pub/ProgrammersJournal-1989mar-00032
EDIT: DOS 2.x and 3.x SFTs differ.

The short answer to my question is that every file in MSDOS is associated with a device driver, and that the redirect-able file handles have their device drivers chosen by COMMAND.COM when the program launches.

Each entry in the SFT has a pointer to either the character device header associated with said handle or the Device Control Block for block devices. This still doesn't tell me how MS-DOS can distinguish character or block devices* however- just that regardless of whether the target file is a device or file, a handle will point to a consistent data structure that contains some information about how to access the "target" (i.e. a file or device) using a device driver.
EDIT: At least one byte of the SFT entries in DOS 2.x and DOS 3.x has the low byte of the device driver information word- bit 7 is used to distinguish character device and file/block device. Found on: http://vxheavens.com/lib/vda14.html
Interestingly, there are also flags for CLOCK, STDIO, and STDIN, even though COMMAND.COM sets these... I know DOS uses the NUL flag to prevent replacing the NUL device, but maybe device drivers are can use the information word to change how they process data when they are STDIN/STDIO devices?

Some entries in the SFT don't even make sense for character devices- i.e. a file handle for the COM port won't have a meaningful beginning cluster!... I don't think.

*Whether block devices are synonymous with file access, or that access to a real file and access to a block device are two different things, I'm not sure.
EDIT: Appears to be the former, it light of my previous edit.

cr1901
February 24th, 2014, 05:02 PM
Alright, I think my curiosity has been more-or-less satisfied, except for two questions...


Why does DOS 3.0+ differentiate between a Device Attribute Word (stored in the device driver header- bit 15 distinguishes character and block devices), and a Device Information Word (used in int 21, AX=4400h, IOCTL get device information- bit 7 distinguishes character and block devices)? In DOS 2.0, the equivalent field in the SPT is only one byte long, and appears to be the high byte of the Device Attribute Word, but I can't tell for sure.

RBIL (int 21h, AH=52h) is vague about this, especially since the high byte of the Device Attribute Word == the high byte of the Device Information Word for character devices (and "reserved" for block devices- which means the corresponding DOS 2.0 field is useless if only the high byte of the Information Word is returned). I'll probably write a small program to check tonight. Incidentally, the DOS 2.0 SPT uses a separate field to check for a block or character device, rather than checking the device information word.


And secondly, how does DOS 3.0 find the current directory for a given file? For DOS 2.0, the directory name is stored in ASCII in the Disk Parameter Table (int 21h, subfunction 32h), but for 3.0, the data structure removes this field, and I'm not sure how to find the directory from the remaining data. Any ideas?

Chuck(G)
February 24th, 2014, 06:27 PM
In DOS 2.0 (and I have the original README for the Microsoft release to OEMs--it's very faded), always has differentiated between block and character devices by the high bit. It's always been that way. However, not all bits were used in the device header, so it makes sense that DOS would conserve storage by compressing the relevant information to one byte. The IOCTL function has expanded greatly since DOS 2.0 and now simply passes the call data to the driver itself for resolution for many calls.

As to the absolute canonical path information, consult the MFT.

To borrow from MFT.INC:


;
; A pictorial diagram for the linkages is as follows:
;
; +---sptr------+
; V |
; +---+<----------|---sptr------+------------+
; |SFT+----+ | | |
; +-+-+ | +-+-+ +--+-+ +--+-+
; V +--->|MFT+-lptr->-|LOCK+-next->|LOCK+->0
; +---+ | +---+ +----+ +----+
; |SFT+----+ ^
; +-+-+ |
; | |
; +-------------+

cr1901
March 1st, 2014, 02:15 PM
I'm sorry, what does "MFT" stand for... Master File Table? I can't seem to find that acronym in my limited documentation on the FAT data structures.

yuhong
March 8th, 2014, 04:56 PM
Since handles were introduced into MS-DOS in 2.0, I think it's a lead-pipe cinch that there have been numerous revisions to the structure.

Certain SFT ordinals (i.e. "handles") have hard-coded handling in DOS (e.g. SFT 0 and 1). But you can find all of this out yourself, right?

I thought it interesting that even through file access through FCB is the older method, FCB access in later DOS is supported by translating to the handle interface.
It was more complex than that. If you look at FCBIO2.ASM in the leaked sources and http://www.ctyme.com/intr/rb-2574.htm from the RBIL, you will see why DOS 4.0 tried to require SHARE.EXE to be loaded on >32M volumes to prevent data corruption and how it was properly fixed in DOS 5.0, and also why FCB access was disabled for FAT32 volumes in DOS 7.1.