• Please review our updated Terms and Rules here

How are files undeleted in a FAT filesystem?

alank2

Veteran Member
Joined
Aug 3, 2016
Messages
2,264
Location
USA
I get that the directory entry is just marked deleted and presumably the entry points to the first cluster in the file, but wouldn't the FAT chain in the FAT have had to be set to "free clusters"?
 
Yup. Without undelete tracking, you basically do it by hand, collecting clusters starting with the one pointed to. Can get to be pretty nasty.
 
I always thought there was a cluster chain for it to still follow somehow, but I guess this isn't the case.

Basically did it check the starting cluster and the size of the file against the FAT and make an assumption that the file was contiguous? So if you delete a fragmented file you could really forget about getting it back?
 
That's about it. That's also why writing to the medium after a delete really screws things up.

I've done file recovery on messed-up floppies (those that have almost nothing but fragmented files). My approach was to copy the image to a file and work with the image. That way, you still have the original intact. The recovered files were always something where you could connect the pieces--that is, text files. You know; at the end of a cluster you see "The quick brown fox ju" and you look for a cluster starting with "mps over the lazy dog". It could be done with binary executables, but that's a headache I can do without.
 
I'm just blown away by the idea that there was no cluster chain to look at. My "luck" with recovering files was always pretty good considering how much of a gamble it is.
 
I always thought there was a cluster chain for it to still follow somehow, but I guess this isn't the case.

Basically did it check the starting cluster and the size of the file against the FAT and make an assumption that the file was contiguous? So if you delete a fragmented file you could really forget about getting it back?

As I remember it (and that's fairly unreliable) the "current FAT" has the deleted clusters marked as unused, but the "other/backup FAT" may still have the cluster chains until the next FAT switch, so reliable undelete is only possible until the next FAT switch.
 
That is interesting. Does anyone know about the technique that was used to determine which FAT is the active one, how it went back and forth when writing, etc.
 
If so, (I don't think it is). The second FAT was intended as backup, as it holds the file structure. You can lose the root directory and still recover the contents if the FAT is intact, but not the other way around.

DOS 5 and later did undelete tracking by using a special file, PCTRACKR.DEL. It can hold a certain (configurable) number of entries. DR-DOS took a similar approach, but also included a FAT snapshot utilitiy, called DISKMAP.
 
If so, (I don't think it is). The second FAT was intended as backup, as it holds the file structure. You can lose the root directory and still recover the contents if the FAT is intact, but not the other way around.

DOS 5 and later did undelete tracking by using a special file, PCTRACKR.DEL. It can hold a certain (configurable) number of entries. DR-DOS took a similar approach, but also included a FAT snapshot utilitiy, called DISKMAP.

PC-DOS 6.1 and later had versions of the Central Point PC Tools utilities included, which incorporated another facility. I'm not sure of the details.
 
Yup, that was a big drawback of using FAT vs. CP/M's "flat" scheme.

At first was not a fan of CP/M's scheme, but the more I deal with FAT (I've been writing 8 bit FAT code for a simlpe AVR project), i can see how simple and clean it is.
 
You have to recall that CP/M was originally intended as a floppy-based system, so the implementation makes sense. With the need for large hard disks and subdirectories, that gets very cumbersome.
 
Although for speed running a defrag is useful on a fat setup. If there is any problem with the fat, defrag can make a real mess.
PCOS ( for the M20 ) had a separate command to free up small pieces of unused space, as files needed to be contiguous. When creating a new file, you had to tell it what size you wanted to use. It was mostly to ensure you had enough space before actually using the space. If you used less, that was OK.
It was a little awkward for people that had no idea what a file was.
Dwight
 
That was very common on several OS implementations. Many had schemes where limited amounts of additional space could be added. There was no defragmenting, which made recovery of deleted files pretty simple, but reclaiming deleted space often took the form of a copy.

Another floppy-based system, used on the Zilog MCZ development system, extended the length of floppy sectors by 4 bytes (i.e., 132 bytes) to hold a forward- and backward- link. The OS maintained only a directory with each entry pointing to the first sector of each file and a used-block bitmap. The benefit there was that there was no need to go back to a table when looking for the next block of a file being read consecutively. The disadvantage was that seeking in random files meant that the file sectors had to be read to find the correct sector.

Unix uses a sort of tree structure in its allocation, which appears to work well. Defragmenting is not much of an issue, as pretty much all files are fragmented. Intel ISIS uses a similar scheme.

I've worked with filesystems where allocation information is stored on the same cylinder that/s used for data. The idea was that you only had to keep track of the number of used sectors for each cylinder and that allocating more storage on the same cylinder didn't involve moving the heads.

Then there's the question of what order to allocation storage to a file. Do you start at the beginning of the disk and allocate any unused/deleted sectors or do you start in the "never been allocated' clusters toward the end of the disk? Clearly, if the latter scheme is selected, the success of an undelete is increased. And it has some benefits for use with SSDs, as it tends to level wear. I've worked on one (large) filesystem where the next cluster to be allocated was determined by a rather involved scheme involving Fibonacci numbers.

What with SSDs coming into wide use, all of the schemes that try to minimize head movement are moving into the area of obsolescence.
 
Back
Top