File Allocation Table (FAT) System. (VFAT, NTFS, HPFS)

FAT : quick definition

The File System used by DOS and 16 bit versions of Wins to manage files stored on HDD, FDD and other disk media, a road map to files stored. The On-disk data structure is known as the FAT which records where individual portions of each file are located on the disk, keeping track of the allocation of files on the disk. This is the underlying method of organising the disk itself (achieved during the format process) and subsequently writing files to disk. FAT also refers to a Table of references to sectors (512 byte units), or more likely collections of sectors, called allocation units (or clusters). Also identifies each cluster as either free, belonging to a file, or bad. On single sided FDDs, although they make use of the FAT, each allocation unit will be 512 bytes, the equivalent of one sector (on double sided disks one cluster = two sectors or 1024 bytes, for HDDs see table below). A file can be spread across a multitude of clusters, distributed across the platters of the disk. Called fragmentation, this is a by-product of the way that the FAT works.

History

When FDD's were first introduced, they were a great improvement on the cassette tapes which preceded them, it was now possible to move to a given spot on the storage medium immediately. The origins of the FAT lie in the need to keep track of files on the 180K floppy disks used by the original IBM PC and FAT is an abbreviation for the Table itself and the name of the File System (FS) that implements it, and was the means devised to manage the new possibilities that disks brought to the PC. As the 'heart' of DOS, the FAT system has been successfully used by millions over the last 20 years and for 'modest' sized storage media remains a perfectly adequate disk housekeeping tool. However, it was designed in the days when FDD's were the primary storage device and as HDD's were created, and grew steadily larger in capacity, some of the design decisions made in those early days became less appropriate.

The 'old' file system.

The volumes on any HDD or FDD are divided into four areas: the Boot Sector , File Allocation Table(s), the root directory (at the start of the disk) and the file or data area (where the file data is stored, space is allocated in units called clusters, which are one or more contiguous sectors). The Table is fixed in size and because entry is limited to16 bits it is possible to address a maximum of 65,536 separate allocation units. Most disks contain two identical copies of the FATable so that files can still be retrieved if a sector in the first Table becomes unreadable. Initially, each of these units was 512 bytes (same as the sector size), which worked well for years until disks began to 'outgrow' the Table limit, the maximum allowable disk size being 32 MB (64k x 512). Larger disks coming along needed dividing into separate partitions, each partition having its own drive letter and Table as though it were physically a different disk. After DOS 3.0, the FAT system was enhanced to work with larger disks (above 32 MB), still restricted to 65,536 (16 bits) the number of units was still finite but they could 'grow', and spread to cover the larger disk areas. This technique is still in use today. The table below shows how units grow to make up the size of the disks.

HDD up to 128MB have...2048 byte (2k) alloc.units .....(4 sectors)

HDD up to 256MB have...4096 byte (4k) alloc.units .....(8 sectors)

HDD up to 512MB have...8192 byte (8k) alloc.units ...(16 sectors)

HDD up to.......1GB have.16384 byte (16k) alloc.units ...(32 sectors)

HDD up to.......2GB have 32768 byte (32k) alloc.units ...(64 sectors)

HDD up to.......4GB have 65536 byte (64k) alloc.units .(128 sectors)

n.b 'clusters' are addressable allocation units, (4 to 64 sectors in size)

one or more contiguous sectors.

Unfortunately the larger disk sizes mean a 'hidden' waste of disk space and system inefficiency. The problem, by definition, is the allocation unit or the minimum unit of space that can be allocated for any single file. 'Leftovers' ( bits of files remaining after a number of whole units have been used for storage) are placed in a fresh unit leaving much of the space in it unoccupied (remembering different files cannot be mixed in the same allocation unit). It can be seen from the table above more space will be wasted with the larger sized allocation units. Also the need to search the Table continually while the FS is doing its job (the FAT being consulted and/or updated) will exact an even greater performance penalty on larger disks and eventually, no matter what size of drive, disk and file fragmentation will slow the FAT system down. Given the difficulties that FAT constraints put on the accommodation of large drives, and taken with its liability to fragment files and disk space (the space available on disk is not likely to made up of consecutive clusters), performance is likely to degrade rapidly and this is more so with large volumes of smaller files on large drives, large (compound) files on single volumes and with large drives generally. These are the problems that New FS seek to address.

Fragmentation.

As well as disks having smaller capacities when the FAT system was designed, there were also relatively fewer files on the disks and as the number of files on disks increased, they became more susceptible to fragmentation. Since this forces the HDD to work harder, fragmentation becomes a drag on performance and shortens the life of the HDD. One indicator is the number of defrag utilities on the market as well as the 'Defrag' utility included with DOS since version 6.0.

HPFS & NTFS (abbr. New FS) Improvements.

With OS/2 ver 1.2, MS and IBM introduced their first Installable File System (IFS) for a P.C. O>S in the form of the HPFS, it initially appeared in a 16-bit version but was soon followed by the 32-bit 'HPFS386' included on the then pre-dominantly server-class 386 systems. Using the High Performance File System (HPFS) under OS/2, or the NT File System (NTFS) under Wins NT you can use 512 byte allocation units regardless of the size of the partition.

In the FAT system, filenames can be of the '8.3' format, i.e 8 characters followed by a 3 letter extension. This format provided ample latitude for distinguishing between relatively small numbers of files, but when the file count is in the thousands, obscure names have to be found to make the names unique. Using subdirs does allow the use of duplicate names with a parent dir. but this is no solution. New FS allow use of names up to 254 characters, including spaces for much greater readability. To access files, the FAT system must first locate the right entry in the dir. and because it assumes the list of files to be short, it starts at the top and reads through the entries until it finds the desired file. Obviously the more files you have the longer it takes to canvass the list especially on large partitions.

Preventing data loss is another goal of the New FS. If a prog 'locks up' it is easy for the Table to become corrupted and no longer match the contents of the disk, resulting in data loss. 'CHKDSK' can identify problems and fix the Table, but cannot put the data back as was, it creates 'FILExxx.chk' files containing the missing data. New FS have beefed up 'CHKDSK' to take advantage of the safer storage techniques which can detect and correct problems that occur. New FS can also detect 'bad' sectors and if finding them dynamically mark a sector as 'unusable', transferring data to a different, safe sector.

OS/2 will allow running DOS progs to access HPFS partitions and Wins NT will let DOS progs access both HPFS and NTFS drives, but DOS progs cannot read dirs. and files not following the '8.3' convention. This does not prevent changing FS's but the '8.3' convention must be followed in dirs. that need to be read by DOS progs. You can run a mix of prog types and as long as you run DOS progs only under OS/2 or Wins NT you can move to the New FS. OS/2 and Wins NT allow DOS to be on one partition and then to boot either O>S, an easier way of migrating to a new O>S. However DOS cannot see the New FS partitions, regardless of name lengths used and it is wise to keep a FAT partition for 'trusted' DOS progs (so you can boot DOS) and data until everything runs properly on OS/2 or NT. A lot of the extra info that OS/2 and NT keep about files is stored as Extended Attributes in a special area. DOS progs are not aware of these and will drop them. It is therefore important to use a backup prog specifically designed for OS/2 or Wins NT.

Wins'95 VFAT.

Wins'95 incorporates the 'VFAT' (Virtual File Allocation Table) system to replace the 'old' FAT system and does not offer many of the advanced features of NTFS and HPFS. It is a compromise designed to give compatibility (100% because volumes are structured exactly the same, a volume is a logical entity that the O>S represents as a drive letter) with existing FAT formatted disks and an easy migration path from Wins 3.x, while eliminating the FAT systems most bothersome limitation, short filenames. VFAT automatically generates '8.3' aliases for long file names so that a file can be accessed by any app. Commercially, MS made compatibility their top criterion, the downside is that VFAT inherits many of FATs deficiencies e.g large cluster sizes have been inherited from FAT.