Hierarchical File System
(HFS), is a file system
developed by Apple
Inc.
for use in computer systems running Mac OS. Originally designed for use on
floppy and
hard
disks, it can also be found on read-only media such as
CD-ROMs. HFS is also referred to as
Mac OS
Standard (or, erroneously, “HFS Standard”), where its
successor,
HFS Plus, is also called
Mac
OS Extended (or, erroneously, “HFS Extended”). With the
introduction of
OS X 10.6, Apple has
dropped support to format or write HFS disks and
images, which are only supported as
read-only volumes.
History
HFS was introduced by Apple in September 1985 specifically to
support Apple's first
hard disk drive
for the Macintosh, replacing the
Macintosh File System (MFS), the
original file system which had been introduced over a year and a
half earlier with the first
Macintosh
computer. Drawing heavily upon Apple's first hierarchical
SOS operating system for the failed
Apple III, which also served as the basis for
hierarchical filing systems on the
Apple
IIe and
Lisa, HFS was developed by
Patrick Dirks and Bill Bruffey and it shared a number of design
features with MFS that were not available in other file systems of
the time (such as
DOS's
FAT). Files could have multiple forks
(normally a data and a
resource fork),
which allowed program code to be stored separately from resources
such as icons that might need to be localised. Files were
referenced with unique file IDs rather than file names, and file
names could be 255 characters long (although the
Finder only supported a maximum of 31
characters).
However MFS was optimised to be used on very small and slow media,
namely
floppy disks, so HFS was
introduced to overcome some of the performance problems that
arrived with the introduction of larger media, notably
hard drives. The main concern was the time needed
to display the contents of a folder. Under MFS all of the file and
directory listing information was stored in a single file, which
the system had to search to build a list of the files stored in a
particular folder. This worked well with a system with a few
hundred kilobytes of storage and perhaps a hundred files, but as
the systems grew into megabytes and thousands of files, the
performance degraded rapidly.
The solution was to replace MFS's directory structure with one more
suitable to larger file systems. HFS replaced the flat table
structure with the
Catalog File which uses a
B-tree structure that could be searched very
quickly regardless of size. HFS also re-designed various structures
to be able to hold larger numbers, 16-bit integers being replaced
by 32-bit almost universally. Oddly, one of the few places this
"upsizing" did not take place was the file directory itself, which
limits HFS to a total of 64k files.
While HFS is a proprietary file system format, it is well
documented so there are usually solutions available to access HFS
formatted disks from most modern
operating systems.
Although Apple introduced HFS out of necessity with its first 20MB
hard disk offering for the Macintosh in
September of 1985, HFS wasn't widely introduced until
System 3.0 which
debuted with the
Macintosh Plus in
January 1986 along with the larger 800K floppy disk drive for the
Macintosh, which also required HFS support. More importantly, HFS
was hard-coded into new Plus' 128K
ROM, freeing not only space from the system
software disk, but also
RAM. However, RAM based
HFS support was also implemented for use with the earlier
Macintosh 512K's 64K ROM through the addition
of an
INIT file on the System Disk. The
introduction of HFS was the first advancement by Apple to leave a
Macintosh model behind: the original
128K
Mactintosh, which lacked sufficient memory to load the HFS code
and was promptly discontinued.
In 1998, Apple introduced
HFS Plus to
address inefficient allocation of disk space in HFS and to add
other improvements. HFS is still supported by current versions of
Mac OS, but starting with
Mac OS X an HFS
volume cannot be used for
booting, and
beginning with OS X 10.6 (Snow Leopard), HFS volumes are read-only
and cannot be created or updated.
Design
The Hierarchical File System divides a volume into
logical
blocks of 512 bytes. These logical blocks are then grouped
together into
allocation blocks which can contain one or
more logical blocks depending on the total size of the volume. HFS
uses a 16 bit value to address allocation blocks, limiting the
number of allocation blocks to 65,536.
There are five structures that make up an HFS volume:
- Logical blocks 0 and 1 of the volume are the Boot Blocks, which contain system
startup information. For example, the names of the System and Shell
(usually the Finder) files which
are loaded at startup.
- Logical block 2 contains the Master Directory
Block (aka MDB). This defines a wide
variety of data about the volume itself, for example date &
time stamps for when the volume was created, the location of the
other volume structures such as the Volume Bitmap or the size of
logical structures such as allocation blocks. There is also a
duplicate of the MDB called the Alternate Master Directory
Block (aka Alternate MDB) located at the
opposite end of the volume in the second to last logical block.
This is intended mainly for use by disk utilities and is only
updated when either the Catalog File or Extents Overflow File grow
in size.
- Logical block 3 is the starting block of the Volume
Bitmap, which keeps track of which allocation blocks are
in use and which are free. Each allocation block on the volume is
represented by a bit in the map: if the bit is set then the block
is in use; if it is clear then the block is free to be used. Since
the Volume Bitmap must have a bit to represent each allocation
block, its size is determined by the size of the volume
itself.
- The Extent Overflow File is a B-tree that contains extra extents that record which
allocation blocks are allocated to which files, once the initial
three extents in the Catalog File are used up. Later versions also
added the ability for the Extent Overflow File to store extents
that record bad blocks, to prevent the file system from trying to
allocate a bad block to a file.
- The Catalog File is another B-tree that contains records for all the files and
directories stored in the volume. It stores four types of records.
Each file consists of a File Thread Record and a File Record while
each directory consists of a Directory Thread Record and a
Directory Record. Files and directories in the Catalog File are
located by their unique Catalog Node ID (or
CNID).
- A File Thread Record stores just the name of
the file and the CNID of its parent directory.
- A File Record stores a variety of metadata
about the file including its CNID, the size of the file, three
timestamps (when the file was created, last modified, last backed
up), the first file extents of the data
and resource forks and pointers to the file's first data and
resource extent records in the Extent Overflow File. The File
Record also stores two 16 byte fields that are used by the Finder
to store attributes about the file including things like its
creator code, type
code, the window the file should appear in and its location
within the window.
- A Directory Thread Record stores just the name
of the directory and the CNID of its parent directory.
- A Directory Record which stores data like the
number of files stored within the directory, the CNID of the
directory, three timestamps (when the directory was created, last
modified, last backed up). Like the File Record, the Directory
Record also stores two 16 byte fields for use by the Finder. These
store things like the width & height and x & y co-ordinates
for the window used to display the contents of the directory, the
display mode (icon view, list view, etc) of the window and the
position of the window's scroll bar.
Problems
The Catalog File, which stores all the file and directory records
in a single data structure, results in performance problems when
the system allows
multitasking, as only
one program can write to this structure at a time, meaning that
many programs may be waiting in queue due to one program "hogging"
the system. It is also a serious reliability concern, as damage to
this file can destroy the entire file system. This contrasts with
other filesystems that store file and directory records in separate
structures (such as DOS's FAT file system or the
Unix File System), where having structure
distributed across the disk means that damaging a single directory
is generally non-fatal and the data may possibly be re-constructed
with data held in the non-damaged portions.
Additionally, the limit of 65,535 allocation blocks resulted in
files having a "minimum" size equivalent 1/65,535th the size of the
disk. Thus, any given volume, no matter its size, could only store
a maximum of 65,535 files. Moreover, any file would be allocated
more space than it actually needed, up to the allocation block
size. When disks were small, this was of little consequence,
because the individual allocation block size was trivial, but as
disks started to approach the 1 GB mark, the smallest amount of
space that any file could occupy (a single allocation block) became
excessively large, wasting significant amounts of disk space. For
example, on a 1 GB disk, the allocation block size under HFS is 16
KB, so even a 1 byte file would take up 16 KB of disk space. This
situation was less of a problem for users having large files (such
as pictures, databases or audio) because these larger files wasted
less space as a percentage of their file size. Users with many
small files, on the other hand, could lose a copious amount of
space due to large allocation block size. This made partitioning
disks into smaller logical volumes very appealing for Mac users,
because small documents stored on a smaller volume would take up
much less space than if they resided on a large partition. The same
problem existed in the FAT16 file system.
See also
References
External links