Filesystem exploration: the almost-FAT12 floppy disk

My parents have a (pretty old at that point) Yamaha Clavinova digital piano, with a floppy drive that can be used to record or play MIDI files.

It came with a floppy disk named “Disk Orchestra Collection” that contains various sample songs. I randomly decided to image it. It copied fine without errors, yay! But I couldn’t mount it: “No mountable filesystems”.

I was pretty sure this was some sort of “copy protection”, and not an issue with the imaging process. Given the age of the system, the protection was not going to be very complex, though. Let’s try to get it to mount!

Note: For accessibility purposes, I’ve put a summary of all hexdumps in this post in figcaption tags. I don’t know if it’s the best way to do it and am open to suggestions.


Looking through the image with hexdump revealed some interesting things:

00000e00  4d 55 53 49 43 20 20 20  44 49 52 20 00 00 00 00  |MUSIC   DIR ....|
00000e10  00 00 00 00 00 00 29 78  94 1e 02 00 e0 0b 00 00  |......)x........|
00000e20  4e 41 4d 45 20 20 20 20  4d 44 41 27 00 00 00 00  |NAME    MDA'....|
00000e30  00 00 00 00 00 00 2a 78  94 1e 05 00 e0 0b 00 00  |......*x........|
At offset e00, an ASCII string "MUSIC DIR"; at e20, "NAME MDA"

This looks like a directory listing on a FAT filesystem! On FAT, filenames are limited to a 8 characters names and 3 characters extension, all ASCII uppercase.
The filesystem stores the 8 characters of the name and the 3 characters of the extension one after the other, padding them with spaces if they are shorter than that. For instance, the name "A.X" will be stored as "A<7 spaces>X<2 spaces>". So here, we can see two names "MUSIC.DIR" and "NAME.MDA". We can also see that the start of each filename is 0x20 bytes after the previous one, and FAT directory entries happen to be 0x20 bytes long.

Since this is a floppy disk, it is probably FAT12 formatted (FAT16 and FAT32 are for larger drives). After reading through some Wikipedia pages, here’s what I understand about this filesystem: (note: I may have got some details wrong)

  • The first sector of the disk (sector 0) is the “boot sector”. It contains information about the layout of the filesystem, and some code that is executed to boot the computer. It always contains boot code even if the disk does not contain an OS; in that case the boot code is a small program that writes “Non-system disk or disk error, press any key to reboot” (or a variant of it) on the screen. (Different disk format tools put different messages so you may have seen different messages, or even localized ones, when leaving a floppy in your computer’s disk drive)
  • The boot sector can be followed by a number of “reserved sectors”. I guess that may be useful for some OSs to put additional boot code.
  • After the boot and reserved sectors comes the File Allocation Table (FAT). All the disk space that is used to store file data (and subdirectory contents) is divided into “clusters” (which are a certain number of sectors long). The FAT contains a series of values that indicate the state of each cluster:
    • The cluster is free space (not used)
    • The cluster is damaged and should not be used
    • The cluster is “reserved”. I guess this is used to indicate that the filesystem driver should not mess with this cluster, for whatever reason.
    • The cluster contains data for a file. The value in the FAT either indicates the index of the next cluster that contains data for this file, or a special value that indicates that it’s the last cluster that holds data for this file.
  • There can be multiple copy of the FAT, for redundancy purposes.
  • After the FATs comes the “root directory”, that list the files and directories at the root of the drive. On FAT, directories are basically special files that contain a list of filenames, with attributes and the start cluster of each one. The “root directory” is special because it’s not part of the storage data, so it’s not divided into clusters; it has a fixed size. Want to store more files at the root of the drive? Sorry, you’ll have to reformat it.
  • Finally, the file clusters comes afterwards, until the end of the disk.

Let’s see how a file is read on FAT. First, you need to locate it. Let’s say the file path is “\X\Y\Z.TXT”

  1. The first name in the path is “X”, so we look for an entry named “X<7 spaces><3 spaces>” in the root directory. This gives us the start cluster of directory \X contents.
  2. From the start cluster, we can read directory \X contents (the same way we would read a file; see below) and find subdirectory Y. This gives us the start cluster of directory \X\Y contents.
  3. Read directory \X\Y contents until we find entry “Z<7 spaces>TXT”. This gives us the start cluster of this file.

The directory entries also gives us other information, such as the file (or directory) creation date/time, last modification date/time, file size, and attributes. The attributes indicates if what we’ve found is a file, a directory, or some other weird thing (like long file names).

Now that we known the start cluster of the file, we can read the value in the FAT corresponding to that cluster, and get the next file cluster. We follow this linked list to the end and hopefully get the list of all clusters that store this file’s data, so we can read it.

Okay, so why does our mystery disk refuses to be mounted? Let’s look at sector 0:

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200
hexdump showing that sector 0 is filled with zeroes

It’s empty! That’s why the disk cannot be mounted. The information about the
filesystem layout is missing.

So let’s try re-creating sector 0; maybe it will be sufficient to mount the
the disk.

The two first cluster values of the FAT are special. From what I understand, cluster 0 value is always 0xFFx with x = 0 or between 8 and F inclusive, and cluster 1 value is always 0xFFF. On FAT12, these are 12-bit values, so each value is stored into 1.5 bytes in little-endian order (which is a bit weird because of the half-byte).

That means we can expect a FAT12 FAT to start with “Fx FF FF”. At sector 1, we can see:

00000200  f9 ff ff 03 40 00 ff 6f  00 07 f0 ff 09 a0 00 0b  |....@..o........|
00000210
hexdump showing bytes f9 ff ff at the start of sector 1, then some other bytes

That’s probably the first FAT. Let’s look a bit further:

00000500  01 22 20 03 42 20 05 62  20 07 82 20 ff 0f 00 00  |." .B .b .. ....|
00000510  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000800  f9 ff ff 03 40 00 ff 6f  00 07 f0 ff 09 a0 00 0b  |....@..o........|
00000810
hexdump showing zeroes starting at offset 0x508, then f9 ff ff at offset 0x800, followed by the same bytes that were at offset 0x200

So the second FAT starts at offset 0x800. That means the FAT is 0x600 bytes, or three sectors long. So the second FAT ends at 0xe00, which is exactly the offset of the MUSIC.DIR directory entry, so root the directory is indeed right after the second FAT.

Now we can determine the information we need to put into sector 0 in order to make the disk readable:

Bytes per sector: 512 (standard for a 720K PC floppy disk)
Number of reserved sectors: 1 (sector 0 counts as a reserved sector, and the FAT immediately follows it).
Number of FATs: 2 (as seen above)
Total sector count: 1440 (720KB disk * 1024 bytes per KB / 512 bytes per sector)
Sectors per FAT: 3 (as seen above)
Sectors per track: 9 (standard for a 720K PC floppy disk)
Number of heads: 2 (standard for a 720K PC floppy disk)
Maximum number of root directory entries:
Let’s try to locate the end of the directory. After a long list of filenames
starting at 0xe00, we find:

000015a0  4d 44 52 5f 35 39 20 20  45 56 54 27 00 00 00 00  |MDR_59  EVT'....|
000015b0  00 00 00 00 00 00 0a a9  95 1e fd 01 f4 2c 00 00  |.............,..|
000015c0  00 e5 e5 e5 e5 e5 e5 e5  e5 e5 e5 e5 e5 e5 e5 e5  |................|
hexdump showing a "MDR_59 EVT" entry at offset 0x15a0; the next entry at 0x15c0 starts with 00 then e5 repeated

A directory entry starting with 0x00 indicates the end of the directory. After, there is a bunch of 0xe5 bytes, with a 0x00 every 32 bytes. This ends at offset 0x1c00, where something that looks like file data begins.

So let’s assume the root directory is located between 0xe00 and 0x1c00. This is 3584 bytes, which is 7 sectors or 112 directory entries. Comparing this to other disk images I have, this seems common.

Sectors per cluster:
I’m a bit confused by this value. The disk can store 1426 sectors of cluster
data, and the FAT has 1164 non-zero cluster values, so I assumed that there was
one sector per cluster. This was wrong; I could mount the disk but not read
any files. Other disks that I have seem to all have two sectors per cluster,
so I used this value instead, and it seems to work. Maybe I’m missing something
about the FAT layout.

The other values in sector 0 are not directly linked to the filesystem
structure. I put some reasonable values. I had a bit of trouble with the very
first value on the disk (three-byte “boot jump”); I think this is the location
in memory where execution is started after copying sector 0 to memory. Putting
zero here did not work so I copied a value from another disk image (0x903ceb).

Okay, let’s see what fsck_msdos thinks about our reconstructed sector 0:

% ./regen_sector_zero.py clavinova_disk_orchestra.img clavinova_disk_orchestra_fixed.img
% fsck_msdos -n clavinova_disk_orchestra_fixed.img 
** clavinova_disk_orchestra_fixed.img
** Phase 1 - Preparing FAT
FAT[0] is incorrect (is 0xFF9; should be 0xF01)
Correct? no
** Phase 2 - Checking Directories
** Phase 3 - Checking for Orphan Clusters
62 files, 194 KiB free (194 clusters)

It looks like the FAT cluster 0 value 0xFF9 is weird enough that the filesystem checker does not like it, but the rest of file system seems OK! Let’s mount it:

% hdiutil attach -readonly clavinova_disk_orchestra_fixed.img        
/dev/disk6          	                               	/Volumes/NO NAME
% ls -nT /Volumes/NO\ NAME
total 1038
-rwxrwxrwx  1 501  20  12042 Jun  7 23:29:58 1995 MDR_00.EVT
-rwxrwxrwx  1 501  20  32787 Apr 21 20:59:22 1995 MDR_01.EVT
[…] 
-rwxrwxrwx  1 501  20  11508 Apr 21 21:08:20 1995 MDR_59.EVT
-rwxrwxrwx  1 501  20   3040 Apr 20 15:01:18 1995 MUSIC.DIR
-rwxrwxrwx  1 501  20   3040 Apr 20 15:01:20 1995 NAME.MDA
The disk mounts successfully and show 60 files named MDR_XX.EVT, one file named MUSIC.DIR and another named NAME.MDA. File dates are in 1995

Success! All the files read fine. There are no subdirectories on the disk. It looks like “NAME.MDA” contains the title of the songs on the disk, in a fixed-size format, and “MUSIC.DIR” contains the corresponding filenames “MDR_XX.EVT”. The EVT files probably contain MIDI data in a proprietary format; I could not easily find information about them or a way to convert them.

And that’s it! I had quite a bit of fun exploring this disk image, and it was satisfying to mount it successfully.
In the future, I think I’ll investigate a weird HFS+ corruption bug that affected some specific versions of Mac OS 8.