In case you were wondering, fileXray (make sure you have a recent version) does support newer HFS+ features such as:
- The “date-added” information stored in Finder Info, which is not the same as the older “creation-time” field in the
statstructure. fileXray will determine if a file or folder has this information set, and if so, will display it. For example:
... # Extended Finder Info reserved1 = 0 date_added = Sat Sep 3 11:21:13 2011 extended_flags = 0000000000000000 reserved2 = 0 reserved3 = 0 # Data Fork logicalSize = 9303170 bytes (9.3 MB) ...
- The per-file content protection extended attributes that are used in the iOS version of HFS+. fileXray will detect these attributes and display the corresponding raw data as well as the interpreted data. For Example:
... # Attributes # Attribute Key keyLength = 62 pad = 0 fileID = 4966082 startBlock = 0 attrNameLen = 25 attrName = com.apple.system.cprotect # Attribute Data Record (Inline) # Record 5 in node 2075 beginning at 512-byte sector 0x65ef0 recordType = 0x10 reserved = 0 reserved = 0 attrSize = 56 bytes attrData = 02 00 00 00 00 00 00 00 02 00 00 00 28 00 00 00 ( 85 a4 03 d0 5f 6d f3 c9 c8 7f cc 97 99 7c a4 aa _ m | b6 83 6c b9 d2 2c 78 a0 5c 3b 04 ab 87 0a 07 3f l , x \ ; ? 3c 53 6f dc a3 16 83 9a < S o # File Content Protection Information major version = 2 minor version = 0 flags = 00000000000000000000000000000000 (00000000) persistent class = 2 (PROTECTION_CLASS_B) key size = 40 persistent key = 85a403d05f6df3c9c87fcc97997ca4aab6836cb9 d22c78a05c3b04ab870a073f3c536fdca316839a ...
- The primary answer is that it is not meaningful to compare them. Dramatically speaking, hfsdebug is the tip to the iceberg that is fileXray. It would be contrived to say that a bicycle is similar to a fighter jet because they both have wheels and can both be used to get from point A to point B. Similarly, it would not be useful to do a point-by-point differentiation between fileXray and hfsdebug even though they both “do things with HFS+ volumes.” hfsdebug’s functionality is a small, strict subset of fileXray’s functionality. Still, to list a few things for the sake of this post, fileXray has things like the following that hfsdebug does not:
- Comprehensive forensics features
- Ability to detect and parse GPT, APM, and MBR partition types
- Large number (28 at the time of this writing) of built-in filters meant for power users, security folks, and forensics professionals
- Built-in virtual file systems for performing advanced analyses
- Checksumming of various types of volume data
- Disk usage analytics
- Volume content retrieval
- Decompression of HFS+ compressed content
- Support for fork-based extended attributes
- Support for external journals
- Ability to harvest recently used file system object names from the journal
- Ability to trawl for volume content via arbitrarily extensible signature matching
- Ability to scavenge for deleted content
- Real-time file activity monitoring
- Ability to reverse map bytes to files
- In terms of code-base sizes, hfsdebug is about 12,000 lines of source code, whereas fileXray is about 105,000 lines. (For those prone to misreading numbers, fileXray has a code-base that is nearly 9 times larger.)
- In case of the few things that both hfsdebug and fileXray can do, fileXray can be up to 20 times faster than hfsdebug. In other words, what could take up to several minutes with hfsdebug may be done by fileXray in just a few seconds.
- fileXray comes as a Universal Binary consisting of 64-bit Intel, 32-bit Intel, and 32-bit PowerPC versions, whereas hfsdebug is 32-bit PowerPC only and must be run under Rosetta on Intel Macs.
- fileXray works on Mac OS X versions 10.5 (Leopard) and newer, including 10.7 (Lion). hfsdebug works on Mac OS X versions 10.4 (Tiger) through 10.6 (Snow Leopard), assuming you have Rosetta installed. Not all features of hfsdebug work on all supported versions of Mac OS X, however.
- Some features of fileXray are simply too powerful and too cool to not be reiterated. For example, you must check out the Scavenger File System and the Arbitrary File System. The former lets you mount scavengable (deleted but recoverable) content as a volume, allowing you to access such content through applications of your choice! The latter lets you access arbitrary byte ranges on a volume through a convenient interface.
- fileXray is a current product, whereas hfsdebug has now been retired because of flaws that cause it to display incorrect results under certain circumstances.
To sum up, fileXray is not a better hfsdebug, but a different beast altogether.
fileXray’s goal is to help you determine “anything” and “everything” about an HFS+ volume.
Someone recently asked if it is possible to locate all orphaned files on a volume. Indeed, it is possible.
HSF+ allows you to unlink a file or a folder while it is busy. Normally, these are permanently removed at unmount time. However, if the volume was not cleanly unmounted, such unlinked-but-not-removed objects can persist. In such cases, when the volume is subsequently mounted in read/write mode, the file system implementation will look for orphans and remove any that are found.
Orphaned objects have the
temp prefix in their names and they reside in the HFS+ Private Data folder. You can enumerate orphans and view the space consumed by them as follows. We use the root volume in this example.
First, we determine the Catalog Node ID (CNID) of the HFS+ Private Data Folder.
$ sudo fileXray --list / ... 18 d--------- ---u root wheel \x0000\x0000\x0000\x0000HFS+ Private Data/
Next, we use the CNID to list the contents of this special folder and filter by the
$ sudo fileXray --cnid 18 --list | grep temp CNID mode @+SUZ user group data rsrc name ... 8433559 -rw------- @--- singh staff 51 KB 0 temp8433559 8433588 -rw-r--r-- @--- singh staff 54 KB 0 temp8433588 ...
Besides its other capabilities, fileXray has an extensive feature set geared for HFS+ file system forensics. This is a quick overview of the relevant features—details can be found in the fileXray User Guide and Reference ebook.
- To begin with, the
--disallow_mountingoption provides a convenient solution to an often cited problem: that of preventing volumes on external devices to be automatically mounted when devices are connected to the computer. The
--disallow_mountingoption lets you temporarily disable such automatic mounting without having to remove or rename any configuration files and without having to stop any system daemons such as
--journal_namesoption dissects the volume’s journal and harvests file system object names. When displaying the harvested names, it annotates the output with the type of file system activity that’s likely to have occurred involving each name—for example, if an object with that name was deleted, moved, renamed, and so on. When run with the
--journal_names“diffs” the journal and volume copies of the blocks recorded in the journal and indicates which parts of the metadata, if any, have changed.
--trawloption scans the volume looking for blocks that match “magic” patterns (signatures). This option uses the same magic mechanism that underlies the
filecommand on Mac OS X. The set of signatures is easily extensible by the user.
--scavengeoption scans the volume looking for deleted files and folders. The result of the scavenge operation is a list of potentially recoverable files. It shows you a list of such files along with their metadata details, including which of the deleted blocks are likely to have been overwritten. It also allows you to “undelete” such scavenged files, if possible. The Scavenger File System provides a virtual file system view of the results of scavenging a volume, so you can actually inspect scavengable data using tools of your choice!
- The Free Space File System provides a convenient way to identify and search through the free extents of a volume. The analog for used extents is the Used Space File System.
- The Arbitrary File System provides a novel and powerful way of accessing arbitrary byte ranges on a given storage device.
- fileXray filters can be used to search a volume for objects with specific attributes. In particular, the
bmactimefamily of filters can be used to search for objects one or more of whose timestamps fall within a given range. The result of the
bmactimefilter can provide a “timeline” view of past file system activity.
--checksumoption can be used to compute hashes of one or more on-disk components of file system objects.
- fileXray provides several ways to read content—both metadata and data—from an HFS+ volume regardless of whether the volume is online or offline.
fileXray contains over two dozen built-in “filters” that allow you to locate file system objects on an HFS+ volume using a variety of criteria. A filter is a piece of code that gets executed by fileXray for each file system object as fileXray rapidly runs through the entire file system hierarchy of an HFS+ volume. One of these is the Mach-O filter, which lists all Mach-O files on a given volume.
The Mach-O filter differs from most fileXray filters in that it examines file content along with file metadata. After identifying a file as a Mach-O file, the filter further examines it to check if it is a multi-architecture (“fat”) file. If so, the filter calculates the logical size of each architecture-specific thin subfile contained within the fat file. The filter enumerates the following architectures:
arm. Any other architecture found in a Mach-O file (say, a NeXT binary—if you have some lying around on the volume) is categorized as “other”. You can optionally specify a list of specific architectures as a filter argument, in which case the filter would only show files containing at least one of the specified architectures.
Using the Mach-O filter, you can answer questions such as the following.
- Does this HFS+ volume, which is meant for storing media files, contain any Mach-O executables?
- Are there any ARM executables or libraries on this volume?
- What is the total logical space being consumed by PowerPC exectuables and libraries on this volume?
The following is an example of using the Mach-O filter.
$ sudo fileXray --filter builtin:macho --volume /Volumes/MacHD i386 x86_64 ppc ppc64 arm other path … 592 504 668 0 584 0 MacHD:/usr/lib/bundle1.o 888 936 972 0 1072 0 MacHD:/usr/lib/crt1.10.5.o 688 816 696 0 880 0 MacHD:/usr/lib/crt1.10.6.o 2112 1840 3480 0 1620 0 MacHD:/usr/lib/crt1.o ...
There are times when you really must be able to just manually “go through” the free (unallocated) space in a volume. Perhaps you are an end user who wants to look for lost data using some unusual technique. Perhaps you are a forensics or security professional who wants a convenient and easy mechanism to isolate the free extents of an HFS+ volume, and then be able to examine those extents using tools of your choice. The Free Space File System (FreespaceFS), one of fileXray’s built-in virtual file systems, provides just that mechanism.
Simply put, FreespaceFS contains virtual files that represent the free extents of a given HFS+ volume. The idea is to isolate free space in easy-to-read contiguous chunks, exposing each chunk as a virtual file that can be normally read, which makes searching through free space much more convenient and faster in most cases.
When you mount an HFS+ volume through FreespaceFS, a top-level virtual directory called
freespace in the resultant volume contains one or more virtual subdirectories whose names are of the format
X is a monotonically increasing decimal number starting at
Y represents a block number in hexadecimal, which is the starting block number of the first extent within that directory. Consider the following example.
# Create a mount point. $ mkdir /Volumes/freespace # Use the Free Space File System to mount the root volume. $ sudo fileXray --userfs_type freespace --userfs_mount /Volumes/freespace $ ls -las /Volumes/freespace/freespace/ total 0 0 drwxr-xr-x 38 root wheel 0 Nov 2 20:58 . 0 drwxr-xr-x 3 root wheel 0 Nov 2 20:58 .. 0 dr-xr-xr-x 1026 root wheel 0 Nov 2 20:58 00000000_00014d87 0 dr-xr-xr-x 1026 root wheel 0 Nov 2 20:58 00000001_003e4772 0 dr-xr-xr-x 1026 root wheel 0 Nov 2 20:58 00000002_004e8ae8 0 dr-xr-xr-x 1026 root wheel 0 Nov 2 20:58 00000003_00550783 0 dr-xr-xr-x 1026 root wheel 0 Nov 2 20:58 00000004_005b8023 0 dr-xr-xr-x 1026 root wheel 0 Nov 2 20:58 00000005_00bda9bd … 0 dr-xr-xr-x 1026 root wheel 0 Nov 2 20:58 00000034_02d66310 0 dr-xr-xr-x 389 root wheel 0 Nov 2 20:58 00000035_02ec061a
Inside each such directory named
X_Y, there are at most
1024 virtual files—a new directory is created after the previous one is populated with
1024 files. Each file represents a free extent—that is, a range of contiguous free blocks. Each file’s name is of the form
U is the extent’s starting block number and
V is the number of blocks in the extent. Both
V are represented in hexadecimal. As noted earlier, the value of
U for the first extent contained within the
X_Y directory is the same as the value of
Reading from such a file will return data from the volume blocks the file represents. The following excerpt shows the last few contents of the last
X_Y directory in the above example.
$ ls -asl /Volumes/freespace/freespace/00000035_02ec061a … 848 -rw-r--r-- 1 root wheel 424K Nov 2 21:18 02f0db4e-02f0dbb7 88 -rw-r--r-- 1 root wheel 44K Nov 2 21:18 02f0dc69-02f0dc73 2272 -rw-r--r-- 1 root wheel 1.1M Nov 2 21:18 02f0dc77-02f0dd92 114176 -rw-r--r-- 1 root wheel 56M Nov 2 21:18 02f0dd94-02f11553 530184 -rw-r--r-- 1 root wheel 259M Nov 2 21:18 02f11556-02f21836 16608 -rw-r--r-- 1 root wheel 8.1M Nov 2 21:18 02f21840-02f2205b 92345904 -rw-r--r-- 1 root wheel 44G Nov 2 21:18 02f2205d-03a24322 $
Note that that last file,
02f2205d-03a24322, contains about 44GB of free space.
In case you are wondering if there is an analog for used extents, the answer is yes: fileXray also provides a “Used Space File System” that exposes the in-use (allocated) extents of an HFS+ volume as virtual files.
By default, the Disk Arbitration mechanism in Mac OS X probes newly discovered storage devices for mountable volumes. Mounting an HFS+ volume in read-write mode, which is the default, will modify the volume in question because both low-level and high-level file system activity can occur at mount time. For example, timestamps and counters can get updated, the journal can get replayed, file system objects can get created or deleted, and so on. This is highly undesirable if you wish to attach and access the storage device for recovery or forensic purposes or otherwise wish to keep it unmodified.
With fileXray, you can not only fully analyze an unmounted (offline) volume, you can prevent the volume of interest from being automatically mounted as you attach the corresponding storage device to the computer. When this option is used, fileXray will wait for a specified number of seconds, during which time any new volumes that appear will not be allowed to automatically mount. However, the devices underlying these volumes will be allowed to attach, which in turn means that you can use fileXray on the devices. As a device attaches, fileXray will print the corresponding block device name(s) and if possible, the corresponding file system type(s) and volume name(s).
In the following example, we use fileXray to disallow automatic mounting for 60 seconds. While mounting is disabled, we attach an external disk drive containing a GUID Partition Table with four volumes on it. We see that fileXray prints information about each volume as that volume’s mounting is attempted by the system. Since the volumes are attached, we can now use fileXray on the corresponding device names.
$ fileXray --disallow_mounting 60 Disallowing mounting for 60 seconds. ... # Now attach an external device ... disk1s2 hfs Untitled 2 disk1s4 hfs Untitled 4 disk1s3 hfs Untitled 3 disk1s1 msdos UNTITLED 1
One of fileXray’s features is that it uses virtual file systems to provide access to certain types of volume information.
The Trawling for Data blog post contained a mention of ArbitraryFS, which is one of the several such file systems built into fileXray. Let us look at ArbitraryFS in a little more detail.
ArbitraryFS contains no visible files by default. The only files that will show up in an ArbitraryFS volume are those that you ask for specifically by name. The name is special in that it must encode a starting byte offset and a size. When you do access a file with such an encoded name, the corresponding content will be transparently made available through that file. The specific naming format is:
For example, if you open a file called
0x10000,65536.pdf, you will “see” a file whose content comes from the HFS+ volume’s on-disk byte range that starts at byte
0x10000 and is
65536 bytes in size. The extension is irrelevant to ArbitraryFS, but may be of use to the application you use to access a file in question. (For example, you can directly open the aforementioned PDF “file” using the Mac OS X Preview application.)
START_BYTE can be negative, in which case the starting offset is relative to the end of the volume. You can also specify multiple extents using the colon character as the separator. For example, the following name will provide content from three byte ranges:
Consider the following example. We can use ArbitraryFS to mount an HFS+ volume on, say,
/Volumes/arbitrary. We know that the HFS+ volume header is
512 bytes in size and resides at an offset of
1024 bytes from the beginning of the volume. The Alternate Volume Header, which is a reasonably-in-sync copy of the Volume Header, resides at an offset of
1024 bytes from the end of the volume. Given that ArbitraryFS lets us access arbitrary byte ranges with either positive or negative offsets, we can read a few bytes from these two data structures and see if we get expected values for the volume signature and the last mounted version signature fields.
# Create a mount point. $ mkdir /Volumes/Arbitrary # Use the Arbitrary File System to mount the root volume. $ sudo fileXray --userfs_type arbitrary --userfs_mount /Volumes/arbitrary $ ls -las /Volumes/arbitrary total 0 0 drwxr-xr-x 2 root wheel 0 Nov 2 17:35 . 0 drwxrwxrwt 33 root wheel 1122 Nov 2 18:22 .. $ # Look at the first 12 bytes of the Volume Header, which is 512 bytes in size # and resides at an offset of 1024 bytes from the beginning of the volume. $ hexdump -n 12 -xc /Volumes/arbitrary/1024,512 0000000 2b48 0400 0080 0020 4648 4a53 0000000 H + \0 004 200 \0 \0 H F S J 000000c # Look at the first 12 bytes of the Alternate Volume Header, which resides # at an offset of 1024 bytes from the end of the volume. $ hexdump -n 12 -xc /Volumes/arbitrary/-1024,512 0000000 2b48 0400 0080 0020 4648 4a53 0000000 H + \0 004 200 \0 \0 H F S J
The idea behind ArbitraryFS is simple yet powerful: make arbitrary byte ranges on a volume accessible easily and conveniently regardless of which tools you use to access. You can use something like
hexdump to examine content in place; you can use
cp to copy content out; you can use a symbolic link as a mnemonic to an ArbitraryFS virtual file. The latter technique is useful when you use ArbitraryFS to examine matches found by fileXray’s trawling process.
Given the general utility of ArbitraryFS, fileXray allows you to mount non-HFS+ entities such as other volume types and even regular files. The following example shows “mounting” a Mach-O executable using ArbitraryFS. Once mounted, the well-defined constituents of the file can be directly accessed by addressing them through file names consisting of the constituents’ offsets and sizes.
# Create a mount point. $ mkdir /Volumes/arbitrary # Use the Arbitrary File System to mount a file instead of an HFS+ volume. # The --force option will enable such mounting despite the warning. $ sudo fileXray --userfs_type arbitrary --userfs_mount /Volumes/arbitrary \ --device /mach_kernel --force No flavor of HFS+ found. $ $ ls -l /mach_kernel -rw-r--r--@ 1 root wheel 18672224 Jul 31 22:49 /mach_kernel $ df -h /Volumes/arbitrary Filesystem Size Used Avail Capacity Mounted on fileXray@fuse0 18Mi 18Mi 0Bi 100% /private/tmp/arbitrary $ # Look for something interesting in the file. For example, the __cstring # section in the text segment of this Mach-O file. $ otool -l /mach_kernel â€¦ sectname __cstring segname __TEXT addr 0x005832c8 size 0x00057def offset 3683016 â€¦ # Access the __cstring section â€œdirectlyâ€ as an on-the-fly file through # the Arbitrary File System. $ hexedit /Volumes/arbitrary/3683016,0x57def 00000000 68 28 25 73 25 64 29 20 69 66 6E 65 74 5F 64 65 h(%s%d) ifnet_de 00000010 74 61 63 68 5F 70 72 6F 74 6F 63 6F 6C 20 66 61 tach_protocol fa 00000020 69 6C 65 64 2C 20 25 64 0A 00 25 73 25 64 3A 20 iled, %d..%s%d: 00000030 25 73 20 77 61 6B 65 75 70 0A 00 00 00 69 66 5F %s wakeup....if_ â€¦
Note that ArbitraryFS does not cache its content. Therefore, reading its virtual files will always retrieve the “latest” content.