One of fileXray’s features is that it uses virtual file systems to provide access to certain types of volume information.
The Trawling for Data blog post contained a mention of ArbitraryFS, which is one of the several such file systems built into fileXray. Let us look at ArbitraryFS in a little more detail.
ArbitraryFS contains no visible files by default. The only files that will show up in an ArbitraryFS volume are those that you ask for specifically by name. The name is special in that it must encode a starting byte offset and a size. When you do access a file with such an encoded name, the corresponding content will be transparently made available through that file. The specific naming format is:
[-]START_BYTE,SIZE_IN_BYTES[.extension]
For example, if you open a file called 0x10000,65536.pdf, you will “see” a file whose content comes from the HFS+ volume’s on-disk byte range that starts at byte 0x10000 and is 65536 bytes in size. The extension is irrelevant to ArbitraryFS, but may be of use to the application you use to access a file in question. (For example, you can directly open the aforementioned PDF “file” using the Mac OS X Preview application.)
Optionally, START_BYTE can be negative, in which case the starting offset is relative to the end of the volume. You can also specify multiple extents using the colon character as the separator. For example, the following name will provide content from three byte ranges:
0x5000,4096:0x9000,4096:0xc000,4096.txt
Consider the following example. We can use ArbitraryFS to mount an HFS+ volume on, say, /Volumes/arbitrary. We know that the HFS+ volume header is 512 bytes in size and resides at an offset of 1024 bytes from the beginning of the volume. The Alternate Volume Header, which is a reasonably-in-sync copy of the Volume Header, resides at an offset of 1024 bytes from the end of the volume. Given that ArbitraryFS lets us access arbitrary byte ranges with either positive or negative offsets, we can read a few bytes from these two data structures and see if we get expected values for the volume signature and the last mounted version signature fields.
# Create a mount point.
$ mkdir /Volumes/Arbitrary
# Use the Arbitrary File System to mount the root volume.
$ sudo fileXray --userfs_type arbitrary --userfs_mount /Volumes/arbitrary
$ ls -las /Volumes/arbitrary
total 0
0 drwxr-xr-x 2 root wheel 0 Nov 2 17:35 .
0 drwxrwxrwt 33 root wheel 1122 Nov 2 18:22 ..
$
# Look at the first 12 bytes of the Volume Header, which is 512 bytes in size
# and resides at an offset of 1024 bytes from the beginning of the volume.
$ hexdump -n 12 -xc /Volumes/arbitrary/1024,512
0000000 2b48 0400 0080 0020 4648 4a53
0000000 H + \0 004 200 \0 \0 H F S J
000000c
# Look at the first 12 bytes of the Alternate Volume Header, which resides
# at an offset of 1024 bytes from the end of the volume.
$ hexdump -n 12 -xc /Volumes/arbitrary/-1024,512
0000000 2b48 0400 0080 0020 4648 4a53
0000000 H + \0 004 200 \0 \0 H F S J
The idea behind ArbitraryFS is simple yet powerful: make arbitrary byte ranges on a volume accessible easily and conveniently regardless of which tools you use to access. You can use something like cat or hexdump to examine content in place; you can use cp to copy content out; you can use a symbolic link as a mnemonic to an ArbitraryFS virtual file. The latter technique is useful when you use ArbitraryFS to examine matches found by fileXray’s trawling process.
Given the general utility of ArbitraryFS, fileXray allows you to mount non-HFS+ entities such as other volume types and even regular files. The following example shows “mounting” a Mach-O executable using ArbitraryFS. Once mounted, the well-defined constituents of the file can be directly accessed by addressing them through file names consisting of the constituents’ offsets and sizes.
# Create a mount point.
$ mkdir /Volumes/arbitrary
# Use the Arbitrary File System to mount a file instead of an HFS+ volume.
# The --force option will enable such mounting despite the warning.
$ sudo fileXray --userfs_type arbitrary --userfs_mount /Volumes/arbitrary \
--device /mach_kernel --force
No flavor of HFS+ found.
$
$ ls -l /mach_kernel
-rw-r--r--@ 1 root wheel 18672224 Jul 31 22:49 /mach_kernel
$ df -h /Volumes/arbitrary
Filesystem Size Used Avail Capacity Mounted on
fileXray@fuse0 18Mi 18Mi 0Bi 100% /private/tmp/arbitrary
$
# Look for something interesting in the file. For example, the __cstring
# section in the text segment of this Mach-O file.
$ otool -l /mach_kernel
…
sectname __cstring
segname __TEXT
addr 0x005832c8
size 0x00057def
offset 3683016
…
# Access the __cstring section “directly†as an on-the-fly file through
# the Arbitrary File System.
$ hexedit /Volumes/arbitrary/3683016,0x57def
00000000 68 28 25 73 25 64 29 20 69 66 6E 65 74 5F 64 65 h(%s%d) ifnet_de
00000010 74 61 63 68 5F 70 72 6F 74 6F 63 6F 6C 20 66 61 tach_protocol fa
00000020 69 6C 65 64 2C 20 25 64 0A 00 25 73 25 64 3A 20 iled, %d..%s%d:
00000030 25 73 20 77 61 6B 65 75 70 0A 00 00 00 69 66 5F %s wakeup....if_
…
Note that ArbitraryFS does not cache its content. Therefore, reading its virtual files will always retrieve the “latest” content.