fileXray Example: Trawling for Data

fileXray provides several ways of looking for elusive or missing data on an HFS+ volume. One of these ways is fileXray’s trawling mechanism, wherein it will scan a volume looking for blocks that match “magic” patterns (signatures) contained in a given query file. You don’t usually need to come up with the patterns—fileXray understands the same “magic” mechanism that underlies the file command in Mac OS X. The /usr/share/file/magic/ system directory contains numerous magic pattern files.

By default, fileXray will scan the free extents of a volume using magic pattern(s) from a given input query file to match against each block. (Optionally, you can tell fileXray to look at every block—free or not—of a volume.) This way, you can trawl the volume looking for, say, PDF documents or JPEG images. You can use any of the pattern files found in /usr/share/file/magic/, which cumulatively contain thousands of patterns to identify file types. You can also concatenate two or more pattern files to provide a larger pattern set. Moreover, you can create your own patterns using the format described in the magic(5) man page. In the following example, the match indicates that byte offset 0x3ad000 on the volume PreciousHD marks the beginning of a PDF document.

# Trawl free extents on PreciousHD looking for PDF documents.
$ fileXray --volume /Volumes/PreciousHD --trawl /usr/share/file/magic/pdf
...
0x3ad000              PDF document, version 1.6
...

Suppose we wish to look for pictures in some common image file formats—GIF, JPEG, PNG, TIFF, etc.—within the free blocks of a volume. The standard pattern file /usr/share/file/magic/images contains several predefined paterns to suite our need. We can combine that file with another standard file /usr/share/file/magic/jpeg to get a larger pattern set and do something like the following.

# Create a larger pattern set for image file types.
$ cat /usr/share/file/magic/images /usr/share/file/magic/jpeg \
        > /tmp/mypatterns
$ fileXray --volume /Volumes/PreciousHD --trawl /tmp/mypatterns
...
0x27d000         PCX ver. 2.5 image data
0x3ad000         JPEG image data, JFIF standard 1.02
0x4ad000         GIF image data, version 89a, 1800 x 1800
0x5a8000         TIFF image data, big-endian
0x841000         PNG image, 1024 x 768, 8-bit/color RGBA, non-interlaced
0x102f000        Targa image data - RGB
0x1917000        PCX ver. 2.5 image data
...

Once fileXray finds potentially interesting or useful data through trawling, how do you access that data? The most convenient way is through fileXray itself—use the Arbitrary File System (ArbitraryFS).

One of the several virtual file systems built into fileXray, ArbitraryFS allows you to access arbitrary byte ranges on the volume as on-the-fly files! You simply tell fileXray to make an HFS+ volume’s storage available through ArbitraryFS. By default, the resultant volume contains no visible files. However, when you attempt to access a file whose name encodes a starting byte offset and a size, the corresponding content on the volume becomes transparently available through that file name. For example, if you attempt to open a file called 0x5000,4096.txt, you will, for the duration of the access, “see” a file whose content comes from the HFS+ volume’s on-disk byte range that starts at offset 0x5000 and is 4096 bytes in size. Optionally, the offset can be negative, in which case the starting offset is relative to the end of the volume. You can even specify multiple byte ranges (extents) using the colon character as the separator.

In the case of the PDF document we found in the aforementioned example, we could use ArbitraryFS to access the “file” named 0x3ad000,65536.pdf, say, by copying it out or opening it in place in a PDF viewer.

$ mkdir /Volumes/arbitrary
$ fileXray --userfs_type arbitrary --userfs_mount /Volumes/arbitrary \
         --volume /Volumes/PreciousHD
$ ls -als /Volumes/arbitrary
total 0
0 drwxr-xr-x  2 root wheel    0 Nov 2 17:35 .
0 drwxrwxrwt 33 root wheel 1122 Nov 2 18:22 ..
$ open /Volumes/arbitrary/0x3ad000,65536.pdf
...

The Arbitrary File System has other uses too. You can read more about it and about the trawling mechanism in the fileXray ebook.

Comments are closed.


All contents of this site, unless otherwise noted, are ©1994-2014 Amit Singh. All Rights Reserved.