fileXray Example: Trawling for Data

November 5th, 2010

fileXray provides several ways of looking for elusive or missing data on an HFS+ volume. One of these ways is fileXray’s trawling mechanism, wherein it will scan a volume looking for blocks that match “magic” patterns (signatures) contained in a given query file. You don’t usually need to come up with the patterns—fileXray understands the same “magic” mechanism that underlies the file command in Mac OS X. The /usr/share/file/magic/ system directory contains numerous magic pattern files.

By default, fileXray will scan the free extents of a volume using magic pattern(s) from a given input query file to match against each block. (Optionally, you can tell fileXray to look at every block—free or not—of a volume.) This way, you can trawl the volume looking for, say, PDF documents or JPEG images. You can use any of the pattern files found in /usr/share/file/magic/, which cumulatively contain thousands of patterns to identify file types. You can also concatenate two or more pattern files to provide a larger pattern set. Moreover, you can create your own patterns using the format described in the magic(5) man page. In the following example, the match indicates that byte offset 0x3ad000 on the volume PreciousHD marks the beginning of a PDF document.

# Trawl free extents on PreciousHD looking for PDF documents.
$ fileXray --volume /Volumes/PreciousHD --trawl /usr/share/file/magic/pdf
...
0x3ad000              PDF document, version 1.6
...

Suppose we wish to look for pictures in some common image file formats—GIF, JPEG, PNG, TIFF, etc.—within the free blocks of a volume. The standard pattern file /usr/share/file/magic/images contains several predefined paterns to suite our need. We can combine that file with another standard file /usr/share/file/magic/jpeg to get a larger pattern set and do something like the following.

# Create a larger pattern set for image file types.
$ cat /usr/share/file/magic/images /usr/share/file/magic/jpeg \
        > /tmp/mypatterns
$ fileXray --volume /Volumes/PreciousHD --trawl /tmp/mypatterns
...
0x27d000         PCX ver. 2.5 image data
0x3ad000         JPEG image data, JFIF standard 1.02
0x4ad000         GIF image data, version 89a, 1800 x 1800
0x5a8000         TIFF image data, big-endian
0x841000         PNG image, 1024 x 768, 8-bit/color RGBA, non-interlaced
0x102f000        Targa image data - RGB
0x1917000        PCX ver. 2.5 image data
...

Once fileXray finds potentially interesting or useful data through trawling, how do you access that data? The most convenient way is through fileXray itself—use the Arbitrary File System (ArbitraryFS).

One of the several virtual file systems built into fileXray, ArbitraryFS allows you to access arbitrary byte ranges on the volume as on-the-fly files! You simply tell fileXray to make an HFS+ volume’s storage available through ArbitraryFS. By default, the resultant volume contains no visible files. However, when you attempt to access a file whose name encodes a starting byte offset and a size, the corresponding content on the volume becomes transparently available through that file name. For example, if you attempt to open a file called 0x5000,4096.txt, you will, for the duration of the access, “see” a file whose content comes from the HFS+ volume’s on-disk byte range that starts at offset 0x5000 and is 4096 bytes in size. Optionally, the offset can be negative, in which case the starting offset is relative to the end of the volume. You can even specify multiple byte ranges (extents) using the colon character as the separator.

In the case of the PDF document we found in the aforementioned example, we could use ArbitraryFS to access the “file” named 0x3ad000,65536.pdf, say, by copying it out or opening it in place in a PDF viewer.

$ mkdir /Volumes/arbitrary
$ fileXray --userfs_type arbitrary --userfs_mount /Volumes/arbitrary \
         --volume /Volumes/PreciousHD
$ ls -als /Volumes/arbitrary
total 0
0 drwxr-xr-x  2 root wheel    0 Nov 2 17:35 .
0 drwxrwxrwt 33 root wheel 1122 Nov 2 18:22 ..
$ open /Volumes/arbitrary/0x3ad000,65536.pdf
...

The Arbitrary File System has other uses too. You can read more about it and about the trawling mechanism in the fileXray ebook.

fileXray Example: Who Owns This Byte?

November 4th, 2010

Suppose you want to know which file or folder (if any) “owns” a given byte on an HFS+ volume. If no regular file or folder owns the byte, is the byte part of a free block, or is it allocated to some internal file system data structure, such as the HFS+ Catalog B-Tree, etc.?

There could be several investigative reasons why you might want to know this, but it’s an interesting question by itself. fileXray can provide the answer.

The following examples show how to use the --who_owns_byte feature of fileXray. Byte offset 0 represents the beginning of the volume.

$ sudo fileXray --who_owns_byte 1023
Reserved

$ sudo fileXray --who_owns_byte 1024
Volume Header

$ sudo fileXray /mach_kernel
...
  extents              =   startBlock   blockCount      % of file

                             0x1b71e6       0x11cf       100.00 %
...

$ sudo fileXray --who_owns_byte 0x1b71e6001
MacHD:/mach_kernel

Note that in the case of /mach_kernel, we look at the file’s extents to know where the file begins on the volume. Since its first block is number 0x1b71e6 and the allocation block size for this volume is 4096 bytes, the file’s first byte is byte number 0x1b71e6000 on the volume.

fileXray Example: The Time Filters

November 3rd, 2010

fileXray has the ability to rapidly run all file system objects on an HFS+ volume through a piece of code called a “filter”. A filter can examine a file system object and use arbitrary criteria to either accept or reject it. fileXray comes with more than two dozen built-in filters and you can even write your own dynamically loadable filters. Examples of built-in filters are:

  • bmactime—List objects with timestamps in a given range.
  • compressed—List HFS+ compressed files.
  • creatorcode—List files that have the given creator code.
  • device—List block or character special files.
  • dirhardlink—List directory hard links.
  • empty—List files that have no extended attributes and whose data and resource forks are both empty.
  • emptyforks—List files whose data and resource forks are both empty.
  • fifo—List named pipes.
  • hardlink—List file hard links.
  • immutable—List immutable file system objects.
  • lsR—List all paths.
  • macho—List Mach-O files along with their per architecture sizes.
  • name—List objects whose name matches a given name (case sensitive).
  • namei—List objects whose name matches a given name (case insensitive).
  • nameprefix—List objects whose name has a given prefix (case sensitive).
  • nameprefixi—List objects whose name has a given prefix (case insensitive).
  • namesuffix—List objects whose name has a given suffix (case sensitive).
  • namesuffixi—List objects whose name has a given suffix (case insensitive).
  • nodename—List the parent node IDs and node names of all objects.
  • null—Do nothing; useful for establishing baselines in benchmarks.
  • socket—List Unix Domain socket files.
  • subname—List objects whose name contains a given string (case sensitive).
  • subnamei—List objects whose name contains a given string (case insensitive).
  • sxid—List setuid and setgid files and folders.
  • symlink—List symbolic links.
  • typecode—List files that have the given file type code.
  • xattrname—List all unique extended attribute names in use.
  • xattr—List objects that have a given extended attribute.

Suppose you have a heavily populated HFS+ volume and want to see exactly which files and folders were modified in the last 60 seconds. (Perhaps you wish to know what modifications an application you just ran made to your volume.) The bmactime built-in filter can quickly show you the answer.

The bmactime filter is a family of filters—a meta filter if you will—whose names all end with the suffix “time”. The prefix can be a permutation of one or more of the characters b, m, a, and c. For example, atime, btime, ctime, mtime, bmactime, and cmtime are all valid filter names. The prefix characters represent HFS+ timestamps as follows.

b Time of creation (birthtime).
m Time of last content (data) modification.
a Time of last access.
c Time of last attribute (metadata) modification.

The bmactime filter requires as an argument a time range consisting of a beginning time and an ending time. These times can be specified either as number of seconds or as human-readable date strings, for example, “Nov 2 12:00:00 PDT 2009”. Refer to the fileXray ebook for more details on the format. In particular, you can use negative values to refer to the last so many seconds.

In the following example, we look for file system objects that were modified within the last 60 seconds. We pipe the output through the sort program to get a timeline view. (Note that fileXray output has been cropped in the following excerpt.)

$ sudo fileXray --filter builtin:mctime --filter_args -60, | sort -n
...
1256751021 Mon Nov 2 10:30:21 2009 1529932 -rw------- .m.c MacHD:/.Spotlight-V100/
1256751021 Mon Nov 2 10:30:21 2009 205344  -rw-r--r-- .m.c MacHD:/private/var/log/
1256751021 Mon Nov 2 10:30:21 2009 205858  -rw------- .m.c MacHD:/private/var/db/s
1256751021 Mon Nov 2 10:30:21 2009 302227  drwx------ .m.c MacHD:/private/var/db/s
1256751021 Mon Nov 2 10:30:21 2009 3089114 -r--r----- .m.c MacHD:/private/var/audi
1256751021 Mon Nov 2 10:30:21 2009 3169431 -rw-r----- .m.c MacHD:/private/var/log/
1256751021 Mon Nov 2 10:30:21 2009 3169532 -rw-r--r-- .m.c MacHD:/private/var/log/
...

Talking of time, how long does fileXray need to work this out? Well, the specific time taken will depend upon the volume and the hardware in question. In the above example, fileXray took 8 seconds for a volume residing on a rotational (non-SSD) disk drive and containing nearly 1.3 million files and folders.

fileXray

November 1st, 2010

Does the idea of wielding power—a lot of power—intrigue you? Check out fileXray.

Start with the ebook. If you are one of the target audiences, it will be worth your time.

Is Your Machine Good Enough for Snow Leopard K64?

August 31st, 2009

“K64″ is what Apple refers to as the 64-bit version of the kernel beginning with Snow Leopard. As an end user, you really should not worry about the bitness of the kernel. If your Apple computer is not booting into K64 by default, you don’t need it—unless, of course, you know that you need it. (Say, because you are a kernel developer or an otherwise system-level developer and want to test something against a 64-bit kernel.) In particular, the 32-bit kernel, which is the default on most existing x86-based Apple computers, runs 64-bit applications just fine. Therefore, as long as you have a 64-bit processor, your Snow Leopard installation is 64-bit from the typical end-user standpoint.

An easy way to tell if you are running a K64 kernel is to use the uname command-line program. The "x86_64" in the excerpt below means that we are running a 64-bit kernel. If the output showed "i386" instead, that would mean a 32-bit kernel.

$ uname -a
Darwin... root:xnu-1456.1.25~1/RELEASE_X86_64 x86_64

If you are averse to using command-line programs (Do you really care about a K64 kernel in that case?), you could instead launch the System Profiler application, either directly, or by clicking on "More Info…" in the "About This Mac" panel. In System Profiler, you can click on the "Software" section in the sidebar. There will be something about the presence or absence of 64-bit Kernel and Extensions. You could also launch the Activity Monitor application and look for kernel_task. The "Kind" column will say if the kernel task (and consequently the kernel) is 64-bit.

As alluded to earlier, a 64-bit processor is required to run a K64 kernel. To boot into K64, you could do one of several things:

  • Press the 6 and the 4 keys simultaneously at power-on time. This indicates to the EFI boot loader (boot.efi) that you wish to boot a 64-bit kernel.
  • Set the boot-args firmware variable, say, through the nvram command-line program. To boot K64, the specific command-line would be:
    $ sudo nvram boot-args="arch=x86_64"

  • Edit /Library/Preferences/SystemConfiguration/com.apple.Boot.plist and add arch=x86_64 to the value of the Kernel Flags key. By default, this value is an empty string.

    $ cat /Library/Preferences/SystemConfiguration/com.apple.Boot.plist
    ...
    <dict>
    	<key>Kernel</key>
    	<string>mach_kernel</string>
    	<key>Kernel Flags</key>
    	<string>arch=x86_64</string>
    </dict>
    ...
    

Another way is to use the -setkernelbootarchitecture argument of the systemsetup(8) command-line program.

Additionally, you could tell the kernel to boot verbosely if you are interested in catching a 64-bit boot early on. Note that one of the early kernel messages is "64 bit mode enabled". This does not mean K64—it just means the kernel has identified the processor to be 64-bit and is going to use certain 64-bit features. In the case of a K64 boot, the message to look for is "Kernel is LP64".

Not so fast though.

Unfortunately, a 64-bit processor alone doesn’t suffice. Out of the box, boot.efi will not boot K64 even if you have a 64-bit processor and explicitly request K64 if at least one of the following is true.

  1. The machine has 32-bit EFI.
  2. The machine’s model is prohibited from booting K64 through a hardcoded list within the boot loader. (A cursory look suggests that the list excludes "non-Pro" machines.)

Both of these "limitations" are technically artificial, albeit to different degrees.

The first limitation actually does have merit and is arguably not all that artificial. Although a 32-bit EFI could launch a 64-bit kernel, the kernel, when running, would not be able to use firmware services. In particular, you wouldn’t have NVRAM. For kernel developers merely wanting to run a 64-bit kernel for testing and debugging, this may not be an issue, but it’s understandable why the limitation is in place.

The second limitation is annoying. As a developer, if you knowingly wish to boot into K64 to test something, you can’t on certain machines even though they are technically perfectly capable. I ran into this on a Unibody MacBook, which has 64-bit EFI but is not a "Pro" machine. Also, it’s ironic that you can, in fact, boot Snow Leopard into K64 on the very same computer when you run it as a guest operating system in a virtual machine.

If you really need to boot into K64 on such a machine with 64-bit EFI, you can—at your own peril—"fix" things within boot.efi by setting the appropriate bits in the hardcoded list of models. To ensure that we’re talking about the same multiarchitecture version of boot.efi, compare the SHA-1 checksum of that file.

$ shasum boot.efi
2fb9fc10e5b4bb06f62c38b01bd9836a433897f8    boot.efi

Then, change the 1 byte at the corresponding model-specific position in the following table to the corresponding new value. Rather than overwriting the original boot.efi, we will copy the original to a new file, say, boot-k64.efi, and edit the latter.

Model (with 64-bit EFI) Byte Position in boot.efi Old Value New Value
Mac mini 0x266D8 0x00 0x04
MacBook 0x266E8 0x00 0x04
MacBook Air 0x266F8 0x00 0x04
iMac 0x26718 0x08 0x0c

For the specific case of the MacBook, which is the only one I've actually tried, the before and after bytes will look like the following:

0x266E0:
38 47 01 00  00 00 00 00  00 00 00 00  00 00 00 00

0x266E0:
38 47 01 00  00 00 00 00  04 00 00 00  00 00 00 00

We'll place the boot-k64.efi file somewhere on the root volume—/System/Library/CoreServices/ is fine. Then, we need to reset volume bootability through the bless command-line program. Optionally, we can also set the ownership and user immutable flag on the file to "proper" values.

$ sudo cp boot-k64.efi /System/Library/CoreServices/
$ cd /System/Library/CoreServices/
$ sudo chown root:wheel boot-k64.efi
$ sudo chflags uchg boot-k64.efi
$ sudo bless --folder /System/Library/CoreServices \
  --file /System/Library/CoreServices/boot-k64.efi

Your mileage may vary depending on whether your installation has 64-bit versions of all necessary drivers for the model of your specific machine. Since I have not tried any other "excluded" machine besides a 64-bit MacBook, I don't know about other models. (Unavailability or instability of certain 64-bit drivers could be a plausible reason for these models to be excluded in the first place.)

If you do render your system unbootable, you can simply run bless again to restore volume bootability as it was before. That is, you can tell bless to use the original boot.efi. Of course, to do that, you'll need to either boot from a different volume (a system install disc would be fine), or be able to access and write to the unbootable volume from another computer.

$ sudo bless --folder /Volumes/BrokenMac/System/Library/CoreServices \
  --file /Volumes/BrokenMac/System/Library/CoreServices/boot.efi

Crafting a Tiny Mach-O Executable

March 15th, 2009

The other day I came across this web page in which the author describes his experiment to create a tiny ELF executable that will run on Linux. The result: a 45-byte ELF executable that executes and returns a value. The executable is functionally equivalent to the one generated from compiling the following C program.

  /* tiny.c */
  int main(void) { return 42; }

Apparently recent Linux kernels do stricter checks on ELF executables, because of which the aforementioned 45-byte executable no longer works. A slightly larger, 64-byte version still works at the time of this writing.

Anyway, as far as tiny executables go, ELF on Linux is taken care of. It would be interesting to repeat a similar experiment for Mach-O executables on Mac OS X.

Let us first see how large the executable generated from the C program is on Mac OS X.

$ cat tiny.c
main() { return 42; }
$ sw_vers
...
ProductVersion: 10.5.6
...
$ gcc -Oz -o tiny tiny.c
$ strip tiny
$ ls -las tiny
32 -rwxr-xr-x  1 singh  wheel  12348 Mar 15 17:26 tiny

The following assembly language program can be compiled to generate a 165-byte Mach-O executable that runs on Mac OS X and returns the wisely chosen value 42.

; tiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o tiny tiny.asm

BITS 32
        org   0x1000

        db    0xce, 0xfa, 0xed, 0xfe       ; magic
        dd    7                            ; cputype (CPU_TYPE_X86)
        dd    3                            ; cpusubtype (CPU_SUBTYPE_I386_ALL)
        dd    2                            ; filetype (MH_EXECUTE)
        dd    2                            ; ncmds
        dd    _start - _cmds               ; cmdsize
        dd    0                            ; flags
_cmds:
        dd    1                            ; cmd (LC_SEGMENT)
        dd    44                           ; cmdsize
        db    "__TEXT"                     ; segname
        db    0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
        dd    0x1000                       ; vmaddr
        dd    0x1000                       ; vmsize
        dd    0                            ; fileoff
        dd    filesize                     ; filesize
        dd    7                            ; maxprot

        dd    5                            ; cmd (LC_UNIXTHREAD)
        dd    80                           ; cmdsize
        dd    1                            ; flvaor (i386_THREAD_STATE)
        dd    16                           ; count (i386_THREAD_STATE_COUNT)
        dd    0, 0, 0, 0, 0, 0, 0, 0       ; state
        dd    0, 0, _start, 0, 0, 0, 0, 0  ; state
_start:
        xor   eax,eax
        inc   eax
	push  byte 42
        sub   esp, 4
        int   0x80                         ; _exit(42)

filesize equ  $ - $$

We can compile and run this program as follows, verifying that its return value is indeed 42.

$ nasm -f bin -o tiny tiny.asm
$ ls -las tiny
8 -rw-r--r--  1 singh  admin  165 Mar 15 12:21 tiny
$ chmod 755 tiny
$ ./tiny
$ echo $?
42

Some points to note:

  • The executable—Mach-O header, load commands, the text segment—is manually crafted in assembly using the nasm 80x86 assembler. The C compiler toolchain is not involved.
  • The executable is unusual for Mac OS X in that the dynamc link editor (dyld) is not involved in running it. No dynamic libraries are involved either.
  • The program makes a "direct" system call through the int 0x80 interface. This is a big no-no on Mac OS X—production code should not be bypassing the C library for making system calls, but then hopefully you won't be writing production code using such techniques. A specific caveat is that system call implementation may do things differently—in terms of arguments, return values, and such—from the user-callable interface. The implementation may also change across system revisions, so such code may break with a system update.
  • The int 0x80 system call path is a legacy path on Mac OS X. It may be removed from Mac OS X some day, in which case the program would need to be modified to use newer alternatives such as sysenter.
  • The executable is not a "correct" Mach-O file, even though the kernel can parse and run it in our case. The reason it's incorrect is because Mach-O load commands have been deliberately made to overlap each other to save some bytes. The otool object-file introspection command will be quite unhappy with this executable. In the following output, inconsistencies are shown in red.

$ otool -l tiny
tiny:
Load command 0
      cmd LC_SEGMENT
  cmdsize 44 Inconsistent size
  segname __TEXT
   vmaddr 0x00001000
   vmsize 0x00001000
  fileoff 0
 filesize 165
  maxprot 0x00000007
 initprot 0x00000005
   nsects 80
    flags 0x1
Section
  sectname
   segname  (does not match segment)
      addr 0x00000000
      size 0x00000000
    offset 0
     align 2^4248 (16777216)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
section structure command extends past end of load commands
Section
  sectname
   segname  (does not match segment)
      addr 0x00000000
      size 0x00000000
    offset 0
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0

At the cost of roughly 80 additional bytes, we can create a "more correct" Mach-O executable that otool will be happy with.

; nicertiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o nicertiny nicertiny.asm

BITS 32
        org   0x1000

        db    0xce, 0xfa, 0xed, 0xfe       ; magic
        dd    7                            ; cputype (CPU_TYPE_X86)
        dd    3                            ; cpusubtype (CPU_SUBTYPE_I386_ALL)
        dd    2                            ; filetype (MH_EXECUTE)
        dd    2                            ; ncmds
        dd    _start - _cmds               ; cmdsize
        dd    0                            ; flags
_cmds:
        dd    1                            ; cmd (LC_SEGMENT)
        dd    124                          ; cmdsize
        db    "__TEXT"                     ; segname
        db    0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
        dd    0x1000                       ; vmaddr
        dd    0x1000                       ; vmsize
        dd    0                            ; fileoff
        dd    filesize                     ; filesize
        dd    7                            ; maxprot
        dd    5                            ; initprot
        dd    1                            ; nsects
        dd    0                            ; flags
        db    "__text"                     ; sectname
        db    0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; sectname
        db    "__TEXT"                     ; segname
        db    0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
        dd    _start                       ; addr
        dd    _end - _start                ; size;
        dd    _start - 0x1000              ; offset
        dd    2                            ; align
        dd    0                            ; reloff
        dd    0                            ; nreloc
        dd    0                            ; flags
        dd    0                            ; reserved1
        dd    0                            ; reserved2

        dd    5                            ; cmd (LC_UNIXTHREAD)
        dd    80                           ; cmdsize
        dd    1                            ; flavor (i386_THREAD_STATE)
        dd    16                           ; count (i386_THREAD_STATE_COUNT)
        dd    0, 0, 0, 0, 0, 0, 0, 0       ; state
        dd    0, 0, _start, 0, 0, 0, 0, 0  ; state
_start:
        xor   eax, eax
        inc   eax
        push  dword 42
        sub   esp, 4
        int   0x80                         ; _exit(42)
_end:
filesize equ  $ - $$

Let us compile and run this version, and see what otool has to say.

$ nasm -f bin -o nicertiny nicertiny.asm
$ ls -las nicertiny
8 -rw-r--r--  1 singh  admin  248 Mar 15 14:49 nicertiny
$ chmod 755 nicertiny
$ ./nicertiny
$ echo $?
42
$ otool -l nicertiny
nicertiny:
Load command 0
      cmd LC_SEGMENT
  cmdsize 124
  segname __TEXT
   vmaddr 0x00001000
   vmsize 0x00001000
  fileoff 0
 filesize 248
  maxprot 0x00000007
 initprot 0x00000005
   nsects 1
    flags 0x0
Section
  sectname __text
   segname __TEXT
      addr 0x000010e8
      size 0x00000010
    offset 232
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Load command 1
        cmd LC_UNIXTHREAD
    cmdsize 80
     flavor i386_THREAD_STATE
      count i386_THREAD_STATE_COUNT
	    eax 0x00000000 ebx    0x00000000 ecx 0x00000000 edx 0x00000000
	    edi 0x00000000 esi    0x00000000 ebp 0x00000000 esp 0x00000000
	    ss  0x00000000 eflags 0x00000000 eip 0x000010e8 cs  0x00000000
	    ds  0x00000000 es     0x00000000 fs  0x00000000 gs  0x00000000

It would be a nice exercise for the reader to try to shrink tiny.asm and nicertiny.asm even further, while retaining the high-level behavior of the corresponding executables. There are plenty of zeros lurking in there.

A TPM for Everyone

March 8th, 2009

Suppose you have a Macintosh without a TPM. This, of course, is highly likely because only the first few x86-based Macintosh models had TPMs. Now suppose you really want to experiment with Trusted Computing or features of the TPM in general. Your needs could be development-related or they could be purely academic. Well, you can do the next best thing to having a real TPM…

Why MacFUSE Installation Recommends a Reboot

March 2nd, 2009

I often hear users—and even developers, for that matter—grumbling about the fact that they are "required" to reboot their systems after installing or upgrading MacFUSE. I’ve even heard explanations that because MacFUSE "does something with the kernel," a reboot is necessary. Well, this whole rebooting-required thing is a myth. Lets clear up some misconceptions.

When you install the official MacFUSE package, you see the following rebooting-related notice.



To begin with, it is recommended (not required) that you restart your system. Unlike in the case of Mac OS X software installations that do necessitate a restart, you can simply close the Installer window after you are done installing MacFUSE.

Now, why is it even recommended that you restart? Let us get some context.

If you are installing MacFUSE for the first time (that is, you are not upgrading a MacFUSE installation), restarting is entirely unnecessary.

If you are upgrading a MacFUSE installation, restarting is a heavy-handed way for users to avoid the confusing situation described below—it is still not required to restart if you know how to avoid the confusing situation and are willing to deal with the extra steps needed.

MacFUSE consists of both kernel-resident and user-space-resident software components. The MacFUSE kernel extension is both dynamically loadable and dynamically unloadable. That is, you can simply unload it if it is not in use and reload a new version.

There is one important thing to know: the kernel-space and user-space components of MacFUSE are in lockstep. You cannot use an older user-space MacFUSE component with a newer MacFUSE kernel extension and vice versa. The aforementioned confusing situation arises when you upgrade MacFUSE while you have a MacFUSE-based file system mounted. The busy kernel extension cannot be unloaded, and the upgraded user-space components will refuse to work with the old kernel extension.

The user-space MacFUSE components actually ensure that the loaded kernel extension is the matching one. In fact, if you upgrade an installation while the MacFUSE kernel extension is loaded, the user-space components will try to unload the old extension and then try to load the new extension. This works, except in the case when the already loaded kernel extension is "busy," which would be the case if you do have a mounted MacFUSE-based file system.

If you first ensure that no MacFUSE-based file systems are mounted, you can upgrade MacFUSE, not restart, and once upgraded, start using the new MacFUSE. The lsvfs command-line tool can be used to determine if a MacFUSE-based file system is mounted.

$ lsvfs
Filesystem                        Refs Flags
-------------------------------- ----- ---------------
ufs                                  0 local
nfs                                  1
fdesc                                1
cd9660                               0 local
unionfs                              0
hfs                                  2 local, dovolfs
devfs                                1
autofs                               2
msdos                                0 local
fusefs                               0

If there is no entry for fusefs or if the corresponding value in the Refs column is 0, you shouldn’t have to do anything. If the value is non-zero, you need to unmount that many MacFUSE-based volumes.

To sum up, a restart is recommended so that end users don’t have to understand these details. Even then, the only time a restart will actually matter is the case when you upgrade MacFUSE while a MacFUSE-based file system is mounted and then you attempt to mount yet another MacFUSE-based file system using the upgraded version of MacFUSE.

Retrieving x86 Processor Information

March 2nd, 2009

The other day I needed to know within one of my experimental programs if the host x86 processor supports certain features. In many cases, the operating system provides interfaces that can answer such questions. Sometimes, the interfaces may not have the answer, or you may wish to avoid them for other reasons. (Say, you don’t wish to depend on anything operating-system-specific.) Of course, one can turn to the x86 processor’s venerable CPUID instruction. I took some code from the xnu kernel and made it into a user-space program that displays processor information. For academic needs, it is always good to have a program at hand that does things from first principles.

Here is the source for cpuinfo_x86.

The following is an example of the program in action. You must compile it as a 32-bit program. Besides Mac OS X, the program should also work on Linux, FreeBSD, and perhaps some other operating systems.

$ gcc -march=i386 -m32 -o cpuinfo_x86 cpuinfo_x86.c
$ ./cpuinfo_x86
# Identification
Vendor                : GenuineIntel
Brand String          : Intel(R) Xeon(R) CPU           X5550  @ 2.67GHz
Model Number          : 26 (Nehalem)
Family Code           : 6
Extended Model        : 1
Extended Family       : 0
Stepping ID           : 5
Signature             : 67237

# Address Bits
Physical Addressing   : 40
Virtual Addressing    : 48

# Multi-Core Information
Logical Processors (Threads) per Physical Processor : 16
Cores per Physical Package                          : 8

# Caches
## L1 Instruction Cache
Size                  : 32K
Line Size             : 64B
Sharing               : shared between 2 processor threads
Sets                  : 128
Partitions            : 1
Associativity         : 4

## L1 Data Cache
Size                  : 32K
Line Size             : 64B
Sharing               : shared between 2 processor threads
Sets                  : 64
Partitions            : 1
Associativity         : 8

## L2 Unified Cache
Size                  : 256K
Line Size             : 64B
Sharing               : shared between 2 processor threads
Sets                  : 512
Partitions            : 1
Associativity         : 8

## L3 Unified Cache
Size                  : 8M
Line Size             : 64B
Sharing               : shared between 16 processor threads
Sets                  : 8192
Partitions            : 1
Associativity         : 16

# Translation Lookaside Buffers
Instruction TLBs      : 7 large, 0 small
Data TLBs             : 32 large, 64 small

# Features
ACPI                  : Thermal Monitor and Software Controlled Clock
APIC                  : On-Chip APIC Hardware
CLFSH                 : CLFLUSH Instruction
CMOV                  : Conditional Move Instruction
CX16                  : CMPXCHG16B Instruction
CX8                   : CMPXCHG8 Instruction
DCA                   : Direct Cache Access
DE                    : Debugging Extension
DS                    : Debug Store
DS-CPL                : CPL Qualified Debug Store
DTES64                : 64-Bit Debug Store
EST                   : Enhanced Intel SpeedStep Technology
FPU                   : Floating-Point Unit On-Chip
FXSR                  : FXSAVE and FXSTOR Instructions
HTT                   : HyperThreading
MCA                   : Machine-Check Architecture
MCE                   : Machine-Check Exception
MMX                   : MMX Technology
MONITOR               : MONITOR/MWAIT Instructions
MSR                   : Model Specific Registers
MTRR                  : Memory Type Range Registers
PAE                   : Physical Address Extension
PAT                   : Page Attribute Table
PBE                   : Pending Break Enable
PDCM                  : Perfmon and Debug Capability
PGE                   : Page Global Enable
POPCNT                : POPCNT Instruction
PSE                   : Page Size Extension
PSE-36                : 36-Bit Page Size Extension
SEP                   : Fast System Call
SS                    : Self-Snoop
SSE                   : Streaming SIMD Extensions
SSE2                  : Streaming SIMD Extensions 2
SSE3                  : Streaming SIMD Extensions 3
SSSE3                 : Supplemental Streaming SIMD Extensions 3
SSE4.1                : Streaming SIMD Extensions 4.1
SSE4.2                : Streaming SIMD Extensions 4.2
TM                    : Thermal Monitor
TM2                   : Thermal Monitor 2
TSC                   : Time Stamp Counter
VME                   : Virtual Mode Extension
VMX                   : Virtual Machine Extensions
xTPR                  : xTPR Update Control

# Extended Features
EM64T                 : Intel Extended Memory 64 Technology
XD                    : Execution Disable

Despite the detailed information cpuinfo_x86 shows, it is still does not retrieve all possible information you can get through CPUID. Moreover, because I wrote it for the x86 version of Mac OS X, the program assumes that you have a relatively recent x86 processor and will likely not behave too well on old processors. Writing an exhaustive and legacy-aware CPUID program is a more rigorous programming exercise.

Displaying the Physical Memory Map

February 25th, 2009

The Apple Kernel Debug Kit comes with a kernel gdb macros file (kgmacros) that contains numerous macros useful during low-level development and analysis. One of the macros is showbootermemorymap, which dumps the physical memory map from EFI. The information in this map is very useful for certain types of development.

Since I am often mobile—without ready access to a two-machine debugging set up—I can’t use real kernel debugging much of the time. For introspection-only kernel "debugging", one can use the darwin-kernel gdb target on Mac OS X. It does require that you have a /dev/kmem device available. /dev/kmem is not enabled by default on Mac OS X, but you can enable it at boot time by including kmem=1 in the kernel’s boot arguments.

$ sudo nvram boot-args="-v kmem=1 ..."
$ sudo reboot
/* system reboots */
$ cd /path/to/kernel/debug/kit/for/current/system/
$ sudo gdb ./mach_kernel
...
(gdb) source kgmacros
Loading Kernel GDB Macros package.  Type "help kgm" for more info.
(gdb) target darwin-kernel
(gdb) attach
Connected.
(gdb) showbootermemorymap
Type       Physical Start   Number of Pages
available  0000000000000000 000000000000008f
ACPI_NVS   000000000008f000 0000000000000001
available  0000000000090000 0000000000000010
...
RT_data    0000000000a8c000 000000000000002b
RT_data    0000000000ab7000 0000000000000001
(gdb)

A more convenient alternative is to simply use DTrace to print this information. The following is a DTrace script that mimics the behavior of the showbootermemorymap kernel debug macro.

#! /usr/sbin/dtrace -s

/*
 * showbootermemorymap - DTrace script that prints out the physical memory
 *                       map from EFI. Mimics the output format of the
 *                       kernel debugging macro of the same name.
 *
 * Amit Singh
 * http://osxbook.com
 */

#pragma D option quiet

BEGIN
{
    self->inited = 1;

    self->kgm_boot_args = ((struct boot_args*)(`PE_state).bootArgs);
    self->kgm_msize = self->kgm_boot_args->MemoryMapDescriptorSize;
    self->kgm_mcount = self->kgm_boot_args->MemoryMapSize / self->kgm_msize;

    printf("Type       Physical Start   Number of Pages\n");

    self->kgm_i = 0;
}

fbt:::entry
/self->inited && self->kgm_i < self->kgm_mcount/
{
    this->kgm_mptr = (struct EfiMemoryRange*)
    ((unsigned long)self->kgm_boot_args->MemoryMap +
                    self->kgm_i * self->kgm_msize);

    self->kgm_i++;

    printf("%s", (this->kgm_mptr->Type == 0)  ? "reserved  " :
                 (this->kgm_mptr->Type == 1)  ? "LoaderCode" :
                 (this->kgm_mptr->Type == 2)  ? "LoaderData" :
                 (this->kgm_mptr->Type == 3)  ? "BS_code   " :
                 (this->kgm_mptr->Type == 4)  ? "BS_data   " :
                 (this->kgm_mptr->Type == 5)  ? "RT_code   " :
                 (this->kgm_mptr->Type == 6)  ? "RT_data   " :
                 (this->kgm_mptr->Type == 7)  ? "available " :
                 (this->kgm_mptr->Type == 8)  ? "Unusable  " :
                 (this->kgm_mptr->Type == 9)  ? "ACPI_recl " :
                 (this->kgm_mptr->Type == 10) ? "ACPI_NVS  " :
                 (this->kgm_mptr->Type == 11) ? "MemMapIO  " :
                 (this->kgm_mptr->Type == 12) ? "MemPortIO " :
                 (this->kgm_mptr->Type == 13) ? "PAL_code  " :
                                                "UNKNOWN   ");
    printf(" %016llx %016llx\n",
           this->kgm_mptr->PhysicalStart, this->kgm_mptr->NumberOfPages);
}

fbt:::return
/self->inited && self->kgm_i >= self->kgm_mcount/
{
    exit(0);
}


All contents of this site, unless otherwise noted, are ©1994-2014 Amit Singh. All Rights Reserved.