A File System Change Logger

© Amit Singh. All Rights Reserved. Written in May 2005

Background

Spotlight was a highly anticipated feature of Mac OS X 10.4 "Tiger". From a technical standpoint, you could think of Spotlight as roughly encompassing the following:

This discussion involves accessing and demonstrating the lowest-level building block, the change notification mechanism.

Although this is a wild guess, and I could be entirely wrong, the Spotlight project might have been an 18-month to 2-year project from conception to release.

I was once (mid 2003-2004) involved in a conceptually very similar, but less consequential endeavor as Spotlight, the details of which I am unable to discuss. Consequently, I was very interested in Spotlight since I first heard about it. Even though it may be easy to explain or understand what Spotlight is, implementing the technology, especially in a commercial end-user system, is a monumental task in my opinion. I think Apple engineers have done a very admirable job with Spotlight, making pragmatic choices that work well.

fslogger

fslogger is a user-space program that subscribes to the same file system event notification mechanism as the Spotlight metadata server.

Note that fslogger does not use the Spotlight APIs. It uses the mechanism underlying to Spotlight.

fslogger's mode of operation is very simple. The following points are particularly noteworthy:

fslogger

Once active, fslogger will wait for change notifications to arrive from the file system layer in the kernel. The various file system operations that are communicated to fslogger (and other subscribers such as Spotlight, specifically the metadata server) include:

The "exchange" operation on HFS Plus is used to exchange fork data of two files by simply swapping certain information in the Catalog file, thus preserving the file ID when updating an existing file.

fslogger receives and displays the aforementioned events practically instantly, courtesy of the kernel's support. An event notification contains details of the file system object(s) on which the event happened. fslogger processes this information, enhancing it marginally (for example, by determining human-friendly names corresponding to process, user, and group identifiers). The information displayed by fslogger for a typical file system event includes:

In its current version, fslogger uses a relatively small queue for holding change notifications. Under heavy file system activity, the queue may become full, and the kernel may have to drop an event, an action which itself is an event, and is reported as such by fslogger.

Microsoft Windows

It must be noted that certain versions of Microsoft Windows already provide analogous features, with some differing aspects. The NTFS Change Journal provides a persistent file system change log. When file system objects are added, deleted, or modified, the change is recorded in a per-volume journal.

The Change Journal is useful for file system replicators and indexers, incremental backup applications, virus and security-related scanners, and so on. In particular, Microsoft uses this mechanism for implementing the Indexing Service feature of Windows XP Professional.

Windows provides another related feature, the ReadDirectoryChangesW function, which retrieves directory-specific change information.

A sample output from fslogger is shown below.

$ sudo ./fslogger fslogger ready => received 164 bytes # Event type = CREATE FILE pid = 286 (zsh) # Details # type len data VNODE 28 path = /Users/amit/Desktop/foo.txt DEVICE 4 dev = 0xe000002 (major 14, minor 2) INODE 4 ino = 808517 MODE 4 mode = -rw-r--r-- (0x0081a4, vnode type VREG) UID 4 uid = 501 (amit) GID 4 gid = 501 (amit) DONE (0xb33f) => received 84 bytes # Event type = CONTENT MODIFIED pid = 79 (Finder) # Details # type len data VNODE 30 path = /Users/amit/Desktop/.DS_Store DEVICE 4 dev = 0xe000002 (major 14, minor 2) INODE 4 ino = 235865 MODE 4 mode = -rw------- (0x008180, vnode type VREG) UID 4 uid = 501 (amit) GID 4 gid = 501 (amit) DONE (0xb33f) => received 88 bytes # Event type = CONTENT MODIFIED pid = 3571 (mdimport) # Details # type len data VNODE 34 path = /private/tmp/objc_sharing_ppc_501 DEVICE 4 dev = 0xe000002 (major 14, minor 2) INODE 4 ino = 801404 MODE 4 mode = -rw------- (0x008180, vnode type VREG) UID 4 uid = 501 (amit) GID 4 gid = 0 (wheel) DONE (0xb33f) ...

Uses

Apple's existing use of this notification support in Spotlight is a good demonstration of its utility and power. fslogger aims to further demonstrate the flexibility of this feature. Consider some scenarios in which fslogger (or more appropriately, a custom program that uses this feature) could be useful. Note that these are merely aloud thoughts, and may vary greatly in feasibility or utility.

Caveat


The interface that fslogger uses is private to Apple. Currently, there is a caveat regarding the use of this interface by third parties (including fslogger). While the change notification interface supports multiple clients, there is a single kernel buffer for holding events that are to be delivered to one or more subscribers, with the primary subscriber being Spotlight. Now, the kernel must hold events until it has notified all subscribers that are interested in them. Since there is a single buffer, a slow subscriber can cause it to overflow. If this happens, events will be dropped — for all subscribers, including Spotlight. Consequently, Spotlight may need to look at the entire volume to determine "what changed".

fslogger is meant to be a learning tool. If you use it, you must understand the aforementioned caveat. If you cause heavy enough file system activity (what's "heavy" will vary greatly, depending on your system and its currently available resources), both fslogger and Spotlight may miss events, causing Spotlight to spend some extra time looking at your volume. Note that Spotlight will not reindex the entire volume — it will only look for the changes that it missed.

An example of a typically heavy file system activity (that may quite possibly cause events to be dropped) is unpacking a giant tarball. Finally, if events are missed, fslogger will indicate that event (missing events is an event itself).

Download


FSLogger-1.1.dmg (only for Mac OS X 10.4.x "Tiger")

FSLogger-2.1.dmg (only for Mac OS X 10.5.x "Leopard")

FSLogger Source Code