Accessing Kernel Memory on the x86 Version of Mac OS X

© Amit Singh. All Rights Reserved. Written in May 2006


Beginning with the x86 version of Mac OS X, Apple removed the /dev/mem and /dev/kmem devices from the system—for reasons that might include interface cleanliness, portability across system versions, security, and perhaps even a bit of obscurity. Consequently, programs that use these devices for directly accessing kernel memory will not work. Similarly, libraries (most notably libkvm) that use these devices will also not work. Memory devices are sometimes convenient while debugging or analyzing the system, especially in the academic context. My book Mac OS X Internals: A Systems Approach has a small number of examples that involve directly reading kernel memory from user space to illustrate certain concepts, say, by showing the contents of certain kernel data structures. In some examples, it is especially convenient to use the dd command to read from /dev/kmem. I promised in the book (in the footnote on page 560, Chapter 6, The xnu Kernel, and also in Appendix A) that the accompanying web site will provide information about implementing your own /dev/kmem. Although there are other ways besides these devices to read kernel memory, they are often more involved and less convenient. In this discussion, we will see how to implement your own /dev/kmem.

As a trivial alternative to the kernel extension described in this document, you can try using the kmem=1 boot-time argument. If your kernel supports this argument (the Apple kernels at the time of this writing do), setting it will reenable the kernel memory device.

The source code in this discussion is useful for several reasons such as the following:

The Memory Devices: Looking Back

In the earliest versions of UNIX, the mem device mapped the computer's physical memory in the /dev/mem special file, which provided byte-oriented read/write access to the memory—that is, byte offsets in /dev/mem corresponded to physical memory addresses. A common use of /dev/mem was to debug the system. In particular, the running kernel could be patched through /dev/mem using a debugger. Accessing invalid offsets (memory addresses that did not exist) resulted in a kernel panic.

# ls -l /unix # 5th Edition UNIX -rwxr-xr-x 1 bin 25802 Mar 21 12:07 /unix # ~25KB! # ls -l /dev total 0 cr--r--r-- 1 bin 1, 0 Nov 26 18:13 mem crw-rw-rw- 1 bin 1, 2 Nov 26 18:13 null crw--w--w- 1 root 0, 0 Mar 21 12:08 tty8 #

Since there was no virtual memory in early UNIX, there was no /dev/kmem.

Most Unix systems implement the /dev/mem and /dev/kmem character devices. As in early UNIX, mem maps the system's physical memory into a file, whereas kmem maps the kernel's virtual memory into a file. There also exist variations on this theme: for example, the /dev/port device on Linux maps the I/O ports into a file. Moreover, some systems allow physical memory region attributes (such as cacheability and write-protection) to be manipulated through ioctl() calls on /dev/mem.

The KVM Library

Although /dev/kmem makes it rather easy to read kernel virtual memory, it is excruciating to read structured information this way, especially if the data structures in question involve pointers. Besides, the programmer has to somehow determine which addresses to read from, so there is the additional issue of symbol lookup. The kvm library, first introduced in SunOS, provides a uniform interface for accessing kernel memory corresponding to live or dead (crashed) kernels. It also allows symbols to be looked up by name. There also exist less generic, system-specific kvm library functions such as kvm_getprocs(), kvm_getargv(), kvm_getenvv(), and kvm_getloadavg(). The kvm library needs the memory devices for its operation. Therefore, it is obsoleted in the x86 version of Mac OS X. Note, however, that the library itself is implemented within the system library, and is still present.

On some Unix systems, the kvm library is used to implement debuggers and other programs that retrieve statistics from the kernel. The ps command is an example of such a program. On recent versions of Mac OS X, ps uses special sysctls (for example, the CTL_KERN—>KERN_PROC—>KERN_PROC_ALL mib.)

The Disappearance of Memory Devices in Mac OS X

As we already implied, it is a bad idea in general for a program to directly access or examine kernel data structures. In cases where certain data structures exist in 32-bit and 64-bit versions, the use of memory devices can be even more cumbersome. Moreover, since access to kernel memory must not be permitted for all programs, it is important (and fraught with vulnerabilities) to manage the security of programs that can. Apple greatly discourages third parties from using anything other than published/supported interfaces in Mac OS X. Even though the memory devices constituted a "published" interface, they allowed access to anything and everything. The removal of these devices means that only certain kernel information is accessible through Apple-endorsed interfaces. Examples of such interfaces include BSD-style sysctl's and the I/O Kit user library. In particular, the latter allows access and manipulation of the I/O Registry.

Nevertheless, as also noted earlier, it is nice to be able to conveniently access arbitrary kernel memory when necessary—this should almost always be for experimentation or debugging. We will mention two alternatives (and discuss one) of doing so on Mac OS X in the absence of system-provided memory devices.

Alternative 1: Using Mach VM Interface Calls

As the Mac OS X Internals book shows, Mac OS X has several interesting aspects because of its Mach underpinnings. The book has programming examples that show how one task can access and manipulate another task's virtual memory. In particular, using the same interfaces, a user task can access the kernel task's memory, which is equivalent to the functionality of /dev/kmem.

Alternative 2: Implementing Your Own Memory Devices

Another alternative is to create a kernel extension that implements /dev/kmem and /dev/mem. This would also allow the kvm library to work. For our experiments, we only need /dev/kmem. However, the kvm library wants /dev/mem to exist, so we will implement a dummy version of /dev/mem. The full source code for the kernel extension is at the end of this page. Let us look at some code excerpts to understand how it works. We will call our kernel extension KernelMemoryAccess.

Figure 1 shows the important part of the Info.plist file for our kernel extension bundle. The OSBundleLibraries specifies the dependencies of the extension. Note that for the most part, we depend on the kernel programming interfaces (KPIs) that were introduced in Tiger. However, we also need an older dependency ( because one of the symbols our extension requires—pmap_find_phys()—is not available through the newer KPIs. We use pmap_find_phys() to determine if a given range of kernel virtual memory is valid. If we were implementing a functional /dev/mem, we would have additional dependencies.

<!-- Info.plist --> ... <key>OSBundleLibraries</key> <dict> <key></key> <string>8.0.0</string> <!-- We need the following for pmap_find_phys() --> <key></key> <string>7.9.9</string> </dict> ...

Figure 1. The Info.plist file for the KernelMemoryAccess kernel extension

Mac OS X has broadly two types of kernel extensions: generic and I/O Kit-based. The latter must be implemented in a subset of C++. Our KernelMemoryAccess extension is quite simple, and falls in the generic category. Figure 2 shows the skeletons for our extension's start and stop functions (or "entry points"). Such functions are typical of loadable kernel extensions on many platforms. We normally use the start and stop functions to set up and clean up, respectively, any state or resources we need for the lifetime of the extension. Note that in the case of an I/O Kit-based kernel extension, the start and stop entry points are used by the system and are not available to the programmer (see Chapter 10 for details of why this is so).

// KernelMemoryAccess.c ... kern_return_t KernelMemoryAccess_start(kmod_info_t *ki, void *d) { // set up ... } kern_return_t KernelMemoryAccess_stop(kmod_info_t *ki, void *d) { // clean up ... }

Figure 2. Start and stop functions for the KernelMemoryAccess kernel extension

We will perform the following steps in our start function:

If we were to implement a real /dev/mem, we would also want to determine the physical memory size so that we could perform sanity checking on user-requested offsets into the device. We could use an in-kernel sysctl call for this purpose (Figure 3 shows an excerpt).

int ret; size_t oldlen; uint64_t physmem_size; ... oldlen = sizeof(physmem_size); ret = sysctlbyname("hw.memsize", &physmem_size, &oldlen, NULL, 0); ...

Figure 3. Determining physical memory size in the kernel

Analogously, we will perform the following cleaning up in our stop function:

In simple terms, a device switch structure is a function pointer table that maps operations (such as read and write) on the device to the corresponding handlers. Figure 4 shows the character device switch structure in our implementation of the memory devices. For simplicity, we will only allow /dev/kmem to be read from. Consequently, most of the work will be done in the read function.

// Our character device switch structure. // static struct cdevsw my_mm_cdevsw = { nullopen, // open_close_fcn_t *d_open; nullclose, // open_close_fcn_t *d_close; my_mmread, // read_write_fcn_t *d_read; nullwrite, // read_write_fcn_t *d_write; my_mmioctl, // ioctl_fcn_t *d_ioctl; nullstop, // stop_fcn_t *d_stop; nullreset, // reset_fcn_t *d_reset; 0, // struct tty **d_ttys; my_mmselect, // select_fcn_t *d_select; eno_mmap, // mmap_fcn_t *d_mmap; eno_strat, // strategy_fcn_t *d_strategy; eno_getc, // getc_fcn_t *d_getc; eno_putc, // putc_fcn_t *d_putc; D_TTY, // int d_type; };

Figure 4. The character device switch structure for our memory devices

Figure 5 shows part of the implementation of the read function (or rather, the backend handler that could potentially serve both read and write frontends). Note that in the case of minor number 0, that is, /dev/mem, we simply return EFAULT. Consequently, a program would see a "bad address" error whenever it attempts to read physical memory from this device.

static int my_mmrw(dev_t dev, struct uio *uio, enum uio_rw rw) { ... int error = 0; if (rw != UIO_READ) { return ENOTSUP; } while (uio_resid(uio) > 0 && error == 0) { ... switch (minor(dev)) { // /dev/mem case 0: goto fault; // say it's a bad address // /dev/kmem case 1: c = uio_iov_len(uio); if (!verify_access(uio->uio_offset, c)) { goto fault; } error = uiomove((caddr_t)(uintptr_t)uio->uio_offset, (int)c, uio); continue; ... } ... // Do uio bookkeeping. ... } ... return error; fault: return EFAULT; }

Figure 5. Implementing read functionality for /dev/kmem

Figure 5 also shows that we call verify_access() before returning virtual memory to the caller. Figure 6 shows how we implement verify_access(). Given a virtual memory range, this function iterates over each page in the range, calling pmap_find_phys() on the page, with the kernel's pmap (physical map) specified as an argument. Chapter 8 discusses the Mach pmap interface. pmap_find_phys() finds the physical address for a kernel virtual address.

boolean_t verify_access(off_t start, size_t len) { off_t base = trunc_page(start); off_t end = start + len; while (base < end) { if (pmap_find_phys(kernel_pmap, (addr64_t)base) == (ppnum_t)0) { return FALSE; } base += page_size; } return TRUE; }

Figure 6. Sanity checking requested kernel virtual memory

Using Your Own /dev/kmem

Let us now see an example of using our kernel extension for peeking into kernel memory. Assuming that the source code tarball is called KernelMemoryAccess-<version>.tar.gz, Figure 7 shows a brief overview of compiling and loading the kernel extension. Needless to say, you need Apple Developer Tools to be installed.

# We don't have a /dev/kmem on the x86 version of Mac OS X $ ls -l /dev/kmem ls: /dev/kmem: No such file or directory # Unpack the tarball somewhere, say, in /tmp $ tar -C /tmp -xzvf KernelMemoryAccess-<version>.tar.gz $ cd /tmp/KernelMemoryAccess-<version> # Build the kernel extension $ xcodebuild -configuration Release # Copy the compiled kernel extension bundle to /tmp $ cp -pr build/Release/KernelMemoryAccess.kext /tmp # Set ownership to make extension management happy $ sudo chown -R root:wheel /tmp/KernelMemoryAccess.kext # Load the kernel extension $ sudo kextload -v /tmp/KernelMemoryAccess.kext kextload: extension /tmp/KernelMemoryAccess.kext appears to be valid kextload: loading extension /tmp/KernelMemoryAccess.kext kextload: sending 1 personality to the kernel kextload: /tmp/KernelMemoryAccess.kext loaded successfully # Check with kextstat as well $ kextstat ... 107 0 0x3bd01000 0x2000 0x1000 com.osxbook.kext.KernelMemoryAccess ... # Check if we have a /dev/kmem now $ ls -l /dev/kmem crw-r----- 1 root kmem 11, 1 Jan 1 02:43 /dev/kmem

Figure 7. Compiling and loading the KernelMemoryAccess kernel extension

Figure 7 shows that on this machine, the kernel virtual address where our extension has been loaded is 0x3bd01000. This is the address where the extension's Mach-O header would reside. The executable code should start at the next page. For a page size of 0x1000 bytes, the executable code will be at address 0x3bd02000. Furthermore, the major number allotted to our device on this system is 11, which means that our device's character device switch structure is at slot number 11 (the 12th entry) in the kernel's global array of such structures. The array's name is cdevsw. Let us do a contrived exercise that uses /dev/kmem to verify our understanding of things.

Looking at Figure 4 again, we see that the character device switch structure contains 13 pointers and 1 integer. Since we have a 32-bit kernel, this structure occupies 56 bytes (14×4 bytes) with no padding. Since there are 11 structures before ours in the cdevsw array, the start of our structure will be 616 bytes (56×11 bytes) into cdevsw. Now, let us choose one of the function pointers in the device switch structure: say, d_read. In our case, this pointer will contain the address of my_mmread(). Since d_read is the 3rd pointer in the structure, it will be at an offset of 8 bytes into the structure, or 624 bytes (616 + 8 bytes) into cdevsw.

Let us first compute the address of my_mmread() in the kernel's virtual memory. Next, we will retrieve the contents of kernel virtual memory at an offset of 624 bytes from cdevsw. These two values should be equal.

We can use the nm command to display the value of the text section symbol _my_mmread in the Mach-O executable of our kernel executable bundle.

$ nm /tmp/KernelMemoryAccess.kext/Contents/MacOS/KernelMemoryAccess ... 0000014f t _my_mmread ...

Figure 8. Using the nm command

Recalling our discussion related to Figure 7, and given the information in Figure 8, we see that the address of mm_read() in the kernel should be 0x3bd02000 + 0x14f, which is 0x3bd0214f. Thus:

The kernel virtual address of my_mmread() is 0x3bd0214f ... (1)

The address of the cdevsw array in the kernel can be looked up by using nm on the /mach file.

$ nm /mach | grep cdevsw 00441720 A _cdevsw ...

Figure 9. Determining the kernel virtual address of the cdevsw array

We can now retrieve 4 bytes from the kernel virtual address 0x441720 + 624, which is 0x441990 (or 4462992 decimal).

# Read 4 bytes from the address 0x441990 (4462992 decimal) $ sudo dd if=/dev/kmem of=/dev/stdout bs=1 iseek=4462992 count=4 | od -X 0000000 3bd0214f 0000004

Figure 10. Reading from /dev/kmem

The kernel virtual address of cdevsw[11].d_read is 0x3bd0214f ... (2)

Statements (1) and (2) together illustrate what we wanted to achieve.


The source code is available under the terms of the Apple Public Source License (APSL).