Crafting a Tiny Mach-O Executable
The other day I came across this web page in which the author describes his experiment to create a tiny ELF executable that will run on Linux. The result: a 45-byte ELF executable that executes and returns a value. The executable is functionally equivalent to the one generated from compiling the following C program.
/* tiny.c */
int main(void) { return 42; }
Apparently recent Linux kernels do stricter checks on ELF executables, because of which the aforementioned 45-byte executable no longer works. A slightly larger, 64-byte version still works at the time of this writing.
Anyway, as far as tiny executables go, ELF on Linux is taken care of. It would be interesting to repeat a similar experiment for Mach-O executables on Mac OS X.
Let us first see how large the executable generated from the C program is on Mac OS X.
$ cat tiny.c
main() { return 42; }
$ sw_vers
...
ProductVersion: 10.5.6
...
$ gcc -Oz -o tiny tiny.c
$ strip tiny
$ ls -las tiny
32 -rwxr-xr-x 1 singh wheel 12348 Mar 15 17:26 tiny
The following assembly language program can be compiled to generate a 165-byte Mach-O executable that runs on Mac OS X and returns the wisely chosen value 42.
; tiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o tiny tiny.asm
BITS 32
org 0x1000
db 0xce, 0xfa, 0xed, 0xfe ; magic
dd 7 ; cputype (CPU_TYPE_X86)
dd 3 ; cpusubtype (CPU_SUBTYPE_I386_ALL)
dd 2 ; filetype (MH_EXECUTE)
dd 2 ; ncmds
dd _start - _cmds ; cmdsize
dd 0 ; flags
_cmds:
dd 1 ; cmd (LC_SEGMENT)
dd 44 ; cmdsize
db "__TEXT" ; segname
db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
dd 0x1000 ; vmaddr
dd 0x1000 ; vmsize
dd 0 ; fileoff
dd filesize ; filesize
dd 7 ; maxprot
dd 5 ; cmd (LC_UNIXTHREAD)
dd 80 ; cmdsize
dd 1 ; flvaor (i386_THREAD_STATE)
dd 16 ; count (i386_THREAD_STATE_COUNT)
dd 0, 0, 0, 0, 0, 0, 0, 0 ; state
dd 0, 0, _start, 0, 0, 0, 0, 0 ; state
_start:
xor eax,eax
inc eax
push byte 42
sub esp, 4
int 0x80 ; _exit(42)
filesize equ $ - $$
We can compile and run this program as follows, verifying that its return value is indeed 42.
$ nasm -f bin -o tiny tiny.asm
$ ls -las tiny
8 -rw-r--r-- 1 singh admin 165 Mar 15 12:21 tiny
$ chmod 755 tiny
$ ./tiny
$ echo $?
42
Some points to note:
- The executable—Mach-O header, load commands, the text segment—is manually crafted in assembly using the
nasm80x86 assembler. The C compiler toolchain is not involved. - The executable is unusual for Mac OS X in that the dynamc link editor (
dyld) is not involved in running it. No dynamic libraries are involved either. - The program makes a "direct" system call through the
int 0x80interface. This is a big no-no on Mac OS X—production code should not be bypassing the C library for making system calls, but then hopefully you won't be writing production code using such techniques. A specific caveat is that system call implementation may do things differently—in terms of arguments, return values, and such—from the user-callable interface. The implementation may also change across system revisions, so such code may break with a system update. - The
int 0x80system call path is a legacy path on Mac OS X. It may be removed from Mac OS X some day, in which case the program would need to be modified to use newer alternatives such assysenter. - The executable is not a "correct" Mach-O file, even though the kernel can parse and run it in our case. The reason it's incorrect is because Mach-O load commands have been deliberately made to overlap each other to save some bytes. The
otoolobject-file introspection command will be quite unhappy with this executable. In the following output, inconsistencies are shown in red.
$ otool -l tiny
tiny:
Load command 0
cmd LC_SEGMENT
cmdsize 44 Inconsistent size
segname __TEXT
vmaddr 0x00001000
vmsize 0x00001000
fileoff 0
filesize 165
maxprot 0x00000007
initprot 0x00000005
nsects 80
flags 0x1
Section
sectname
segname (does not match segment)
addr 0x00000000
size 0x00000000
offset 0
align 2^4248 (16777216)
reloff 0
nreloc 0
flags 0x00000000
reserved1 0
reserved2 0
section structure command extends past end of load commands
Section
sectname
segname (does not match segment)
addr 0x00000000
size 0x00000000
offset 0
align 2^0 (1)
reloff 0
nreloc 0
flags 0x00000000
reserved1 0
reserved2 0
At the cost of roughly 80 additional bytes, we can create a "more correct" Mach-O executable that otool will be happy with.
; nicertiny.asm for Mac OS X (Mach-O Object File Format)
; nasm -f bin -o nicertiny nicertiny.asm
BITS 32
org 0x1000
db 0xce, 0xfa, 0xed, 0xfe ; magic
dd 7 ; cputype (CPU_TYPE_X86)
dd 3 ; cpusubtype (CPU_SUBTYPE_I386_ALL)
dd 2 ; filetype (MH_EXECUTE)
dd 2 ; ncmds
dd _start - _cmds ; cmdsize
dd 0 ; flags
_cmds:
dd 1 ; cmd (LC_SEGMENT)
dd 124 ; cmdsize
db "__TEXT" ; segname
db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
dd 0x1000 ; vmaddr
dd 0x1000 ; vmsize
dd 0 ; fileoff
dd filesize ; filesize
dd 7 ; maxprot
dd 5 ; initprot
dd 1 ; nsects
dd 0 ; flags
db "__text" ; sectname
db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; sectname
db "__TEXT" ; segname
db 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ; segname
dd _start ; addr
dd _end - _start ; size;
dd _start - 0x1000 ; offset
dd 2 ; align
dd 0 ; reloff
dd 0 ; nreloc
dd 0 ; flags
dd 0 ; reserved1
dd 0 ; reserved2
dd 5 ; cmd (LC_UNIXTHREAD)
dd 80 ; cmdsize
dd 1 ; flavor (i386_THREAD_STATE)
dd 16 ; count (i386_THREAD_STATE_COUNT)
dd 0, 0, 0, 0, 0, 0, 0, 0 ; state
dd 0, 0, _start, 0, 0, 0, 0, 0 ; state
_start:
xor eax, eax
inc eax
push dword 42
sub esp, 4
int 0x80 ; _exit(42)
_end:
filesize equ $ - $$
Let us compile and run this version, and see what otool has to say.
$ nasm -f bin -o nicertiny nicertiny.asm
$ ls -las nicertiny
8 -rw-r--r-- 1 singh admin 248 Mar 15 14:49 nicertiny
$ chmod 755 nicertiny
$ ./nicertiny
$ echo $?
42
$ otool -l nicertiny
nicertiny:
Load command 0
cmd LC_SEGMENT
cmdsize 124
segname __TEXT
vmaddr 0x00001000
vmsize 0x00001000
fileoff 0
filesize 248
maxprot 0x00000007
initprot 0x00000005
nsects 1
flags 0x0
Section
sectname __text
segname __TEXT
addr 0x000010e8
size 0x00000010
offset 232
align 2^2 (4)
reloff 0
nreloc 0
flags 0x00000000
reserved1 0
reserved2 0
Load command 1
cmd LC_UNIXTHREAD
cmdsize 80
flavor i386_THREAD_STATE
count i386_THREAD_STATE_COUNT
eax 0x00000000 ebx 0x00000000 ecx 0x00000000 edx 0x00000000
edi 0x00000000 esi 0x00000000 ebp 0x00000000 esp 0x00000000
ss 0x00000000 eflags 0x00000000 eip 0x000010e8 cs 0x00000000
ds 0x00000000 es 0x00000000 fs 0x00000000 gs 0x00000000
It would be a nice exercise for the reader to try to shrink tiny.asm and nicertiny.asm even further, while retaining the high-level behavior of the corresponding executables. There are plenty of zeros lurking in there.