Fuzz User Guide

Tavis Ormandy
20070608

About fuzz testing
- 1.1. About fuzz
How to use fuzz
- 2.1. Configuration
- 2.2. Choosing an input file
Tips and tricks
Appendix

1. About fuzz testing

Fuzz testing is an auditing and QA process for finding bugs in applications.

The basic principal of fuzz testing involves corrupting an valid input file in subtle ways in order to trigger any bugs in the application. The trick is to be subtle enough that the application can still identify and read the file, but corrupting the file enough to test the application's resilience to bad input. If an application is not resilient to bad input, fuzz testing may uncover bugs or security issues that may require further investigation.

1.1. About fuzz

Fuzz is a hybrid bash script, designed to perform fuzz testing. bash was originally chosen to implement fuzz as it is anticipated some applications may require modifications to fuzz in order to be fuzz tested, this should be simple and painless in bash as almost everyone is familiar and comfortable with the syntax.

Executing programs and checking return codes is what shells do best, making bash a good choice for this application. The performance bottleneck encountered when fuzz testing is usually the application itself, rather than fuzz, so the only consideration is minimising any overhead. As certain shell constructs are inherently slow, such as substitution and pipelines which require invoking a subshell, some code has been implemented in a DSO which is loaded by fuzz at startup. These (very simple) routines are written in C, and designed to eliminate the overhead of common functions fuzz has to perform, such as reading or writing the value of a byte at a file offset.

To minimise overhead, builtins are preferred whenever possible, and substitution and pipelines have been deliberately avoided in any tight loops.

2. How to use fuzz

To use fuzz, you simply need to construct a command line that will start your program, process a file, then exit. You then provide fuzz with a valid input file to begin corrupting. The syntax to start fuzz is as follows:

fuzz [OPTION]... [PROGRAM] [FILE]

You simply enclose the commandline in quotes, replacing any reference to the input file with the string __FILE__, for example, to fuzz test the cat command, you would prepare a text file, and then use

$ fuzz "cat __FILE__" input.txt

fuzz will modify input.txt and run cat, replacing the string __FILE__ with input.txt.

You can use the __FILE__ anywhere on the commandline, if you wanted to fuzz test the convert utility from the ImageMagick suite of utilities, you might prepare a small image file and construct a commandline like this:

$ fuzz "convert png:__FILE__ png:/dev/null" input.png

By default, fuzz attaches the input file to your program's stdin, you can override this by specifying an alternate redirection in the program's commandline.

2.1. Configuration

2.1.1. Return codes

fuzz uses your program's return code to detect errors, by default fuzz assumes the program uses the standard UNIX convention for return codes used by most applications, this is as follows:

Code	Status
0	Success.
1-127	Error.
128-255	Abnormal Termination.

The are three special cases interpreted differently by fuzz, these are as follows:

Code	Status
142	Timeout.
127	Program Not Found.
128	Program Not Executed.

However, these conventions are not rules and your program is free to intercept signals and return any code. If you need to change how fuzz interprets return codes, you should use the -a, -e, -f, -s and -t commandline options. For example, the bzip2 utility intercepts signals and returns 3, so you would have to prepare an input bzip2 file to test and use:

$ fuzz -a3 "bzip2 -dc < __FILE__" input.bz2

The return code 3 is is changed to indicate an abnormal termination in the lookup table. You can use each option as many times as you require to remap any number of return codes. Please also note that you can use redirection and other shell constructs within the quoted program string, this string is evaluated by the shell before being executed.

Option	Meaning	Description
-a	Aborted	Indicates program was terminated abnormally.
-e	Error	The program indicated it encountered an handled error.
-f	Failure	The program could not be executed for some reason.
-s	Success	The program indicates it succeeded processing the input file.
-t	Timeout	The program exceeded the timeout.

2.1.2. Saving output

By default, fuzz discards the stdout and stderr of the program, if you would like fuzz to save any output your program sends to stderr, you can use the -l commandline option to save the output to a logfile for your later perusal.

To save the stderr from a fuzz testing session of the gzip utility, you might prepare a gzipped file and use:

$ fuzz -l gzip-output "gzip -dc < __FILE__" input.gz

You can use the standard tail utility to monitor this logfile as fuzz runs, for example, you can open another terminal and enter:

$ tail -f gzip-output

**Be aware that fuzz attempts to run the application as many times as possible as quickly as possible, potentially generating lots of output.**

2.1.3. Timeouts

Abnormal termination is not the only undesirable result of reading malformed input, another common action is entering an infinite loop or consuming too much cpu time. For this reason, fuzz times how long the application takes to return and terminates it after a set period, assuming it will never return. By default, fuzz waits 60 seconds, however you can modify this with the -T (timeout) commandline option.

For example, if you're fuzz testing the grep utility and wanted to stop it executing after 10 seconds, you might use this:

$ fuzz -T 10 "grep -q foobar __FILE__" input.txt

Input files that cause timeouts are discarded by default, however these may be interesting to you. If so, you can tell fuzz to keep them using the -k (keep timeouts) commandline option.

for example, using the wc utility:

$ fuzz -T10 -k "cat __FILE__ | wc -c" input.txt

A useless use of cat, but this demonstrates that pipelines can be used inside the program string.

2.1.4. Aborts

If fuzz does it's job correctly, it may find an input file that crashes your application. If this should happen, fuzz will copy the input file and save it with a unique number appended, fuzz also gzips the file to conserve space.

If you would like fuzz to store the files in a different directory, you can use the -d (directory) commandline option, for example, using the stat utility:

% fuzz -k -T10 -d /tmp/statcrashes "stat __FILE__" input.dat

2.1.5. Distributed fuzz testing

To split the fuzz testing workload over multiple machines, you can tell each instance of fuzz to restrict testing to a specific arena.

For example, to split the load between eight machines, each machine would be assigned 1 portion of the 8 part arena, the syntax is:

<part>:<arena>

The first step would be to copy the same input file to all eight machines, then start fuzz on the first machine, typing:

$ fuzz -m 1:8 -k -T10 "strings -a __FILE__" a.out

on the second machine:

$ fuzz -m 2:8 -k -T10 "strings -a __FILE__" a.out

the third:

$ fuzz -m 3:8 -k -T10 "strings -a __FILE__" a.out

And so on, each instance will restrict the fuzz testing to one portion of the arena, as specified on the commandline. Once fuzz has exhausted the predefined decay strategies, each instance will continue to search at random until interrupted.

2.1.6. Process

fuzz knows various tricks to try to locate bugs quickly, these are called //decay strategies// and fuzz has several pre-defined, or you can write your own.

These strategies change the input file in a predetermined way, known to confuse many applications.

However, it is possible that your application will handle all of these strategies easily. If this happens, fuzz resorts to a brute force stress test, where the file is corrupted randomly forever. Ideally, you should leave fuzz running for as long as possible, interrupting it using ^C when you are satisfied your application is resilient enough to malformed input.

2.1.7. Progress

As fuzz is running, it prints statistics to the screen to indicate it's progress. The layout is as follows:

Mexican Wave, 65p/s, S:%002 E:%097 [######### ] C:4, T:0

[1] [2] [3] [4] [5] [6] [7]

This is the name of the current decay strategy being attempted.
This is an estimate of the average rate of execution, in executions per second.
Percentage of executions returning Success.
Percentage of executions returning Error.
A progress meter, indicating how much of the current stage has been completed.
The number of abnormal terminations encountered so far.
The number of time outs.

Note: bash only supports integer division, so rounding errors may cause minor anomalies in the display, such as displaying 102% or similar, this could be corrected, but as a minor aesthetic problem it has simply been ignored to reduce overhead.

2.1.8. Decay Strategies

2.1.8.1. Dword Wrap

Many file formats store sizes, counts, dimensions, etc in a dword (4 consecutive bytes). Some programs have been known to read this value unchecked then attempt to allocate num*sizeof (struct foo) bytes to write into, if we can set num to (0xffffffff+1)/sizeof (struct foo) an arithmetic overflow may result in allocating 0 bytes, potentially resulting in an heap overflow.

Dword Wrap sets each consecutive 4 bytes to a value that when multiplied by any number between 2-2048 (allowing for lots of different ``sizeof (struct foo)```) will result in an dword arithmetic overflow. See also, word wrap.

2.1.8.2. Increment

Every byte is incremented by n, making offsets incorrect. see also, Rotate.

2.1.8.3. Insert

The file is elongated, and a new byte inserted at every position. see also, Shrink.

2.1.8.4. Populate

The file is overwritten with a byte, then restored from offset 0x00 towards the end.

The restoration starts from 0x00 to ensure any magic bytes or header data is restored first, as this data is usually stored towards the start of the file.

2.1.8.5. Random

Each byte is replaced with a random byte at a random offset, until completely corrupted. The file is then restored form offset 0x00 towards the end, see Populate for rationale.

This strategy is a last resort, designed to run continuously for long periods of time to indicate how resilient the application is to bad input.

2.1.8.6. Rotate

The file is rotated by n positions.

2.1.8.7. Shrink

Each offset is removed, then restored. see also, Insert.

2.1.8.8. Swap

Each pair of bytes is swapped.

Experience suggests this strategy can be very effective.

2.1.8.9. Mexican Wave

Each byte is overwritten with every possible value in turn.

Experience suggests this simple strategy is highly effective.

2.1.8.10. Word Wrap

See Dword Wrap.

2.2. Choosing an input file

In order to begin fuzz testing, fuzz needs some valid input to modify. fuzz testing can be a lengthy, time consuming process, so the smaller the file, the better.

If you can create an input file under 500 bytes, that would be perfect. fuzz imposes no limitations on file size, so nothing will stop you using a 8M file, but be aware that the larger the file, the lower the probability of finding any bugs within a short of period of time.

3. Tips and tricks

If the applications wants to modify the input file, configure it to use stdin and use standard shell redirection in the program string.
Keep the input file as small as possible.
Use screen or splitvt to tail logs and run fuzz in the same terminal.
If you have an multiprocessor system, you can start multiple instances of fuzz using the -m ARENA option to distribute the workload over multiple cpus.

4. Appendix

Please report bugs to <taviso@gentoo.org>.