PA1: Fuzz Testing

Goal

Fuzz testing generates random inputs to test program In this assignment, you will implement a fuzzer that generates inputs for a given program with the goal of exploring as many statements in that program as possible.

You can use any language for this assignment, but your program must run on the CSE machine (where I will evaluate your submission). So do not use external libraries or any extra tools etc. You are also strongly encouraged to use Python as it makes parsing and various string manipulation tasks easy. Knowing Python is requirement for this class, but I am being a bit flexible in this assignment. However note that if you use other languages, it is unlikely that I can help if you face technical difficulties.

For this assignment, you must work alone. However, you are allowed to talk to others (e.g., discussing problems, work on the ideas together), but the implementation and submission must be done individually. If you are stuck, you can post your question as a discussion on Canvas or approach me.

Specification

High level algorithm: Your fuzzer will implement a simple algorithm based on AFL that we discussed in class. The algorithm works something like below

Use one or multiple valid URL strings as seed inputs
Randomly mutate the seed inputs to obtain various mutated input strings
Evaluate those input strings and obtain their line coverage
Select input strings with many new, uncovered lines as seed inputs.
Repeat until cannot obtain new coverage for some number of consecutive iterations.

You can be as creative as you want (e.g., you can start with multiple valid URL's as well as "weird" ones or whatever you want, e.g., empty string), but your fuzzer must not run for too long (e.g., at most 5 minutes) and do not take up too much space (e.g., do not generate over 100MB of files).

The Fuzzer: Your fuzzer will take as input an executable program (given as a string indicating the full path of the file, see example below) and output 2 lines: the first is a nonnegative number indicating the number of explored statements and the second is a sorted, comma-separated list of unique id's of statements that were covered.

$ ./fuzzer.exe /path/to/test.exe
1000
test.c:1, test.c:2,  ...  test.c:1000

Your fuzzer also takes as input an argument -show, which outputs whatever you think the program is doing that is interesting (e.g., at iteration X, the fuzzer generated Y inputs and covered Z unique lines).

$./fuzzer.exe /path/to/test.exe -show
....  # some progress
1000
test.c:1, test.c:2,  ...  test.c:1000

Extra credit:

Compare with different metric: e.g., using "unsual" path coverage (see the Fuzzingbook GreyboxFuzzer chapter for an example formula)
At the end your fuzzer will produce a graph showing the progress of your fuzzer on all the metrics used (so that we can compare them). The y-axis shows the number of evaluated inputs and coverage, and the x-axis shows number of iterations. The fuzzer will output a line at the end simply saying graph.ext (where ext is jpg or png).

$./fuzzer.exe /path/to/test.exe
....  # some progress
1000
test.c:1, test.c:2,  ...  test.c:1000
graph.png

The test program: The input executable test program (e.g., test.exe) is a CGI decoder (discussed in class), which takes as input a string CGI encoded string (e.g., a URL, web address) and convert it to the original string before the encoding. The C source code for this CGI program is here and you can find more information and example about this program from the class notes.

$ ./test.exe "Hello+World"
Hello World

Coverage Information: You will use gcov to obtain the coverage information when running test.exe on an input string. Essentially, this involves compiling test.c with coverage support, run test.exe on an input string, then run gcov and analyze the resulting file. Thus, your fuzzer will repeatedly invoke test.exe and gcov to obtain coverage information.

$ gcc --coverage -o test.exe test.c
$ ./test.exe 'Send+mail+to+me%40unl.edu'
$ gcov test.exe.c   # generate the .gcov file
$ less test.exe.c.gcov    # your fuzzer will parse in this file and analyze its contents

Mutating Input Strings: Your fuzzer will have a mutation function that takes as input a string and randomly mutates it (and returns the mutants). You can use the techniques discussed in class, which include

Delete: delete a random char in the string
Insert: insert a random char into the string
Swap: swap two random chars in the string
Replace: replace a random char in a string with another random char (equiv to delete + insert)

What to Turn In

You must turn in a zip file containing the following files. Use the following names:

The main source file fuzzer.ext, where ext is the extension of the source file of the language of your choice (e.g., .py, .c)
The test source code (i.e., the provided CGI decoder) test.c
a buildscript file buildscript.sh
a readme.txt file
Nothing else, do not include exe, binary file, or anything else.

The buildscript.sh script is used to build your program. It takes no input and must return an executable fuzzer.exe file and a test.exe file. The grader will simply run your buildscript as ./buildscript.sh and test fuzzer.exe on test.exe

For example, if you use C/C++, your buildscript can just call gcc on your fuzzer.c to produce a fuzzer.exe file. If you use Python, then your fuzzer.py can be simply renamed to fuzzer.exe. The buildscript also calls chmod +x fuzzer.exe and chmod +x test.exe to make these files executable.
Be sure to test your buildscript.sh extensively the CSE machine. If your code does not build, you will simply get a 0 for the assignment.

The readme.txt file should be a /plain ASCII text file/ (not a Word file, not an RTF file, not an HTML file) describing how to run your program and your design decisions. One or two English paragraphs should suffice. Spelling, grammar, capitalization and punctuation all count. * Note: if you did the extra credit, also explicitly state that here. Also give some details on how that was done (e.g., what do you use to create the graph)?

Grading Rubric

PA1 Grading (out of 20 points):

15 points — for a complete fuzzer as described (your points will be deducted if you achieve low coverage, i.e., your fuzzer is not effective)
5 points — for a clear description in your README
5 — thorough discussion of your implementation; a few paragraphs of coherent English sentences should be fine
2 — vague or hard to understand; omits important details
0 — little to no effort, or submitted an RTF/DOC/PDF file instead of plain TXT
Extra credit: 5 points.
Additional Metrics: 3 points
Graph: 2 points