PA3: Delta Debugging
Goal
In this assignment, you will implement delta debugging (DD) to reduce to an input but still preserves its interestingness as specified by an oracle (e.g., it crashes, or does something interesting). For example, your DD program can take as input a C program (e.g., myfile.c
) and an oracle in a form of a test script that defines interestingness (e.g., a test.sh
that takes as input a C file, compiles and runs it, and returns 0
if it is interesting and 1
if it is not). Your program will apply DD to obtain a minimal version of myfile.c
that is still interesting.
As with the previous assignment, you can use any language for this assignment, but your program must run on the CSE machine (where I will evaluate your submission). So do not use external libraries or any extra tools etc. You are also strongly encouraged to use Python as it makes various string manipulation tasks easy.
For this assignment, you must work alone. However, you are allowed to talk to others (e.g., discussing problems, work on the ideas together), but the implementation and submission must be done individually. If you are stuck, you can post your question as a discussion on Canvas or approach me.
Specification
High level algorithm: You will implement a DD algorithm that we discussed in class. Recall that a DD works something like below
- Set the original input program as the current input
- check that the current input is interesting (using the testscript)
- set
n = 2
- split the current input into
n
parts - check each part \(p_i\)
- if any \(p_i\) is interesting (i.e., it is a reduced input and is still interesting)
- set \(p_i\) as the current input.
- Repeat step 3
- if none of these parts are interesting
- set
n = n*2
- repeat step 4
- set
You can be as creative as you want, but your DD must not run for too long and, most importantly, must show progress (e.g., after each splitting, either you decide that an individual part does help (and further split that part) or no part helps and further split) and must not take up too much space (e.g., do not generate over 50MB of files).
Example 1
- The below C program
hello.c
modifies the integera
and we consider itinteresting
if at the end the value ofa
is>= 3
. Clearly, this program contains a lot of unnecessary statements and we want to use DD to safely remove them so that we have a new program that is smaller/easier to understand but still preserves the conditiona >= 3
.
// hello1.c: program to be reduced
#include <stdio.h>
int main(int argc, char **argv) {
int a = 0;
printf("Hello, World!\n");
a++;
a--;
a+=2;
a++;
a--;
a++;
if (a >= 3){
printf("correct\n");
}
return 0;
}
- This is a simple oracle or test scripts
hello1.sh
that takes as input a C program, compiles, and runs it. Then the test script checks if the result contains the word"correct"
and if so returns0
to indicate interesting (and returns1
otherwise).
# hello1.test
# oracle/testscript defining interestingness
#!/bin/bash
# -*-sh-*-
if gcc -o a.out $1 &> cmp_out; then
if ./a.out &> run_out; then
if grep "correct" run_out; then
echo "0" # interesting
exit 0;
fi
fi
fi
echo "1" # not interesting
exit 1;
# running DD
$ > python vdd.py hello1.c -test hello1.test
preprocessing .. rm 1 idxs: [16]
can delete 2 chunks containing 2 idxs: [0, 1]
can delete 1 chunks containing 1 idxs: [2]
can delete 1 chunks containing 1 idxs: [5]
can delete 2 chunks containing 2 idxs: [6, 7]
can delete 2 chunks containing 2 idxs: [10, 11]
can delete 1 chunks containing 1 idxs: [15]
can delete 6 chunks containing 9 idxs: [0, 1, 2, 5, 6, 7, 10, 11, 15]
can delete 1 chunks containing 9 idxs: [0, 1, 2, 5, 6, 7, 10, 11, 15]
done: '/Users/tnguyen/git/projects/vdd/src/min.c' (lines: total 8 rem 9, 29 created variants)
// min.c: reduced program
int main(int argc, char **argv) {
int a = 0;
a+=2;
a++;
if (a >= 3){
printf("correct\n");
}
}
Example 2
-
DD is useful for various tasks. We just need to provide an oracle to decide if the input is interesting or not. Below is another example where interestingness is defined if the input text file contains a line that has the word
redo
-
Input: file.txt
-
Oracle:
#!/bin/bash
# -*-sh-*-
if grep "redo" $1; then
echo "0" # Success.
exit 0;
fi
echo "1" # Failure.
exit 1;
> python vdd.py file.txt -test test.sh
preprocessing .. rm 21 idxs: [6 18 23 25 26 28 59 61 65 67 71 73 88 90 104 106 108 112 113 114 118]
can delete 25 chunks containing 25 idxs: [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 24, 27, 29, 30]
can delete 12 chunks containing 12 idxs: [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42]
can delete 6 chunks containing 6 idxs: [43, 44, 45, 46, 47, 48]
can delete 3 chunks containing 3 idxs: [49, 50, 51]
can delete 2 chunks containing 2 idxs: [52, 53]
can delete 1 chunks containing 1 idxs: [54]
can delete 50 chunks containing 50 idxs: [56, 57, 58, 60, 62, 63, 64, 66, 68, 69, 70, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 105, 107, 109, 110, 111, 115, 116, 117, 119, 120]
can delete 7 chunks containing 99 idxs: [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 24, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 60, 62, 63, 64, 66, 68, 69, 70, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 105, 107, 109, 110, 111, 115, 116, 117, 119, 120]
can delete 1 chunks containing 99 idxs: [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 24, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 60, 62, 63, 64, 66, 68, 69, 70, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 105, 107, 109, 110, 111, 115, 116, 117, 119, 120]
done: '/Users/tnguyen/git/projects/vdd/src/min.txt' (lines: total 1 rem 99, 17 created variants)
Output
As shown in the examples, at the end, the program will output
done:
, follows by- the location of the resulting file
rfile
- total number of lines of
rfile
(note, it doesn't have to be exact) - the lines removed (i.e., how many lines were you able to remove from the original file)
- any additional information you find interesting (but don't output too much)
- the total runtime in seconds
Also, the program will output some information whenever it makes progress (i.e., able to reduce/delete parts of code, how much was deleted, etc). Examples are given above, but you do not need to print the exact information shown in the examples (e.g., chunks
, idxs
etc)
As usual, use the README
to tell me exactly how to compile/run your program, e.g., python3 dd.exe file.C test.sh
or gcc dd.c -o dd.exe
and ./dd.exe file.c test.sh
. Again, make sure that your code works on the CSE machine and do not hard code anything specific to your account.
Hints
- try the
test.sh
by hand so that you're familiar with it - e.g., run
test.sh file.C
to see how it works, ie..,test.sh
should return1
, iffile.c
is interesting and0
otherwise. - run
test.sh
on various files, even those that do not compile (test.sh
should return0
as these are not interesting) - do DD by hand on several examples (e.g., the given C program) to make sure you are familiar with DD
- start the implementation
- test it on the given examples as well as other examples by creating the appropriate oracles. Be as creative as you want. (if you did something interest, you can use them for extra credits)
Extra Credits
- create 2 additional examples showing other kind of oracles you came up with (something interesting and different than the given oracles).
What to Turn In
You must turn in a zip file containing the following files. Use the following names:
- the zip file must be named
pa3-yourlast-firstname.zip
- One single main source file
dd.ext
, whereext
is the extension of the source file of the language of your choice (e.g.,.py
,.c
). As usual, indicate in the README file on how I can run it. - a
readme.txt
file - 2 screen shots showing how (1) you start your program (e.g., the commands used to compile and run your program) and (2) the end of the run (i.e., the output described above)
- Extra credits:
ec1.ext
file andec1.sh
test andec2.ext
file andec2.sh
test -
Nothing else, do not include
exe
,binary file
, or anything else. -
Be sure to test your code extensively the CSE machine. If your code does not build, you will simply get a 0 for the assignment.
The readme.txt
file should be a plain ASCII text file (not a Word file, not an RTF file, not an HTML file) consisting of the following:
- brief description how to run your program, e.g.,
$ gcc dd.c -o dd.exe
$ ./dd.exe myfile.c mytest.sh
....
-
Answer the following questions about your DD
-
Briefly describe your DD algorithm (if you use the one from the book or anywhere, briefly summarize)
- What modifications (if any) did you made to make your DD work?
- Construct a C program (you can use the given one as example) to show the WORST CASE scenario for DD (i.e., it takes \(O(n^2)\)).
- include the program and test script in the submission (name it
worstcase.c
andworstcase.sh
) - Indicate how to run this using your DD (
./dd.exe worstcase.c worstcase.sh
?) - show the output when running DD on it
- Explain this program and give reason why it is the worst case
- include the program and test script in the submission (name it
- Like 3, but the BEST CASE scenario.
- Additional observations/thoughts
Grading Rubric
Grading (out of 20 points):
- 12 points — for a complete DD as described (your points will be deducted if your DD does not obtain the minimal input)
- 10 points if your DD at least got the minimal result for the 2 given examples above
- 2 points if you DD works on my own example (not provided). This means you should try to generalize your DD and test it on various scenarios.
- 8 points — for the answers in the README
- 8 — clear explanations on running the program and thorough answers for the given questions
- If you did the extra credits, describe them and say why they are interesting (e.g., different than the given examples)
- 2 — vague or hard to understand; omits important details
- 0 — little to no effort, or submitted an RTF/DOC/PDF file instead of plain TXT
- 3 pts extra credit