Skip to content

PA3: Delta Debugging

Goal

In this assignment, you will implement delta debugging (DD) to reduce to an input but still preserves its interestingness as specified by an oracle (e.g., it crashes, or does something interesting). For example, your DD program can take as input a C program (e.g., myfile.c) and an oracle in a form of a test script that defines interestingness (e.g., a test.sh that takes as input a C file, compiles and runs it, and returns 0 if it is interesting and 1 if it is not). Your program will apply DD to obtain a minimal version of myfile.c that is still interesting.

As with the previous assignment, you can use any language for this assignment, but your program must run on the CSE machine (where I will evaluate your submission). So do not use external libraries or any extra tools etc. You are also strongly encouraged to use Python as it makes various string manipulation tasks easy.

For this assignment, you must work alone. However, you are allowed to talk to others (e.g., discussing problems, work on the ideas together), but the implementation and submission must be done individually. If you are stuck, you can post your question as a discussion on Canvas or approach me.

Specification

High level algorithm: You will implement a DD algorithm that we discussed in class. Recall that a DD works something like below

  1. Set the original input program as the current input
  2. check that the current input is interesting (using the testscript)
  3. set n = 2
  4. split the current input into n parts
  5. check each part \(p_i\)
  6. if any \(p_i\) is interesting (i.e., it is a reduced input and is still interesting)
    1. set \(p_i\) as the current input.
    2. Repeat step 3
  7. if none of these parts are interesting
    1. set n = n*2
    2. repeat step 4

You can be as creative as you want, but your DD must not run for too long and, most importantly, must show progress (e.g., after each splitting, either you decide that an individual part does help (and further split that part) or no part helps and further split) and must not take up too much space (e.g., do not generate over 50MB of files).

Example 1

  • The below C program hello.c modifies the integer a and we consider it interesting if at the end the value of a is >= 3. Clearly, this program contains a lot of unnecessary statements and we want to use DD to safely remove them so that we have a new program that is smaller/easier to understand but still preserves the condition a >= 3.
// hello1.c: program to be reduced
#include <stdio.h>
int main(int argc, char **argv) {
    int a = 0;
    printf("Hello, World!\n");
    a++;
    a--;
    a+=2;
    a++;
    a--;
    a++;
    if (a >= 3){
      printf("correct\n");
    }
    return 0;

}
  • This is a simple oracle or test scripts hello1.sh that takes as input a C program, compiles, and runs it. Then the test script checks if the result contains the word "correct" and if so returns 0 to indicate interesting (and returns 1 otherwise).
# hello1.test
# oracle/testscript defining interestingness

#!/bin/bash
# -*-sh-*-
if gcc -o a.out $1 &> cmp_out; then
    if ./a.out &> run_out; then
        if grep "correct" run_out; then
            echo "0"                 # interesting
            exit 0;
        fi
    fi
fi
echo "1"                         # not interesting
exit 1;
# running DD
$ > python vdd.py  hello1.c -test hello1.test
preprocessing .. rm 1 idxs: [16]
can delete 2 chunks containing 2 idxs: [0, 1]
can delete 1 chunks containing 1 idxs: [2]
can delete 1 chunks containing 1 idxs: [5]
can delete 2 chunks containing 2 idxs: [6, 7]
can delete 2 chunks containing 2 idxs: [10, 11]
can delete 1 chunks containing 1 idxs: [15]
can delete 6 chunks containing 9 idxs: [0, 1, 2, 5, 6, 7, 10, 11, 15]
can delete 1 chunks containing 9 idxs: [0, 1, 2, 5, 6, 7, 10, 11, 15]
done: '/Users/tnguyen/git/projects/vdd/src/min.c' (lines: total 8 rem 9, 29 created variants)
// min.c: reduced program
int main(int argc, char **argv) {
    int a = 0;
    a+=2;
    a++;
    if (a >= 3){
      printf("correct\n");
    }
}

Example 2

  • DD is useful for various tasks. We just need to provide an oracle to decide if the input is interesting or not. Below is another example where interestingness is defined if the input text file contains a line that has the word redo

  • Input: file.txt

  • Oracle:

#!/bin/bash
# -*-sh-*-
if grep "redo" $1; then
    echo "0"                 # Success.
    exit 0;
fi
echo "1"                      # Failure.
exit 1;
> python vdd.py  file.txt -test test.sh
preprocessing .. rm 21 idxs: [6 18 23 25 26 28 59 61 65 67 71 73 88 90 104 106 108 112 113 114 118]
can delete 25 chunks containing 25 idxs: [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 24, 27, 29, 30]
can delete 12 chunks containing 12 idxs: [31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42]
can delete 6 chunks containing 6 idxs: [43, 44, 45, 46, 47, 48]
can delete 3 chunks containing 3 idxs: [49, 50, 51]
can delete 2 chunks containing 2 idxs: [52, 53]
can delete 1 chunks containing 1 idxs: [54]
can delete 50 chunks containing 50 idxs: [56, 57, 58, 60, 62, 63, 64, 66, 68, 69, 70, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 105, 107, 109, 110, 111, 115, 116, 117, 119, 120]
can delete 7 chunks containing 99 idxs: [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 24, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 60, 62, 63, 64, 66, 68, 69, 70, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 105, 107, 109, 110, 111, 115, 116, 117, 119, 120]
can delete 1 chunks containing 99 idxs: [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 24, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 56, 57, 58, 60, 62, 63, 64, 66, 68, 69, 70, 72, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 89, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 105, 107, 109, 110, 111, 115, 116, 117, 119, 120]
done: '/Users/tnguyen/git/projects/vdd/src/min.txt' (lines: total 1 rem 99, 17 created variants)

Output

As shown in the examples, at the end, the program will output

  1. done:, follows by
  2. the location of the resulting file rfile
  3. total number of lines of rfile (note, it doesn't have to be exact)
  4. the lines removed (i.e., how many lines were you able to remove from the original file)
  5. any additional information you find interesting (but don't output too much)
  6. the total runtime in seconds

Also, the program will output some information whenever it makes progress (i.e., able to reduce/delete parts of code, how much was deleted, etc). Examples are given above, but you do not need to print the exact information shown in the examples (e.g., chunks, idxs etc)

As usual, use the README to tell me exactly how to compile/run your program, e.g., python3 dd.exe file.C test.sh or gcc dd.c -o dd.exe and ./dd.exe file.c test.sh. Again, make sure that your code works on the CSE machine and do not hard code anything specific to your account.

Hints

  1. try the test.sh by hand so that you're familiar with it
  2. e.g., run test.sh file.C to see how it works, ie.., test.sh should return 1, if file.c is interesting and 0 otherwise.
  3. run test.sh on various files, even those that do not compile (test.sh should return 0 as these are not interesting)
  4. do DD by hand on several examples (e.g., the given C program) to make sure you are familiar with DD
  5. start the implementation
  6. test it on the given examples as well as other examples by creating the appropriate oracles. Be as creative as you want. (if you did something interest, you can use them for extra credits)

Extra Credits

  • create 2 additional examples showing other kind of oracles you came up with (something interesting and different than the given oracles).

What to Turn In

You must turn in a zip file containing the following files. Use the following names:

  1. the zip file must be named pa3-yourlast-firstname.zip
  2. One single main source file dd.ext, where ext is the extension of the source file of the language of your choice (e.g., .py, .c). As usual, indicate in the README file on how I can run it.
  3. a readme.txt file
  4. 2 screen shots showing how (1) you start your program (e.g., the commands used to compile and run your program) and (2) the end of the run (i.e., the output described above)
  5. Extra credits: ec1.ext file and ec1.sh test and ec2.ext file and ec2.sh test
  6. Nothing else, do not include exe, binary file, or anything else.

  7. Be sure to test your code extensively the CSE machine. If your code does not build, you will simply get a 0 for the assignment.

The readme.txt file should be a plain ASCII text file (not a Word file, not an RTF file, not an HTML file) consisting of the following:

  1. brief description how to run your program, e.g.,
$ gcc dd.c -o dd.exe
$ ./dd.exe myfile.c mytest.sh
....
  1. Answer the following questions about your DD

  2. Briefly describe your DD algorithm (if you use the one from the book or anywhere, briefly summarize)

  3. What modifications (if any) did you made to make your DD work?
  4. Construct a C program (you can use the given one as example) to show the WORST CASE scenario for DD (i.e., it takes \(O(n^2)\)).
    1. include the program and test script in the submission (name it worstcase.c and worstcase.sh)
    2. Indicate how to run this using your DD (./dd.exe worstcase.c worstcase.sh ?)
    3. show the output when running DD on it
    4. Explain this program and give reason why it is the worst case
  5. Like 3, but the BEST CASE scenario.
  6. Additional observations/thoughts

Grading Rubric

Grading (out of 20 points):

  • 12 points — for a complete DD as described (your points will be deducted if your DD does not obtain the minimal input)
  • 10 points if your DD at least got the minimal result for the 2 given examples above
  • 2 points if you DD works on my own example (not provided). This means you should try to generalize your DD and test it on various scenarios.
  • 8 points — for the answers in the README
  • 8 — clear explanations on running the program and thorough answers for the given questions
    • If you did the extra credits, describe them and say why they are interesting (e.g., different than the given examples)
  • 2 — vague or hard to understand; omits important details
  • 0 — little to no effort, or submitted an RTF/DOC/PDF file instead of plain TXT
  • 3 pts extra credit