Skip to content

Midterm 1

Instructions

  • This is a take-home, open book exam.
  • You must do this exam by yourself (i.e., no collaboration with anyone).
  • You must type all solutions.
  • If you used any resources (e.g., online, books, lectures), you need to explicitly say so.
  • Solutions will be graded on correctness and clarity. Each problem has a relatively simple and straightforward solution. We might deduct points if your solution is far more complicated than necessary. Partial solutions will be graded for partial credit.

Short questions (12 pts)

Several main phases of a modern compilers are lexing, parsing, semantic analysis (which includes tasks such as type checking and evaluation), optimization. For each phase:

  1. explain its purpose
  2. explain its input and output (be careful here, e.g., the inputs to Lexer are not just the program src)
  3. (lexing and parsing only) explain how it is achieved, e.g., underlying algorithms, and how we implement that phase, e.g., tool support
  4. explain what would happen if we did not have that phase e.g., anything catastrophic?

Comparing RE, DFA, NFA, and CFG (4 pts)

For the following, answer true or false and give concrete example when asked

  1. There are some languages that cannot be expressed using an RE, but can be expressed using an NFA (if answer T, give an example)
  2. There are some languages that cannot be expressed using a DFA, but can be expressed using an NFA (if answer T, give an example)
  3. There are some languages that cannot be expressed using a NFA, but can be expressed using a CFG (if answer T, give an example)
  4. For every language that RE, DFA, and NFA recognizes, there exists a CFG that recognize that exact language (if answer N, give an example)

DFA and CFG (15 pts)

For each of the following languages over the alphabet \({a,b}\), determine if it can be recognized by a DFA. If yes, draw a DFA for it. If not, determine if it can be recognized by a CFG. If yes, write the CFG for it. If not, say No.

  1. (3 pts) languages consisting of strings having exactly two a's

  2. (3 pts) language consisting of strings having n a's followed by n b's, e.g., e, aabb, aaaabbbb.

  3. (3 pts) language over the alpha \(\{a,b,c\}\) consisting of strings having \(n\) a's followed by \(n\) b's followed by \(n\) c's, e.g., e, aabbcc, aaaabbbbcccc.

  4. (3 pts) language over the alphabet \({(, )}\) consists of all strings of properly nested parentheses, e.g., e, ()(), ((())), (()()).

  5. (3 pts) language consisting of strings that are palindrome

Parsing (15 pts)

  • (5 pts) Consider the grammar G

    S-> E
    E -> int
    E -> E + E | E - E
    E -> (E)
    

    1. (1 pts) Show that this grammar is ambiguous using the string int-int-int

    2. (4 pts) This grammar is left recursive. Rewrite it to eliminate left recursion (but is still equivalent to the original grammar). That is, provide a grammar G' such that L(G) = L(G') but G' admits no derivation X --->* Xa

  • (6 pts) The following grammar is left recursive.

    E -> E + T | T
    T -> id | (E)
    
    For each of the following, explain whether it eliminates left recursions and is equivalent to the original grammar. If it is NOT equivalent to the original grammar, give an example showing the difference.

    1. G1: E -> E + id | E + (E) | id | (E)
    2. G2
      E -> TE’
      E’ -> +TE' | e
      T  -> id | (E)
      
    3. G3
      E -> E’ + T | T 
      E’-> id | (E) 
      T -> id | (E)
      
    4. G4
      E -> id + E | E + T | T 
      T -> id | (E)
      
  • (4 pts) Consider the string s = id + id * (id) and the grammar:
    E -> T  | T + E
    T -> id | id * T | ( E )
    
    1. (2 pts) Draw the parse tree for s
    2. (2 pts) Draw the AST tree using any of the drawn parse tree

LL(1) Grammar (4 pts)

Consider the grammar

E -> T + E | T 
T -> int | int * T | ( E ) 

  • (1) Explain why Recursive Descent algorithm would have backtracking problems with the above grammar.
  • (3) Rewrite this grammar to an LL(1) grammar to avoid the problem.