Computer Science 60
Principles of Computer Science
Fall 2007


Assignment 13: Language reflection and algorithm efficiency...

This assignment is due by Wednesday, December 12 at 11:59 pm.

IMPORTANT: Revised intructions for HW13 submission:

Do not submit HW13 through your dropbox on Sakai.  Instead, go to the Assignments section of the Sakai site.  There you will find one assignmnet: Homwork 13.  Click on that assignmnet and follow the instructions for submission.  You must attach a single file, which is a zip file called hw13.zip containing all of the files listed below.

In the first question of this assignment there is no programming - for those you should write up your answers (proofs or analyses) in a plain-text, pdf or Microsoft Word file and submit it as part1.txt, part1.pdf or part1.doc. For the second problem, you will submit 4 files: 3 code files (align.scm, Align.java and align.pl) and 1 text file (part2.txt).  For the third problem, you should submit Wally.java.  Again, zip all of these files into a single file called hw13.zip.

Watch out because Sakai only allows you to submit your assignment ONCE.  So, IF YOU ARE TAKING A LATE DAY, you must let me know so that I can enable resubmission after you submit the Wally problem on 12/12.

Late days for this assignment...

You may use your remaining late days on this assignment, except for problem #3, the Wally problem. Since we will talk about that program on Thursday, December 13 in class, there are no late days available for that one.

  1. Fast Multiplication! [40 Points]

    In this problem we consider the problem of multiplying two n-bit numbers, x and y. For simplicity, we will assume that n is a power of 2. This is generally the case in any real-world computer architecture, but it's also true that relaxing this assumption does not change the results of the analysis.

  2. Language Comparison [35 points]
  3. In this problem, you will implement a program that does the same thing in three different languages: Scheme, Prolog and Java.  The problem itself is fairly simple (it's part of a larger, more complex problem that you will not have to implement), but it will allow you to directly compare different languages for solving the same problem.  

     The problem you will address is the problem we looked at on the first day of class: the problem of DNA sequence alignment (taken from a homework assignment by Sedgewick and Wayne: http://www.cs.princeton.edu/courses/archive/fall07/cos126/assignments/sequence.html).  Follow the link and read the assignment up through the part where it talks about how to measure the distance between to sequences.  Don't worry about the dynamic programming part (although you might want to read it as you might find it helpful in general... with other computational problems).  We will not implement an algorithm for finding the minimum edit-distance alignment here.  We will only implement the edit-distance function for two given lists of DNA sequences.  

    Part 2.1: Implementation (15 points)

    Implement the following:

    1. In Scheme, in a file called align.scm, implement:
      (edit-dist L1 L2) 
       which is a function that returns the edit distance between the sequences defined by L1 and L2, two lists of genetic information.  Blanks will be represented with the symbol 'blank, and genes will be represented by alphabetic symbols.  For example:
      >>(edit-dist '(A A C A G T T  A C C) '(T A blank A G G T blank C A)) 
      7
    2. In Java implement a class called Align (in a file with the same name .java) with one method:
      public int editDist(char[] seq1, char[] seq2);
      This function should again return the edit distance between seq1 and seq2.  Blanks will be represented with a #.  E.g., 
      editDist( {'A', 'A', 'C', 'A', 'G', 'T', 'T', 'A', 'C', 'C'} 
                {'T', 'A', '#', 'A', 'G', 'G', 'T', '#', 'C', 'A'} );
      Should return 7.
    3. In Prolog, in a file called align.pl, write a predicate editDist(L1, L2, X) that is true when X represents the edit distance between L1 and L2.  X should be able to be a variable, but L1 and L2 will always be bound.  Blanks will again be represented with the literal symbol blank. E.g., 
      editDist([a, a, c, a, g, t, t, a, c, c], [t, a, blank, a, g, g, t, blank, c, a], X).
      X = 7 ;
      No.
    In writing your code, using the following guidelines/assumptions for all parts:

    Part 2.2: Relection (20 points)

    After you have written your code, imagine that you wanted to implement a complete solution to the sequence alignment problem (i.e., find the minimum edit-distance alignment).  You don't actually have to implement anything, just think about how you would implement it.  Also, you don't have to implement a fast algorithm--brute force is fine.  

    Now, choose the language that you feel would be best suited for this implementation, and write a short essay (2-3 paragraphs) explaining and defending your choice of language.  You should address your argument to your classmates--that is, students currently in CS60.  Try to convince them that your choice of language is the best choice based on your experience implementing the functions above and your thoughts about how you will implement the rest of the program.  Be clear about your evidence.  We will grade you on how convincing your argument is, taking into account how clear your examples are, and how clean your writing is.  

    In your argument, you may wish to consider any or all of the following:

    You specifically should not worry about running time in this problem.  Turn in this writeup in a file called part2.txt.

    Extra Credit on this part: If you actually do implement the full optimal alignment program you will receive 5 points extra credit.  Place the extra credit code in the same file as the code for the required section, and include a comment at the top of the file and in your writeup file.

  4. Wally Wends Westward... [50 Points]

    This problem provides a chance to exercise your own algorithm-design skills, to estimate the big-O complexity of your algorithm, and then to try out your code to see how quickly it runs in practice on problems of increasing size.

    The problem

    Wally is unicycling back to Mudd (@ the southwest corner of the map) after spending the summer in New England (@ the northeast corner). A square grid has been overlaid on the map, with each cell containing a positive number -- namely, the number of strawberry-filled DonutMan donuts that Wall will need to consume in order to have enough energy to pass through that cell.

    Your task is to maximize donut conservation. And for this problem specifically, you should write

    The rest is up to you... the algorithm, the software structure (other classes, other meethods), and the testing you do.

    Example

    For example, here is an image of a 5x5 donut-consumption grid and the correct minimum answer. Note that the start cell will be in the final spot in the first row, or map[0][N-1], and that the destination cell is in the final spot of the first column, or map[N-1][0], where N is the length of a side of the square map.

    Below it is the path that produced that answer:


    The write-up

    You should comment your code in order to explain how your algorithm works, as usual. However, you should also include a comment at the top of your file that includes


    You may cite any results from class to justify your claim to a particular big-O running time, but you may also need to supplement those results with additional analysis of your own.

    For uniformity, we ask that everyone define the problem size, N, to be the length of ONE SIDE of the square donut-consumption grid.

    Submission

    Submit Wally.java as instructed above, under assignment #13.

    Scoring

    Your solution will receive full correctness credit if it runs in "reasonable" time (under ten seconds on a 6x6 grid) and it works correctly, i.e., it computes the true minimum donut cost. Slow correct solutions will receive almost all of the points (unless they are bogosort-slow). Incorrect solutions, fast or slow, will receive almost no points.

    The other half of the credit will be for your algorithm deisgn (not speed, just thoughtfulness), explanation, program design, and commenting.

    Extra Credit Options



ANOTHER OPTIONAL BONUS PROBLEM (20 Points)

Consider two regular expressions. Each of these RE's defines a language of strings. Alternatively, one might say that the REs "generate" or "accept" their respective languages. In every case, such a language must be regular.

For this problem, show how a "Regular-expression autograder" CAN be built. This regular-expression autograder should be a program that takes as input two RE's and determines whether or not the two RE's define the same language of strings.

We showed that such an autograder is not possible for Turing Machines, but amazingly, it is possible for RE's! You may cite any results from class to help with your construction.