CS 105

Problem 1: Debugging Optimized Code

Let's first look at the code in problem1.c:

#include <stdio.h>
#include <stdlib.h>

int loop_while(int a, int b)
{
    int i = 0;
    int result = a;
    while (i < 256) {
        result += a;
        a -= b;
        i += b;
    }
    return result;
}

int main(int argc, char *argv[])
{
    printf("%d\n", loop_while(atoi(argv[1]), 16));
    return 0;
}

The program has a function, loop_while, with a small while loop, and a simple main function that calls loop_while. Look over the loop_while function to get an idea about how it works (you don't need to fully decode it; just get a clue about what's going on).

atoi is pronounced “ay to eye”, as in “ASCII to integer” rather than as “a toy”.

Notice the atoi function. You can run man atoi in a terminal window to find out how it works (i.e., what it does, what arguments it takes, what it returns, etc.).

Also look at the call to printf, which is the usual way of producing output in C (as opposed to the iostream approach in C++ that you learned about in CS 70).

You can read more about printf in Kernighan & Ritchie or online (the advantage of reading in K&R is that the description there is less complex; recent versions of printf have tons of extensions that aren't particularly useful in this course).

The printf function supports a lot of complicated functionality, but for now we'll just say that it prints answers, and "%d\n" means print the argument as a decimal with a newline character at the end.

Compile the Program

Compile the program with the -g switch and no optimization; that is,

gcc -g -o problem1 problem1.c

You should have an executable called problem1.

Run gdb on the Program

Start gdb with

gdb problem1

and set a breakpoint in main by typing b main. The breakpoint tells the debugger to stop the program when it reaches the specified function or line, allowing you to type more debugger commands; for example, for examining variable values at that point.

Now run the program by typing r or run. The program will stop in main. (Ignore any warnings; they're meaningful but we'll work around them.)

Next Steps and Questions

Take careful note of the commands we ask you to run. In particular, think about what they mean, and why one command might be a better choice than another one that seems superficially similar.

For example, remember that r is the quickest way to run a program under gdb and that if you use r or run alone GDB remembers the arguments you used last time and uses them again (see Step 5 below).

Note: To help you keep track of what you're supposed to be doing, at the beginning of each step we've listed the breakpoints you should have already set in italics---except when they don't matter. Also, when possible we have listed the state you should be in.

  1. Existing breakpoint at main.

    Type c (or continue) to continue past the breakpoint. What happens?

  2. Existing breakpoint at main; after the program terminates.

    Type bt (or backtrace). That will print a “trace” of which function called which to get to where the program died. Take note of the numbers in the left column; they identify the stack frames of the calls that led to the point of failure. The point of failure is #0, and main is the last function listed.

    Type frame n, where n is one of those numbers, to get to main's stack frame so that you can look at main's variables.

    What file and line number are you on?

  3. Existing breakpoint at main; after the program terminates.

    Usually when bad things appear to happen in the library (here, several variants of strtol) it's actually your fault, not the library's. In this case, the problem is that main passed a bad argument to atoi.

    Rerun the program by typing r (you'll have to confirm that you really want to do so) and let it stop at the breakpoint. Note that in Step 2, atoi was called with the argument argv[1]. You can find out the value that was passed to atoi with the command "print argv[1]".

    What is printed?

  4. Existing breakpoint at main; after rerunning the program and stopping at the breakpoint.

    If you took CS 70, you will recognize that number as the value of a NULL pointer. Like many library functions, atoi doesn't like NULL pointers.

    Rerun the program with an argument of 5 by typing r 5. When it reaches the breakpoint, continue (type c).

    What does the program print?

  5. Existing breakpoint at main; after the program terminates.

    Without restarting gdb, type r (without any further parameters) to run the program yet again. (If you restarted gdb, you must first repeat Step 4.)

    When you get to the breakpoint, examine the variables argc and argv by using the print command. For example, type print argv[0].

    Also try print argv[0]@argc, which is gdb's notation for “print elements of the argv array starting at element 0 and continuing for argc elements”.

    What is the value of argc?

    What are the elements of the argv array?

    Where did they come from, given that you didn't add anything to the run command?

  6. Existing breakpoint at main; at main.

    The step or s command is a useful way to follow a program's execution one line at a time.

    Type s. Where do you wind up?

    If you end up on a line inside atoi.c, type finish to get out of atoi and then type s again so that you end at a line inside problem1.c.

    Where have you wound up?

  7. Existing breakpoint at main; at main plus one step.

    gdb always shows you the line that is about to be executed. Sometimes it's useful to see some context. Type list and the Enter (or Return) key.

    What lines do you see?

    Now hit the Enter key again.

    What do you see now?

  8. Existing breakpoint at main; at main and stepped once as described in Step 6.

    Type s (and Enter) to step to the next line. Then hit the Enter key three more times.

    What do you think the Enter key does?

  9. Existing breakpoint at main; after stepping once as described in Step 6 and then stepping four more times.

    What are the values of result, a, and b?

  10. Existing breakpoint at main; after stepping once as described in Step 6 and then stepping four more times.

    Disassemble the main function by typing disassem main (or disas main). Look at what functions are called by main. You should be able to see call instructions to atoi and to loop_while.

    We haven't covered this material in class yet, but functions expect their first two parameters to be stored in the registers %rdi and %rsi (also known, for this problem, as %edi and %esi—go figure), respectively; and functions return results in %rax (also known as %eax). So after the call to atoi, the result of atoi will be in %eax.

    Describe what the instructions between the calls to atoi and loop_while are doing.

  11. Existing breakpoint at main; after stepping once as described in Step 6 and then stepping four more times.

    Type quit to exit gdb. (You'll have to tell it to kill the “inferior process”, which is the program you are debugging. Insulting!)

    Recompile the program, but this time optimize the code more by adding -O2 after the -g:

    sh gcc -g -O2 -o problem1 problem1.c

    Note that the O is the letter, not a zero. (Also note that the lowercase "-o" is still necessary!)

    Start gdb with problem1 again. This time, set a breakpoint at loop_while (not main!), and run it with an argument of 20 (not 5!). Where does the program stop?

  12. Existing breakpoint at loop_while; after running with argument of 20.

    Hmm…. That's kind of odd. Disassemble the main function by typing disassem main (or disas main).

    What is the address of the instruction that calls printf?

    Are there any other call instructions?

  13. Existing breakpoint at loop_while; after running with argument of 20.

    It turns out that the call to atoi was replaced with a call to strtol. But what's up with loop_while? Where's the call to it?

  14. Existing breakpoint at loop_while; after running with argument of 20.

    A handy feature of print is that you can use it to convert between bases. For example, what happens when you type "print/x 42"? How about "p 0x2f"?

  15. Existing breakpoint at loop_while; after running with argument of 20.

    After the call to strtol there are a few lea instructions (lea is gdb's version of leal and leaq, depending on the destination). The lea you want to look at contains a small hexadecimal constant. Can you find a matching decimal value (positive or negative) in problem1.c?

  16. Some of the instructions following the call to strtol are of secondary importance, and the rest are pretty hard for novices to decode. But if the value in %eax is called \( a \), the important instructions (which are the ones at +32, +35, +37, +40, and +42) calculate \( 15(a-16) + 2a - 1680. \) (0x690 is 1680 in decimal.)

    (After you've developed a bit more facility with the x86_64 instruction set, it will be worth your time to return to these instructions and analyze them.)

    That's weird. Or not—it turns out that the compiler is so smart that it figured out the underlying math of loop_while and replaced it with a straight-line calculation. Wow!

    No answer is needed here.

To Complete This Part of the Assignment…

You'll know you're done with this part of the assignment when you've done all of the following:

(When logged in, completion status appears here.)