CS 105

Problem 1: Debugging Optimized Code

Let's first look at the code in problem1.c:

#include <stdio.h>
#include <stdlib.h>

int loop_while(int a, int b)
{
    int i = 0;
    int result = a;
    while (i < 256) {
        result += a;
        a -= b;
        i += b;
    }
    return result;
}

int main(int argc, char *argv[])
{
    printf("%d\n", loop_while(atoi(argv[1]), 16));
    return 0;
}

The program has a function, loop_while, with a small while loop, and a simple main function that calls loop_while. Look over the loop_while function to get an idea about how it works (you don't need to fully decode it; just get a clue about what's going on).

atoi is pronounced “ay to eye”, as in “ASCII to integer” rather than as “a toy”.

Notice the atoi function. You can run man atoi in a terminal window to find out how it works (i.e., what it does, what arguments it takes, what it returns, etc.).

Also look at the call to printf, which is the usual way of producing output in C (as opposed to the iostream approach in C++ that you learned about in CS 70).

You can read more about printf in Kernighan & Ritchie or online (the advantage of reading in K&R is that the description there is less complex; recent versions of printf have tons of extensions that aren't particularly useful in this course).

The printf function supports a lot of complicated functionality, but for now we'll just say that it prints answers, and "%d\n" means print the argument as a decimal with a newline character at the end.

Compile the Program

Compile the program with the -g switch and no optimization; that is,

gcc -g -o problem1 problem1.c

You should have an executable called problem1.

Try Running the Program

Try running the program with no arguments:

./problem1

You should see the program crash out with a “segmentation fault” (or “segfault” for short). This is a common type of error that happens when a program tries to access memory that it shouldn't. You might be able to figure out why this happens just by looking at the code, but we'll use a tool called gdb to analyze the program and find out exactly what's going on. (Don't worry if you don't understand everything right away; it'll make more sense as you work through the steps below.)

Run gdb on the Program

Start gdb with

gdb problem1

and set a breakpoint in main by typing b main. The breakpoint tells the debugger to stop the program when it reaches the specified function or line, allowing you to type more debugger commands; for example, for examining variable values at that point.

Now run the program by typing run (which you can abbreviate as just r). The program will stop in main.

Your session up to this point should look something like this:

% gdb problem1
GNU gdb (Gentoo 16.3 vanilla) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from problem1...
(gdb) b main
Breakpoint 1 at 0x40117a: file problem1.c, line 18.
(gdb) run
Starting program: /mnt/home/oneill/cs105/lab2-debugging/problem1 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, main (argc=1, argv=0x7fffffffe328) at problem1.c:18
18      printf("%d\n", loop_while(atoi(argv[1]), 16));
(gdb)

At this point, we've started the program, and then paused execution at the beginning of main. The program is waiting for us to type more commands in gdb to tell it what to do next.

Open lab02.txt to Record Your Answers

You may want to open up a second terminal session to the server so that you can edit the file lab02.txt to record your answers in one window while you use gdb in another. In any case, make sure to record your answers to the questions in the instructions as you go along.

Next Steps and Questions

Take careful note of the commands we ask you to run. In particular, think about what they mean, and why one command might be a better choice than another one that seems superficially similar.

For example, remember that r is the quickest way to run a program under gdb and that if you use r or run alone GDB remembers the arguments you used last time and uses them again (see Step 5 below).

Note: To help you keep track of what you're supposed to be doing, at the beginning of each step we've listed the breakpoints you should have already set in italics---except when they don't matter. Also, when possible we have listed the state you should be in.

  1. Existing breakpoint at main.

    Type c (or continue) to continue past the breakpoint. What happens?

  2. Existing breakpoint at main; the program is in the process of crashing out.

    Type bt (or backtrace). That will print a “trace” of which function called which to get to where the program died. Take note of the numbers in the left column; they identify the stack frames of the calls that led to the point of failure. The point of failure is #0, and main is the last function listed.

    A stack frame is a record of a function call, including the function's name, the values of its parameters, and variables local to the function. You might remember drawing stack diagrams in CS 70 and showing the stack frame for each function call. The stack frames in the backtrace are the same thing. GDB can inspect any stack frame and (if the code was compiled with debugging information) show you the values of the variables in that frame.

    The backtrace also shows the code address for the functions being executed. You might notice that the address for main is quite different from the other functions in the trace.

    Type frame n, where n is the number next to main, so that you can look at main's variables.

    What file and line number are you on?

    Also why do you think the other functions in the backtrace have such different addresses from main? (It's okay to guess!)

  3. Existing breakpoint at main; the program is in the process of crashing out.

    Looking at the code, we can see that we were calling atoi(argv[1]) when the program crashed deep inside the implementation of atoi. Usually when bad things appear to happen in the library (here, several variants of strtol) it's actually your fault, not the library's. In this case, the problem is that main passed a bad argument to atoi. Let's look at the argument that was passed to atoi by running these three commands:

    print argv[1]
    whatis argv[1]
    explore argv[1]
    

    and see what it is.

    What did it tell you about the argument passed to atoi?

  4. Existing breakpoint at main; the program is in the process of crashing out.

    From CS 70, you might remember null pointers, which are pointers that don't point to anything. On X86_64 Linux, null pointers are represented by the value 0 (or 0x0 in hexadecimal). The library function atoi wants a pointer to an actual C-style string, and so isn't at all happy being given a null pointer.

    First, let's let the program finish crashing out by typing c (or continue). It should end up saying that the program terminated and no longer exists.

    Now, let's see how it behaves when we actually give it a valid argument. We'll pass in 5 as an argument to the program.

    Type

    r 5
    

    to run the program with an argument of 5. (Note, BTW, that if we hadn't terminated the original program, GDB would have reminded us that we still had an instance of the program running and asked us if we were okay with starting over.)

    We still have our original breakpoint at main, so the program will stop there again. Before we continue, double-check the value of argv[1] by typing

    print argv[1]
    

    to make sure it is what we expect—you should see that it shows it's a pointer to the string "5". Now, type c to continue past the breakpoint.

    What does the program print? Is the program still running? Try typing c again, or p argv[1] again.

  5. Existing breakpoint at main; after the program terminates.

    Without restarting gdb, type r (without any further parameters) to run the program yet again. (If you restarted gdb, you must first repeat Step 4.)

    When you get to the breakpoint, examine the variables argc and argv by using the print command (you can use whatis too, if you like). For example, type print argv[0].

    Also try print argv[0] @ argc, which is gdb's notation for “print elements of the argv array starting at element 0 and continuing for argc elements”.

    What is the value of argc?

    What are the elements of the argv array?

    Where did they come from, given that you didn't add anything to the run command? (In other words, what do you surmise r does when you don't give it any arguments?)

  6. Existing breakpoint at main; stopped entering main.

    The step or s command is a useful way to follow a program's execution one line at a time.

    Type s. Where do you wind up?

    If you end up on a line inside atoi.c, type finish to get out of atoi and then type s again so that you end at a line inside problem1.c.

    Where have you wound up?

  7. Existing breakpoint at main; at main plus one step.

    gdb always shows you the line that is about to be executed. Sometimes it's useful to see some context. Type list and the Enter (or Return) key.

    What lines do you see?

    Now hit the Enter key again.

    What do you see now?

  8. Existing breakpoint at main; at main and stepped once as described in Step 6.

    Type s (and Enter) to step to the next line. Then hit the Enter key three more times.

    What do you think the Enter key does?

  9. Existing breakpoint at main; after stepping once as described in Step 6 and then stepping four more times.

    What are the values of result, a, and b?

  10. Existing breakpoint at main; after stepping once as described in Step 6 and then stepping four more times.

    Disassemble the main function by typing disassem main (or disas main). Look at what functions are called by main. You should be able to see call instructions to atoi and to loop_while.

    Remember what we figured out in class, functions expect their first two parameters to be stored in the registers %rdi and %rsi (or in this case, the 32-bit versions %edi and %esi), respectively; and functions return results in %rax (or %eax). So after the call to atoi, the result of atoi will be in %eax. Look at the instructions following the call to atoi and see how the arguments for loop_while are set up.

    Describe what the instructions between the calls to atoi and loop_while are doing.

  11. Existing breakpoint at main; after stepping once as described in Step 6 and then stepping four more times.

    Type quit to exit gdb. (You'll have to tell it to kill the “inferior process”, which is the program you are debugging.)

    Recompile the program, but this time optimize the code more by adding -O2 after the -g:

    gcc -g -O2 -o problem1 problem1.c
    

    Note that the O is the letter, not a zero. (Also note that the lowercase "-o" is still necessary!)

    Start gdb with problem1 again. This time, set a breakpoint at loop_while (not main!), and run it with an argument of 20 (not 5!). Where does the program stop?

  12. Existing breakpoint at loop_while; after running with argument of 20.

    Hmm…. That's kind of odd. Disassemble the main function by typing disassem main (or disas main).

    What is the address of the instruction that calls printf?

    Are there any other call instructions?

  13. Existing breakpoint at loop_while; after running with argument of 20.

    It turns out that the call to atoi was replaced with a call to strtol. But what's up with loop_while? Where's the call to it?

  14. Existing breakpoint at loop_while; after running with argument of 20.

    A handy feature of print is that you can use it to convert between bases. For example, what happens when you type "print/x 42"? How about "p 0x2f"?

  15. Existing breakpoint at loop_while; after running with argument of 20.

    After the call to strtol there are a few lea instructions (lea is gdb's version of leal and leaq, depending on the destination). The lea you want to look at contains a small hexadecimal constant. Can you find a matching decimal value (positive or negative) in problem1.c?

  16. Some of the instructions following the call to strtol are of secondary importance, and the rest are pretty hard for novices to decode. But if the value in %eax is called \( a \), the important instructions (which are the ones at +32, +35, +37, +40, and +42) calculate \( 15(a-16) + 2a - 1680. \) (0x690 is 1680 in decimal.)

    (After you've developed a bit more facility with the x86_64 instruction set, it will be worth your time to return to these instructions and analyze them.)

    That's weird. Or not—it turns out that the compiler is so smart that it figured out the underlying math of loop_while and replaced it with a straight-line calculation. Wow!

    No answer is needed here.

To Complete This Part of the Assignment…

You'll know you're done with this part of the assignment when you've done all of the following:

(When logged in, completion status appears here.)