Problem 1: Debugging Optimized Code
Let's first look at the code in problem1.c:
#include <stdio.h>
#include <stdlib.h>
int loop_while(int a, int b)
{
int i = 0;
int result = a;
while (i < 256) {
result += a;
a -= b;
i += b;
}
return result;
}
int main(int argc, char *argv[])
{
printf("%d\n", loop_while(atoi(argv[1]), 16));
return 0;
}
The program has a function, loop_while, with a small while loop, and a
simple main function that calls loop_while. Look over the
loop_while function to get an idea about how it works (you don't need to
fully decode it; just get a clue about what's going on).
atoi is pronounced “ay to eye”, as in “ASCII to integer” rather than as “a toy”.
Notice the atoi function. You can run man atoi in a terminal
window to find out how it works (i.e., what it does, what arguments it takes, what it returns, etc.).
Also look at the call to printf, which is the usual way of producing output in C (as opposed to the iostream approach in C++ that you learned about in CS 70).
You can read more about printf in Kernighan & Ritchie or
online (the advantage of reading in K&R is that the description
there is less complex; recent versions of printf have tons of
extensions that aren't particularly useful in this course).
The printf function supports a lot of complicated functionality, but for now we'll just
say that it prints answers, and "%d\n" means print the argument as a decimal with a newline character at the end.
Compile the Program
Compile the program with the -g switch and no optimization; that is,
gcc -g -o problem1 problem1.c
You should have an executable called problem1.
Try Running the Program
Try running the program with no arguments:
./problem1
You should see the program crash out with a “segmentation fault” (or “segfault” for short). This is a common type of error that happens when a program tries to access memory that it shouldn't. You might be able to figure out why this happens just by looking at the code, but we'll use a tool called gdb to analyze the program and find out exactly what's going on. (Don't worry if you don't understand everything right away; it'll make more sense as you work through the steps below.)
Run gdb on the Program
Start gdb with
gdb problem1
and set a breakpoint in main by typing b main. The breakpoint
tells the debugger to stop the program when it reaches the specified
function or line, allowing you to type more debugger commands; for
example, for examining variable values at that point.
Now run the program by typing run (which you can abbreviate as just r). The program will stop in
main.
Your session up to this point should look something like this:
% gdb problem1
GNU gdb (Gentoo 16.3 vanilla) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from problem1...
(gdb) b main
Breakpoint 1 at 0x40117a: file problem1.c, line 18.
(gdb) run
Starting program: /mnt/home/oneill/cs105/lab2-debugging/problem1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Breakpoint 1, main (argc=1, argv=0x7fffffffe328) at problem1.c:18
18 printf("%d\n", loop_while(atoi(argv[1]), 16));
(gdb)
At this point, we've started the program, and then paused execution at the beginning of main. The program is waiting for us to type more commands in gdb to tell it what to do next.
Open lab02.txt to Record Your Answers
You may want to open up a second terminal session to the server so that you can edit the file lab02.txt to record your answers in one window while you use gdb in another. In any case, make sure to record your answers to the questions in the instructions as you go along.
Next Steps and Questions
Take careful note of the commands we ask you to run. In particular, think about what they mean, and why one command might be a better choice than another one that seems superficially similar.
For example, remember that r is the quickest way to run a program
under gdb and that if you use r or run alone GDB remembers the
arguments you used last time and uses them again (see Step
5 below).
Note: To help you keep track of what you're supposed to be doing, at the beginning of each step we've listed the breakpoints you should have already set in italics---except when they don't matter. Also, when possible we have listed the state you should be in.
-
Existing breakpoint at
main.Type
c(orcontinue) to continue past the breakpoint. What happens? -
Existing breakpoint at
main; the program is in the process of crashing out.Type
bt(orbacktrace). That will print a “trace” of which function called which to get to where the program died. Take note of the numbers in the left column; they identify the stack frames of the calls that led to the point of failure. The point of failure is#0, andmainis the last function listed.A stack frame is a record of a function call, including the function's name, the values of its parameters, and variables local to the function. You might remember drawing stack diagrams in CS 70 and showing the stack frame for each function call. The stack frames in the backtrace are the same thing. GDB can inspect any stack frame and (if the code was compiled with debugging information) show you the values of the variables in that frame.
The backtrace also shows the code address for the functions being executed. You might notice that the address for
mainis quite different from the other functions in the trace.Type
frame, wheren is the number next ton main, so that you can look atmain's variables.What file and line number are you on?
Also why do you think the other functions in the backtrace have such different addresses from
main? (It's okay to guess!) -
Existing breakpoint at
main; the program is in the process of crashing out.Looking at the code, we can see that we were calling
atoi(argv[1])when the program crashed deep inside the implementation ofatoi. Usually when bad things appear to happen in the library (here, several variants ofstrtol) it's actually your fault, not the library's. In this case, the problem is thatmainpassed a bad argument toatoi. Let's look at the argument that was passed toatoiby running these three commands:print argv[1] whatis argv[1] explore argv[1]and see what it is.
What did it tell you about the argument passed to
atoi? -
Existing breakpoint at
main; the program is in the process of crashing out.From CS 70, you might remember null pointers, which are pointers that don't point to anything. On X86_64 Linux, null pointers are represented by the value
0(or0x0in hexadecimal). The library functionatoiwants a pointer to an actual C-style string, and so isn't at all happy being given a null pointer.First, let's let the program finish crashing out by typing
c(orcontinue). It should end up saying that the program terminated and no longer exists.Now, let's see how it behaves when we actually give it a valid argument. We'll pass in
5as an argument to the program.Type
r 5to run the program with an argument of
5. (Note, BTW, that if we hadn't terminated the original program, GDB would have reminded us that we still had an instance of the program running and asked us if we were okay with starting over.)We still have our original breakpoint at
main, so the program will stop there again. Before we continue, double-check the value ofargv[1]by typingprint argv[1]to make sure it is what we expect—you should see that it shows it's a pointer to the string
"5". Now, typecto continue past the breakpoint.What does the program print? Is the program still running? Try typing
cagain, orp argv[1]again. -
Existing breakpoint at
main; after the program terminates.Without restarting
gdb, typer(without any further parameters) to run the program yet again. (If you restartedgdb, you must first repeat Step 4.)When you get to the breakpoint, examine the variables
argcandargvby using theprintcommand (you can usewhatistoo, if you like). For example, typeprint argv[0].Also try
print argv[0] @ argc, which isgdb's notation for “print elements of theargvarray starting at element 0 and continuing forargcelements”.What is the value of
argc?What are the elements of the
argvarray?Where did they come from, given that you didn't add anything to the
runcommand? (In other words, what do you surmiserdoes when you don't give it any arguments?) -
Existing breakpoint at
main; stopped enteringmain.The
steporscommand is a useful way to follow a program's execution one line at a time.Type
s. Where do you wind up?If you end up on a line inside
atoi.c, typefinishto get out ofatoiand then typesagain so that you end at a line insideproblem1.c.Where have you wound up?
-
Existing breakpoint at
main; atmainplus onestep.gdbalways shows you the line that is about to be executed. Sometimes it's useful to see some context. Typelistand the Enter (or Return) key.What lines do you see?
Now hit the Enter key again.
What do you see now?
-
Existing breakpoint at
main; atmainand stepped once as described in Step 6.Type
s(and Enter) to step to the next line. Then hit the Enter key three more times.What do you think the Enter key does?
-
Existing breakpoint at
main; after stepping once as described in Step 6 and then stepping four more times.What are the values of
result,a, andb? -
Existing breakpoint at
main; after stepping once as described in Step 6 and then stepping four more times.Disassemble the
mainfunction by typingdisassem main(ordisas main). Look at what functions are called by main. You should be able to seecallinstructions toatoiand toloop_while.Remember what we figured out in class, functions expect their first two parameters to be stored in the registers
%rdiand%rsi(or in this case, the 32-bit versions%ediand%esi), respectively; and functions return results in%rax(or%eax). So after the call toatoi, the result ofatoiwill be in%eax. Look at the instructions following the call toatoiand see how the arguments forloop_whileare set up.Describe what the instructions between the calls to
atoiandloop_whileare doing. -
Existing breakpoint at
main; after stepping once as described in Step 6 and then stepping four more times.Type
quitto exitgdb. (You'll have to tell it to kill the “inferior process”, which is the program you are debugging.)Recompile the program, but this time optimize the code more by adding
-O2after the-g:gcc -g -O2 -o problem1 problem1.cNote that the
Ois the letter, not a zero. (Also note that the lowercase "-o" is still necessary!)Start
gdbwithproblem1again. This time, set a breakpoint atloop_while(notmain!), andrunit with an argument of20(not5!). Where does the program stop? -
Existing breakpoint at
loop_while; after running with argument of 20.Hmm…. That's kind of odd. Disassemble the
mainfunction by typingdisassem main(ordisas main).What is the address of the instruction that calls
printf?Are there any other call instructions?
-
Existing breakpoint at
loop_while; after running with argument of 20.It turns out that the call to
atoiwas replaced with a call tostrtol. But what's up withloop_while? Where's the call to it? -
Existing breakpoint at
loop_while; after running with argument of 20.A handy feature of
printis that you can use it to convert between bases. For example, what happens when you type "print/x 42"? How about "p 0x2f"? -
Existing breakpoint at
loop_while; after running with argument of 20.After the call to
strtolthere are a fewleainstructions (leaisgdb's version oflealandleaq, depending on the destination). Theleayou want to look at contains a small hexadecimal constant. Can you find a matching decimal value (positive or negative) inproblem1.c? -
Some of the instructions following the call to
strtolare of secondary importance, and the rest are pretty hard for novices to decode. But if the value in%eaxis called \( a \), the important instructions (which are the ones at +32, +35, +37, +40, and +42) calculate \( 15(a-16) + 2a - 1680. \) (0x690is 1680 in decimal.)(After you've developed a bit more facility with the x86_64 instruction set, it will be worth your time to return to these instructions and analyze them.)
That's weird. Or not—it turns out that the compiler is so smart that it figured out the underlying math of
loop_whileand replaced it with a straight-line calculation. Wow!No answer is needed here.
(When logged in, completion status appears here.)