Problem 3: Assembly-Level Debugging
Thus far, we've mostly been taking advantage of the fact that gdb
understands your program at the source level: it knows about
strings, source lines, call chains, and even complicated C++ data
structures. But sometimes it's necessary to dive into the assembly
code.
-
If you get to this point before we've done the lecture on “flow control”, it might be a good time to take a break and work on some other class, although you can get partway into this part before we cover the relevant material.
-
When you are working with assembly code, it can be very helpful to issue the
gdbcommandset disassemble-next-line on. That will tellgdbthat whenever the program stops, it should disassemble and display the next instruction that is to be executed. We suggest that you issue this command whenever you start GDB.
Steps and Questions
To be sure we're all on the same page,
- Quit
gdb - Run
gdbagainst the version ofproblem2you built with the-Og -gcompiler arguments:gdb problem2
which should print a bunch of boilerplate and then give you a (gdb) prompt.
- Run the program inside GDB with
run 1 42 2 47 3
You should see something like
(gdb) run 1 42 2 47 3
Starting program: /mnt/home/cmc/cs105/lab02/problem2 1 42 2 47 3
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
1 47 2 42 3
[Inferior 1 (process 17222) exited normally]
(gdb)
Now we can start debugging (as before, we're including the state you should be in at the start of each step):
-
No breakpoints; after running
problem2.What is the output?
-
No breakpoints.
Set a breakpoint in
main. Run the program again (useralone so GDB uses the same arguments).What line does it stop at?
-
Existing breakpoint at
main; after running the program.Type
listand then Enter to see what's nearby, then typeb 35andc.Where does it stop now?
-
Existing breakpoints at
mainlines 29 and 35; after running and continuing.Since that's the start of the loop, typing
cwill take you to the next iteration, right? -
Existing breakpoints at
mainlines 29 and 35; after running and continuing twice.Start over by just typing
r, thencontinue past that first breakpoint to the second one, which is the one we care about.Since we're in the
forstatement, why didn't the program stop there the second time?Type
info b(orinfo breakpointsfor the terminally verbose) and take a look at the “Address” column. Take note of the address given for breakpoint 2, and then typedisassem main. You'll note that there's a helpful little arrow (=>) on the left margin pointing right at breakpoint 2's address, since that's the instruction we're about to execute. Looking back at the corresponding source code, what part of theforstatement does this assembly code correspond to? -
Existing breakpoints at
mainlines 29 and 35; after running and continuing once.The instructions after
<+28>comprise the rest of theforloop, at lines 35--37 of the source. See if you can find the end of the loop body, where it jumps back to<main+35>. You should recognize theforloop pattern we covered in class.We've successfully set a breakpoint at the point that the loop is initialized. But we'd like to have a breakpoint inside the
forloop, so we can stop on every iteration. In fact, sometimes we want to break at a given line of assembly, even if it's in the middle of a source line, and GDB lets us do that.Type
b *(main+35)orb *0x4011b9. <!-- (assuming that's the address ofmain+35, as it was when weThe asterisk (
*) tells GDB to interpret the rest of the command as an address in memory, rather than as a line number in the source code.What does
info btell you about the line number you chose? -
Existing breakpoints at
mainlines 29 and 35, and at instructionmain+35; after running and continuing twice before hitting the third breakpoint.We can look at the current value of the array by typing
p array[0]@argcorp array[0]@6. But the current value isn't very interesting. Let's continue a few times and see what it looks like then. Typingcover and over is tedious (especially if you need to do it 10,000 times!) so let's use a singlecontinueorcto get to breakpoint 3 and then tryc 4.What are the full contents of
array? -
Existing breakpoints at
mainlines 29 and 35, and at instructionmain+35; after continuing until breakpoint 3 has been hit and then typingc 4.Maybe we should have done
c 3instead ofc 4. We could rerun the program, but we really don't need all the breakpoints we set, as we're only working with breakpoint 3.Type
info bto find out what's going on right now. Then used 1ordelete 1to completely get rid of breakpoint 1. But maybe breakpoint 2 will be useful in the future, so typedisable 2. Useinfo bto verify that it's no longer enabled (in the “Enb” column).Now run the program again. Where do we stop?
-
No previous state.
Sometimes, instead of stepping through a program line by line, we want to see what the individual instructions do. Of course, instructions manipulate registers. Quit
gdband restart it, setting a breakpoint infix_array. (Remember to issueset disassemble-next-line on.)Run the program with
run 1 42 2 47 3.Type
info registers(orinfo r) to see all the processor registers and the values stored in them in both hex and decimal.Which registers have not been covered in class?
-
Existing breakpoint at
fix_array; after running and hitting the breakpoint.Most of the registers we didn't cover aren't all that interesting, except for
eflags, which holds the condition codes… and some other things. Notice that instead of being shown in decimal, its contents are shown symbolically---for example,CF,ZF, and so on.Of the flags we have discussed in class, which ones are set right now?
What preceding instruction caused those flags to be set?
NOTE: If you haven't been through the “x86 control flow” lecture, you will have to return to this step after that.
-
Existing breakpoint at
fix_array; after running and hitting the breakpoint.Looking at all the registers is often more information than we need. GDB lets us do just that. Type
p $rdi.What is the value?
Is
p/x $rdimore meaningful? -
Existing breakpoint at
fix_array; after running and hitting the breakpoint.You may recall that earlier we mentioned a fondness for
x/16i.There's nothing special about the number 16; we just like powers of 2, and 16 gives you enough instructions to be useful.But we really likex/16i $rip.Type that command into GDB, and compare its results to what you get from
disassem fix_array.Explain your observations.
-
Existing breakpoint at
fix_array; after running and hitting the breakpoint.Finally, we mentioned stepping by instructions. That's done with
stepi(“steponeinstruction”). Type that now, and note thatgdbgives a new instruction address but says that you're in the left curly brace ({).With
set disassemble-next-line on(from Step 9),gdbwill also tell you what instruction you are on.A shorter alternative todisassemble-next-linethat only shows one line is to usedisplay/i $rip. Combining these two techniques can be confusing, so we recommend that you pick one and stick with it.What instruction are we on?
-
Keep hitting Enter to step one instruction at a time until you reach a
callinstruction.What function is about to be called?
-
Existing breakpoint at
fix_array; after hitting the breakpoint and then stepping by instruction until acallis about to be executed.As with source-level debugging, at the assembly level it's often useful to skip over function calls. At this point you have a choice of typing
stepiornexti.If you type
stepi, what do you expect the next instruction to be (hexadecimal address)?What about
nexti?(By now, your debugging skills should be strong enough that you can try one, restart the program, and try the other, so there's little excuse for getting this one wrong!)
-
Existing breakpoint at
fix_array; after experimenting withstepiandnexti.Stepping one instruction at a time can be tedious. You can always use
stepito zip past a bunch of them, but when you're dealing with loops and conditionals it can be hard to decide whether it's going to be 1,042 or 47,093 instructions before you reach the next interesting point in your program.n You could set a breakpoint at the next suspect line. But sometimes the next “interesting” bit is inside a line.
Imagine that you're interested about how the
retq(AKAret) instruction works. You might want to dodisassem fix_arrayto see the address where this instruction occurs. You can set a breakpoint there by typingb *0x401195(assuming that is its address, as it was when we wrote these instructions). Do so, and then continue.What source line is listed?
-
Existing breakpoints at
fix_arrayandretq; stopped atretqinstruction.The
retq/retinstruction manipulates registers in some way. Start by looking at what%rsppoints to. You can find out the address withp/x $rspand then use thexcommand, or you could just tryx/x $rsp. Or you could get wild and use C-style typecasting:p/x *(long *)$rsp(try it!).What is the value?
-
Existing breakpoints at
fix_arrayandretq; stopped atretqinstruction.Use
info regto find out what all the registers are. Then usestepito step past theretqinstruction, and look at all the registers again.Which registers have changed, and what are their old and new values?
That's it—you're done!
One way of seeing which registers have changed is doing it by
eye—just looking back and forth between the two lists. You could,
however, take advantage of Unix tools by copying each register list
and saving it into a file (e.g., reg1 for the before-ret
register list and reg2 for the after-ret register list), then
use the diff command to compare them; for example,
diff -u reg1 reg2
which gives you something like
--- reg1 2026-02-05 11:57:37.000000000 -0800
+++ reg2 2026-02-05 11:58:55.000000000 -0800
@@ -5,7 +5,7 @@
rsi 0x5 5
rdi 0x2f 47
rbp 0x6 0x6
-rsp 0x7fffffffd8e8 0x7fffffffd8e8
+rsp 0x7fffffffd8f0 0x7fffffffd8f0
r8 0x1999999999999999 1844674407370955161
r9 0x0 0
r10 0x7ffff7f4ffe0 140737353416672
@@ -14,7 +14,7 @@
r13 0x4052b4 4215476
r14 0x7fffffffda38 140737488345656
r15 0x403dc0 4210112
-rip 0x401195 0x401195 <fix_array+42>
+rip 0x4011e9 0x4011e9 <main+83>
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
@@ -32,3 +32,4 @@
k7 0x0 0
fs_base 0x7ffff7ddf740 140737351907136
gs_base 0x0 0
The -u flag tells diff to give you three lines of context on
either side of a change (-u gives you lines of context),
and marks the changed lines with a - (old) or + (changed/new) on
the left margin.
You can also just run diff without -u, but the results are a bit
harder to read:
8c8
< rsp 0x7fffffffd8e8 0x7fffffffd8e8
---
> rsp 0x7fffffffd8f0 0x7fffffffd8f0
17c17
< rip 0x401195 0x401195 <fix_array+42>
---
> rip 0x4011e9 0x4011e9 <main+83>
(When logged in, completion status appears here.)