CS 105

Problem 3: Assembly-Level Debugging

Thus far, we've mostly been taking advantage of the fact that gdb understands your program at the source level: it knows about strings, source lines, call chains, and even complicated C++ data structures. But sometimes it's necessary to dive into the assembly code.

  • If you get to this point before we've done the lecture on “flow control”, it might be a good time to take a break and work on some other class, although you can get partway into this part before we cover the relevant material.

  • When you are working with assembly code, it can be very helpful to issue the gdb command set disassemble-next-line on. That will tell gdb that whenever the program stops, it should disassemble and display the next instruction that is to be executed. We suggest that you issue this command whenever you start GDB.

Steps and Questions

To be sure we're all on the same page,

  • Quit gdb
  • Run gdb against the version of problem2 you built with the -Og -g compiler arguments:
    gdb problem2
    

which should print a bunch of boilerplate and then give you a (gdb) prompt.

  • Run the program inside GDB with run 1 42 2 47 3

You should see something like

(gdb) run 1 42 2 47 3
Starting program: /mnt/home/cmc/cs105/lab02/problem2 1 42 2 47 3
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
1 47 2 42 3
[Inferior 1 (process 17222) exited normally]
(gdb)

Now we can start debugging (as before, we're including the state you should be in at the start of each step):

  1. No breakpoints; after running problem2.

    What is the output?

  2. No breakpoints.

    Set a breakpoint in main. Run the program again (use r alone so GDB uses the same arguments).

    What line does it stop at?

  3. Existing breakpoint at main; after running the program.

    Type list and then Enter to see what's nearby, then type b 35 and c.

    Where does it stop now?

  4. Existing breakpoints at main lines 29 and 35; after running and continuing.

    Since that's the start of the loop, typing c will take you to the next iteration, right?

  5. Existing breakpoints at main lines 29 and 35; after running and continuing twice.

    Start over by just typing r, then continue past that first breakpoint to the second one, which is the one we care about.

    Since we're in the for statement, why didn't the program stop there the second time?

    Type info b (or info breakpoints for the terminally verbose) and take a look at the “Address” column. Take note of the address given for breakpoint 2, and then type disassem main. You'll note that there's a helpful little arrow (=>) on the left margin pointing right at breakpoint 2's address, since that's the instruction we're about to execute. Looking back at the corresponding source code, what part of the for statement does this assembly code correspond to?

  6. Existing breakpoints at main lines 29 and 35; after running and continuing once.

    The instructions after <+28> comprise the rest of the for loop, at lines 35--37 of the source. See if you can find the end of the loop body, where it jumps back to <main+35>. You should recognize the for loop pattern we covered in class.

    We've successfully set a breakpoint at the point that the loop is initialized. But we'd like to have a breakpoint inside the for loop, so we can stop on every iteration. In fact, sometimes we want to break at a given line of assembly, even if it's in the middle of a source line, and GDB lets us do that.

    Type b *(main+35) or b *0x4011b9. <!-- (assuming that's the address of main+35, as it was when we

    The asterisk (*) tells GDB to interpret the rest of the command as an address in memory, rather than as a line number in the source code.

    What does info b tell you about the line number you chose?

  7. Existing breakpoints at main lines 29 and 35, and at instruction main+35; after running and continuing twice before hitting the third breakpoint.

    We can look at the current value of the array by typing p array[0]@argc or p array[0]@6. But the current value isn't very interesting. Let's continue a few times and see what it looks like then. Typing c over and over is tedious (especially if you need to do it 10,000 times!) so let's use a single continue or c to get to breakpoint 3 and then try c 4.

    What are the full contents of array?

  8. Existing breakpoints at main lines 29 and 35, and at instruction main+35; after continuing until breakpoint 3 has been hit and then typing c 4.

    Maybe we should have done c 3 instead of c 4. We could rerun the program, but we really don't need all the breakpoints we set, as we're only working with breakpoint 3.

    Type info b to find out what's going on right now. Then use d 1 or delete 1 to completely get rid of breakpoint 1. But maybe breakpoint 2 will be useful in the future, so type disable 2. Use info b to verify that it's no longer enabled (in the “Enb” column).

    Now run the program again. Where do we stop?

  9. No previous state.

    Sometimes, instead of stepping through a program line by line, we want to see what the individual instructions do. Of course, instructions manipulate registers. Quit gdb and restart it, setting a breakpoint in fix_array. (Remember to issue set disassemble-next-line on.)

    Run the program with run 1 42 2 47 3.

    Type info registers (or info r) to see all the processor registers and the values stored in them in both hex and decimal.

    Which registers have not been covered in class?

  10. Existing breakpoint at fix_array; after running and hitting the breakpoint.

    Most of the registers we didn't cover aren't all that interesting, except for eflags, which holds the condition codes… and some other things. Notice that instead of being shown in decimal, its contents are shown symbolically---for example, CF, ZF, and so on.

    Of the flags we have discussed in class, which ones are set right now?

    What preceding instruction caused those flags to be set?

    NOTE: If you haven't been through the “x86 control flow” lecture, you will have to return to this step after that.

  11. Existing breakpoint at fix_array; after running and hitting the breakpoint.

    Looking at all the registers is often more information than we need. GDB lets us do just that. Type p $rdi.

    What is the value?

    Is p/x $rdi more meaningful?

  12. Existing breakpoint at fix_array; after running and hitting the breakpoint.

    You may recall that earlier we mentioned a fondness for x/16i.

    There's nothing special about the number 16; we just like powers of 2, and 16 gives you enough instructions to be useful.
    But we really like x/16i $rip.

    Type that command into GDB, and compare its results to what you get from disassem fix_array.

    Explain your observations.

  13. Existing breakpoint at fix_array; after running and hitting the breakpoint.

    Finally, we mentioned stepping by instructions. That's done with stepi (“step one instruction”). Type that now, and note that gdb gives a new instruction address but says that you're in the left curly brace ({).

    With set disassemble-next-line on (from Step 9), gdb will also tell you what instruction you are on.

    A shorter alternative to disassemble-next-line that only shows one line is to use display/i $rip. Combining these two techniques can be confusing, so we recommend that you pick one and stick with it.

    What instruction are we on?

  14. Keep hitting Enter to step one instruction at a time until you reach a call instruction.

    What function is about to be called?

  15. Existing breakpoint at fix_array; after hitting the breakpoint and then stepping by instruction until a call is about to be executed.

    As with source-level debugging, at the assembly level it's often useful to skip over function calls. At this point you have a choice of typing stepi or nexti.

    If you type stepi, what do you expect the next instruction to be (hexadecimal address)?

    What about nexti?

    (By now, your debugging skills should be strong enough that you can try one, restart the program, and try the other, so there's little excuse for getting this one wrong!)

  16. Existing breakpoint at fix_array; after experimenting with stepi and nexti.

    Stepping one instruction at a time can be tedious. You can always use stepi n to zip past a bunch of them, but when you're dealing with loops and conditionals it can be hard to decide whether it's going to be 1,042 or 47,093 instructions before you reach the next interesting point in your program.

    You could set a breakpoint at the next suspect line. But sometimes the next “interesting” bit is inside a line.

    Imagine that you're interested about how the retq (AKA ret) instruction works. You might want to do disassem fix_array to see the address where this instruction occurs. You can set a breakpoint there by typing b *0x401195 (assuming that is its address, as it was when we wrote these instructions). Do so, and then continue.

    What source line is listed?

  17. Existing breakpoints at fix_array and retq; stopped at retq instruction.

    The retq/ret instruction manipulates registers in some way. Start by looking at what %rsp points to. You can find out the address with p/x $rsp and then use the x command, or you could just try x/x $rsp. Or you could get wild and use C-style typecasting: p/x *(long *)$rsp (try it!).

    What is the value?

  18. Existing breakpoints at fix_array and retq; stopped at retq instruction.

    Use info reg to find out what all the registers are. Then use stepi to step past the retq instruction, and look at all the registers again.

    Which registers have changed, and what are their old and new values?

That's it—you're done!

One way of seeing which registers have changed is doing it by eye—just looking back and forth between the two lists. You could, however, take advantage of Unix tools by copying each register list and saving it into a file (e.g., reg1 for the before-ret register list and reg2 for the after-ret register list), then use the diff command to compare them; for example,

diff -u reg1 reg2

which gives you something like

--- reg1    2026-02-05 11:57:37.000000000 -0800
+++ reg2    2026-02-05 11:58:55.000000000 -0800
@@ -5,7 +5,7 @@
 rsi            0x5                 5
 rdi            0x2f                47
 rbp            0x6                 0x6
-rsp            0x7fffffffd8e8      0x7fffffffd8e8
+rsp            0x7fffffffd8f0      0x7fffffffd8f0
 r8             0x1999999999999999  1844674407370955161
 r9             0x0                 0
 r10            0x7ffff7f4ffe0      140737353416672
@@ -14,7 +14,7 @@
 r13            0x4052b4            4215476
 r14            0x7fffffffda38      140737488345656
 r15            0x403dc0            4210112
-rip            0x401195            0x401195 <fix_array+42>
+rip            0x4011e9            0x4011e9 <main+83>
 eflags         0x246               [ PF ZF IF ]
 cs             0x33                51
 ss             0x2b                43
@@ -32,3 +32,4 @@
 k7             0x0                 0
 fs_base        0x7ffff7ddf740      140737351907136
 gs_base        0x0                 0

The -u flag tells diff to give you three lines of context on either side of a change (-u n gives you n lines of context), and marks the changed lines with a - (old) or + (changed/new) on the left margin.

You can also just run diff without -u, but the results are a bit harder to read:

8c8
< rsp            0x7fffffffd8e8      0x7fffffffd8e8
---
> rsp            0x7fffffffd8f0      0x7fffffffd8f0
17c17
< rip            0x401195            0x401195 <fix_array+42>
---
> rip            0x4011e9            0x4011e9 <main+83>

To Complete This Part of the Assignment…

You'll know you're done with this part of the assignment when you've done all of the following:

(When logged in, completion status appears here.)