Machine language and debugging
Contents
Machine language and debugging#
1. Intel x86 processors#
Dominate laptop/desktop/server market
Evolutionary design
Backwards compatible up until 8086, introduced in 1978
Added more features as time goes on
x86 is a Complex Instruction Set Computer (CISC)
Many different instructions with many different formats
But, only small subset encountered with Linux programs
Compare: Reduced Instruction Set Computer (RISC)
RISC: very few instructions, with very few modes for each
RISC can be quite fast (but Intel still wins on speed!)
Current RISC renaissance (e.g., ARM, RISC V), especially for low-power
2. Intel x86 processors: machine evolution#
Name |
Date |
Transistor Counts |
---|---|---|
386 |
1985 |
0.3M |
Pentium |
1993 |
3.1M |
Pentium/MMX |
1997 |
4.5M |
Pentium Pro |
1995 |
6.5M |
Pentium III |
1999 |
8.2M |
Pentium 4 |
2000 |
42M |
Core 2 Duo |
2006 |
291M |
Core i7 |
2008 |
731M |
Core i7 Skylake |
2015 |
1.75B |
Added features
Instructions to support multimedia operations
Instructions to enable more efficient conditional operations (!)
Transition from 32 bits to 64 bits
More cores
3. x86 clones: Advanced Micro Devices (AMD)#
Historically
AMD has followed just behind Intel
A little bit slower, a lot cheaper
Then
Recruited top circuit designers from Digital Equipment Corp. and other downward trending companies
Built Opteron: tough competitor to Pentium 4
Developed x86-64, their own extension to 64 bits
Recent Years
Intel got its act together
1995-2011: Lead semiconductor “fab” in world
2018: #2 largest by $$ (#1 is Samsung)
2019: reclaimed #1
AMD fell behind
Relies on external semiconductor manufacturer GlobalFoundaries
ca. 2019 CPUs (e.g., Ryzen) are competitive again
2020 Epyc


4. Machine programming: levels of abstraction#

5. Assembly/Machine code view#
Machine code (Assembly code) differs greatly from the original C code.
Parts of processor state that are not visible/accessible from C programs are now visible.
PC: Program counter
Contains address of next instruction
Called
%rip
(instruction pointer register)
Register file
contains 16 named locations (registers), each can store 64-bit values.
These registers can hold addresses (~ C pointers) or integer data.
Condition codes
Store status information about most recent arithmetic or logical operation
Used for conditional branching (
if
/while
)
Vector registers to hold one or more integers or floating-point values.
Memory
Is seen as a byte-addressable array
Contains code and user data
Stack to support procedures

6. Hands on: assembly/machine code example#
Inside your
csc231
, create another directory called04-machine
and change into this directory.Create a file named
mstore.c
with the following contents:
Run the following commands It is capital o, not number 0
$ gcc -Og -S mstore.c
$ cat mstore.s
$ gcc -Og -c mstore.c
$ objdump -d mstore.o

7. Data format#
C data type |
Intel data type |
Assembly-code suffix |
Size |
---|---|---|---|
char |
Byte |
b |
1 |
short |
Word |
w |
2 |
int |
Double word |
l |
4 |
long |
Quad word |
q |
8 |
char * |
Quad word |
q |
8 |
float |
Single precision |
s |
4 |
double |
Double precision |
l |
8 |
8. Integer registers#
x86_64 CPU contains a set of 16
general purpose registers
storing 64-bit values.Original 8086 design has eight 16-bit registers,
%ax
through%sp
.Origin (mostly obsolete)
%ax
: accumulate%cx
: counter%dx
: data%bx
: base%si
: source index%di
: destination index%sp
: stack pointer%bp
: base pointer
After IA32 extension, these registers grew to 32 bits, labeled
%eax
through%esp
.After x86_64 extension, these registers were expanded to 64 bits, labeled
%rax
through%rsp
. Eight new registered were added:%r8
through%r15
.Instructions can operate on data of different sizes stored in low-order bytes of the 16 registers.

9. Assembly characteristics: operations#
Transfer data between memory and register
Load data from memory into register
Store register data into memory
Perform arithmetic function on register or memory data
Transfer control
Unconditional jumps to/from procedures
Conditional branches
Indirect branches
10. Data movement#
Example:
movq Source, Dest
Note: This is ATT notation. Intel uses
mov Dest, Source
Operand Types:
Immediate (Imm): Constant integer data.
$0x400
,$-533
.Like C constant, but prefixed with
$
.Encoded with 1, 2, or 4 bytes.
Register (Reg): One of 16 integer registers
Example:
%rax
,%r13
%rsp
reserved for special use.Others have special uses in particular instructions.
Memory (Mem): 8 (
q
inmovq
) consecutive bytes of memory at address given by register.Example:
(%rax)
Various other addressing mode (See textbook page 181, Figure 3.3).
Other
mov
:movb
: move bytemovw
: move wordmovl
: move double wordmovq
: move quad wordmoveabsq
: move absolute quad word
11. movq Operand Combinations#
|
Source |
Dest |
Src, Dest |
C Analog |
---|---|---|---|---|
Imm |
Reg |
|
tmp = 0x4; |
|
Imm |
Mem |
|
*p = -147; |
|
Reg |
Reg |
|
tmp2 = tmp1; |
|
Reg |
Mem |
|
*p = tmp; |
|
Mem |
Reg |
|
tmp = *p; |
12. Simple memory addressing mode#
Normal: (R) Mem[Reg[R]]
Register R specifies memory address
Aha! Pointer dereferencing in C
movq (%rcx),%rax
Displacement D(R) Mem[Reg[R]+D]
Register R specifies start of memory region
Constant displacement D specifies offset
movq 8(%rbp),%rdx
Midterm#
Monday October 25, 2021
12-hour windows range: 9:00AM - 9:00PM October 25, 2021.
50 minutes duration.
20-25 questions (similar in format to the quizzes).
Everything (including source codes) up to and including the episode on Representing and manipulating information.
No class on Monday October 25, 2021.
13. x86_64 Cheatsheet#
14. Hands on: data movement#
Create a file named
swap.c
in04-machine
with the following contents:
Run the following commands
$ gcc -Og -c swap.c
$ objdump -d swap.o

15. Hands on: data movement#
Create a file named
swap_dsp.c
in04-machine
with the following contents:
Run the following commands
$ gcc -Og -c swap_dsp.c
$ objdump -d swap_dsp.o

16. Complete memory addressing mode#
Most General Form
D(Rb,Ri,S)
:Mem[Reg[Rb]+S*Reg[Ri]+ D]
D: Constant displacement 1, 2, or 4 bytes
Rb: Base register: Any of 16 integer registers
Ri: Index register: Any, except for
%rsp
S: Scale: 1, 2, 4, or 8
Special Cases
(Rb,Ri)
:Mem[Reg[Rb]+Reg[Ri]]
D(Rb,Ri)
:Mem[Reg[Rb]+Reg[Ri]+D]
(Rb,Ri,S)
:Mem[Reg[Rb]+S*Reg[Ri]]
(,Ri,S)
:Mem[S*Reg[Ri]]
D(,Ri,S)
:Mem[S*Reg[Ri] + D]
17. Arithmetic and logical operations: lea#
lea
: load effective addressA form of
movq
intsructionlea S, D
: Write&S
toD
.can be used to generate pointers
can also be used to describe common arithmetic operations.
18. Hands on: lea#
Create a file named
m12.c
in04-machine
with the following contents:
Run the following commands
$ gcc -Og -c m12.c
$ objdump -d m12.o

19. Other arithmetic operations#
Omitting suffixes comparing to the book.
Format |
Computation |
Description |
---|---|---|
|
D <- D + S |
add |
|
D <- D - S |
subtract |
|
D <- D * S |
multiply |
————— |
———– |
—————— |
|
D <- D << S |
shift left |
|
D <- D >S |
arith. shift right |
|
D <- D >S |
shift right |
|
D <- D << S |
arith. shift left |
————— |
———– |
—————— |
|
D <- D ^ S |
exclusive or |
|
D <- D & S |
and |
|
D <- D | S |
or |
————— |
———– |
—————— |
|
D <- D + 1 |
increment |
|
D <- D - 1 |
decrement |
|
D <- -D |
negate |
|
D <- -D |
complement |
Watch out for argument order (ATT versus Intel)
No distinction between signed and unsigned int. Why is that?
20. Challenge: lea#
Create a file named
scale.c
in04-machine
with the following contents:
Run the following commands
$ gcc -Og -c scale.c
$ objdump -d scale.o

21. Hands on: long arithmetic#
Create a file named
arith.c
in04-machine
with the following contents:
Run the following commands
$ gcc -Og -c arith.c
$ objdump -d arith.o
Understand how the Assembly code represents the actual arithmetic operation in the C code.

22. Quick review: processor state#
Information about currently executing program
temporary data (
%rax
,…)location of runtime stack (
%rsp
)location of current code control point (
%rip
,…)status of recent tests (
CF
,ZF
,SF
,OF
in%EFLAGS
)

23. Condition codes (implicit setting)#
Single-bit registers
CF
: the most recent operation generated a carry out of the most significant bit.ZF
: the most recent operation yielded zero.SF
: the most recent operation yielded negative.OF
: the most recent operation caused a two’s-complement overflow.
Implicitly set (as side effect) of arithmetic operations.
24. Condition codes (explicit setting)#
Exlicit setting by Compare instruction
cmpq Src2, Src1
cmpq b, a
like computinga - b
without setting destination
CF
set if carry/borrow out from most significant bit (unsigned comparisons)ZF
set ifa == b
SF
set if(a - b) < 0
OF
set if two’s complement (signed) overflow(a>0 && b<0 && (a-b)<0) || (a<0 && b>0 && (a-b)>0)
25. Condition branches (jX)#
Jump to different part of code depending on condition codes
Implicit reading of condition codes
jX |
Condition |
Description |
---|---|---|
|
1 |
direct jump |
|
ZF |
equal/zero |
|
~ZF |
not equal/not zero |
|
SF |
negative |
|
~SF |
non-negative |
|
~(SF^OF) & ~ZF |
greater |
|
~(SF^OF) |
greater or equal to |
|
SF^OF |
lesser |
|
SF^OF | ZF |
lesser or equal to |
|
~CF & ~ZF |
above |
|
CF |
below |
26. Hands on: a simple jump#
Create a file named
jump.c
in04-machine
with the following contents:
Run the following commands
$ gcc -Og -c jump.c
$ objdump -d jump.o
Understand how the Assembly code enables jump across instructions to support conditional workflow.
In the next video, we will look at how
cmp
andjle
ofabsdiff
really behave in an actual execution.
27. Hands on: loop#
Create a file named
factorial.c
in04-machine
with the following contents:
Run the following commands
$ gcc -Og -c factorial.c
$ objdump -d factorial.o
Understand how the Assembly code enables jump across instructions to support loop.
Create
factorial_2.c
andfactorial_3.c
fromfactorial.c
.Modify
factorial_2.c
so that the factorial is implemented with awhile
loop. Study the resulting Assembly code.Modify
factorial_3.c
so that the factorial is implemented with afor
loop. Study the resulting Assembly code.Behavior of
factorial
Assembly instructions inside GDB
28. Mechanisms in procedures#
Function = procedure (book terminology)
Support procedure
P
calls procedureQ
.Passing control
To beginning of procedure code
starting instruction of
Q
Back to return point
next instruction in
P
afterQ
Passing data
Procedure arguments
P
passes one or more parameters toQ
.Q
returns a value back toP
.
Return value
Memory management
Allocate during procedure execution and de-allocate upon return
Q
needs to allocate space for local variables and free that storage once finishes.
Mechanisms all implemented with machine instructions
x86-64 implementation of a procedure uses only those mechanisms required
Machine instructions implement the mechanisms, but the choices are determined by designers. These choices make up the Application Binary Interface (ABI).
29. x86-64 stack#
Region of memory managed with stack discipline
Memory viewed as array of bytes.
Different regions have different purposes.
(Like ABI, a policy decision)
Grows toward lower addresses
Register
%rsp
contains lowest stack address.address of “top” element

30. Stack push and pop#
pushq Src
Fetch operand at
Src
Decrement
%rsp
by 8Write operand at address given by
%rsp
popq Dest
Read value at address given by
%rsp
Increment
%rsp
by 8Store value at Dest (usually a register)
31. What really happens in memory/registers at the beginning and the end of a function#
The
-Og
flag often combines/reduces these steps.The memory stack architecture for a function has a base pointer (
$rbp
) and a stack pointer ($rsp
).Base pointer: the bottom of the stack (higher memory address)
Stack pointer: the top of the stack (lower memory address)
Function prologue
Push the current base pointer onto the memory stack (to be restored later).
Assign the value of the base pointer (set the
$rbp
to that value) to the current address pointed to by the stack pointer.Move the stack pointer down further (push new memory in) a distance that would accommodate local variables of the function.
Function prologue (Assembly), ATT notation, assume rbp/ebp and rsp/esp
push $rbp
mov $rsp, $rbp
sub N, $rsp
Function epilogue
Drop the stack pointer to the current base pointer, so room reserved in the prologue for local variables is freed.
Pops the base pointer off the stack, so it is restored to its value before the prologue.
Returns to the calling function, by popping the previous frame’s program counter off the stack and jumping to it.
Function prologue (Assembly), ATT notation, assume rbp/ebp and rsp/esp
mov $rbp, $rsp
pop $rbp
ret
Video lecture on the slide
32. Hands on: function calls#
Create a file named
mult.c
in04-machine
with the following contents:
Description of C code:
Compile with
-g
flag and rungdb
on the resulting executable.
$ gcc -g -o mult mult.c
$ gdb mult
Setup gdb with a breakpoint at
main
and start running.A new GDB command is
si
: executing the next instruction (machine or code instruction).It will execute the highlighted (greened and arrowed) instruction in the
code
section.If the Assembly instruction is calling another function, we need to use
ni
if we don’t want to step into that instruction.
Be careful, Intel notation in the code segment of GDB
endbr64
is a new instruction to help enforce Control Flow Technology to prevent potential stitching of malicious Assembly codes.
33. Data alignment#
Intel recommends data to be aligned to improve memory system performance.
K-alignment rule: Any primitive object of
K
bytes must have an address that is multiple ofK
: 1 forchar
, 2 forshort
, 4 forint
andfloat
, and 8 forlong
,double
, andchar *
.
{% include links.md %}