In the previous post, we learned the CPU’s instructions. With instructions ready in memory, let’s now see how the CPU actually executes them.
What does a CPU need to do to process an instruction? It fetches the instruction from memory, decodes its meaning, and executes what it says. Endlessly repeating these three steps is everything a CPU does.
Multiple instructions sit in memory. Where does the CPU start reading?
Just as you track your place in a book with your finger, the CPU has a device that points to “where I’m currently reading.” It’s a register called the Program Counter (PC). The CPU reads the instruction at the address PC points to, then advances PC to the next address.
Normally PC increments one step at a time, but the JUMP and JZ instructions we saw in the previous post change PC directly, enabling loops and conditional branching.
| Addr | Binary | Assembly |
|---|---|---|
| 0 | 01100001 | LDI R0, 1 |
| 1 | 01100110 | LDI R1, 2 |
| 2 | 00010001 | ADD R0, R1 |
| 3 | 10010101 | STORE 5 |
| 4 | 11000010 | JUMP 2 |
| 5 | 00000000 | Data |
The fetched instruction is just a binary number like 01100011. How do we
interpret it?
In this series’ 8-bit ISA, each instruction is split into two parts. The first 4
bits are the opcode — the command “what to do.” The remaining 4 bits are the
operand — the target “to what / where.” For example, 0110 0011 has opcode
0110 (LDI) with operand 0011 (register and value), meaning store a value
into a register.
The component responsible for this interpretation is the control unit. At its core are the decoder and MUX we learned about earlier.
When the 4-bit opcode enters the decoder, only the control signal for that
instruction is activated. For example, if the opcode is 0110 (LDI), the
write a value to a register signal turns on; if it’s 0001 (ADD), the
perform addition with the ALU signal turns on. All other signals stay off.
The operand side works similarly. When a register number (2 bits) is included in the instruction, that number becomes the MUX’s select signal, picking one of four registers to read from or write to.
Loads an immediate value into Rd.
Once the control unit sends its signals, the actual operation is carried out.
For example, with ADD R0, R1, the control unit sends an addition signal to the
ALU while the MUX selects the values of R0 and R1. The ALU adds the two values,
stores the result in R0, and updates the Zero Flag. Arithmetic instructions
(ADD, SUB, etc.) from the previous post run in the
ALU, while data-movement instructions (LOAD, STORE, etc.) execute between
registers and memory. After execution, PC advances to the next address, and we
return to Fetch.
Try running the Fetch-Decode-Execute cycle step by step in the Von Neumann Simulator, which assembles all these components together.
Starting from zeros and ones, we’ve passed through logic gates, adders, latches, Turing machines, and the von Neumann architecture to assemble a working computer.
But there’s one problem. Modern CPUs can process billions of instructions per second, yet fetching data from memory is hundreds of times slower. Since the Fetch stage must access memory every cycle, even the fastest CPU is bottlenecked by waiting for memory. In the next post, we’ll see how this speed gap is resolved.