- Published on
The Hack Registers Explained A Deep Dive into Memory and Control
- Authors

- Name
- Yinhuan Yuan
The Hack Registers Explained: A Deep Dive into Memory and Control
Let me explain registers from the ground up, building a complete understanding of how the Hack computer stores and manages data.
What is a Register? (The Foundation)
The Basic Concept
Think of a register as a sticky note for the computer:
- It remembers a number
- It holds that number until you tell it to change
- It can give you back that number instantly
- It's extremely fast (nanoseconds)
Analogy: Your Calculator's "Memory" Button
When you press [M+] on a calculator:
├─ Calculator STORES current number
├─ Number STAYS there (even if you clear screen)
├─ Press [MR] to RECALL that number
└─ Press [M+] again to REPLACE it
A register does exactly this - but in hardware!
Why Do We Need Registers?
The Problem:
Imagine trying to compute: (5 + 3) × 2
Without registers:
1. Add 5 + 3 = 8
2. Uhh... where did the 8 go?
3. Can't multiply it - we lost it!
With registers:
1. Add 5 + 3 = 8
2. STORE 8 in register
3. Load 8 from register
4. Multiply 8 × 2 = 16 ✓
Registers are the working memory of the CPU. Without them, you can only do one operation and then lose the result!
The Speed Hierarchy
Speed vs Capacity tradeoff:
Registers │ ████████████ Fastest (< 1ns) │ Tiny (a few numbers)
│ │
Cache │ ██████████ Very Fast (1-10ns) │ Small (KB-MB)
│ │
RAM │ ████ Fast (50-100ns) │ Medium (GB)
│ │
SSD │ ██ Slow (microseconds) │ Large (TB)
│ │
Hard Drive │ █ Very Slow (milliseconds) │ Huge (TB)
Registers are at the TOP of the pyramid!
Why registers are fast:
- Right next to the ALU (no distance to travel)
- Direct electrical connection
- No addressing needed
- Always powered and ready
The Three Registers in Hack Computer
The Hack computer has exactly three registers:
Overview
┌────────────────────────────────────────────────────┐
│ │
│ A Register (16-bit) │
│ ├─ Address register │
│ ├─ Also used for data │
│ └─ Dual purpose! │
│ │
│ D Register (16-bit) │
│ ├─ Data register │
│ ├─ Only stores data │
│ └─ ALU input │
│ │
│ PC - Program Counter (15-bit) │
│ ├─ Points to next instruction │
│ ├─ Auto-increments │
│ └─ Controls program flow │
│ │
└────────────────────────────────────────────────────┘
Why Only Three Registers?
Design philosophy of Hack computer:
- Simplicity over speed
- Easier to understand with fewer registers
- Matches the Nand2Tetris educational goals
- Still Turing-complete (can compute anything!)
Comparison:
Architecture Registers Philosophy
────────────────────────────────────────────────
Hack 3 Educational simplicity
8080 7 Early microprocessor
x86-64 16 Performance
ARM 31 RISC efficiency
GPU 1000s Massive parallelism
Now let's dive deep into each register!
The A Register (Address/Data Register)
The Dual Purpose Design
The A register is special - it has two jobs:
Job 1: Address Register
├─ Holds memory addresses
├─ Points to RAM locations
├─ Used for: @commands in Hack assembly
└─ Example: @100 means "A = 100"
Job 2: Data Register
├─ Holds regular data values
├─ Can be used in ALU operations
├─ Alternative to D register
└─ Example: A = A + 1
Why dual purpose?
- Saves a register!
- Many instructions need an address anyway
- Flexible programming model
Real-World Use Examples
Example 1: Loading a constant
──────────────────────────────
Assembly: @17
D=A
What happens:
1. A Register ← 17 (constant loaded)
2. D Register ← A (copy to D)
Result: D = 17
Example 2: Accessing memory
───────────────────────────
Assembly: @100
M=D
What happens:
1. A Register ← 100 (address)
2. RAM[100] ← D (store D at address 100)
Result: RAM[100] now contains value from D
Example 3: Jump to address
──────────────────────────
Assembly: @LOOP
0;JMP
What happens:
1. A Register ← address of LOOP label
2. PC ← A (program counter jumps to A)
Result: Program continues from LOOP
Hardware Implementation: 74HC574
The A register uses the 74HC574 integrated circuit.
What is a 74HC574?
74HC574: Octal D-Type Flip-Flop with 3-State Output
"Octal" = 8 bits
"D-Type Flip-Flop" = basic memory element
"3-State Output" = can be enabled/disabled
One IC handles 8 bits, so we need two 74HC574s for 16 bits:
- U1: Handles bits [7:0] (low byte)
- U2: Handles bits [15:8] (high byte)
Inside the 74HC574
Let me explain what's inside this chip:
Simplified view of ONE bit (multiply by 8):
Data In
│
▼
┌────────┐
│ D │
CLK─►│ Q ──┼──► Data Out
│ │
└────────┘
D Flip-Flop
(The memory!)
When CLK goes HIGH → LOW (falling edge):
├─ Whatever is on D gets CAPTURED
├─ Stored in the flip-flop
├─ Appears on Q
└─ STAYS there until next clock edge
The data is "latched" - it remembers!
The D Flip-Flop: How Bits Remember
This is fundamental - let me explain carefully:
The Master-Slave D Flip-Flop
Data In (D)
│
▼
┌──────────┐ ┌──────────┐
│ MASTER │ │ SLAVE │
│ LATCH │───────►│ LATCH │──► Q (Output)
└──────────┘ └──────────┘
▲ ▲
│ │
CLK (edge triggered)
Operation:
1. Rising edge: Master captures D
2. Falling edge: Slave captures Master
3. Output Q shows saved value
4. Stays there UNTIL next clock cycle
This is how computers REMEMBER!
The transistor-level truth:
A flip-flop is made of cross-coupled gates that form a feedback loop:
Simplified SR Latch (Set-Reset):
S ─────┐
NOR ──┬─── Q
┌───┘ │
│ │
└─────────┤
NOR ──── !Q
R ─────┐ │
└────┘
When S=1, Q=1 (Set)
When R=1, Q=0 (Reset)
When both 0, Q REMEMBERS!
The feedback loop creates the memory effect!
Why this works:
- The output feeds back to the input
- Creates a bistable state (stable at 0 or 1)
- Requires continuous power to maintain
- This is volatile memory (loses data when power off)
74HC574 Pinout and Connections
74HC574 (DIP-20 package)
┌──────┴──────┐
/OE │1 20│ VCC (+5V)
D0 │2 19│ D7 ◄── Data inputs
D1 │3 18│ D6
D2 │4 17│ D5
D3 │5 16│ D4
Q0 │6 15│ Q7 ◄── Data outputs
Q1 │7 14│ Q6
Q2 │8 13│ Q5
Q3 │9 12│ Q4
GND │10 11│ CLK ◄── Clock input
└─────────────┘
Pin Functions:
─────────────
D0-D7: Data inputs (what you want to store)
Q0-Q7: Data outputs (what's currently stored)
CLK: Clock - captures data on HIGH→LOW edge
/OE: Output Enable (active LOW)
└─ When LOW: outputs enabled
└─ When HIGH: outputs go hi-Z (disconnected)
VCC: Power (+5V)
GND: Ground (0V)
How the A Register Stores Data
Let's trace through a complete storage operation:
SCENARIO: Store 0xABCD in A register
Initial state:
├─ A register contains: 0x0000
├─ Data bus has: 0xABCD
└─ LOAD_A signal is about to pulse
Step 1: Setup (before clock edge)
──────────────────────────────────
Data Bus[15:8] = 0xAB ──► U2.D7-D0
Data Bus[7:0] = 0xCD ──► U1.D7-D0
0xAB 0xCD
│ │
┌──▼──────────┐ ┌───▼─────────┐
│ U2 (Hi) │ │ U1 (Lo) │
│ 74HC574 │ │ 74HC574 │
│ │ │ │
│ D7..D0 Q7..Q0│ │ D7..D0 Q7..Q0│
│ 10101011│ │ │ 11001101│ │
│ (waiting)│ │ │ (waiting)│ │
└───────────┘ └─────────────┘
LOAD_A signal: ─────────╗
║ (high, about to fall)
Step 2: Clock edge (falling edge)
──────────────────────────────────
LOAD_A signal: ────────╗
╚═══════ ◄── Falling edge HERE!
The moment the clock falls:
├─ U1 captures 0xCD into its flip-flops
├─ U2 captures 0xAB into its flip-flops
├─ Data is now STORED inside the ICs
└─ Takes only ~5ns!
Step 3: After clock (data retained)
────────────────────────────────────
┌────────────┐ ┌─────────────┐
│ U2 (Hi) │ │ U1 (Lo) │
│ 74HC574 │ │ 74HC574 │
│ │ │ │
│ Stored: AB │ │ Stored: CD │
│ Q7..Q0 │ │ Q7..Q0 │
│ 10101011───┼──► │ 11001101────┼──►
└────────────┘ └─────────────┘
│ │
└──────┬──────────────┘
│
A_Out[15:0] = 0xABCD ✓
Data bus can now change - doesn't matter!
The A register REMEMBERS 0xABCD
Step 4: Later - reading the value
──────────────────────────────────
No clock needed!
Output is ALWAYS available:
├─ A_Out[15:0] continuously shows 0xABCD
├─ Can be read by ALU any time
├─ Can be used as address any time
├─ Stays there until next LOAD_A pulse
└─ Even if data bus changes!
This is "transparent" output
Timing Diagram (Critical Understanding)
Time: │←─ 1 clock cycle ─→│←─ 1 clock cycle ─→│
│ │ │
CLOCK: ───╗ ╔═══╗ ╔═══╗ ╔═══
╚═════╝ ╚═════╝ ╚═════╝
▲ ▲ ▲
Falling edge (captures data)
Data_In: ─── 0x1234 ────── 0x5678 ────── 0xABCD ───
LOAD_A: ──────────╗ ╔══════════
╚═══════════════════╝
│ │
└─ Enabled ─────────┘
A_Out: ─── 0x0000 ─────── 0x1234 ────── 0x5678 ───
▲ ▲
│ │
Captured here Captured here
Critical Timing Parameters (74HC574):
─────────────────────────────────────
Setup time: 12ns (data must be stable BEFORE clock)
Hold time: 3ns (data must be stable AFTER clock)
Propagation: 25ns (CLK to Q output change)
Clock-to-Q: 20ns (how fast output updates)
If timing violated → DATA CORRUPTION!
Why We Use 74HC574 (Design Choice)
Advantages:
✓ Simple interface (just clock it)
✓ Transparent outputs (always readable)
✓ 3-state outputs (can disconnect from bus)
✓ Edge-triggered (clean data capture)
✓ Fast (25ns max)
✓ Low power (CMOS)
✓ Cheap (~$0.50 per IC)
✓ Available everywhere
Alternatives considered:
Option 1: 74HC374 (similar but different pinout)
├─ Functionally identical
├─ Different pin arrangement
└─ Either works fine
Option 2: 74HC273 (no 3-state)
├─ Cannot disconnect from bus
├─ Less flexible
└─ Not chosen
Option 3: 74HC377 (with enable)
├─ Has separate enable pin
├─ More control, more complex
└─ Overkill for Hack
Option 4: SRAM (6264 or similar)
├─ Overkill (we only need 16 bits!)
├─ Requires addressing logic
├─ Slower
└─ Not practical for registers
The Output Enable Feature
The 74HC574 has a special feature: 3-state output
What is 3-state (tri-state)?
Normal output:
├─ HIGH (1, ~5V)
└─ LOW (0, ~0V)
3-state output adds:
└─ Hi-Z (high impedance, "floating")
Pin 1 (/OE - Output Enable, active LOW):
─────────────────────────────────────────
/OE = 0 → Outputs ENABLED (normal operation)
/OE = 1 → Outputs DISABLED (hi-Z)
Why this matters:
With /OE feature:
┌─────────┐ ┌─────────┐
│ Reg A │─────┬───│ Reg D │
│ /OE=1 │ │ │ /OE=0 │
└─────────┘ │ └─────────┘
(off) │ (on)
│
Shared Bus
Only D drives it!
Without /OE (collision!):
┌─────────┐ ┌─────────┐
│ Reg A │─────┬───│ Reg D │
│ outputs │ │ │ outputs │
└─────────┘ │ └─────────┘
║ │ ║
╚═══════════╪═══════╝
FIGHT!
(both trying to drive bus)
(can damage chips!)
In the Hack computer:
- For A and D registers, /OE is tied to GND
- This means outputs are always enabled
- This is fine because each register has dedicated output wires
- No bus sharing for these registers
When you WOULD use /OE:
- Multiple devices on shared bus
- Memory banks (only one enabled at a time)
- I/O port multiplexing
The D Register (Data Register)
Purpose and Design
The D register is the pure data storage register:
Job: Store intermediate calculation results
├─ NOT used for addressing
├─ Only used for data
├─ Primary ALU input
└─ Simplest register
Why separate from A?
├─ A is busy being an address
├─ Need somewhere to save ALU results
├─ Allows more complex calculations
└─ Standard CPU architecture
Hardware Implementation
Exactly the same as A register!
- 2× 74HC574 chips (U3, U4)
- Same connections
- Same operation
- Different control signal (LOAD_D instead of LOAD_A)
D Register Schematic
Data_In[7:0] ────────┐
│
┌────▼─────┐
LOAD_D ────────►│ U3 │
│ 74HC574 │
GND ────────────►│ /OE │
│ │──► D_Out[7:0]
└──────────┘
Data_In[15:8] ───────┐
│
┌────▼─────┐
LOAD_D ────────►│ U4 │
│ 74HC574 │
GND ────────────►│ /OE │
│ │──► D_Out[15:8]
└──────────┘
Power & Ground omitted for clarity
Usage Patterns
Typical D Register Operations:
Pattern 1: Store ALU result
────────────────────────────
D = D + 1
How it works:
1. D_Out → ALU input
2. ALU computes D + 1
3. ALU_Out → Data_In
4. LOAD_D pulse
5. Result stored in D
Timeline:
t=0ns: D_Out = 5
t=100ns: ALU_Out = 6
t=110ns: LOAD_D pulses
t=111ns: D_Out = 6 ✓
Pattern 2: Save calculation
───────────────────────────
D = A + D
1. A_Out → ALU X input
2. D_Out → ALU Y input
3. ALU computes A + D
4. Result → Data_In
5. LOAD_D pulse
6. Sum saved in D
Pattern 3: Load from memory
───────────────────────────
D = M
1. A has address
2. RAM[A] → Data_In
3. LOAD_D pulse
4. D now contains RAM value
Why Not Just Use A for Everything?
Good question! Here's why we need D:
Scenario: Compute (RAM[5] + RAM[6]) and store in RAM[7]
With A and D (actual design):
────────────────────────────
@5 // A = 5
D = M // D = RAM[5] (saved!)
@6 // A = 6
D = D + M // D = RAM[5] + RAM[6]
@7 // A = 7
M = D // RAM[7] = result ✓
Total: 6 instructions
Without D register (hypothetical):
──────────────────────────────────
@5 // A = 5
A = M // A = RAM[5]... but wait!
@6 // A = 6, LOST previous value!
// Can't do it! Need somewhere to save RAM[5]
IMPOSSIBLE without D register!
The D register provides "scratch space" for calculations.
The PC Register (Program Counter)
This is the most complex and most interesting register!
What is a Program Counter?
The Program Counter (PC) is a special register that:
├─ Points to the NEXT instruction to execute
├─ Automatically increments after each instruction
├─ Can JUMP to different addresses
└─ Controls program flow
Analogy: Reading a Book
────────────────────────
PC is like your finger pointing at:
├─ Current line you're reading
├─ Moves down one line after reading (auto-increment)
├─ Can jump to a different page (jump instruction)
└─ Bookmarks your place
Why PC is Special
Unlike A and D, the PC must:
- Auto-increment (add 1 automatically)
- Load new values (for jumps)
- Reset to zero (at startup)
This needs different hardware than simple storage!
Hardware Implementation: 74HC161
The PC uses the 74HC161 integrated circuit.
What is a 74HC161?
74HC161: 4-bit Synchronous Binary Counter
"4-bit" = counts 0-15 in one IC
"Synchronous" = all bits change together (not ripple)
"Binary Counter" = increments by 1 each clock
"Parallel Load" = can load any value directly
We need FOUR of these for 15 bits:
- U6: Bits [3:0]
- U7: Bits [7:4]
- U8: Bits [11:8]
- U9: Bits [14:12] + one unused bit
Why 15 bits, not 16?
- Hack addressing is 0-32767 (32K)
- 2^15 = 32768 addresses
- Don't need 16th bit
- Saves hardware!
Inside the 74HC161
Simplified 4-bit counter logic:
Data Inputs (A, B, C, D)
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────────┐
│ 4 D Flip-Flops │ ◄── Storage
│ (Like 74HC574) │
└────┬──┬──┬──┬────┘
│ │ │ │
┌──▼──▼──▼──▼──┐
│ Increment │ ◄── +1 Logic
│ Logic │
└───────────────┘
│ │ │ │
▼ ▼ ▼ ▼
QA QB QC QD (Outputs)
│
┌──▼──┐
│ RCO │ ◄── Ripple Carry Out (to next stage)
└─────┘
Control Signals:
├─ CLK: When to count
├─ ENP, ENT: Enable counting (both must be HIGH)
├─ /LOAD: Load parallel data (active LOW)
└─ /CLR: Clear to zero (active LOW)
The Increment Logic
How does it add 1? Here's the actual logic:
4-bit Binary Counter Logic:
Current → Next (increment by 1):
Q3 Q2 Q1 Q0 → Q3 Q2 Q1 Q0
─────────────────────────────
0 0 0 0 → 0 0 0 1 (0→1)
0 0 0 1 → 0 0 1 0 (1→2)
0 0 1 0 → 0 0 1 1 (2→3)
0 0 1 1 → 0 1 0 0 (3→4)
...
1 1 1 1 → 0 0 0 0 (15→0, overflow!)
▲
└─ RCO (Ripple Carry Out) = 1
Logic equations:
────────────────
Q0_next = !Q0 (toggle every time)
Q1_next = Q1 XOR Q0 (toggle when Q0=1)
Q2_next = Q2 XOR (Q1 AND Q0) (toggle when Q1,Q0=1)
Q3_next = Q3 XOR (Q2 AND Q1 AND Q0)
RCO = Q3 AND Q2 AND Q1 AND Q0 (all bits high?)
This is implemented with:
├─ XOR gates (for toggle)
├─ AND gates (for carry propagation)
└─ D flip-flops (for storage)
74HC161 Pinout and Connections
74HC161 (DIP-16 package)
┌──────┴──────┐
/CLR │1 16│ VCC (+5V)
CLK │2 15│ RCO (Ripple Carry Out)
A │3 14│ QA ◄── Counter outputs
B │4 13│ QB
C │5 12│ QC
D │6 11│ QD
ENP │7 10│ ENT
GND │8 9│ /LOAD
└─────────────┘
Pin Functions:
──────────────
A, B, C, D: Parallel load inputs (data to load)
QA-QD: Counter outputs (current count)
CLK: Clock input (count on rising edge)
ENP, ENT: Enable inputs (both must be HIGH to count)
/LOAD: Load enable (active LOW - loads A,B,C,D)
/CLR: Clear (active LOW - resets to 0000)
RCO: Ripple Carry Out (goes HIGH when count = 1111)
Chaining Multiple 74HC161s for 15-bit Counter
This is the critical part - connecting four counters:
U6 [3:0] U7 [7:4] U8 [11:8] U9 [14:12]
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
A[0]─┤3 14├─PC[0]│3 14├─PC[4]│3 14├─PC[8]│3 14├─PC[12]
A[1]─┤4 13├─PC[1]│4 13├─PC[5]│4 13├─PC[9]│4 13├─PC[13]
A[2]─┤5 12├─PC[2]│5 12├─PC[6]│5 12├─PC[10]│5 12├─PC[14]
A[3]─┤6 11├─PC[3]│6 11├─PC[7]│6 11├─PC[11]│6 11├─(unused)
│ │ │ │ │ │ │ │
CLK─►│2 15│ │2 15│ │2 15│ │2 15│
│ RCO├────►│7,10 RCO├────►│7,10 RCO├────►│7,10 RCO├──(unused)
│ │ │ │ │ │ │ │ │ │ │
│7,10 │ │ │ │ │ │ │ │ │ │
└────────┘ │ └────────┘ │ └────────┘ │ └────────┘
│ │ │
Carry chain: enables next stage only when previous overflows
Control signals (shared by all):
────────────────────────────────
CLK → All counters clock together
/LOAD → All counters load together (for jumps)
/CLR → All counters clear together (reset)
ENP, ENT on U6 → Controls counting (PC_INC signal)
ENP, ENT on U7,U8,U9 → Connected to previous RCO
The Carry Chain Explained
How the 15-bit counter works:
Stage 1 (U6, bits 0-3):
────────────────────────
Counts: 0, 1, 2, 3, ..., 14, 15, 0, 1, 2...
RCO is HIGH when count = 15 (1111)
Stage 2 (U7, bits 4-7):
────────────────────────
Only increments when U6.RCO = HIGH
This happens every 16 counts
So U7 counts: 0, 0, 0, ..., 1, 1, 1, ...
(Changes every 16 clock cycles)
Stage 3 (U8, bits 8-11):
────────────────────────
Only increments when U7.RCO = HIGH
This happens every 256 counts
Counts the "pages" of memory
Stage 4 (U9, bits 12-14):
─────────────────────────
Only increments when U8.RCO = HIGH
This happens every 4096 counts
Provides the highest address bits
Maximum count:
──────────────
Binary: 111111111111111 (15 ones)
Decimal: 32767
Hex: 0x7FFF
Then wraps to 0 (overflow)
Visual representation of counting:
Count │ U9[14:12] │ U8[11:8] │ U7[7:4] │ U6[3:0] │
│ (4096s) │ (256s) │ (16s) │ (1s) │
─────────┼───────────┼──────────┼─────────┼─────────┤
0 │ 000 │ 0000 │ 0000 │ 0000 │
1 │ 000 │ 0000 │ 0000 │ 0001 │
... │ ... │ .... │ .... │ .... │
15 │ 000 │ 0000 │ 0000 │ 1111 │◄─U6 about to overflow
16 │ 000 │ 0000 │ 0001 │ 0000 │◄─U7 increments!
... │ ... │ .... │ .... │ .... │
255 │ 000 │ 0000 │ 1111 │ 1111 │
256 │ 000 │ 0001 │ 0000 │ 0000 │◄─U8 increments!
... │ ... │ .... │ .... │ .... │
32767 │ 111 │ 1111 │ 1111 │ 1111 │◄─Maximum
32768 │ 000 │ 0000 │ 0000 │ 0000 │◄─Wraps around!
PC Control Logic
The PC has three modes of operation:
Mode 1: INCREMENT (normal operation)
─────────────────────────────────────
Condition: No jump instruction
Action: PC = PC + 1
Control: PC_INC = HIGH, /LOAD = HIGH
Mode 2: JUMP (load new address)
────────────────────────────────
Condition: Jump instruction taken
Action: PC = A (load from A register)
Control: PC_INC = LOW, /LOAD = LOW
Mode 3: RESET (startup)
───────────────────────
Condition: System reset
Action: PC = 0
Control: /CLR = LOW
The Control Circuit
We need logic to decide: increment or load?
Using 74HC00 (NAND gates):
Inputs:
├─ JUMP (from control unit)
├─ RESET (from reset button)
└─ CLK (system clock)
Outputs:
├─ PC_INC (enable counting)
├─ PC_LOAD (/LOAD signal)
└─ PC_CLR (/CLR signal)
Logic:
──────
PC_LOAD = JUMP (when jumping, load new address)
PC_INC = !JUMP AND !RESET (count when not jumping/reset)
PC_CLR = RESET (clear on reset)
Circuit with 74HC00:
────────────────────
┌────┐
JUMP──►│1 3├──► PC_LOAD (inverted for /LOAD)
JUMP──►│2 │
└────┘
74HC00 (NAND gate used as NOT)
┌────┐
JUMP──►│4 6├──┐
RESET─►│5 │ │
└────┘ │
▼
PC_INC (HIGH when both inputs LOW)
PC Operation Examples
Let me trace through complete operations:
Example 1: Normal Increment (Sequential Execution)
Initial state:
├─ PC = 0x0100 (256)
├─ Next instruction is at address 0x0101
└─ No jump condition
Step-by-step:
─────────────
t=0ns: Fetch instruction from ROM[0x0100]
├─ PC_Out = 0x0100
├─ ROM receives address
└─ Instruction retrieved
t=50ns: Execute instruction (say, D=D+1)
├─ ALU does computation
├─ Result stored in D
└─ No jump (JUMP=0)
t=100ns: Clock rises (increment PC)
├─ JUMP=0, so PC_INC=1
├─ ENP=1, ENT=1 on U6
├─ Counter increments
└─ PC becomes 0x0101
t=110ns: New PC value stable
├─ PC = 0x0101 ✓
├─ Points to next instruction
└─ Ready for next cycle
Timeline:
────────
CLK: ────╗ ╔════╗ ╔════
╚═════╝ ╚═════╝
PC_INC: ═══════════════════════════ (HIGH, counting enabled)
PC_Out: ──── 0x0100 ────── 0x0101 ──
▲
Incremented here
Example 2: Jump Instruction (Non-Sequential)
Scenario: Execute "0;JMP" (unconditional jump)
Initial state:
├─ PC = 0x0100
├─ A = 0x0200 (jump target)
└─ Instruction says: always jump
Step-by-step:
─────────────
t=0ns: Fetch JMP instruction
├─ ROM[0x0100] contains jump instruction
├─ Control unit decodes it
└─ Determines: JUMP=1
t=50ns: Control signals asserted
├─ JUMP=1 (jump taken)
├─ PC_INC=0 (don't increment)
├─ /LOAD=0 (load mode)
└─ PC_CLR=1 (not clearing)
t=100ns: Clock rises
├─ /LOAD is LOW → parallel load enabled
├─ A[14:0] → Counter inputs
├─ All flip-flops load: PC ← A
└─ PC becomes 0x0200
t=110ns: Jump complete
├─ PC = 0x0200 ✓
├─ Now pointing at jump target
└─ Next instruction fetched from 0x0200
Timeline:
────────
CLK: ────╗ ╔════╗ ╔════
╚═════╝ ╚═════╝
▲
Jump loads here
JUMP: ════════════════════════ (HIGH during jump)
/LOAD: ───────╗ ╔══════════ (LOW to load)
╚════╝
A[14:0]: ───── 0x0200 ────────── (jump target)
PC_Out: ──── 0x0100 ────── 0x0200
▲
Jumped! (not incremented)
Example 3: Conditional Jump Not Taken
Scenario: "D;JGT" (jump if D > 0), but D=0
Initial state:
├─ PC = 0x0100
├─ A = 0x0200
├─ D = 0
└─ Instruction: jump if D > 0
Step-by-step:
─────────────
t=0ns: Check condition
├─ ALU computes: is D > 0?
├─ ZR flag = 1 (zero)
├─ Condition FALSE
└─ Jump NOT taken
t=50ns: Control signals
├─ JUMP=0 (don't jump)
├─ PC_INC=1 (increment instead)
├─ /LOAD=1 (don't load)
└─ Normal increment mode
t=100ns: Clock rises
├─ Counters increment
├─ PC = PC + 1
└─ PC becomes 0x0101
Result:
├─ PC = 0x0101 (incremented)
├─ Jump was ignored
└─ Execution continues sequentially
This shows conditional control flow!
Example 4: System Reset
Scenario: Reset button pressed
All current state:
├─ PC = 0x1234 (random value)
├─ Various program running
└─ Need to restart from beginning
Step-by-step:
─────────────
t=0ns: Reset button pressed
├─ RESET signal goes LOW
├─ PC_CLR=0 (active)
└─ Immediately affects counters
t=5ns: Counters clear
├─ /CLR pin goes LOW on all 74HC161s
├─ Asynchronous clear (doesn't wait for clock!)
├─ All flip-flops reset to 0
└─ PC becomes 0x0000
t=10ns: Reset released
├─ RESET signal goes HIGH
├─ PC_CLR=1 (inactive)
└─ System ready
t=100ns: First instruction fetch
├─ PC = 0x0000
├─ Fetches ROM[0]
└─ Program starts from beginning
Timeline:
────────
RESET: ────╗ ╔════════
╚═════════╝
◄──10ns──►
PC_Out: ──── 0x1234 ─── 0x0000 ──
▲
Cleared instantly
(no clock needed!)
This is why /CLR is "asynchronous"
Why 74HC161 Instead of 74HC574?
Could we build PC with 74HC574 + adder?
Hypothetical PC with 74HC574:
PC_Out → [+1 Adder] → [74HC574] → PC_Out
▲
CLK
Problems:
├─ Need separate 16-bit adder (4× 74HC283)
├─ Need multiplexer to select increment vs load
├─ More ICs (8+ instead of 4)
├─ More complex control logic
└─ Slower (more gate delays)
74HC161 advantages:
├─ Built-in counter (no external adder)
├─ Built-in load (no external mux)
├─ Fewer ICs (4 instead of 8+)
├─ Simpler control (just 3 signals)
└─ Faster (optimized internally)
The 74HC161 is PERFECT for PC!
Register Timing and Synchronization
The Clock Signal (Master Timekeeper)
All registers operate from the same clock:
555 Timer (Clock Generator)
│
[Buffer: 74HC04]
│
┌─────┴─────┬──────────┬────────┐
│ │ │ │
▼ ▼ ▼ ▼
LOAD_A LOAD_D PC_CLK (ALU)
│ │ │
▼ ▼ ▼
A Register D Register PC Counter
All synchronized to same clock!
Why synchronization matters:
Bad (no clock):
───────────────
Data changes randomly → chaos!
Register captures wrong data
ALU sees partial results
Program jumps to random addresses
Good (with clock):
──────────────────
Everything changes at same instant
Data stable before capture
Predictable behavior
Deterministic operation
Critical Timing Paths
Path 1: ALU → Register → ALU (feedback loop)
────────────────────────────────────────────
Longest path through system:
t=0ns: Register outputs → ALU inputs
t=10ns: ALU preprocessing complete
t=50ns: ALU carry chain settles
t=100ns: ALU output stable
t=110ns: Register setup time met
t=110ns: CLOCK EDGE (safe to capture)
Minimum clock period: 110ns
Maximum frequency: ~9 MHz
Path 2: Memory → Register
──────────────────────────
ROM[PC] → Instruction Register → Control signals
t=0ns: PC outputs address
t=50ns: ROM access time
t=80ns: Data valid
t=100ns: Setup time met
t=110ns: CLOCK EDGE
Memory is slower than ALU!
Usually limits clock speed
Path 3: Register → Memory
──────────────────────────
A Register → RAM address → Data read
t=0ns: A outputs address
t=10ns: Address decode
t=60ns: RAM access time
t=70ns: Data valid at output
Can complete in one cycle if slow enough
Setup and Hold Times (Critical!)
Every register has timing requirements:
┌── Setup Time ──┐ ┌─ Hold Time ─┐
│ │ │ │
Data: ───┴─────────────────┴─┴─────────────┴───
│ │ │ │
│ Must be stable │ │Must not │
│ before edge │ │change yet │
│ │ │ │
CLK: ───────────╗ ╚═══╗ ────
╚═════════════╝
▲
Clock edge
74HC574 Timing (typical):
─────────────────────────
Setup time: 12ns
Hold time: 3ns
Clock-to-Q: 20ns
74HC161 Timing (typical):
─────────────────────────
Setup time: 15ns
Hold time: 5ns
Clock-to-Q: 25ns
If violated → data corruption!
Real-world example of timing violation:
Bad Design:
───────────
Data changes: ───────╗ ╔════
╚═══╝
│ │
│ 5ns (too short!)
│ │
CLK edge: ────────────╗
╚═══
Result: Glitch captured! Random 1 or 0
Good Design:
────────────
Data stable: ═══════════════════
│ │
20ns 5ns
│ │
CLK edge: ──────╗ ╚═══
╚══════════
Result: Clean capture! ✓
The Complete Register System
How They Work Together
Instruction Execution Cycle:
Step 1: FETCH
─────────────
PC → ROM address
ROM → Instruction
Instruction → Control Unit
Step 2: DECODE
──────────────
Control Unit → Signals
├─ ALU control
├─ Register control
└─ Memory control
Step 3: EXECUTE
───────────────
A, D → ALU
ALU → Result
Result → Register (if needed)
Step 4: UPDATE PC
─────────────────
If jump: PC ← A
Else: PC ← PC + 1
All synchronized by clock!
Example: Complete Instruction
Let's trace D = M + 1 through the hardware:
Assembly: @100 // First, load address
D=M+1 // Then, this instruction
Focus on: D=M+1
Initial State:
├─ PC = 0x0050 (pointing to this instruction)
├─ A = 0x0064 (100 in hex)
├─ D = 0x0000 (current D value)
└─ RAM[100] = 0x0042 (66 decimal)
Clock Cycle 1: Fetch & Decode
──────────────────────────────
t=0ns: PC outputs 0x0050
ROM address = 0x0050
t=50ns: ROM outputs instruction
Instruction = 1111110111010000
(binary for "D=M+1")
t=60ns: Control unit decodes
├─ ALU_CTRL = 011111 (increment)
├─ A_OR_M = 1 (use M, not A)
├─ LOAD_D = 1 (will store to D)
└─ JUMP = 0 (no jump)
Clock Cycle 2: Execute
──────────────────────
t=0ns: A register outputs address
A_Out = 0x0064 (100)
RAM address = 100
t=50ns: RAM outputs data
RAM[100] = 0x0042 (66)
M data → ALU Y input
t=60ns: ALU computes M+1
Y = 0x0042
Control = increment Y
Result = 0x0043 (67)
t=100ns: ALU output stable
ALU_Out = 0x0043
t=110ns: Clock edge!
LOAD_D pulses
D ← 0x0043
t=120ns: D register updated
D_Out = 0x0043 ✓
Clock Cycle 3: Update PC
────────────────────────
t=0ns: JUMP=0, so increment
t=110ns: Clock edge
PC increments
PC = 0x0051
Result:
├─ D = 0x0043 (67 = 66 + 1) ✓
├─ PC = 0x0051 (next instruction) ✓
└─ Took 3 clock cycles total
Register Interaction Patterns
Pattern 1: Register to Register
────────────────────────────────
D = A
Flow:
A_Out → Data_Bus → D_In
LOAD_D pulse
Done in 1 cycle!
Pattern 2: ALU Feedback
───────────────────────
D = D + 1
Flow:
D_Out → ALU Y input
ALU computes D + 1
ALU_Out → D_In
LOAD_D pulse
D updated in 1 cycle
Pattern 3: Memory Access
────────────────────────
D = M
Flow:
A_Out → RAM address
RAM → Data
Data → D_In
LOAD_D pulse
Need 2-3 cycles (memory slower)
Pattern 4: Complex Calculation
───────────────────────────────
D = (A + D) & M
Flow:
Cycle 1: Compute A + D → temp
Cycle 2: Load temp to D
Cycle 3: Compute D & M → result
Cycle 4: Load result to D
Multi-cycle operation!
Design Decisions Explained
Why These Specific ICs?
74HC574 for A and D:
Pros:
✓ Simple edge-triggered operation
✓ Separate input/output pins
✓ 3-state capability (future expansion)
✓ Very common and cheap
✓ Well understood
Cons:
✗ No built-in increment (but don't need it)
✗ Requires external adder for arithmetic
Verdict: Perfect for simple storage registers
74HC161 for PC:
Pros:
✓ Built-in counter (auto-increment)
✓ Synchronous operation (no glitches)
✓ Parallel load capability (for jumps)
✓ Clear function (for reset)
✓ Carry chain support (for multi-byte)
Cons:
✗ Cannot use for general storage
✗ More complex than 74HC574
Verdict: Perfect for program counter
Alternative Designs Considered
Option 1: Use SRAM for all registers
─────────────────────────────────────
Pros: Simpler (just address decode)
Cons: Slower, needs addressing logic
Verdict: Overkill for 3 registers
Option 2: Use shift registers
─────────────────────────────
Pros: Serial data loading
Cons: Too slow (16 clocks to load)
Verdict: Wrong tool for the job
Option 3: Use larger counters (74HC193)
───────────────────────────────────────
Pros: Up/down counting
Cons: Don't need down counting
Verdict: Unnecessary complexity
Option 4: Use 74HC377 (with enable)
───────────────────────────────────
Pros: Extra enable control
Cons: Same function as 74HC574 + /OE
Verdict: No advantage
Final Choice: 74HC574 + 74HC161
────────────────────────────────
├─ Minimum IC count
├─ Standard, available parts
├─ Well-documented behavior
├─ Easy to understand
└─ Perfect match for requirements
Power Consumption
Per IC power consumption (typical):
74HC574 (idle): ~10µA
74HC574 (active): ~5mA at 1MHz
74HC161 (idle): ~10µA
74HC161 (active): ~6mA at 1MHz
Register module total:
├─ 4× 74HC574 = 20mA
├─ 4× 74HC161 = 24mA
└─ Total: ~44mA at 1MHz
Very low power!
Can run from batteries
Much better than old 74LS series
Testing Registers
Testing A and D Registers
TEST 1: Basic Storage
─────────────────────
1. Apply data: 0xAAAA to Data_In
2. Pulse LOAD_A
3. Check: A_Out should be 0xAAAA
4. Change Data_In to 0x5555
5. Don't pulse LOAD_A
6. Check: A_Out still 0xAAAA (retained)
Pass criteria: Data retained until next load
TEST 2: All Bits
────────────────
Test patterns:
├─ 0x0000 (all zeros)
├─ 0xFFFF (all ones)
├─ 0xAAAA (alternating: 1010...)
├─ 0x5555 (alternating: 0101...)
├─ Walking 1s: 0x0001, 0x0002, 0x0004...
└─ Walking 0s: 0xFFFE, 0xFFFD, 0xFFFB...
Verifies: All flip-flops work
TEST 3: Timing
──────────────
1. Set Data_In = 0x1234
2. Pulse LOAD_A (20ns wide)
3. Measure Clock-to-Q delay
4. Should be < 25ns
Verifies: IC speed within spec
TEST 4: Multiple Registers
───────────────────────────
1. Load A = 0xABCD
2. Load D = 0x1234
3. Verify both retained independently
4. No crosstalk between registers
Verifies: Isolation between registers
Testing Program Counter
TEST 1: Increment
─────────────────
1. Clear PC (PC = 0)
2. Set PC_INC = HIGH
3. Apply 10 clock pulses
4. Check: PC should be 10
Pass: Counts correctly
TEST 2: Parallel Load
─────────────────────
1. Set A = 0x1234
2. Assert /LOAD (LOW)
3. Clock pulse
4. Check: PC = 0x1234
Pass: Load function works
TEST 3: Carry Chain
───────────────────
1. Set PC = 0x000F (15)
2. Increment once
3. Check: PC = 0x0010 (16)
(Verify bit 4 changes, not just bit 0)
Pass: Carry propagates correctly
TEST 4: Maximum Count
─────────────────────
1. Set PC = 0x7FFF (32767)
2. Increment once
3. Check: PC = 0x0000 (wraps)
Pass: Overflow handling correct
TEST 5: Reset
─────────────
1. Set PC = random value
2. Assert /CLR
3. Check: PC = 0x0000 immediately
(no clock needed)
Pass: Asynchronous clear works
Common Problems and Solutions
Problem 1: Register always outputs 0
────────────────────────────────────
Likely causes:
├─ No power to IC
├─ /OE pin not grounded (hi-Z mode)
├─ Clock not reaching IC
└─ Damaged IC
Tests:
├─ Check VCC at pin 16/20
├─ Check GND at pin 8/10
├─ Verify /OE = 0V
└─ Replace IC if needed
Problem 2: Register doesn't update
───────────────────────────────────
Likely causes:
├─ No clock signal
├─ Clock pulse too short
├─ Setup time violated
└─ Timing issue
Tests:
├─ Probe CLK pin with scope
├─ Verify pulse width > 15ns
├─ Check data stable before clock
└─ Slow down clock if needed
Problem 3: Random values stored
────────────────────────────────
Likely causes:
├─ Missing decoupling cap
├─ Setup/hold time violation
├─ Noise on data bus
└─ Clock edge too fast
Solutions:
├─ Add 0.1µF cap next to IC
├─ Add buffer if loading heavy
├─ Slow down clock transitions
└─ Check PCB routing
Problem 4: PC counts wrong
──────────────────────────
Likely causes:
├─ Carry chain broken
├─ ENP/ENT not connected
├─ RCO not reaching next stage
└─ One IC not counting
Tests:
├─ Check each counter individually
├─ Verify RCO connections
├─ Probe each stage with scope
└─ Replace bad counter IC
Problem 5: PC doesn't jump
──────────────────────────
Likely causes:
├─ /LOAD not reaching ICs
├─ Parallel load data wrong
├─ /LOAD pulse too short
└─ Control logic error
Tests:
├─ Probe /LOAD signal
├─ Check A register value
├─ Verify timing of /LOAD
└─ Check control circuit
Understanding Through Analogy
The Register as a Mailbox
A and D Registers = Post Office Boxes
──────────────────────────────────────
├─ Box #A and Box #D
├─ You can PUT mail in (write)
├─ You can GET mail out (read)
├─ Mail STAYS until replaced
└─ Need key to change (clock signal)
Program Counter = Street Address
─────────────────────────────────
├─ Walking down street (incrementing)
├─ Each building is instruction (ROM)
├─ Sometimes skip buildings (jump)
├─ Automated walk (counter)
└─ Can teleport (parallel load)
The Clock = Postal Worker's Route
──────────────────────────────────
├─ Comes at same time each day
├─ Everything synchronized to schedule
├─ Reliable, predictable timing
└─ System works because of routine
The Complete Picture
┌───────────────────────────┐
│ Instruction Memory │
│ (ROM) │
└──────────┬────────────────┘
│
│ Instruction
▼
┌──────────────────────────┐
│ Control Unit │
│ (Decodes instruction) │
└─────┬────────────────────┘
│
┌──────────┼──────────┬─────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌───────┐ ┌──────┐
│ A │ │ D │ │ PC │ │ ALU │
│ Reg │ │ Reg │ │Counter│ │ │
└────┬───┘ └────┬───┘ └───┬───┘ └──┬───┘
│ │ │ │
└──────────┴─────────┴────────┘
Data flows between all
Summary: The Elegant Register Design
Key Insights:
Registers are memory - but fast, local, and simple
Two types needed:
- Storage (A, D) - just remember
- Counter (PC) - remember AND increment
Perfect IC match:
- 74HC574: optimal for storage
- 74HC161: optimal for counting
Synchronization critical:
- All on same clock
- Predictable timing
- No race conditions
Simple yet complete:
- Only 3 registers
- Enough for full computer
- Turing-complete!
The register design shows that simplicity works. You don't need dozens of registers like modern CPUs. Three carefully designed registers, with the right hardware support, are enough to build a working computer!
Would you like me to explain:
- How registers connect to the ALU in detail?
- The complete fetch-execute cycle with timing diagrams?
- How to expand to more registers if desired?
- The memory interface in the same level of detail?