diff --git a/README.md b/README.md
index 4100a17..95b70e4 100644
--- a/README.md
+++ b/README.md
@@ -12,8 +12,8 @@ Boots IRIX 6.5 and 5.3. Has networking. Has a framebuffer.
 ## Q&A
 
 **Q: What is it?**
-An SGI Indy (MIPS R4400) emulator. Emulates enough of the hardware that IRIX
-actually boots to a usable system — shell, networking, X11, the works.
+An SGI Indy (MIPS R4400) emulator. Emulates enough hardware that IRIX
+boots to a usable system: shell, networking, X11, the works.
 
 **Q: But why?**
 Wanted to see how far vibe coding could go, and to learn some Rust along the way.
@@ -38,7 +38,11 @@ Yes.
 
 - IRIX 6.5 boots to multiuser, networking works (ping, telnet, ftp)
 - IRIX 5.3 works too
-- X11 / Newport (REX3) graphics works
+- X11 / Newport (REX3) graphics works, with mouse and keyboard input
+- Cranelift JIT compiler for MIPS to x86_64 translation (optional)
+- Copy-on-write disk overlay. Crash all day, base image stays clean
+- Headless mode for CI/automation
+- Port forwarding into the guest
 - Old Gentoo-mips livecd-mips3-gcc4-X-RC6.img dies somewhere in kernel
 - NetBSD shows a white screen and probably goes into the weeds
 
@@ -47,18 +51,88 @@ Yes.
 
 You need:
 - `scsi1.raw` — raw hard disk image with IRIX 6.5.22 for Indy
-for a quick start get the mame irix image from https://mirror.rqsall.com/sgi-mame/ and convert to raw using chdman extractraw
+  (for a quick start get the MAME IRIX image from https://mirror.rqsall.com/sgi-mame/ and convert to raw using `chdman extractraw`)
 - `070-9101-011.bin` — Indy PROM image (optional; a default is embedded)
 
 ```
 cargo run --release
-you can add --features lightning for a little more speed
 ```
 
-See [HELP.md](HELP.md) for the full rundown — serial ports, monitor console,
+Build variants:
+```
+cargo run --release --features lightning    # disable breakpoints for ~10% more speed
+cargo run --release --features jit         # enable Cranelift JIT compiler
+```
+
+See [HELP.md](HELP.md) for the full rundown: serial ports, monitor console,
 NVRAM/MAC address setup, disk image prep, and more.
 
 
+## JIT compiler
+
+Optional Cranelift-based JIT. Compiles hot MIPS basic blocks to native x86_64.
+Enable with `--features jit` at build time and `IRIS_JIT=1` at runtime.
+
+Three tiers: blocks start ALU-only (registers + branches), promote to
+Loads (+ memory reads), then Full (+ stores) based on stable execution. Probe
+interval is adaptive. Hot block profiles persist across sessions.
+
+```
+IRIS_JIT=1 cargo run --release --features jit
+```
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `IRIS_JIT` | 0 | Enable JIT (1) or interpreter-only (0) |
+| `IRIS_JIT_MAX_TIER` | 2 | Cap tier: 0=ALU, 1=Loads, 2=Full |
+| `IRIS_JIT_VERIFY` | 0 | Run each block through interpreter and compare (debug) |
+| `IRIS_JIT_PROBE` | 200 | Base probe interval (steps between cache checks) |
+
+
+## Copy-on-write disk overlay
+
+Protects disk images from corruption during development and testing. The base
+`.raw` file is opened read-only and writes go to a sparse overlay file. Kill
+the emulator whenever you want. Delete the overlay to reset to the clean base.
+
+Enable in `iris.toml`:
+```toml
+[scsi.1]
+path = "scsi1.raw"
+cdrom = false
+overlay = true
+```
+
+Writes go to `scsi1.raw.overlay`. Monitor commands:
+- `cow status` - show dirty sector count
+- `cow commit` - merge overlay into base image (permanent)
+- `cow reset` - discard all overlay writes
+
+
+## Input
+
+Click the window to grab mouse and keyboard. Right Ctrl releases the grab.
+Mouse and keyboard use standard PS/2 emulation through the IOC.
+
+**Note:** Alt-tabbing away from the window can garble keyboard input in IRIX
+terminal apps. Use `telnet 127.0.0.1 2323` (with port forwarding configured)
+for a clean terminal instead.
+
+
+## Rules
+
+The `rules/` directory contains hard-won lessons from debugging the JIT and
+getting IRIX running. These are meant for both humans and AI assistants working
+on the codebase.
+
+- `rules/jit/` - dispatch architecture, store compilation, sync, verify mode, probe tuning
+- `rules/irix/` - networking config, keyboard quirks
+- `rules/testing/` - disk image handling, avoiding filesystem corruption
+
+If you're about to touch the JIT dispatch loop, read `rules/jit/dispatch-architecture.md`
+first. It'll save you a few days.
+
+
 ## License
 
 BSD 3-Clause
diff --git a/rules/irix/keyboard-issues.md b/rules/irix/keyboard-issues.md
new file mode 100644
index 0000000..9ff5cf2
--- /dev/null
+++ b/rules/irix/keyboard-issues.md
@@ -0,0 +1,20 @@
+# IRIX Keyboard Issues
+
+## Alt-tab corrupts X11 keyboard input
+
+After alt-tabbing away from the Rex window and returning, IRIX X11 terminal
+apps (Console, Terminal, xterm) show escape codes instead of typed characters.
+The IRIX login dialog still works (different input path).
+
+**Cause:** The Alt key release event from alt-tab confuses IRIX's X keyboard
+state machine. The PS/2 scancode for LAlt (0x19 in set 3) is delivered as a
+release without a matching press.
+
+**Workarounds:**
+1. Don't alt-tab while interacting with IRIX GUI — use Right Ctrl to ungrab mouse
+2. Use telnet via port forwarding (host 2323 -> guest 23) for terminal access
+3. Mount the disk image directly to edit files from the host
+
+**Status:** Pre-existing emulator issue, not introduced by any recent changes.
+Proper fix would require filtering or suppressing stale modifier key events
+in the UI event handler when focus is regained.
diff --git a/rules/irix/networking.md b/rules/irix/networking.md
new file mode 100644
index 0000000..a2eb906
--- /dev/null
+++ b/rules/irix/networking.md
@@ -0,0 +1,60 @@
+# IRIX 6.5 Networking Configuration
+
+## Required files
+
+| File | Contents | Example |
+|------|----------|---------|
+| /etc/sys_id | Hostname | `IRIS` |
+| /etc/hosts | IP-to-hostname mapping | `192.168.0.2 IRIS` |
+| /etc/config/ifconfig-ec0.options | IP + netmask (hex) | `192.168.0.2 netmask 0xffffff00` |
+| /etc/config/static-route.options | Default gateway | `$ROUTE $QUIET add net default 192.168.0.1` |
+| /etc/config/network | Enable networking | `on` |
+
+## Common mistakes
+
+- **Wrong filename:** Use `ifconfig-ec0.options`, NOT `ifconfig-1.options`.
+  IRIX names config files after the interface device name.
+
+- **Missing IP in options:** The IP address goes IN `ifconfig-ec0.options`
+  along with the netmask. It's not just options — it's the full ifconfig args.
+
+- **Wrong gateway file:** Use `/etc/config/static-route.options`, NOT
+  `/etc/defaultrouter`. The format uses shell variables: `$ROUTE $QUIET add net default <ip>`.
+
+- **Netmask format:** IRIX uses hex notation: `0xffffff00` for 255.255.255.0.
+
+## NVRAM MAC address (one-time setup)
+
+The Seeq Ethernet controller reads its MAC from NVRAM. A fresh install has
+no MAC set, which prevents networking.
+
+1. Boot to PROM monitor (press Escape during countdown)
+2. `>> setenv -f eaddr 08:00:69:de:ad:01` (any SGI OUI `08:00:69` MAC)
+3. From iris monitor (telnet 127.0.0.1 8888): `rtc save`
+
+## iris emulator network configuration
+
+The emulator provides a NAT gateway with built-in DHCP:
+- Gateway: 192.168.0.1 (hardcoded in GatewayConfig)
+- Guest: 192.168.0.2 (assigned via DHCP or static)
+- Netmask: 255.255.255.0
+- DNS: forwarded to host's resolver
+
+Port forwarding configured in iris.toml:
+```toml
+[[port_forward]]
+proto = "tcp"
+host_port = 2323
+guest_port = 23
+bind = "localhost"
+```
+
+## Keyboard workaround
+
+Alt-tabbing away from the Rex window corrupts IRIX X11 keyboard input
+(terminal apps show escape codes). Once networking is up, use:
+```bash
+telnet 127.0.0.1 2323
+```
+This connects via the port forward to IRIX's telnet daemon with a clean
+terminal — no keyboard corruption issues.
diff --git a/rules/jit/dispatch-architecture.md b/rules/jit/dispatch-architecture.md
new file mode 100644
index 0000000..5313ec6
--- /dev/null
+++ b/rules/jit/dispatch-architecture.md
@@ -0,0 +1,40 @@
+# JIT Dispatch Architecture Rules
+
+## Interpreter-first, never JIT-first
+
+The dispatch loop must run the interpreter in sustained bursts (hundreds of steps)
+between JIT block executions. The interpreter's step() does critical per-instruction
+bookkeeping: cp0_count advancement, interrupt checking, cp0_compare crossover,
+delay slot state machine.
+
+**NEVER** check the JIT cache every iteration (JIT-first). Even one exec.step()
+between JIT blocks is insufficient. Tested at 58% JIT ratio — kernel panicked.
+
+**Minimum probe interval: 100.** Below this, the system approaches JIT-first
+behavior and crashes. The adaptive ProbeController enforces this via IRIS_JIT_PROBE_MIN.
+
+## No block chaining
+
+**NEVER** execute multiple JIT blocks consecutively without returning to the
+interpreter. Manual interrupt checks between chained blocks are insufficient —
+they miss CP0 timing, soft reset, software interrupts. Tested with up to 16
+chained blocks — kernel panic at 0x880097ac.
+
+## Post-block bookkeeping is mandatory
+
+After every JIT block execution (normal exit path):
+1. Advance cp0_count by `block_len * count_step`
+2. Check cp0_compare crossover for timer interrupt (CAUSE_IP7)
+3. Credit `local_cycles += block_len`
+4. Check for pending interrupts via atomic load
+5. Merge external IP bits into cp0_cause
+6. If unmasked interrupt pending, call exec.step() to service it
+
+On the exception path:
+1. Advance cp0_count for instructions executed BEFORE the fault:
+   `instrs_before_fault = (ctx.pc - block_start_pc) / 4`
+
+Omitting post-block cp0_count advancement causes timer drift and kernel panics.
+This was present in the original initial JIT but accidentally dropped in the
+rewrite. The bug was masked with short straight-line ALU blocks but manifested
+immediately with branch compilation (longer, more frequent blocks).
diff --git a/rules/jit/probe-tuning.md b/rules/jit/probe-tuning.md
new file mode 100644
index 0000000..0e55873
--- /dev/null
+++ b/rules/jit/probe-tuning.md
@@ -0,0 +1,36 @@
+# JIT Probe Tuning Rules
+
+## Minimum probe interval: 100
+
+Never allow the probe interval to drop below ~100 interpreter steps. Below
+this threshold, the system approaches JIT-first behavior and the interpreter
+doesn't get enough sustained runs for kernel timing stability.
+
+An earlier adaptive formula `200_000 / cache_size` gave a value of 9 with
+21K blocks, effectively making probe=32. This crashed.
+
+## Use sqrt-based cache pressure
+
+Cache size pressure formula: `1.0 / (cache_size / 100.0).sqrt()`
+
+This degrades gracefully:
+- 100 blocks: factor 1.0 (no change)
+- 1000 blocks: factor 0.68
+- 10000 blocks: factor 0.46
+- 50000 blocks: factor 0.31
+
+Combined with min_interval=100, the effective probe never drops dangerously low.
+
+## Asymmetric EWMA response
+
+Hits pull the interval down aggressively (~3% per hit, factor 31/32).
+Misses push the interval up gently (~1% per miss, factor 33/32).
+This exploits hot code quickly without overreacting to cold regions.
+
+## Environment variables
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| IRIS_JIT_PROBE | 200 | Base probe interval |
+| IRIS_JIT_PROBE_MIN | 100 | Minimum (critical floor) |
+| IRIS_JIT_PROBE_MAX | 2000 | Maximum |
diff --git a/rules/jit/store-compilation.md b/rules/jit/store-compilation.md
new file mode 100644
index 0000000..8dd9bf6
--- /dev/null
+++ b/rules/jit/store-compilation.md
@@ -0,0 +1,62 @@
+# JIT Store Compilation Rules
+
+## Full-tier blocks must be non-speculative
+
+Set `speculative: tier != BlockTier::Full` in the compiler.
+
+**Why:** Snapshot rollback restores CPU+TLB but NOT memory. If a store block
+does read-modify-write (LW, ADDIU, SW) and then hits an exception, rollback
+rewinds CPU to pre-block state but memory has the modified value. The
+interpreter re-runs from block entry, reads the modified value, modifies it
+again. Counters become N+2 instead of N+1. This corrupts kernel data structures.
+
+## Full-tier blocks must terminate at the first store
+
+In trace_block, break after pushing the first store instruction at Full tier:
+```rust
+if tier == BlockTier::Full && is_compilable_store(&d) {
+    break;
+}
+```
+
+**Why:** Long blocks with multiple load/store helper calls create complex CFG
+(ok_block/exc_block diamond patterns per helper). This triggers Cranelift
+regalloc2 codegen issues on x86_64 — rare but fatal corruption that manifests
+after millions of block executions. Short blocks (~3-10 instructions) work
+perfectly. Confirmed empirically: short blocks = stable with 5K+ Full
+promotions; long blocks = crash at 780M instructions.
+
+## Write helpers must use status != EXEC_COMPLETE
+
+```rust
+if status != EXEC_COMPLETE { ctx.exit_reason = EXIT_EXCEPTION; ... }
+```
+
+**NEVER** use `status & EXEC_IS_EXCEPTION != 0`. BUS_BUSY (0x100) does not
+have the EXEC_IS_EXCEPTION bit (bit 27) set, so it would be treated as
+success. But BUS_BUSY means the write was NOT performed. This silently drops
+uncached writes (MMIO stores to device registers), causing slow corruption.
+
+## Verify mode cannot validate stores
+
+Verify mode snapshots CPU/TLB but NOT memory. After a JIT block with stores
+modifies memory, the interpreter re-run reads the JIT-modified values.
+Read-modify-write sequences get double-applied. Verify mode is only valid
+for ALU and Load tiers.
+
+## Delay-slot stores should be excluded from compilation
+
+In trace_block, when checking the delay slot instruction for a branch, exclude
+stores:
+```rust
+if is_compilable_for_tier(&delay_d, tier) && !is_compilable_store(&delay_d) {
+    instrs.push((delay_raw, delay_d));
+    delay_ok = true;
+}
+```
+
+**Why:** If a delay-slot store faults, sync_to_executor clears in_delay_slot.
+exec.step() re-executes the store as a non-delay-slot instruction.
+handle_exception sets cp0_epc to the store PC (not the branch PC) and doesn't
+set the BD bit. On ERET, the branch is permanently skipped, corrupting control
+flow. This is defensive — the block length fix is the primary fix for stores.
diff --git a/rules/jit/sync-architecture.md b/rules/jit/sync-architecture.md
new file mode 100644
index 0000000..b5a373d
--- /dev/null
+++ b/rules/jit/sync-architecture.md
@@ -0,0 +1,29 @@
+# JIT Sync Architecture Rules
+
+## sync_to_executor: minimal writeback only
+
+sync_to_executor must ONLY write back:
+- GPRs (core.gpr)
+- PC (core.pc)
+- hi, lo
+
+It must NOT write back:
+- cp0_status, cp0_cause, cp0_epc, cp0_badvaddr
+- cp0_count, cp0_compare, count_step
+- nanotlb (all 3 entries)
+- fpr, fpu_fcsr
+- local_cycles, cached_pending
+
+**Why:** JIT memory helpers (read/write) call exec methods directly, which
+modify these fields on the executor in-place. The JitContext copy is stale
+for these fields after helpers run. Writing them back would clobber changes
+made by exception handlers and TLB fill operations.
+
+## sync_to_executor must clear delay slot state
+
+Always set:
+- exec.in_delay_slot = false
+- exec.delay_slot_target = 0
+
+JIT blocks handle delay slots internally. Clearing prevents the interpreter
+from jumping to a stale target on the next step().
diff --git a/rules/jit/verify-mode.md b/rules/jit/verify-mode.md
new file mode 100644
index 0000000..17fac32
--- /dev/null
+++ b/rules/jit/verify-mode.md
@@ -0,0 +1,17 @@
+# JIT Verify Mode Rules
+
+## Timing false positive detection
+
+Verify mode re-runs each block through the interpreter at a different wall-clock
+time. The interpreter may see different external interrupt state via the atomic
+and take an exception the JIT didn't see (or vice versa).
+
+Detection: if the interpreter PC is in exception vectors (0x80000000-0x80000400
+or 0x80000180) but the JIT PC is not, it's a timing false positive. Keep the
+block, don't invalidate. Use the interpreter's result as authoritative.
+
+## Verify mode is invalid for store blocks
+
+See store-compilation.md. Memory is not part of the snapshot, so verify mode
+double-applies read-modify-write sequences for blocks containing stores.
+Only use verify mode for ALU and Loads tiers.
diff --git a/rules/testing/disk-image-hygiene.md b/rules/testing/disk-image-hygiene.md
new file mode 100644
index 0000000..b425326
--- /dev/null
+++ b/rules/testing/disk-image-hygiene.md
@@ -0,0 +1,43 @@
+# Disk Image Hygiene Rules
+
+## Always use COW overlay or fresh disk for JIT testing
+
+Each emulator crash (kernel panic) corrupts the IRIX XFS filesystem on
+scsi1.raw. Subsequent boots from the corrupted image produce TLBMISS panics
+that look IDENTICAL to JIT bugs. This confounded at least 5 rounds of store
+debugging — crashes were attributed to JIT codegen when they were actually
+filesystem corruption from earlier test runs.
+
+**Solutions (use one):**
+1. Enable `overlay = true` in iris.toml — base image never modified
+2. Re-extract scsi1.raw from the CHD archive before each test
+3. Keep a known-good copy: `cp scsi1.raw scsi1.raw.clean`
+
+## Mounting IRIX disk images on Linux
+
+The disk image has an SGI DVH (disk volume header) with partitions. The XFS
+root partition is NOT at offset 0.
+
+To find the partition offset:
+```python
+# Parse SGI DVH partition table at offset 312
+# 16 entries of 12 bytes: (nblks: u32be, first_lbn: u32be, type: u32be)
+# Type 10 = XFS
+```
+
+For a standard IRIX 6.5 install, the root XFS partition is typically at
+offset 136314880 (LBA 266240 * 512).
+
+```bash
+sudo losetup -o 136314880 /dev/loopN scsi1.raw
+sudo mount /dev/loopN /mnt/irix
+```
+
+If the filesystem is dirty from a crash:
+```bash
+sudo xfs_repair -L /dev/loopN   # zeros the dirty log
+sudo mount /dev/loopN /mnt/irix
+```
+
+The `-L` flag is required — IRIX big-endian XFS dirty logs can't be replayed
+by Linux's little-endian XFS driver.
diff --git a/src/config.rs b/src/config.rs
index 34d221e..36a54c2 100644
--- a/src/config.rs
+++ b/src/config.rs
@@ -14,6 +14,10 @@ pub struct ScsiDeviceConfig {
     pub discs: Vec<String>,
     /// true = CD-ROM, false = hard disk.
     pub cdrom: bool,
+    /// Enable copy-on-write overlay. Base image is never modified; writes go to
+    /// `{path}.overlay`. Delete the overlay file to reset to clean state.
+    #[serde(default)]
+    pub overlay: bool,
 }
 
 /// Protocol for port forwarding.
@@ -125,11 +129,13 @@ fn default_scsi() -> std::collections::HashMap<u8, ScsiDeviceConfig> {
         path: "scsi1.raw".to_string(),
         discs: vec![],
         cdrom: false,
+        overlay: false,
     });
     map.insert(4, ScsiDeviceConfig {
         path: "cdrom4.iso".to_string(),
         discs: vec![],
         cdrom: true,
+        overlay: false,
     });
     map
 }
@@ -312,6 +318,7 @@ impl Cli {
                 path: String::new(),
                 discs: vec![],
                 cdrom,
+                overlay: false,
             });
             entry.path = path;
             entry.cdrom = cdrom;
diff --git a/src/cow_disk.rs b/src/cow_disk.rs
new file mode 100644
index 0000000..1a2e89c
--- /dev/null
+++ b/src/cow_disk.rs
@@ -0,0 +1,162 @@
+//! Copy-on-write disk overlay for SCSI disk images.
+//!
+//! Protects the base disk image from writes by redirecting them to a sparse
+//! overlay file. Reads check the overlay first, falling back to the base image
+//! for clean sectors. Deleting the overlay file resets the disk to its original state.
+
+use std::collections::HashSet;
+use std::fs::{File, OpenOptions};
+use std::io::{self, Read, Seek, SeekFrom, Write};
+
+const SECTOR_SIZE: u64 = 512;
+
+pub struct CowDisk {
+    base: File,
+    overlay: File,
+    dirty: HashSet<u64>,
+    base_size: u64,
+    overlay_path: String,
+}
+
+impl CowDisk {
+    /// Open a COW disk with the given base image (read-only) and overlay file (read-write).
+    /// If the overlay file exists, its dirty sectors are reconstructed from its sparse extent.
+    /// If it doesn't exist, a new empty overlay is created.
+    pub fn new(base_path: &str, overlay_path: &str) -> io::Result<Self> {
+        let base = File::open(base_path)?;
+        let base_size = base.metadata()?.len();
+
+        let overlay = OpenOptions::new()
+            .read(true)
+            .write(true)
+            .create(true)
+            .open(overlay_path)?;
+
+        // Rebuild the dirty set from the overlay file size.
+        // The overlay is a sparse file with the same layout as the base.
+        // Any sector that has been written occupies space, but we can't easily
+        // detect sparse holes portably. Instead, track dirty sectors in memory
+        // and accept that a fresh start after crash loses the dirty set
+        // (overlay is deleted on state load anyway).
+        let dirty = HashSet::new();
+
+        eprintln!("iris: COW overlay active (base: {}, overlay: {})", base_path, overlay_path);
+        eprintln!("iris: to reset disk to clean state, delete {}", overlay_path);
+
+        Ok(Self {
+            base,
+            overlay,
+            dirty,
+            base_size,
+            overlay_path: overlay_path.to_string(),
+        })
+    }
+
+    /// Read `count` sectors starting at `lba`.
+    /// Dirty sectors are read from the overlay, clean sectors from the base.
+    pub fn read_sectors(&mut self, lba: u64, count: usize) -> io::Result<Vec<u8>> {
+        let total = count * SECTOR_SIZE as usize;
+        let mut data = vec![0u8; total];
+
+        // Batch consecutive sectors from the same source to minimize seeks.
+        let mut pos = 0usize;
+        let mut sector = lba;
+        while pos < total {
+            // Determine run length from the same source.
+            let is_dirty = self.dirty.contains(&sector);
+            let mut run = 1usize;
+            while pos + run * SECTOR_SIZE as usize <= total {
+                let next = sector + run as u64;
+                if self.dirty.contains(&next) != is_dirty {
+                    break;
+                }
+                run += 1;
+            }
+            // Don't overshoot.
+            let run_sectors = run.min((total - pos) / SECTOR_SIZE as usize);
+            let run_bytes = run_sectors * SECTOR_SIZE as usize;
+
+            let file = if is_dirty { &mut self.overlay } else { &mut self.base };
+            file.seek(SeekFrom::Start(sector * SECTOR_SIZE))?;
+            file.read_exact(&mut data[pos..pos + run_bytes])?;
+
+            pos += run_bytes;
+            sector += run_sectors as u64;
+        }
+
+        Ok(data)
+    }
+
+    /// Write sectors starting at `lba`. Data length must be a multiple of 512.
+    /// Writes go to the overlay file only; the base image is never modified.
+    pub fn write_sectors(&mut self, lba: u64, data: &[u8]) -> io::Result<()> {
+        debug_assert!(data.len() % SECTOR_SIZE as usize == 0);
+        let count = data.len() / SECTOR_SIZE as usize;
+
+        self.overlay.seek(SeekFrom::Start(lba * SECTOR_SIZE))?;
+        self.overlay.write_all(data)?;
+
+        for i in 0..count as u64 {
+            self.dirty.insert(lba + i);
+        }
+
+        Ok(())
+    }
+
+    /// Base image size in bytes.
+    pub fn size(&self) -> u64 {
+        self.base_size
+    }
+
+    /// Merge all dirty overlay sectors into the base image, then truncate overlay.
+    pub fn commit(&mut self) -> io::Result<usize> {
+        // Reopen base as read-write for the commit.
+        // (We can't just change the mode of self.base, so we open a second handle.)
+        let base_path = {
+            // Get the path from /proc/self/fd on Linux, or just require it as a param.
+            // For simplicity, we'll do the commit through the overlay path convention:
+            // base path = overlay path without the ".overlay" suffix.
+            if self.overlay_path.ends_with(".overlay") {
+                self.overlay_path[..self.overlay_path.len() - 8].to_string()
+            } else {
+                return Err(io::Error::new(io::ErrorKind::Other,
+                    "cannot determine base path from overlay path"));
+            }
+        };
+
+        let mut base_rw = OpenOptions::new().read(true).write(true).open(&base_path)?;
+        let mut buf = vec![0u8; SECTOR_SIZE as usize];
+        let mut committed = 0usize;
+
+        for &lba in &self.dirty {
+            self.overlay.seek(SeekFrom::Start(lba * SECTOR_SIZE))?;
+            self.overlay.read_exact(&mut buf)?;
+            base_rw.seek(SeekFrom::Start(lba * SECTOR_SIZE))?;
+            base_rw.write_all(&buf)?;
+            committed += 1;
+        }
+
+        base_rw.sync_all()?;
+        self.dirty.clear();
+        self.overlay.set_len(0)?;
+
+        // Reopen base read-only to pick up committed data.
+        self.base = File::open(&base_path)?;
+
+        eprintln!("iris: COW committed {} sectors to {}", committed, base_path);
+        Ok(committed)
+    }
+
+    /// Delete the overlay file and create a fresh empty one (for state load).
+    pub fn reset_overlay(&mut self) -> io::Result<()> {
+        self.dirty.clear();
+        self.overlay.set_len(0)?;
+        self.overlay.seek(SeekFrom::Start(0))?;
+        Ok(())
+    }
+
+    /// Number of dirty sectors in the overlay.
+    pub fn dirty_count(&self) -> usize {
+        self.dirty.len()
+    }
+}
diff --git a/src/hpc3.rs b/src/hpc3.rs
index 7eba141..5e8bc69 100644
--- a/src/hpc3.rs
+++ b/src/hpc3.rs
@@ -1066,8 +1066,8 @@ impl Hpc3 {
         self.seeq.set_phys(mem);
     }
 
-    pub fn add_scsi_device(&self, id: usize, path: &str, is_cdrom: bool, discs: Vec<String>) -> std::io::Result<()> {
-        self.scsi_dev.add_device(id, path, is_cdrom, discs)
+    pub fn add_scsi_device(&self, id: usize, path: &str, is_cdrom: bool, discs: Vec<String>, overlay: bool) -> std::io::Result<()> {
+        self.scsi_dev.add_device(id, path, is_cdrom, discs, overlay)
     }
 
     pub fn ioc(&self) -> &Ioc {
@@ -1226,7 +1226,7 @@ impl Device for Hpc3 {
         if cmd == "seeq" || cmd == "net" {
              return self.seeq.execute_command(cmd, args, writer);
         }
-        if cmd == "scsi" {
+        if cmd == "scsi" || cmd == "cow" {
              return self.scsi_dev.execute_command(cmd, args, writer);
         }
         if cmd == "hal2" {
diff --git a/src/lib.rs b/src/lib.rs
index 712efa4..15004e6 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -28,6 +28,7 @@ pub mod locks;
 pub mod pit8254;
 pub mod net;
 pub mod seeq8003;
+pub mod cow_disk;
 pub mod scsi;
 pub mod wd33c93a;
 pub mod hal2;
diff --git a/src/machine.rs b/src/machine.rs
index 2153e01..6dae5e4 100644
--- a/src/machine.rs
+++ b/src/machine.rs
@@ -105,7 +105,7 @@ impl Machine {
             } else {
                 (dev.path.clone(), vec![])
             };
-            if let Err(e) = hpc3.add_scsi_device(id as usize, &path, dev.cdrom, discs) {
+            if let Err(e) = hpc3.add_scsi_device(id as usize, &path, dev.cdrom, discs, dev.overlay) {
                 println!("Note: Could not attach {} to SCSI ID {}: {}", path, id, e);
             }
         }
diff --git a/src/scsi.rs b/src/scsi.rs
index ab84fe7..90f93c6 100644
--- a/src/scsi.rs
+++ b/src/scsi.rs
@@ -1,5 +1,7 @@
 use std::fs::{File, OpenOptions};
-use std::io::{Read, Seek, SeekFrom, Write};
+use std::io::{self, Read, Seek, SeekFrom, Write};
+
+use crate::cow_disk::CowDisk;
 
 /// Get the standard CDB length based on the opcode's group code
 pub fn get_cdb_length(opcode: u8) -> usize {
@@ -59,8 +61,51 @@ pub struct ScsiResponse {
     pub data: Vec<u8>,   // Response data
 }
 
+/// Disk I/O backend: either direct file access or copy-on-write overlay.
+pub enum DiskBackend {
+    /// Direct read-write access to a single file (current default behavior).
+    Direct(File),
+    /// Copy-on-write: base image is read-only, writes go to overlay file.
+    Cow(CowDisk),
+}
+
+impl DiskBackend {
+    fn read_sectors(&mut self, lba: u64, count: usize) -> io::Result<Vec<u8>> {
+        match self {
+            DiskBackend::Direct(file) => {
+                let offset = lba * 512;
+                let total = count * 512;
+                file.seek(SeekFrom::Start(offset))?;
+                let mut data = vec![0u8; total];
+                file.read_exact(&mut data)?;
+                Ok(data)
+            }
+            DiskBackend::Cow(cow) => cow.read_sectors(lba, count),
+        }
+    }
+
+    fn write_sectors(&mut self, lba: u64, data: &[u8]) -> io::Result<()> {
+        match self {
+            DiskBackend::Direct(file) => {
+                let offset = lba * 512;
+                file.seek(SeekFrom::Start(offset))?;
+                file.write_all(data)?;
+                Ok(())
+            }
+            DiskBackend::Cow(cow) => cow.write_sectors(lba, data),
+        }
+    }
+
+    fn size(&self) -> u64 {
+        match self {
+            DiskBackend::Direct(file) => file.metadata().map(|m| m.len()).unwrap_or(0),
+            DiskBackend::Cow(cow) => cow.size(),
+        }
+    }
+}
+
 pub struct ScsiDevice {
-    file: File,
+    backend: DiskBackend,
     size: u64,
     is_cdrom: bool,
     /// Path of the currently mounted image.
@@ -77,9 +122,9 @@ pub struct ScsiDevice {
 const SCSI_BUFFER_SIZE: usize = 0x4000; // 16KB (16384 bytes)
 
 impl ScsiDevice {
-    pub fn new(file: File, size: u64, is_cdrom: bool, filename: String, discs: Vec<String>) -> Self {
+    pub fn new(backend: DiskBackend, size: u64, is_cdrom: bool, filename: String, discs: Vec<String>) -> Self {
         Self {
-            file,
+            backend,
             size,
             is_cdrom,
             filename,
@@ -90,6 +135,36 @@ impl ScsiDevice {
         }
     }
 
+    /// Commit the COW overlay to the base image. No-op if not using COW.
+    /// Returns the number of sectors committed, or 0 if direct mode.
+    pub fn cow_commit(&mut self) -> io::Result<usize> {
+        match &mut self.backend {
+            DiskBackend::Cow(cow) => cow.commit(),
+            DiskBackend::Direct(_) => Ok(0),
+        }
+    }
+
+    /// Reset the COW overlay (discard all writes). No-op if not using COW.
+    pub fn cow_reset(&mut self) -> io::Result<()> {
+        match &mut self.backend {
+            DiskBackend::Cow(cow) => cow.reset_overlay(),
+            DiskBackend::Direct(_) => Ok(()),
+        }
+    }
+
+    /// Number of dirty sectors in the COW overlay, or 0 if direct mode.
+    pub fn cow_dirty_count(&self) -> usize {
+        match &self.backend {
+            DiskBackend::Cow(cow) => cow.dirty_count(),
+            DiskBackend::Direct(_) => 0,
+        }
+    }
+
+    /// Whether this device is using COW overlay mode.
+    pub fn is_cow(&self) -> bool {
+        matches!(&self.backend, DiskBackend::Cow(_))
+    }
+
     /// Advance to the next disc in the list (wraps around).
     /// Returns the new active disc path, or None if this is not a CD-ROM
     /// or there is only one disc.
@@ -105,7 +180,7 @@ impl ScsiDevice {
         match OpenOptions::new().read(true).open(&next_path) {
             Ok(f) => {
                 let size = f.metadata().map(|m| m.len()).unwrap_or(0);
-                self.file = f;
+                self.backend = DiskBackend::Direct(f);
                 self.size = size;
                 self.filename = next_path.clone();
                 self.unit_attention = true; // signal medium change on next command
@@ -284,13 +359,7 @@ impl ScsiDevice {
     }
 
     fn perform_read(&mut self, lba: u64, count: usize) -> Result<ScsiResponse, std::io::Error> {
-        let offset = lba * 512;
-        let total = count * 512;
-
-        self.file.seek(SeekFrom::Start(offset))?;
-        let mut data = vec![0u8; total];
-        self.file.read_exact(&mut data)?;
-
+        let data = self.backend.read_sectors(lba, count)?;
         Ok(ScsiResponse {
             status: 0x00,
             data,
@@ -332,9 +401,7 @@ impl ScsiDevice {
             });
         }
 
-        let offset = lba * 512;
-        self.file.seek(SeekFrom::Start(offset))?;
-        self.file.write_all(data)?;
+        self.backend.write_sectors(lba, data)?;
 
         Ok(ScsiResponse {
             status: 0x00,
diff --git a/src/wd33c93a.rs b/src/wd33c93a.rs
index 259e575..c1f27be 100644
--- a/src/wd33c93a.rs
+++ b/src/wd33c93a.rs
@@ -234,19 +234,29 @@ impl Wd33c93a {
     /// For CD-ROMs, `discs` is the full ordered list of ISO paths; the first
     /// entry is mounted immediately.  For HDDs `discs` is ignored — only
     /// `path` is used.
-    pub fn add_device(&self, id: usize, path: &str, is_cdrom: bool, discs: Vec<String>) -> std::io::Result<()> {
-        let file = OpenOptions::new()
-            .read(true)
-            .write(!is_cdrom)
-            .open(path)?;
-        let metadata = file.metadata()?;
-        let size = metadata.len();
+    pub fn add_device(&self, id: usize, path: &str, is_cdrom: bool, discs: Vec<String>, overlay: bool) -> std::io::Result<()> {
+        use crate::cow_disk::CowDisk;
+        use crate::scsi::DiskBackend;
+
+        let (backend, size) = if overlay && !is_cdrom {
+            let overlay_path = format!("{}.overlay", path);
+            let cow = CowDisk::new(path, &overlay_path)?;
+            let sz = cow.size();
+            (DiskBackend::Cow(cow), sz)
+        } else {
+            let file = std::fs::OpenOptions::new()
+                .read(true)
+                .write(!is_cdrom)
+                .open(path)?;
+            let sz = file.metadata()?.len();
+            (DiskBackend::Direct(file), sz)
+        };
 
         let disc_list = if is_cdrom { discs } else { vec![] };
 
         let mut state = self.state.lock();
         if id < 8 {
-            state.devices[id] = Some(ScsiDevice::new(file, size, is_cdrom, path.to_string(), disc_list));
+            state.devices[id] = Some(ScsiDevice::new(backend, size, is_cdrom, path.to_string(), disc_list));
         }
         Ok(())
     }
@@ -486,6 +496,7 @@ impl Device for Wd33c93a {
     fn register_commands(&self) -> Vec<(String, String)> {
         vec![
             ("scsi".to_string(), "SCSI commands: scsi status | scsi eject <id> | scsi debug <on|off> [DEV]".to_string()),
+            ("cow".to_string(), "COW overlay: cow status | cow commit [id] | cow reset [id]".to_string()),
         ]
     }
 
@@ -529,6 +540,57 @@ impl Device for Wd33c93a {
                 _ => return Err("Usage: scsi debug <on|off> | scsi status | scsi eject <id>".to_string()),
             }
         }
+        if cmd == "cow" {
+            let mut state = self.state.lock();
+            match args.first().copied() {
+                Some("status") => {
+                    for (id, dev) in state.devices.iter().enumerate() {
+                        if let Some(d) = dev {
+                            if d.is_cow() {
+                                writeln!(writer, "SCSI {}: COW overlay, {} dirty sectors", id, d.cow_dirty_count()).unwrap();
+                            } else {
+                                writeln!(writer, "SCSI {}: direct (no overlay)", id).unwrap();
+                            }
+                        }
+                    }
+                    return Ok(());
+                }
+                Some("commit") => {
+                    let ids: Vec<usize> = if let Some(id_str) = args.get(1) {
+                        vec![id_str.parse().map_err(|_| "invalid SCSI ID".to_string())?]
+                    } else {
+                        (0..8).filter(|&i| state.devices[i].as_ref().map(|d| d.is_cow()).unwrap_or(false)).collect()
+                    };
+                    for id in ids {
+                        if let Some(dev) = &mut state.devices[id] {
+                            match dev.cow_commit() {
+                                Ok(n) if n > 0 => writeln!(writer, "SCSI {}: committed {} sectors to base image", id, n).unwrap(),
+                                Ok(_) => writeln!(writer, "SCSI {}: nothing to commit", id).unwrap(),
+                                Err(e) => writeln!(writer, "SCSI {}: commit failed: {}", id, e).unwrap(),
+                            }
+                        }
+                    }
+                    return Ok(());
+                }
+                Some("reset") => {
+                    let ids: Vec<usize> = if let Some(id_str) = args.get(1) {
+                        vec![id_str.parse().map_err(|_| "invalid SCSI ID".to_string())?]
+                    } else {
+                        (0..8).filter(|&i| state.devices[i].as_ref().map(|d| d.is_cow()).unwrap_or(false)).collect()
+                    };
+                    for id in ids {
+                        if let Some(dev) = &mut state.devices[id] {
+                            match dev.cow_reset() {
+                                Ok(()) => writeln!(writer, "SCSI {}: overlay reset (all writes discarded)", id).unwrap(),
+                                Err(e) => writeln!(writer, "SCSI {}: reset failed: {}", id, e).unwrap(),
+                            }
+                        }
+                    }
+                    return Ok(());
+                }
+                _ => return Err("Usage: cow status | cow commit [id] | cow reset [id]".to_string()),
+            }
+        }
         Err("Command not found".to_string())
     }
 }