diff --git a/doc/tpm.md b/doc/tpm.md index 90f7ec064..c9b3fc308 100644 --- a/doc/tpm.md +++ b/doc/tpm.md @@ -10,8 +10,35 @@ See also: [architecture.md](architecture.md), [boot-process.md](boot-process.md) ## tpmr — unified TPM abstraction `initrd/bin/tpmr.sh` is a shell script wrapper that presents a single interface -over both TPM 1.2 (`tpm` / `trousers`) and TPM 2.0 (`tpm2-tools`). All Heads -scripts call `tpmr.sh` rather than invoking `tpm` or `tpm2` directly. +over both TPM 1.2 and TPM 2.0. All Heads scripts call `tpmr.sh` rather than +invoking TPM tools directly. + +### Boot chain and TPM tool selection + +```text +initrd/init (PID 1) + └─ CONFIG_BOOTSCRIPT → /bin/gui-init.sh [board config] + ├─ source /etc/functions.sh [shared TPM helpers] + ├─ source /etc/gui_functions.sh [whiptail wrappers] + └─ calls initrd/bin/tpmr.sh [TPM abstraction] + ├─ TPM1: calls `tpm` (tpmtotp util/tpm) [CONFIG_TPM2_TOOLS != y] + │ modules/tpmtotp → output: totp hotp qrenc util/tpm + │ + └─ TPM2: calls `tpm2` (single binary, subcommands) [CONFIG_TPM2_TOOLS=y] + modules/tpm2-tss + modules/tpm2-tools +``` + +TPM1 support comes exclusively from the `tpmtotp` module (`modules/tpmtotp`), +which builds `util/tpm` as part of its outputs. This binary is installed to +the initrd as `tpm` and supports subcommands such as `physicalpresence`, +`forceclear`, `takeown -pwdo`, `counter_create`, `counter_increment`, etc. + +TPM2 support comes from `modules/tpm2-tss` (TSS software stack) and +`modules/tpm2-tools` (`tpm2` binary with subcommands like `getcap`, +`nvdefine`, `nvincrement`). + +Both TPM1 and TPM2 boards may also enable `CONFIG_TPMTOTP=y` for the +`totp` and `hotp` utilities, which are independent of the TPM version. ### PCR sizes @@ -398,3 +425,183 @@ To verify that a new board's coreboot config matches the expected RoT: | Auth sessions | Not used | Required for policy-based unseal | | `kexec_finalize` | No-op | Extends PCRs, then `tpm2 shutdown` | | `startsession` | No-op | Creates encryption session | + +### TPM1 auth retry and error detection + +`_tpm_auth_retry()` in `initrd/bin/tpmr.sh` provides shared retry logic for +both TPM1 and TPM2 operations that need authorization. On auth failure +(wrong passphrase), the passphrase cache is shredded and the user is +re-prompted up to 3 times before giving up. + +Auth failure is detected by grepping the command output for known error +patterns. TPM1 (tpmtotp) errors go to stdout via `printf()` with +`TPM_GetErrMsg()` strings. TPM2 (tpm2-tools) errors go to stderr via +`LOG_ERR()` and may include raw TPM response codes. + +| Pattern | Type | TPM version | Example error | +| --- | --- | --- | --- | +| `authorization|auth|bad|permission` | English words | TPM1+TPM2 | `TPM_AUTHFAIL`, `bad passphrase` | +| `defend` | English word | TPM1 | `Defend lock running` | +| `0x98e|0x149` | Hex codes | TPM2 | `TPM2_RC_AUTH_FAIL`, `TPM2_RC_NV_AUTHORIZATION` | + +### TPM1 reset defend lock + +`TPM_DEFEND_LOCK_RUNNING` (`tpm_error.h`: `TPM_BASE + TPM_NON_FATAL + 3`) +is a standard TPM 1.2 error raised when the TPM's dictionary-attack +protection is active. After too many failed authorization attempts, the +TPM enters a time-out period and refuses all authorization operations -- +including `tpm takeown` even after a successful `tpm forceclear` +(forceclear clears the owner but not the dictionary attack counter on +some implementations, particularly Infineon TPMs). + +tpmtotp's `tpm takeown` outputs: +``` +Error Defend lock running from TPM_TakeOwnership +``` + +`tpm1_reset()` in `initrd/bin/tpmr.sh` detects "defend lock" in the +`takeown` output and attempts one recovery: cycling physical presence +(`physicaldisable` / `physicalenable` / `physicalpresence` / +`physicalsetdeactivated`) to re-assert PP before retrying `takeown`. +This works on some chipsets where software presence was not properly +honoured by the first `forceclear`. + +If PP cycling also fails, no software-based recovery is available. +Further attempts (second forceclear, `TPM_ResetLockValue` with empty +auth, sleep+retry) will not help. Use `tpmr.sh da_state` from the +recovery shell to check the current DA state: + +- **TPM1**: `actionDependValue` reports remaining lockout seconds. +- **TPM2**: the human-readable summary shows estimated unlock time + based on `recoveryTime` (seconds before one failure is forgotten). + +Alternatively, reset the TPM to clear the DA state entirely: +`tpm-reset.sh` from the recovery shell, or GUI menu `Options -> +TPM/TOTP/HOTP Options -> Reset the TPM` for full reprovision. + +#### DA lockout duration escalation + +TPM 1.2 dictionary attack timeouts escalate with the failure count +(approximate; varies by vendor and TPM firmware version per Dell and +Microsoft documentation): + +| Failures accumulated | Typical lockout time | +|---------------------|---------------------| +| 1-2 | None (counter only) | +| 3-5 | 10 seconds | +| 6-9 | 1 hour | +| 10-12 | Several hours | +| 13+ | Up to 24 hours | + +Each time the TPM fully locks out and the timer expires, the DA counter +resets. If failures continue to accumulate across boots without +waiting for the timer to expire, the escalation can reach 24 hours. +This is what happened with the counter auth regression (3 failures per +boot x many boots): the DA counter reached the maximum threshold. + +#### Diagnosing DA state + +Use `tpmr.sh da_state` from the recovery shell to query the current DA +state. Available for both TPM1 and TPM2: + +| Information | TPM1 | TPM2 | +|-------------|------|------| +| Locked? | `state`: 0=inactive, 1=locked | `TPM2_PT_LOCKOUT_COUNTER` > `TPM2_PT_MAX_AUTH_FAIL` | +| Current failures | `currentCount` | `TPM2_PT_LOCKOUT_COUNTER` | +| Lockout threshold | `thresholdCount` | `TPM2_PT_MAX_AUTH_FAIL` | +| Lockout interval | -- | `TPM2_PT_LOCKOUT_INTERVAL` | +| Time remaining | `actionDependValue` (seconds) | Estimate from `LOCKOUT_COUNTER` vs `MAX_AUTH_FAIL` times `LOCKOUT_INTERVAL` | + +The recovery shell can run `tpmr.sh da_state` at any time to check +whether the TPM is locked and how much lockout time remains. + +Note: TPM1 DA state query relies on `TPM_CAP_DA_LOGIC` (0x19), a late +TPM 1.2 spec addition (rev 103). Older Infineon TPMs (e.g. X230-era +SLB9635/9645) and some Atmel chips do not support this capability. +On such hardware `da_state` returns `unavailable` and the preflight +guard silently skips; TPM2 is unaffected. + +#### DA parameter configurability + +TPM2 DA parameters are configured during `tpm2_reset()` (called by +`tpm-reset.sh` and the GUI `reset_tpm()`). Heads sets: +- `maxTries=10`: auth failures before lockout +- `recoveryTime=3600`: seconds before one failure is forgotten (counter + decrements by 1 per interval) +- `lockoutRecovery=0`: seconds lockout auth blocked after failure + +TPM1 has no software-accessible command to configure DA parameters +(tpmtotp's `setcapability` does not expose DA threshold or timeout +sub-capabilities). The DA policy is determined by the TPM firmware +and cannot be changed through software on TPM1. + +#### Testing DA lockout + +Use `tpmr.sh bad_auth` from the recovery shell to test dictionary attack +lockout behavior by deliberately triggering an auth failure: + +- **TPM1**: increments the rollback counter with a wrong password via + `tpm counter_increment -pwdc `. Each call increments the DA + counter by 1 until lockout is triggered at `currentCount >= thresholdCount`. +- **TPM2**: increments an existing NV counter with a wrong password via + `tpm2 nvincrement -P `. Requires a counter created by a prior + `reset_tpm()` GUI flow. Each call increments `TPM2_PT_LOCKOUT_COUNTER` + by 1 until lockout is triggered. + +Both show DA state before and after the attempt, and the `DA:` machine +line is logged by the preflight guard in `increment_tpm_counter`. + +#### Preventing future lockouts + +Heads' counter auth regression caused 3 TPM auth failures per boot by +passing the owner passphrase as the counter auth while the counter was +created with empty auth. Restoring empty counter auth for both creation +and increment (as per TCG spec) prevents auth failures from counter +operations. All TPM1 boards that ran the regression code are affected +identically; this is not platform-specific. + +### TPM1 physical presence + +TPM1.2 forceclear requires physical presence to be asserted. The +`tpm1_reset()` function does this with `tpm physicalpresence -s` (software +presence). On some platforms (e.g., Dell OptiPlex, some Infineon TPMs), +software physical presence may not work — the TPM firmware only accepts +hardware-asserted presence (GPIO set by BIOS). In that case, `forceclear` +returns success but may not fully reset the TPM, or `takeown` may fail +with unexpected errors. + +When software physical presence fails, the LOG shows: +``` +tpm1_reset: unable to set physical presence +``` + +This is logged but not fatal — `tpm forceclear` is still attempted. +If the TPM firmware ignores software physical presence, the reset fails +and the user must use the platform's hardware TPM reset mechanism +(typically a BIOS option or jumper). + +### TPM reset methods + +Heads has two TPM reset methods with different scope: + +**`tpm-reset.sh`** (CLI, recovery shell): +- Prompts for new owner passphrase, calls `tpmr.sh reset` +- TPM clear + re-ownership only +- No counter creation, no /boot signing, no TOTP/HOTP generation +- Intended for headless recovery or clearing a defend lock before running + the full GUI flow + +**`reset_tpm()`** (GUI, via Options -> TPM/TOTP/HOTP -> Reset the TPM in +`initrd/bin/gui-init.sh`): +- Prompts for new owner passphrase, calls `tpmr.sh reset` +- Removes stale `/boot/kexec_rollback.txt` and `/boot/kexec_primhdl_hash.txt` +- Creates new TPM rollback counter via `check_tpm_counter()` +- Increments the new counter +- Re-signs /boot with the GPG signing key +- Generates new TOTP/HOTP secrets +- Reseals TPM Disk Unlock Key (DUK) to LUKS +- Regenerates TPM2 encrypted sessions + +After `tpm-reset.sh`, the TPM is cleared but the system is not fully +provisioned — the user must complete the GUI `reset_tpm()` or OEM Factory +Reset to restore counter, signing, and secrets. diff --git a/initrd/bin/oem-factory-reset.sh b/initrd/bin/oem-factory-reset.sh index ece2f8f59..5dc315965 100755 --- a/initrd/bin/oem-factory-reset.sh +++ b/initrd/bin/oem-factory-reset.sh @@ -868,7 +868,7 @@ generate_checksums() { if [ "$CONFIG_TPM" = "y" ]; then if [ "$CONFIG_IGNORE_ROLLBACK" != "y" ]; then tpmr.sh counter_create \ - -pwdc "${TPM_PASS:-}" \ + -pwdc '' \ -la -3135106223 | tee /tmp/counter >/dev/null 2>&1 || whiptail_error_die "Unable to create TPM counter" diff --git a/initrd/bin/tpm-reset.sh b/initrd/bin/tpm-reset.sh index 047d49ef0..426012863 100755 --- a/initrd/bin/tpm-reset.sh +++ b/initrd/bin/tpm-reset.sh @@ -6,3 +6,12 @@ NOTE "This will erase all keys and secrets from the TPM" prompt_new_owner_password tpmr.sh reset "$tpm_owner_passphrase" + +# TODO: move the TPM reset + full reprovision flow (counter creation, /boot +# signing, TOTP/HOTP generation, DUK reseal) from gui-init.sh's reset_tpm() +# into a reusable function in functions.sh. Then tpm-reset.sh and the GUI +# reset_tpm() can both call the same code, eliminating the inconsistency +# between CLI and GUI reset paths. + +NOTE "TPM cleared. The TPM rollback counter was destroyed. /boot/kexec_rollback.txt still references the old counter." +NOTE "Restore full functionality from the GUI: Options -> TPM/TOTP/HOTP Options -> Reset the TPM" diff --git a/initrd/bin/tpmr.sh b/initrd/bin/tpmr.sh index 46f4581d8..c4aa98805 100755 --- a/initrd/bin/tpmr.sh +++ b/initrd/bin/tpmr.sh @@ -354,7 +354,7 @@ tpm2_counter_inc() { rm -f "$tmp_err_file" shred -n 10 -z -u /tmp/secret/tpm_owner_passphrase 2>/dev/null || true DEBUG "tpm2_counter_inc attempt $attempt failed. Stderr: $tmp_err_content" - if ! echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission|0x98e|0x149'; then + if ! echo "$tmp_err_content" | grep -qiE 'authorization|auth|bad|permission|defend|0x98e|0x149'; then DIE "Can't increment TPM counter for $index, access denied." fi WARN "Authentication failed, retrying..." @@ -370,16 +370,26 @@ tpm2_counter_inc() { # Caching: prompt_tpm_owner_password reuses cached passphrase if available. # On auth failure the cache is shredded; next prompt will ask the user. # +# Error stream selection: +# TPM1 (tpmtotp): errors go to stdout via printf() — capture stdout+stderr +# TPM2 (tpm2-tools): errors go to stderr via LOG_ERR() — capture stderr only +# +# Auth detection grep patterns: +# English words — TPM1 (TPM_GetErrMsg returns "Authentication failed...") +# — TPM2 (tpm2-tools LOG_ERR returns "TPM2_RC_AUTH_FAIL...") +# defend — TPM1 "Defend lock running" (TPM_DEFEND_LOCK_RUNNING) +# 0x98e, 0x149 — TPM2 raw hex codes (TPM2_RC_AUTH_FAIL, TPM2_RC_NV_AUTHORIZATION) +# # Usage: _tpm_auth_retry