Summary
On Copilot CLI 1.0.21 (Linux), a long-lived or frequently-resumed session can accumulate a very large ~/.copilot/session-state/[session-id]/events.jsonl file. Once the file is large enough (~120 MiB / ~37 k lines in my case), subsequent event writes begin failing permanently with:
Failed to flush events for [session-id]:
Error: Failed to append to JSONL file
~/.copilot/session-state/[session-id]/events.jsonl:
Error: timeout while waiting for mutex to become available
Multiple stale inuse.[pid].lock files also remain in the session directory after the owning processes have exited, suggesting the cleanup path does not run on all exit scenarios.
Environment
| Detail |
Value |
| Copilot CLI version |
1.0.21 |
| OS |
Linux (Ubuntu) |
| Shell |
Bash inside tmux |
Steps to Reproduce
- Start a Copilot CLI session and keep resuming / reusing the same session over an extended period (days to weeks).
- Allow
~/.copilot/session-state/[session-id]/events.jsonl to grow large (observed at ~123 MiB).
- Open or resume the same session from additional Copilot CLI processes.
- Eventually, event flush calls begin failing with the mutex-timeout error above.
Note: I have not isolated a minimal deterministic repro yet; the issue emerged organically during normal extended use.
Observed Behavior
- Repeated mutex-timeout errors in multiple
~/.copilot/logs/process-*.log files, from at least two different PIDs — not a single transient failure.
- Stale lock files remain after the processes exit:
~/.copilot/session-state/[session-id]/inuse.[pid-a].lock
~/.copilot/session-state/[session-id]/inuse.[pid-b].lock
~/.copilot/session-state/[session-id]/inuse.[pid-c].lock
Each file contains only its PID. ps -p [pid] confirms none of these processes are still running.
- The session's first
events.jsonl record shows "alreadyInUse": false; a later session.resume record shows "alreadyInUse": true, confirming overlapping access.
- No matching errors appeared in
~/.copilot/logs/copilot.log — only in the per-process logs.
Expected Behavior
- Event appends should not permanently fail due to mutex contention on a large file.
- Stale
inuse.*.lock files should be cleaned up when their owning process exits (gracefully or via signal).
- Ideally,
events.jsonl should be rotated, truncated, or otherwise bounded to prevent unbounded growth.
Diagnostic Clues for Maintainers
- The failure is on the write path (appending events), not the read/resume path.
- The mutex implementation appears to use a file-based lock with a fixed timeout; on a large file the append may exceed the timeout window, or a previously-crashed process's lock may never be released.
- Checking whether the lock acquisition timeout scales with file size, and whether stale-lock detection runs before acquisition, would likely pinpoint the root cause.
Related Issues
| Issue |
Relevance |
| #2209 |
Large events.jsonl / long-lived session corruption (read-path focus) |
| #2490 |
Session corruption with large event files (read-path focus) |
| #1790 |
Feature request: clean up stale inuse.*.lock files |
| #2323 |
Long-lived / sub-agent session-state corruption |
| #2543 |
Session-state corruption in sub-agent scenarios |
| #2217 |
Crash resilience / events corruption |
This issue differs from all of the above because the primary symptom is write-path mutex contention (not read-path corruption or resume failures), combined with stale lock files that are never reclaimed.
Suggested Labels
bug, session-state, events
Filed from sanitized local evidence. No private repository names, absolute home paths beyond ~/.copilot/..., or credentials are included.
Summary
On Copilot CLI 1.0.21 (Linux), a long-lived or frequently-resumed session can accumulate a very large
~/.copilot/session-state/[session-id]/events.jsonlfile. Once the file is large enough (~120 MiB / ~37 k lines in my case), subsequent event writes begin failing permanently with:Multiple stale
inuse.[pid].lockfiles also remain in the session directory after the owning processes have exited, suggesting the cleanup path does not run on all exit scenarios.Environment
Steps to Reproduce
~/.copilot/session-state/[session-id]/events.jsonlto grow large (observed at ~123 MiB).Observed Behavior
~/.copilot/logs/process-*.logfiles, from at least two different PIDs — not a single transient failure.ps -p [pid]confirms none of these processes are still running.events.jsonlrecord shows"alreadyInUse": false; a latersession.resumerecord shows"alreadyInUse": true, confirming overlapping access.~/.copilot/logs/copilot.log— only in the per-process logs.Expected Behavior
inuse.*.lockfiles should be cleaned up when their owning process exits (gracefully or via signal).events.jsonlshould be rotated, truncated, or otherwise bounded to prevent unbounded growth.Diagnostic Clues for Maintainers
Related Issues
events.jsonl/ long-lived session corruption (read-path focus)inuse.*.lockfilesThis issue differs from all of the above because the primary symptom is write-path mutex contention (not read-path corruption or resume failures), combined with stale lock files that are never reclaimed.
Suggested Labels
bug,session-state,eventsFiled from sanitized local evidence. No private repository names, absolute home paths beyond
~/.copilot/..., or credentials are included.