Skip to content

Fix CPU spike while streaming tool-call arguments#455

Open
itkonen wants to merge 1 commit intoeditor-code-assistant:masterfrom
itkonen:fix/tool-call-stream-cpu
Open

Fix CPU spike while streaming tool-call arguments#455
itkonen wants to merge 1 commit intoeditor-code-assistant:masterfrom
itkonen:fix/tool-call-stream-cpu

Conversation

@itkonen
Copy link
Copy Markdown
Contributor

@itkonen itkonen commented May 9, 2026

I noticed that ECA could become very CPU-heavy while the model was streaming tool calls, especially when writing or editing files. In those cases the ECA process could reach around 100% CPU and the workflow became noticeably slower.

The issue seemed specific to streamed tool-call arguments. Normal assistant text streaming did not show the same behavior; the slowdown was most visible when the model was gradually building a tool call such as write_file or edit_file, where the arguments grow chunk by chunk.

I asked Codex/ECA to benchmark this path and identify the hotspot. After applying this change, I tested the same workflow locally and the CPU load is now negligible. Tool-call streaming now appears to keep up with the LLM instead of CPU becoming the bottleneck.


AI-generated technical summary

Problem

While streaming tool-call arguments, :on-prepare-tool-call recomputed the full tool list for every streamed argument delta:

(f.tools/all-tools chat-id agent @db* config)

This is expensive because f.tools/all-tools rebuilds native tools, MCP tools, schemas, dynamic descriptions, disabled-tool filtering, approval filtering, and subagent filtering.

For large tool calls such as write_file, providers may stream the file content as tool-call argument JSON. That means this callback can run hundreds or thousands of times for a single tool call.

Root cause

The prompt flow already computes all-tools once before sending the request to the provider:

:tools all-tools

The model generates tool calls based on that prompt-turn tool snapshot. However, the streamed prepare callback was rebuilding the tool list again for every argument chunk.

Git history suggests this was likely introduced during a refactor rather than being intentional. Earlier code resolved streamed tool-call prepare events against the existing prompt-turn all-tools binding.

Fix

Reuse the prompt’s existing all-tools value inside :on-prepare-tool-call instead of recomputing it per streamed delta.

This makes streamed tool-call preparation resolve against the same tool list that was sent to the model for that prompt turn.

Functional impact

This should preserve intended behavior.

Tool-call prepare events now use the prompt-turn tool snapshot, which is consistent with provider semantics: the model can only call tools that were included in the request at prompt start.

The only theoretical behavior change is that tool-list changes made during an active streaming response are not reflected in toolCallPrepare metadata until the next prompt turn. That seems preferable to resolving different chunks of the same streamed tool call against potentially different tool lists.

Benchmark evidence

A dev benchmark was added to simulate streamed tool-call argument chunks without involving a live LLM or editor.

Command:

clojure -M:dev -m eca.perf.stream-tool-calls 500 128

Representative results:

transition-prepare, noop messenger:
  500 chunks in ~6.59 ms

transition-prepare, JSON serialization:
  500 chunks in ~25.45 ms

on-prepare-like, recomputing all-tools:
  500 chunks in ~5371.49 ms
  ~10.74 ms/chunk

on-prepare-like, cached tools:
  500 chunks in ~3.22 ms

on-prepare-like, recomputing all-tools + JSON serialization:
  500 chunks in ~5044.93 ms

on-prepare-like, cached tools + JSON serialization:
  500 chunks in ~5.65 ms

all-tools alone:
  500 calls in ~5022.95 ms
  ~10.05 ms/call

A larger cached-path run processed 5000 streamed chunks with JSON serialization in ~106 ms.

Test plan

clojure -M:test --focus eca.features.chat-test
clojure -M:test --focus eca.features.chat-tool-call-state-test
clojure -M:dev -m eca.perf.stream-tool-calls 500 128

Manual validation:

  • Before: ECA CPU could reach around 100% while streaming file-writing/editing tool calls.
  • After: CPU load became negligible in the same workflow.

Reuse the prompt-turn tool list while streaming tool-call arguments
instead of recomputing all available tools for every streamed delta.

Add a dev benchmark for reproducing the streamed tool-call prepare path.

🤖 Generated with [eca](https://eca.dev)

Co-Authored-By: eca-agent <git@eca.dev>
@itkonen
Copy link
Copy Markdown
Contributor Author

itkonen commented May 9, 2026

The test failure appears to be unrelated to this PR’s code - maybe a transient server problem.

@itkonen itkonen marked this pull request as ready for review May 9, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant