A small coding agent that runs in your terminal, powered by Ollama for local LLM hosting.
smol-agent gives a local language model the tools it needs to read, write, and edit code, run shell commands, search across a codebase, and ask you for clarification when it gets stuck — all wrapped in a colorful pi-tui and chalk-based terminal UI.
- What it looks like
- Prerequisites
- Install
- Usage
- Tools
- Progressive Tool Discovery
- Code Execution Tool
- Server-Side Programmatic Tool Calling (Anthropic)
- Context Management
- Context Injection
- Persistent Memory
- Skills
- Architecture
- Cross-Agent Communication
- Advanced Features
- Contributing
- Security
- License
smol-agent (model: qwen2.5-coder:32b)
> add error handling to the /users endpoint
⎿ (project context gathered)
⎿ [tool] list_files(pattern: src/**)
⎿ [tool] read_file(filePath: src/routes/users.js)
⎿ [tool] replace_in_file(filePath: src/routes/users.js, oldText: ..., newText: ...)
⏋ thinking...
When the agent finishes, it streams a response:
▸ I've added error handling to the `/users` endpoint. The changes include:
- Input validation for request body
- Try-catch around database operations
- Proper error responses with status codes
See src/routes/users.js for the implementation.
When the agent needs clarification:
? Which /users endpoint should I modify? [answer]
⎿ (answer) the public one
> continue...
Errors are shown in red:
✗ connect ECONNREFUSED 127.0.0.1:11434
Agent responses are rendered with rich markdown formatting in the terminal:
- Headers (
#,##,###) are displayed in different colors and weights - Bold text (
**text**) appears in bold - Italic text (
*text*or_text_) appears in italics - Inline code (
`code`) appears in cyan - Code blocks (
code) are displayed with gray text - Blockquotes (
> quote) appear in gray and italic - Lists (
- itemor1. item) are displayed with bullet points or numbers - Links (
[text](url)) appear in blue with underlines, showing both text and URL - Strikethrough (
~~text~~) appears dimmed
This makes agent responses much easier to read and understand at a glance.
- Node.js >= 20.0.0
- Ollama running locally (default
http://127.0.0.1:11434) OR API access to OpenAI, Anthropic, Grok, Groq, or Gemini
If using Ollama (the default provider):
- Install Ollama from ollama.com
- Pull a model:
ollama pull qwen2.5-coder:32bNote: For machines with limited RAM,
qwen2.5-coder:7borqwen3-coder:14bwork well too. - (Optional) Get an Ollama API key for
web_searchandweb_fetchtools:export OLLAMA_API_KEY=your-key-here
To use OpenAI's models (GPT-4o, GPT-4, etc.):
- Get an API key from OpenAI's platform
- Set the environment variable:
export OPENAI_API_KEY=your-key-here - Run with the OpenAI provider:
smol-agent -p openai "your prompt here"
Default model: gpt-4o
To use Anthropic's Claude models:
- Get an API key from Anthropic's console
- Set the environment variable:
export ANTHROPIC_API_KEY=your-key-here - Run with the Anthropic provider:
smol-agent -p anthropic "your prompt here"
Default model: claude-sonnet-4-20250514
To use xAI's Grok models:
- Get an API key from xAI's console
- Set the environment variable:
export XAI_API_KEY=your-key-here - Run with the Grok provider:
smol-agent -p grok "your prompt here"
Default model: grok-4-latest
To use Groq's fast inference:
- Get an API key from Groq's console
- Set the environment variable:
export GROQ_API_KEY=your-key-here - Run with the Groq provider:
smol-agent -p groq "your prompt here"
Default model: openai/gpt-oss-120b
To use Google's Gemini models:
- Get an API key from Google AI Studio
- Set the environment variable:
export GEMINI_API_KEY=your-key-here - Run with the Gemini provider:
smol-agent -p gemini "your prompt here"
Default model: gemini-2.5-pro
To use Codex directly through the local Codex app-server:
- Install the
codexCLI and make sure it is on yourPATH - Authenticate it:
codex login
- Run with the Codex provider:
smol-agent -p codex "your prompt here"
Default model: gpt-5.4
Notes:
- This uses
codex app-serverunder the hood, not the OpenAI-compatible HTTP API - Codex executes work directly inside your project directory, it is not routed through smol-agent's normal tool-calling loop
- The provider starts Codex in the same working directory you pass with
-d, --directory
You can use any OpenAI-compatible API by passing the base URL as the provider:
smol-agent -p https://your-api.example.com/v1 "your prompt here"This works with self-hosted models (vLLM, LocalAI, etc.) and other OpenAI-compatible services.
smol-agent can use tree-sitter for enhanced code analysis:
- Repository Map: Builds a "table of contents" of your codebase showing key symbols (functions, classes, types) with their file locations. This gives the agent structural understanding without requiring multiple grep/read calls.
- Syntax Validation: After file edits, validates syntax to catch obvious errors before the agent proceeds.
Requirements: tree-sitter requires Node.js 18-22 (it does not build on Node 23+ due to C++20 requirements). To enable:
npm install tree-sitter tree-sitter-javascript tree-sitter-python tree-sitter-typescript tree-sitter-goNote: If installation fails, smol-agent will still work but without these enhanced features.
curl -fsSL https://raw.githubusercontent.com/streed/smol-agent/main/install.sh | shThis will:
- Check Node.js, npm, and git are installed
- Clone smol-agent to
~/.local/share/smol-agent - Install npm dependencies
- Link
smol-agentglobally
git clone https://github.com/streed/smol-agent.git
cd smol-agent
npm install
npm link # makes `smol-agent` available globallynpm install -g smol-agentReleases are published automatically when PRs with the release label are merged to main.
If you installed via the one-liner, update to the latest version with:
smol-agent --self-updateThis pulls the latest changes and reinstalls dependencies automatically.
If you cloned manually:
cd smol-agent
git pull
npm installIf installed via curl | sh:
npm unlink -g smol-agent
rm -rf ~/.local/share/smol-agent
rm -rf ~/.config/smol-agentIf installed via git clone:
npm unlink -g smol-agent
rm -rf smol-agentsmol-agent [options] [prompt]
Interactive mode — launch with no arguments to get a REPL:
smol-agent
One-shot mode — pass a prompt directly:
smol-agent "add input validation to src/api.js"
| Flag | Description |
|---|---|
-m, --model <name> |
Model to use (default depends on provider) |
-p, --provider <name> |
LLM provider: ollama, openai, anthropic, grok, groq, gemini, codex (default: ollama) |
-H, --host <url> |
Provider host/base URL (default: provider-specific) |
--api-key <key> |
API key for cloud providers (or use env vars) |
-d, --directory <path> |
Set working directory and jail boundary (default: cwd) |
--auto-approve |
Skip approval prompts for write/command tools (alias: --yolo) |
--acp |
Run as ACP (Agent Client Protocol) server over stdio |
--self-update |
Update smol-agent to the latest version |
--help |
Show help message |
| Flag | Description |
|---|---|
-s, --session <id> |
Resume a saved session by ID or name |
-c, --continue |
Resume the most recent session |
--session-name <name> |
Name for the new session |
--list-sessions |
List all saved sessions |
--sessions |
Alias for --list-sessions |
| Command | Description |
|---|---|
/clear |
Clear conversation history and start a new session |
/sessions |
List saved sessions |
/session save [name] |
Save the current session (with optional name) |
/session load <id> |
Load a saved session by ID |
/session delete <id> |
Delete a saved session by ID |
/session rename <id> <name> |
Rename a saved session |
/inspect |
Dump current context to CONTEXT.md |
/reload-skills |
Reload skills from global and local directories |
/skills |
List available skills |
/reflect |
Analyze recent logs for skill opportunities and update file documentation |
/document |
Run full codebase documentation pass on all source files >100 lines |
exit / quit |
Exit the agent (/quit also triggers end-of-session reflection) |
Ctrl-C |
Cancel current operation (double-tap to exit) |
The agent has access to the following tools:
| Tool | Description |
|---|---|
read_file |
Read file contents with optional line offset/limit |
write_file |
Write content to a file (creates or overwrites) |
replace_in_file |
Find and replace text in a file |
list_files |
Glob-based file and directory listing |
grep |
Regex search across files with line numbers |
run_command |
Execute shell commands (builds, tests, git, etc.) |
git |
Git commands with safety restrictions (blocks push, --force) |
ask_user |
Ask the user a clarifying question and wait for a response |
| Tool | Description |
|---|---|
web_search |
Search the web via Ollama's web search API |
web_fetch |
Fetch a URL and return its content via Ollama's web fetch API |
save_plan |
Save a plan to a markdown file for tracking |
load_plan_progress |
Load current plan progress and state |
get_current_plan |
Get the content of the currently active plan |
complete_plan_step |
Mark a plan step as completed |
update_plan_status |
Update plan status (in-progress, completed, paused, abandoned) |
reflect |
Summarize work done, what went well, and areas for improvement |
remember |
Save a fact/pattern/preference to persistent memory across sessions |
recall |
Retrieve memories from persistent storage |
save_context |
Save a dense summary of a directory/code area for future sessions |
delegate |
Spawn a sub-agent for focused research tasks |
smol-agent uses a progressive tool discovery system to improve context efficiency. Instead of loading all 45+ tools into the context window at once, tools are organized into groups and unlocked on demand.
- Starter groups are always active:
explore(read_file, list_files, grep, ask_user),edit(write_file, replace_in_file), andexecute(run_command, git, code_execution) — 9 core tools - Additional groups are activated when needed, either automatically or via the
discover_toolsmeta-tool - Tools refresh each iteration — once a group is activated, its tools are immediately available
| Group | Tools | Description |
|---|---|---|
explore |
read_file, list_files, grep, ask_user | Read files, list directories, search code |
edit |
write_file, replace_in_file | Create and edit files |
execute |
run_command, git, code_execution | Shell commands, git operations, code execution |
plan |
save_plan, load_plan_progress, complete_plan_step, update_plan_status, get_current_plan, reflect | Planning and progress tracking |
memory |
remember, recall, memory_bank_read, memory_bank_write, memory_bank_init, save_context | Persistent memory and cross-session knowledge |
web |
web_search, web_fetch | Search the web and fetch URLs |
multi_agent |
delegate, send_letter, check_reply, read_inbox, read_outbox, reply_to_letter, list_agents, link_repos, set_snippet, find_agent_for_task | Sub-agents and cross-agent messaging |
Automatic — The agent detects context signals in user prompts and auto-activates relevant groups. For example, mentioning "plan" or "step by step" activates the plan group; mentioning "remember" or "previous session" activates memory.
Explicit — The agent calls the discover_tools meta-tool:
discover_tools({ groups: ["plan", "memory"] }) // activate groups
discover_tools({ groups: [], list: true }) // list all available groups
Note — All models now use progressive discovery by default.
Progressive discovery reduces context bloat by ~60-70% for typical sessions. Most tasks only need the starter tools. By loading additional tools lazily, the agent preserves context window capacity for actual work — file contents, code analysis, and conversation history.
The code_execution tool allows the agent to run JavaScript code that calls other tools programmatically. This enables batch operations, loops, and result processing — all in a single turn without multiple round-trips to the LLM.
Agent writes JS code → Runs in sandboxed VM → Tools execute outside sandbox
↓
Results returned to sandbox
↓
console.log() output sent back to agent
The sandbox is isolated (no direct filesystem or network access), but all registered tools are available as async functions:
// Example: Batch read multiple files and count lines
const files = await list_files({ pattern: "src/**/*.js" });
for (const f of files.slice(0, 5)) {
const content = await read_file({ filePath: f });
console.log(f, content.split('\n').length);
}// Example: Search and aggregate
const results = await grep({ pattern: "TODO", path: "src/" });
const todos = results.split('\n').length;
console.log(`Found ${todos} TODOs`);- Multi-tool workflows — Call multiple tools in one turn
- Loops and logic — Iterate over results, filter, aggregate
- Token efficient — Only final
console.log()output returns to the model (not intermediate tool results) - Works with all providers — Ollama, OpenAI, Anthropic, Grok, Groq, Gemini
- 2-minute timeout — Long-running operations are capped
Available globals: console, JSON, Math, Date, Array, Object, Map, Set, RegExp, Error, Promise, setTimeout, clearTimeout
All tools are callable as async functions: read_file(), write_file(), grep(), run_command(), etc.
When using the Anthropic provider with supported Claude models, smol-agent can also enable server-side programmatic tool calling. This lets Claude execute Python code on Anthropic's servers and invoke smol-agent's tools from within that code execution sandbox.
claude-opus-4-6claude-sonnet-4-6claude-sonnet-4-5-20250929claude-opus-4-5-20251101
User prompt → Claude writes Python code → Code runs on Anthropic servers
↓
Calls smol-agent tools via allowed_callers
↓
Results flow back into Claude's reasoning
When enabled:
- The Anthropic
code_execution_20260120tool is prepended to the tool list - All other tools get
allowed_callers: ["code_execution_20260120"]— making them callable from within the code execution sandbox - The client-side
code_executiontool is replaced by the server-side version - A container ID is tracked across turns for sandbox reuse
# Via CLI
smol-agent -p anthropic -m claude-sonnet-4-6 --programmatic-tool-calling "your prompt"
# Programmatically
const agent = new Agent({
provider: "anthropic",
model: "claude-sonnet-4-6",
programmaticToolCalling: true,
});Programmatic tool calling is useful when Claude needs to:
- Orchestrate multiple tool calls in a single reasoning step
- Process tool results with Python code before deciding next steps
- Perform calculations or data transformations on tool outputs
The agent manages context window limits automatically:
- Token tracking: Monitors token usage throughout the conversation
- Intelligent pruning: When approaching limits, removes less important messages first
- Context summarization: Summarizes old conversation turns to compress context
- Result truncation: Truncates large tool results while preserving key information
This allows the agent to work on large codebases without running into context window errors.
smol-agent can inject project-specific context into the system prompt:
- AGENT.md: Place a file named
AGENT.mdin your project root. It will be included in the context sent to the LLM. - Skills: Markdown files in
.smol-agent/skills/or~/.config/smol-agent/skills/are loaded as skills.
The agent can remember facts across sessions using the remember and recall tools:
- remember: Save a fact, pattern, or preference to persistent storage
- recall: Retrieve memories (optionally filtered by key or category)
Memories are stored in ~/.config/smol-agent/memories.json and persist across sessions.
Skills are markdown files that define reusable prompts for common tasks:
- Location:
.smol-agent/skills/(project) or~/.config/smol-agent/skills/(global) - Format: Markdown with
# Skill NameandDescription: ...in the header - Loading: Skills are loaded on startup and injected into the system prompt
Example skill:
# Fix Lint Errors
Description: Fix linting errors in the codebase
Find and fix all linting errors in the project. Run the linter first to identify issues, then fix each one systematically.User prompt → Agent.run() → LLM Provider API or Codex app-server → tool calls → execute tools → feed results back → repeat until text response
The agent is an EventEmitter that drives a loop: send messages to the configured provider, check for tool calls, execute them, push results back, and repeat (max 25 iterations). The pi-tui UI subscribes to events (tool_call, tool_result, response, error) to render progress.
Most providers use smol-agent's normal tool-calling loop. The codex provider is different: it streams output from codex app-server, and Codex performs its own file and command execution directly in the working tree.
Run as an Agent Client Protocol server for IDE/editor integration:
smol-agent --acpCommunicates via JSON-RPC over stdio, compatible with ACP-compatible editors.
smol-agent instances can communicate across repositories using the inbox/letter protocol. This allows a frontend agent to request backend changes, a main agent to delegate to a documentation agent, etc.
Agent A Agent B
| |
| 1. find_agent_for_task() |
| 2. send_letter() ---------> inbox/.letter.md
| | 3. watchInbox detects letter
| | 4. Spawns agent, does work
| | 5. reply_to_letter()
| inbox/.response.md <------- |
| 6. Auto-notified via watcher |
| (injected into conversation) |
Agents self-register in a global registry (~/.config/smol-agent/agents.json) on startup. Use these tools to find and communicate with other agents:
| Tool | Description |
|---|---|
list_agents |
List all registered agents |
find_agent_for_task |
Find the best agent for a task (keyword matching against snippets) |
send_letter |
Send a work request to another agent (supports wait_for_reply) |
check_reply |
Poll for a response to a sent letter |
reply_to_letter |
Send a response back after completing work |
link_repos |
Create relationships between repos (depends-on, serves, etc.) |
Responses are delivered through three complementary mechanisms:
- Auto-notification (default) -- A file watcher detects incoming
.response.mdfiles and injects the reply into the running conversation automatically. - Blocking wait --
send_letter(wait_for_reply: true)blocks until the reply arrives (up to 5 minutes). - Manual poll --
check_reply(letter_id)for explicit polling.
If a spawned agent exits without calling reply_to_letter, the system auto-generates a completed/failed response as a safety net.
Run a persistent watcher that processes incoming letters:
smol-agent --watch-inboxSee docs/cross-agent-communication.md for the full protocol specification with Mermaid diagrams.
See CONTRIBUTING.md for guidelines.
smol-agent operates within a "jail" directory:
- All file operations are restricted to the jail directory
- Commands like
rm -rf /are blocked - Git push and
--forceare blocked
However, you are responsible for reviewing changes before approving tool calls. The agent will ask for approval before:
- Writing files
- Running shell commands
Use --auto-approve (or --yolo) to skip approvals (use with caution).