smol-agent

A small coding agent that runs in your terminal, powered by Ollama for local LLM hosting.

smol-agent gives a local language model the tools it needs to read, write, and edit code, run shell commands, search across a codebase, and ask you for clarification when it gets stuck — all wrapped in a colorful pi-tui and chalk-based terminal UI.

What it looks like

smol-agent (model: qwen2.5-coder:32b)

 > add error handling to the /users endpoint

    ⎿  (project context gathered)
    ⎿  [tool] list_files(pattern: src/**)
    ⎿  [tool] read_file(filePath: src/routes/users.js)
    ⎿  [tool] replace_in_file(filePath: src/routes/users.js, oldText: ..., newText: ...)
 ⏋ thinking...

When the agent finishes, it streams a response:

 ▸  I've added error handling to the `/users` endpoint. The changes include:

    - Input validation for request body
    - Try-catch around database operations  
    - Proper error responses with status codes

    See src/routes/users.js for the implementation.

When the agent needs clarification:

 ?  Which /users endpoint should I modify? [answer]
    ⎿  (answer) the public one

 > continue...

Errors are shown in red:

 ✗ connect ECONNREFUSED 127.0.0.1:11434

Benchmark

Rich Markdown Rendering

Agent responses are rendered with rich markdown formatting in the terminal:

Headers (#, ##, ###) are displayed in different colors and weights
Bold text (**text**) appears in bold
Italic text (*text* or _text_) appears in italics
Inline code (`code`) appears in cyan
Code blocks ( code ) are displayed with gray text
Blockquotes (> quote) appear in gray and italic
Lists (- item or 1. item) are displayed with bullet points or numbers
Links ([text](url)) appear in blue with underlines, showing both text and URL
Strikethrough (~~text~~) appears dimmed

This makes agent responses much easier to read and understand at a glance.

Prerequisites

Node.js >= 20.0.0
Ollama running locally (default http://127.0.0.1:11434) OR API access to OpenAI, Anthropic, Grok, Groq, or Gemini

Setting up Ollama

If using Ollama (the default provider):

Install Ollama from ollama.com
Pull a model:
```
ollama pull qwen2.5-coder:32b
```
Note: For machines with limited RAM, qwen2.5-coder:7b or qwen3-coder:14b work well too.
(Optional) Get an Ollama API key for web_search and web_fetch tools:
```
export OLLAMA_API_KEY=your-key-here
```

Setting up OpenAI

To use OpenAI's models (GPT-4o, GPT-4, etc.):

Get an API key from OpenAI's platform
Set the environment variable:
```
export OPENAI_API_KEY=your-key-here
```
Run with the OpenAI provider:
```
smol-agent -p openai "your prompt here"
```

Default model: gpt-4o

Setting up Anthropic

To use Anthropic's Claude models:

Get an API key from Anthropic's console
Set the environment variable:
```
export ANTHROPIC_API_KEY=your-key-here
```

Run with the Anthropic provider:

smol-agent -p anthropic "your prompt here"

Default model: claude-sonnet-4-20250514

Setting up Grok (xAI)

To use xAI's Grok models:

Get an API key from xAI's console
Set the environment variable:
```
export XAI_API_KEY=your-key-here
```
Run with the Grok provider:
```
smol-agent -p grok "your prompt here"
```

Default model: grok-4-latest

Setting up Groq

To use Groq's fast inference:

Get an API key from Groq's console
Set the environment variable:
```
export GROQ_API_KEY=your-key-here
```
Run with the Groq provider:
```
smol-agent -p groq "your prompt here"
```

Default model: openai/gpt-oss-120b

Setting up Gemini (Google)

To use Google's Gemini models:

Get an API key from Google AI Studio
Set the environment variable:
```
export GEMINI_API_KEY=your-key-here
```
Run with the Gemini provider:
```
smol-agent -p gemini "your prompt here"
```

Default model: gemini-2.5-pro

Setting up Codex CLI

To use Codex directly through the local Codex app-server:

Install the codex CLI and make sure it is on your PATH
Authenticate it:
```
codex login
```
Run with the Codex provider:
```
smol-agent -p codex "your prompt here"
```

Default model: gpt-5.4

Notes:

This uses codex app-server under the hood, not the OpenAI-compatible HTTP API
Codex executes work directly inside your project directory, it is not routed through smol-agent's normal tool-calling loop
The provider starts Codex in the same working directory you pass with -d, --directory

Custom OpenAI-Compatible Endpoints

You can use any OpenAI-compatible API by passing the base URL as the provider:

smol-agent -p https://your-api.example.com/v1 "your prompt here"

This works with self-hosted models (vLLM, LocalAI, etc.) and other OpenAI-compatible services.

Optional: tree-sitter (for enhanced code analysis)

smol-agent can use tree-sitter for enhanced code analysis:

Repository Map: Builds a "table of contents" of your codebase showing key symbols (functions, classes, types) with their file locations. This gives the agent structural understanding without requiring multiple grep/read calls.
Syntax Validation: After file edits, validates syntax to catch obvious errors before the agent proceeds.

Requirements: tree-sitter requires Node.js 18-22 (it does not build on Node 23+ due to C++20 requirements). To enable:

npm install tree-sitter tree-sitter-javascript tree-sitter-python tree-sitter-typescript tree-sitter-go

Note: If installation fails, smol-agent will still work but without these enhanced features.

Install

Quick Install (recommended)

curl -fsSL https://raw.githubusercontent.com/streed/smol-agent/main/install.sh | sh

This will:

Check Node.js, npm, and git are installed
Clone smol-agent to ~/.local/share/smol-agent
Install npm dependencies
Link smol-agent globally

Manual Install

git clone https://github.com/streed/smol-agent.git
cd smol-agent
npm install
npm link   # makes `smol-agent` available globally

Via npm

npm install -g smol-agent

Releases are published automatically when PRs with the release label are merged to main.

Update

Self-update (installed via curl | sh)

If you installed via the one-liner, update to the latest version with:

smol-agent --self-update

This pulls the latest changes and reinstalls dependencies automatically.

Manual update (git clone)

If you cloned manually:

cd smol-agent
git pull
npm install

Uninstall

If installed via curl | sh:

npm unlink -g smol-agent
rm -rf ~/.local/share/smol-agent
rm -rf ~/.config/smol-agent

If installed via git clone:

npm unlink -g smol-agent
rm -rf smol-agent

Usage

smol-agent [options] [prompt]

Interactive mode — launch with no arguments to get a REPL:

smol-agent

One-shot mode — pass a prompt directly:

smol-agent "add input validation to src/api.js"

Options

Flag	Description
`-m, --model <name>`	Model to use (default depends on provider)
`-p, --provider <name>`	LLM provider: `ollama`, `openai`, `anthropic`, `grok`, `groq`, `gemini`, `codex` (default: `ollama`)
`-H, --host <url>`	Provider host/base URL (default: provider-specific)
`--api-key <key>`	API key for cloud providers (or use env vars)
`-d, --directory <path>`	Set working directory and jail boundary (default: cwd)
`--auto-approve`	Skip approval prompts for write/command tools (alias: `--yolo`)
`--acp`	Run as ACP (Agent Client Protocol) server over stdio
`--self-update`	Update smol-agent to the latest version
`--help`	Show help message

Session Management

Flag	Description
`-s, --session <id>`	Resume a saved session by ID or name
`-c, --continue`	Resume the most recent session
`--session-name <name>`	Name for the new session
`--list-sessions`	List all saved sessions
`--sessions`	Alias for `--list-sessions`

Commands (interactive mode)

Command	Description
`/clear`	Clear conversation history and start a new session
`/sessions`	List saved sessions
`/session save [name]`	Save the current session (with optional name)
`/session load <id>`	Load a saved session by ID
`/session delete <id>`	Delete a saved session by ID
`/session rename <id> <name>`	Rename a saved session
`/inspect`	Dump current context to CONTEXT.md
`/reload-skills`	Reload skills from global and local directories
`/skills`	List available skills
`/reflect`	Analyze recent logs for skill opportunities and update file documentation
`/document`	Run full codebase documentation pass on all source files >100 lines
`exit` / `quit`	Exit the agent (`/quit` also triggers end-of-session reflection)
`Ctrl-C`	Cancel current operation (double-tap to exit)

Tools

The agent has access to the following tools:

Core Tools (always available)

Tool	Description
`read_file`	Read file contents with optional line offset/limit
`write_file`	Write content to a file (creates or overwrites)
`replace_in_file`	Find and replace text in a file
`list_files`	Glob-based file and directory listing
`grep`	Regex search across files with line numbers
`run_command`	Execute shell commands (builds, tests, git, etc.)
`git`	Git commands with safety restrictions (blocks push, --force)
`ask_user`	Ask the user a clarifying question and wait for a response

Extended Tools (available when needed)

Tool	Description
`web_search`	Search the web via Ollama's web search API
`web_fetch`	Fetch a URL and return its content via Ollama's web fetch API
`save_plan`	Save a plan to a markdown file for tracking
`load_plan_progress`	Load current plan progress and state
`get_current_plan`	Get the content of the currently active plan
`complete_plan_step`	Mark a plan step as completed
`update_plan_status`	Update plan status (in-progress, completed, paused, abandoned)
`reflect`	Summarize work done, what went well, and areas for improvement
`remember`	Save a fact/pattern/preference to persistent memory across sessions
`recall`	Retrieve memories from persistent storage
`save_context`	Save a dense summary of a directory/code area for future sessions
`delegate`	Spawn a sub-agent for focused research tasks

Progressive Tool Discovery

smol-agent uses a progressive tool discovery system to improve context efficiency. Instead of loading all 45+ tools into the context window at once, tools are organized into groups and unlocked on demand.

How It Works

Starter groups are always active: explore (read_file, list_files, grep, ask_user), edit (write_file, replace_in_file), and execute (run_command, git, code_execution) — 9 core tools
Additional groups are activated when needed, either automatically or via the discover_tools meta-tool
Tools refresh each iteration — once a group is activated, its tools are immediately available

Tool Groups

Group	Tools	Description
`explore`	read_file, list_files, grep, ask_user	Read files, list directories, search code
`edit`	write_file, replace_in_file	Create and edit files
`execute`	run_command, git, code_execution	Shell commands, git operations, code execution
`plan`	save_plan, load_plan_progress, complete_plan_step, update_plan_status, get_current_plan, reflect	Planning and progress tracking
`memory`	remember, recall, memory_bank_read, memory_bank_write, memory_bank_init, save_context	Persistent memory and cross-session knowledge
`web`	web_search, web_fetch	Search the web and fetch URLs
`multi_agent`	delegate, send_letter, check_reply, read_inbox, read_outbox, reply_to_letter, list_agents, link_repos, set_snippet, find_agent_for_task	Sub-agents and cross-agent messaging

Activation Methods

Automatic — The agent detects context signals in user prompts and auto-activates relevant groups. For example, mentioning "plan" or "step by step" activates the plan group; mentioning "remember" or "previous session" activates memory.

Explicit — The agent calls the discover_tools meta-tool:

discover_tools({ groups: ["plan", "memory"] })       // activate groups
discover_tools({ groups: [], list: true })            // list all available groups

Note — All models now use progressive discovery by default.

Why This Matters

Progressive discovery reduces context bloat by ~60-70% for typical sessions. Most tasks only need the starter tools. By loading additional tools lazily, the agent preserves context window capacity for actual work — file contents, code analysis, and conversation history.

Code Execution Tool

The code_execution tool allows the agent to run JavaScript code that calls other tools programmatically. This enables batch operations, loops, and result processing — all in a single turn without multiple round-trips to the LLM.

How It Works

Agent writes JS code → Runs in sandboxed VM → Tools execute outside sandbox
                                                ↓
                                          Results returned to sandbox
                                                ↓
                                          console.log() output sent back to agent

The sandbox is isolated (no direct filesystem or network access), but all registered tools are available as async functions:

// Example: Batch read multiple files and count lines
const files = await list_files({ pattern: "src/**/*.js" });
for (const f of files.slice(0, 5)) {
  const content = await read_file({ filePath: f });
  console.log(f, content.split('\n').length);
}

// Example: Search and aggregate
const results = await grep({ pattern: "TODO", path: "src/" });
const todos = results.split('\n').length;
console.log(`Found ${todos} TODOs`);

Key Features

Multi-tool workflows — Call multiple tools in one turn
Loops and logic — Iterate over results, filter, aggregate
Token efficient — Only final console.log() output returns to the model (not intermediate tool results)
Works with all providers — Ollama, OpenAI, Anthropic, Grok, Groq, Gemini
2-minute timeout — Long-running operations are capped

Sandboxed Environment

Available globals: console, JSON, Math, Date, Array, Object, Map, Set, RegExp, Error, Promise, setTimeout, clearTimeout

All tools are callable as async functions: read_file(), write_file(), grep(), run_command(), etc.

Server-Side Programmatic Tool Calling (Anthropic)

When using the Anthropic provider with supported Claude models, smol-agent can also enable server-side programmatic tool calling. This lets Claude execute Python code on Anthropic's servers and invoke smol-agent's tools from within that code execution sandbox.

Supported Models

claude-opus-4-6
claude-sonnet-4-6
claude-sonnet-4-5-20250929
claude-opus-4-5-20251101

How It Works

User prompt → Claude writes Python code → Code runs on Anthropic servers
                                            ↓
                                        Calls smol-agent tools via allowed_callers
                                            ↓
                                        Results flow back into Claude's reasoning

When enabled:

The Anthropic code_execution_20260120 tool is prepended to the tool list
All other tools get allowed_callers: ["code_execution_20260120"] — making them callable from within the code execution sandbox
The client-side code_execution tool is replaced by the server-side version
A container ID is tracked across turns for sandbox reuse

Enabling

# Via CLI
smol-agent -p anthropic -m claude-sonnet-4-6 --programmatic-tool-calling "your prompt"

# Programmatically
const agent = new Agent({
  provider: "anthropic",
  model: "claude-sonnet-4-6",
  programmaticToolCalling: true,
});

When To Use

Programmatic tool calling is useful when Claude needs to:

Orchestrate multiple tool calls in a single reasoning step
Process tool results with Python code before deciding next steps
Perform calculations or data transformations on tool outputs

Context Management

The agent manages context window limits automatically:

Token tracking: Monitors token usage throughout the conversation
Intelligent pruning: When approaching limits, removes less important messages first
Context summarization: Summarizes old conversation turns to compress context
Result truncation: Truncates large tool results while preserving key information

This allows the agent to work on large codebases without running into context window errors.

Context Injection

smol-agent can inject project-specific context into the system prompt:

AGENT.md: Place a file named AGENT.md in your project root. It will be included in the context sent to the LLM.
Skills: Markdown files in .smol-agent/skills/ or ~/.config/smol-agent/skills/ are loaded as skills.

Persistent Memory

The agent can remember facts across sessions using the remember and recall tools:

remember: Save a fact, pattern, or preference to persistent storage
recall: Retrieve memories (optionally filtered by key or category)

Memories are stored in ~/.config/smol-agent/memories.json and persist across sessions.

Skills

Skills are markdown files that define reusable prompts for common tasks:

Location: .smol-agent/skills/ (project) or ~/.config/smol-agent/skills/ (global)
Format: Markdown with # Skill Name and Description: ... in the header
Loading: Skills are loaded on startup and injected into the system prompt

Example skill:

# Fix Lint Errors
Description: Fix linting errors in the codebase

Find and fix all linting errors in the project. Run the linter first to identify issues, then fix each one systematically.

Architecture

User prompt → Agent.run() → LLM Provider API or Codex app-server → tool calls → execute tools → feed results back → repeat until text response

The agent is an EventEmitter that drives a loop: send messages to the configured provider, check for tool calls, execute them, push results back, and repeat (max 25 iterations). The pi-tui UI subscribes to events (tool_call, tool_result, response, error) to render progress.

Most providers use smol-agent's normal tool-calling loop. The codex provider is different: it streams output from codex app-server, and Codex performs its own file and command execution directly in the working tree.

Advanced Features

ACP Server Mode

Run as an Agent Client Protocol server for IDE/editor integration:

smol-agent --acp

Communicates via JSON-RPC over stdio, compatible with ACP-compatible editors.

Cross-Agent Communication

smol-agent instances can communicate across repositories using the inbox/letter protocol. This allows a frontend agent to request backend changes, a main agent to delegate to a documentation agent, etc.

How It Works

Agent A                          Agent B
  |                                |
  |  1. find_agent_for_task()      |
  |  2. send_letter() --------->  inbox/.letter.md
  |                                |  3. watchInbox detects letter
  |                                |  4. Spawns agent, does work
  |                                |  5. reply_to_letter()
  |  inbox/.response.md  <------- |
  |  6. Auto-notified via watcher  |
  |  (injected into conversation)  |

Agent Discovery

Agents self-register in a global registry (~/.config/smol-agent/agents.json) on startup. Use these tools to find and communicate with other agents:

Tool	Description
`list_agents`	List all registered agents
`find_agent_for_task`	Find the best agent for a task (keyword matching against snippets)
`send_letter`	Send a work request to another agent (supports `wait_for_reply`)
`check_reply`	Poll for a response to a sent letter
`reply_to_letter`	Send a response back after completing work
`link_repos`	Create relationships between repos (depends-on, serves, etc.)

Response Delivery

Responses are delivered through three complementary mechanisms:

Auto-notification (default) -- A file watcher detects incoming .response.md files and injects the reply into the running conversation automatically.
Blocking wait -- send_letter(wait_for_reply: true) blocks until the reply arrives (up to 5 minutes).
Manual poll -- check_reply(letter_id) for explicit polling.

If a spawned agent exits without calling reply_to_letter, the system auto-generates a completed/failed response as a safety net.

Inbox Watcher Mode

Run a persistent watcher that processes incoming letters:

smol-agent --watch-inbox

See docs/cross-agent-communication.md for the full protocol specification with Mermaid diagrams.

Contributing

See CONTRIBUTING.md for guidelines.

Security

smol-agent operates within a "jail" directory:

All file operations are restricted to the jail directory
Commands like rm -rf / are blocked
Git push and --force are blocked

However, you are responsible for reviewing changes before approving tool calls. The agent will ask for approval before:

Writing files
Running shell commands

Use --auto-approve (or --yolo) to skip approvals (use with caution).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 174 Commits
.github		.github
docs		docs
harbor		harbor
scripts		scripts
src		src
test		test
.editorconfig		.editorconfig
.gitignore		.gitignore
.smol-gang-plan.md		.smol-gang-plan.md
AGENT.md		AGENT.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CODE_STYLE.md		CODE_STYLE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SECURITY_FIXES_APPLIED.md		SECURITY_FIXES_APPLIED.md
eslint.config.js		eslint.config.js
install.sh		install.sh
jest.config.js		jest.config.js
mise.toml		mise.toml
package-lock.json		package-lock.json
package.json		package.json
plan.md		plan.md
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

smol-agent

Table of Contents

What it looks like

Benchmark

Rich Markdown Rendering

Prerequisites

Setting up Ollama

Setting up OpenAI

Setting up Anthropic

Setting up Grok (xAI)

Setting up Groq

Setting up Gemini (Google)

Setting up Codex CLI

Custom OpenAI-Compatible Endpoints

Optional: tree-sitter (for enhanced code analysis)

Install

Quick Install (recommended)

Manual Install

Via npm

Update

Self-update (installed via curl | sh)

Manual update (git clone)

Uninstall

Usage

Options

Session Management

Commands (interactive mode)

Tools

Core Tools (always available)

Extended Tools (available when needed)

Progressive Tool Discovery

How It Works

Tool Groups

Activation Methods

Why This Matters

Code Execution Tool

How It Works

Key Features

Sandboxed Environment

Server-Side Programmatic Tool Calling (Anthropic)

Supported Models

How It Works

Enabling

When To Use

Context Management

Context Injection

Persistent Memory

Skills

Architecture

Advanced Features

ACP Server Mode

Cross-Agent Communication

How It Works

Agent Discovery

Response Delivery

Inbox Watcher Mode

Contributing

Security

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages