vasted is a CLI that launches on-demand Vast.ai GPU workers for llama.cpp GGUF inference and exposes a stable OpenAI-compatible /v1 endpoint.
Built by deeflect.com · Follow on X: x.com/deeflectcom
- Stable client endpoint while worker URLs rotate.
- Setup wizard for local machine and VPS deployments.
- Non-interactive automation mode for agents/CI.
- OpenAI-compatible proxy for tools that expect
/v1APIs. - Session usage and cost tracking.
- Optional Telegram bot control commands.
- Python
3.12+ uv- Vast.ai account + API key
- Optional: Telegram bot token (
telegramextra)
uv tool install vasted
vasted --versionUpgrade:
uv tool upgrade vastedgit clone https://github.com/deeflect/vasted.git
cd vasted
uv sync --extra devRun CLI commands from the repo:
uv run vasted --helpuv tool install "git+https://github.com/deeflect/vasted.git"If installed as a tool:
vasted setup
vasted up
vasted status --verboseFrom source checkout:
uv run vasted setup
uv run vasted up
uv run vasted status --verboseClient connection values after setup:
- Base URL:
http://<host>:<port>/v1 - Auth header:
Authorization: Bearer <token>
When proxy_host is 0.0.0.0, use your real machine/VPS IP or domain in clients.
Use non-interactive commands to avoid prompts:
uv run vasted setup --non-interactive \
--vast-api-key "$VASTED_API_KEY" \
--bearer-token "$VASTED_BEARER_TOKEN" \
--client openclaw \
--deployment-mode local_pc \
--model qwen3-coder-30b \
--quality balanced \
--gpu-mode auto
uv run vasted up --non-interactive --yes --jinja --model qwen3-coder-30b --quality balanced --gpu-mode auto --no-serve
uv run vasted status --verbose
uv run vasted usage
uv run vasted down --forceEnvironment variables accepted by setup --non-interactive:
VASTED_API_KEYVASTED_BEARER_TOKENVASTED_CLIENT(openclaw,opencode,custom)VASTED_LLAMA_JINJA(true/false)VASTED_MODEL,VASTED_QUALITY,VASTED_GPU_MODE,VASTED_GPU_PRESETVASTED_DEPLOYMENT_MODE,VASTED_PROXY_HOST,VASTED_PROXY_PORT,VASTED_PUBLIC_HOST
setup supports client presets that define default llama.cpp --jinja behavior:
--client openclaw: jinja on by default--client opencode: jinja off by default--client custom: keep/manual behavior
Per launch override is still available:
uv run vasted up --jinja
uv run vasted up --no-jinjavasted setup [--non-interactive] [--manual] [--client openclaw|opencode|custom]
vasted up [--model ...] [--quality ...] [--gpu-mode auto|manual] [--gpu-preset ...] [--profile ...] [--max-price ...] [--jinja|--no-jinja] [--yes] [--non-interactive] [--serve|--no-serve]
vasted down [--force]
vasted status [--verbose]
vasted logs [--instance-id N] [--tail N]
vasted usage
vasted token show [--full]
vasted token rotate
vasted rotate-token
vasted config show
vasted profile list|add|use|remove
vasted completions <bash|zsh|fish>Install telegram extra and run:
uv sync --extra telegram
uv run python bot.pyuv run ruff check .
uv run mypy app tests bot.py
uv run pytest -qapp/commands/*: CLI command handlersapp/service.py: worker lifecycle + launch policyapp/proxy.py: OpenAI-compatible reverse proxyapp/vast.py: Vast API integration + startup script generationapp/usage.py: token/time/cost accountingapp/user_config.py: persistent config + keyring integrationapp/state.py: runtime state persistencebot.py: optional Telegram control plane
- Keep Vast API keys and bearer tokens private.
- Prefer localhost binds unless remote access is required.
- See SECURITY.md for disclosure policy.
See CONTRIBUTING.md and run the validation commands before opening a PR.
MIT — see LICENSE.
