High-performance proxy router for LLM APIs with automatic load balancing, rate limiting, and fail2ban protection. Routes requests to OpenAI, Vertex AI, Gemini AI Studio, Anthropic, and other Auto AI Router instances.
- Multi-provider support — OpenAI, Vertex AI, Gemini, Anthropic, Proxy chains
- Round-robin load balancing — across multiple credentials per model
- Rate limiting — per-credential and per-model RPM/TPM controls
- Fail2ban — automatic provider banning on repeated errors
- Prometheus metrics — request counts, latency, credential status
- LiteLLM DB integration — spend logging and API key authentication
- Streaming — full SSE support for all providers
- Environment variables — secure credential management via
os.environ/VAR_NAME
# Build
git clone https://github.com/MiXaiLL76/auto_ai_router.git
cd auto_ai_router
go build -o auto_ai_router ./cmd/server/
# Run
./auto_ai_router -config config.yamlOr with Docker:
docker pull ghcr.io/mixaill76/auto_ai_router:latest
docker run -p 8080:8080 -v $(pwd)/config.yaml:/app/config.yaml ghcr.io/mixaill76/auto_ai_router:latestFull documentation is available at mixaill76.github.io/auto_ai_router.
Apache License 2.0 — see LICENSE file.