Skip to content

Commit 21dba4e

Browse files
author
examples-bot
committed
feat(examples): add 530 — Voice Agent Multi-Provider Chat Completions Proxy (Python)
1 parent fe9e6da commit 21dba4e

9 files changed

Lines changed: 591 additions & 0 deletions

File tree

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Deepgram — https://console.deepgram.com/
2+
DEEPGRAM_API_KEY=
3+
4+
# Active LLM provider: "openai" or "bedrock"
5+
LLM_PROVIDER=openai
6+
7+
# OpenAI — https://platform.openai.com/api-keys
8+
OPENAI_API_KEY=
9+
10+
# Amazon Bedrock — https://console.aws.amazon.com/bedrock/
11+
AWS_ACCESS_KEY_ID=
12+
AWS_SECRET_ACCESS_KEY=
13+
AWS_REGION=us-east-1
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Multi-Provider Chat Completions Proxy for Deepgram Voice Agent
2+
3+
An OpenAI-compatible proxy server that sits between the Deepgram Voice Agent API and multiple LLM backends. Swap between OpenAI and Amazon Bedrock by changing one environment variable — no code changes needed.
4+
5+
## What you'll build
6+
7+
A FastAPI server exposing `/v1/chat/completions` that the Deepgram Voice Agent uses as its "think" endpoint. The proxy translates requests to whichever LLM backend you configure (`openai` or `bedrock`), so you can switch providers without modifying the agent code.
8+
9+
## Prerequisites
10+
11+
- Python 3.10+
12+
- Deepgram account — [get a free API key](https://console.deepgram.com/)
13+
- OpenAI account — [get an API key](https://platform.openai.com/api-keys) (for OpenAI provider)
14+
- AWS account with Bedrock access — [enable models](https://console.aws.amazon.com/bedrock/) (for Bedrock provider)
15+
16+
## Environment variables
17+
18+
| Variable | Where to find it |
19+
|----------|-----------------|
20+
| `DEEPGRAM_API_KEY` | [Deepgram console](https://console.deepgram.com/) |
21+
| `LLM_PROVIDER` | Set to `openai` or `bedrock` |
22+
| `OPENAI_API_KEY` | [OpenAI dashboard](https://platform.openai.com/api-keys) |
23+
| `AWS_ACCESS_KEY_ID` | [AWS IAM console](https://console.aws.amazon.com/iam/) |
24+
| `AWS_SECRET_ACCESS_KEY` | [AWS IAM console](https://console.aws.amazon.com/iam/) |
25+
| `AWS_REGION` | Your Bedrock-enabled region (default: `us-east-1`) |
26+
27+
## Install and run
28+
29+
```bash
30+
cp .env.example .env
31+
# Fill in your credentials in .env
32+
33+
pip install -r requirements.txt
34+
35+
# Start the proxy server
36+
uvicorn src.proxy:app --port 8080
37+
38+
# In another terminal — run the Voice Agent client
39+
python -m src.agent
40+
```
41+
42+
## Key parameters
43+
44+
| Parameter | Value | Description |
45+
|-----------|-------|-------------|
46+
| `LLM_PROVIDER` | `openai` / `bedrock` | Which LLM backend the proxy routes to |
47+
| `think.endpoint.url` | `http://localhost:8080/v1/chat/completions` | Voice Agent sends LLM requests here instead of OpenAI |
48+
| `listen.provider.model` | `nova-3` | Deepgram STT model for speech recognition |
49+
| `speak.provider.model` | `aura-2-thalia-en` | Deepgram TTS model for voice output |
50+
51+
## How it works
52+
53+
1. **Start the proxy** — FastAPI server exposes an OpenAI-compatible `/v1/chat/completions` endpoint
54+
2. **Configure the Voice Agent** — the agent's `think.endpoint.url` points at the proxy instead of OpenAI's API directly
55+
3. **Agent connects to Deepgram** — the Voice Agent WebSocket opens and sends settings with the custom think endpoint
56+
4. **User speaks** — Deepgram STT transcribes audio via nova-3, then the agent sends chat completions to the proxy
57+
5. **Proxy dispatches** — based on `LLM_PROVIDER`, the proxy forwards to OpenAI or translates to Bedrock's Converse API
58+
6. **Response flows back** — the proxy returns an OpenAI-format response, the agent speaks it via Deepgram TTS
59+
60+
### Adding a new provider
61+
62+
1. Add the provider's env vars to `.env.example`
63+
2. Add a new branch in `src/config.py:load_provider_config()`
64+
3. Add a `_forward_{provider}()` function in `src/proxy.py`
65+
4. Register it in `_DISPATCH`
66+
67+
## Starter templates
68+
69+
[deepgram-starters](https://github.com/orgs/deepgram-starters/repositories)
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
deepgram-sdk==6.1.1
2+
fastapi==0.135.3
3+
uvicorn[standard]==0.44.0
4+
httpx==0.28.1
5+
python-dotenv==1.2.2
6+
boto3==1.42.85

examples/530-voice-agent-multi-provider-proxy-python/src/__init__.py

Whitespace-only changes.
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
"""Deepgram Voice Agent client that routes its LLM calls through the local proxy.
2+
3+
Connects to the Deepgram Voice Agent API via the Python SDK and configures
4+
a custom "think" endpoint pointing at the proxy server. This lets you swap
5+
LLM backends (OpenAI, Bedrock, etc.) without touching the agent code — just
6+
change LLM_PROVIDER in your .env file.
7+
8+
Usage:
9+
# First start the proxy:
10+
uvicorn src.proxy:app --port 8080
11+
12+
# Then run the agent:
13+
python -m src.agent
14+
"""
15+
16+
import os
17+
import signal
18+
import sys
19+
import threading
20+
21+
from dotenv import load_dotenv
22+
23+
load_dotenv()
24+
25+
from deepgram import DeepgramClient
26+
from deepgram.agent.v1.types.agent_v1settings import AgentV1Settings
27+
from deepgram.agent.v1.types.agent_v1conversation_text import AgentV1ConversationText
28+
from deepgram.agent.v1.types.agent_v1settings_applied import AgentV1SettingsApplied
29+
from deepgram.agent.v1.types.agent_v1welcome import AgentV1Welcome
30+
from deepgram.agent.v1.types.agent_v1agent_thinking import AgentV1AgentThinking
31+
from deepgram.agent.v1.types.agent_v1agent_audio_done import AgentV1AgentAudioDone
32+
from deepgram.agent.v1.types.agent_v1error import AgentV1Error
33+
from deepgram.agent.v1.types.agent_v1function_call_request import AgentV1FunctionCallRequest
34+
from deepgram.agent.v1.types.agent_v1send_function_call_response import AgentV1SendFunctionCallResponse
35+
36+
PROXY_URL = os.environ.get("PROXY_URL", "http://localhost:8080")
37+
38+
39+
def build_agent_settings(proxy_url: str = PROXY_URL) -> AgentV1Settings:
40+
"""Build the Voice Agent settings that point the think endpoint at the proxy.
41+
42+
The key insight: setting think.endpoint.url to our proxy means every LLM
43+
call the agent makes goes through the proxy, which routes to whichever
44+
backend LLM_PROVIDER selects. Changing providers requires zero code changes
45+
in this file — just update .env.
46+
"""
47+
return AgentV1Settings(
48+
type="Settings",
49+
# ← tag is REQUIRED on every Deepgram API call
50+
tags=["deepgram-examples"],
51+
audio={
52+
"input": {"encoding": "linear16", "sample_rate": 16000},
53+
"output": {"encoding": "linear16", "sample_rate": 16000},
54+
},
55+
agent={
56+
"listen": {
57+
"provider": {"type": "deepgram", "model": "nova-3"},
58+
},
59+
"think": {
60+
"provider": {"type": "open_ai", "model": "proxy"},
61+
# ← THIS enables custom LLM routing: the Voice Agent sends
62+
# chat-completions requests to our proxy instead of OpenAI
63+
"endpoint": {
64+
"url": f"{proxy_url}/v1/chat/completions",
65+
"headers": {},
66+
},
67+
"prompt": (
68+
"You are a helpful voice assistant. Keep responses brief "
69+
"and conversational — the user is speaking to you, not reading."
70+
),
71+
},
72+
"speak": {
73+
"provider": {"type": "deepgram", "model": "aura-2-thalia-en"},
74+
},
75+
"greeting": "Hello! I'm your voice assistant powered by Deepgram. How can I help?",
76+
},
77+
)
78+
79+
80+
def run_agent(proxy_url: str = PROXY_URL) -> None:
81+
"""Connect to the Deepgram Voice Agent API and stream microphone audio.
82+
83+
This is a demonstration entry point. In production you'd pipe audio from
84+
a phone call, browser WebSocket, or other source instead of the microphone.
85+
"""
86+
if not os.environ.get("DEEPGRAM_API_KEY"):
87+
print("Error: DEEPGRAM_API_KEY not set", file=sys.stderr)
88+
sys.exit(1)
89+
90+
client = DeepgramClient()
91+
settings = build_agent_settings(proxy_url)
92+
93+
print(f"[agent] Connecting to Deepgram Voice Agent (proxy at {proxy_url})...")
94+
95+
with client.agent.v1.connect() as connection:
96+
connection.send_settings(settings)
97+
98+
stop_event = threading.Event()
99+
100+
def on_recv():
101+
while not stop_event.is_set():
102+
try:
103+
msg = connection.recv()
104+
except Exception:
105+
break
106+
107+
if isinstance(msg, AgentV1Welcome):
108+
print(f"[agent] Connected — request_id: {msg.request_id}")
109+
elif isinstance(msg, AgentV1SettingsApplied):
110+
print("[agent] Settings applied — proxy endpoint active")
111+
elif isinstance(msg, AgentV1ConversationText):
112+
print(f"[{msg.role}] {msg.content}")
113+
elif isinstance(msg, AgentV1AgentThinking):
114+
print("[agent] Thinking...")
115+
elif isinstance(msg, AgentV1AgentAudioDone):
116+
print("[agent] Audio done")
117+
elif isinstance(msg, AgentV1Error):
118+
print(f"[agent] Error: {msg.description}")
119+
elif isinstance(msg, bytes):
120+
pass
121+
elif isinstance(msg, AgentV1FunctionCallRequest):
122+
for fn in msg.functions or []:
123+
connection.send_function_call_response(
124+
AgentV1SendFunctionCallResponse(
125+
type="FunctionCallResponse",
126+
id=fn.id,
127+
output='{"error": "no functions registered"}',
128+
)
129+
)
130+
131+
recv_thread = threading.Thread(target=on_recv, daemon=True)
132+
recv_thread.start()
133+
134+
print("[agent] Agent is running. Press Ctrl+C to stop.")
135+
print("[agent] (No microphone input in this demo — connect a real audio source)")
136+
137+
def handle_signal(sig, frame):
138+
stop_event.set()
139+
140+
signal.signal(signal.SIGINT, handle_signal)
141+
142+
stop_event.wait()
143+
print("\n[agent] Shutting down...")
144+
145+
146+
if __name__ == "__main__":
147+
run_agent()
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
"""Provider configuration for the multi-provider chat completions proxy.
2+
3+
Reads LLM_PROVIDER from the environment and returns the corresponding
4+
backend configuration. Adding a new provider means adding one more
5+
elif branch and its env vars to .env.example.
6+
"""
7+
8+
import os
9+
from dataclasses import dataclass
10+
11+
12+
@dataclass(frozen=True)
13+
class ProviderConfig:
14+
name: str
15+
model: str
16+
api_base: str | None
17+
api_key: str | None
18+
extra_headers: dict[str, str]
19+
20+
21+
def load_provider_config() -> ProviderConfig:
22+
provider = os.environ.get("LLM_PROVIDER", "openai").lower()
23+
24+
if provider == "openai":
25+
return ProviderConfig(
26+
name="openai",
27+
model=os.environ.get("OPENAI_MODEL", "gpt-4o-mini"),
28+
api_base="https://api.openai.com/v1",
29+
api_key=os.environ.get("OPENAI_API_KEY"),
30+
extra_headers={},
31+
)
32+
33+
if provider == "bedrock":
34+
# Amazon Bedrock doesn't natively expose an OpenAI-compatible API.
35+
# This proxy bridges that gap — but you still need valid AWS creds
36+
# so the proxy can call Bedrock's invoke-model endpoint.
37+
return ProviderConfig(
38+
name="bedrock",
39+
model=os.environ.get(
40+
"BEDROCK_MODEL_ID",
41+
"anthropic.claude-3-haiku-20240307-v1:0",
42+
),
43+
api_base=None,
44+
api_key=None,
45+
extra_headers={},
46+
)
47+
48+
raise ValueError(
49+
f"Unknown LLM_PROVIDER '{provider}'. Supported: openai, bedrock"
50+
)

0 commit comments

Comments
 (0)