Skip to content

oasis-surveys/oasis-platform

Repository files navigation

OASIS Logo

OASIS

Open Agentic Survey Interview System

Self-hosted platform for AI-powered conversational interviews.
Voice and text. Any model provider. Your infrastructure, your data.

Python FastAPI React Tailwind PostgreSQL Redis Docker Pipecat LiteLLM DOI

Website · Docs · FAQ · About · License


What is OASIS?

OASIS lets you run conversational AI interviews from your own infrastructure. Define a study, configure an agent with a system prompt and model, share a link with participants. Transcripts are stored in your database. You control the data, the models, and the pipeline.

Built because existing tools for conversational AI aren't designed for research. Things like follow-up probing, semi-structured interview guides, participant tracking, and study-level organization are afterthoughts in commercial platforms. OASIS puts them front and center.

OASIS launched in March 2026. It works, but it's a young project. If something breaks, open an issue.

Demo

1. Setup a Study

demo1-fast.mp4

2. Create and Configure an Agent

demo2-fast.mp4

3. Collect and Manage Data

demo3-fast.mp4

Features

  • Voice + text interviews. Real-time speech (STT > LLM > TTS) or clean text chat with customizable avatars.
  • Voice-to-voice. Stream audio directly to multimodal models (OpenAI Realtime, Gemini Live) for lower latency.
  • Semi-structured mode. Define question guides with follow-up probes and transition logic. The agent follows a structured backbone while keeping conversation natural.
  • Multi-provider. OpenAI, Google Gemini, Scaleway, Azure, GCP Vertex, or any LiteLLM-compatible provider. Custom model IDs supported.
  • Self-hosted STT/TTS. OpenAI Whisper, Deepgram, ElevenLabs, Cartesia, Scaleway, or bring your own OpenAI-compatible server.
  • Knowledge base (RAG). Upload documents, OASIS chunks and embeds them with pgvector. Agents can retrieve relevant context during interviews. Embeddings work with OpenAI or any self-hosted server.
  • Research-first. Study-level organization, participant identifiers (random/predefined/self-reported), diarized transcripts, session analytics, data export.
  • Phone interviews (beta). Twilio Media Streams for incoming calls.
  • Self-hosted. Everything in Docker. No data leaves your infra unless you point at external APIs. Optional dashboard auth.

See the FAQ for questions about self-hosting, HPC clusters, European cloud providers, and running with fully open-source models.

Architecture

Five Docker containers on one internal network:

Container Stack Role
Caddy Caddy 2 Reverse proxy, auto HTTPS
Frontend React, Tailwind, Nginx Dashboard + interview widget
Backend FastAPI, Pipecat, LiteLLM WebSocket transport, AI pipeline, REST API
PostgreSQL pgvector/pg16 Configs, transcripts, participant data, embeddings
Redis Redis 7 Sessions, real-time pub/sub, API key overrides
┌──────────────────────────────────────────────────────────┐
│                     Caddy (ports 80/443)                 │
│                    ┌──────────┬──────────┐               │
│                    │ Frontend │ Backend  │               │
│                    │ (React)  │ (FastAPI)│               │
│                    └────┬─────┴────┬─────┘               │
│                         │          │                     │
│                    ┌────┴────┐ ┌───┴───┐                 │
│                    │PostgreSQL│ │ Redis │                 │
│                    └─────────┘ └───────┘                 │
└──────────────────────────────────────────────────────────┘

Quick Start

You need Docker and an OpenAI API key. That's the minimum to run a complete text or voice interview end-to-end. Other providers are optional and can be added later.

1. Clone and configure

git clone https://github.com/oasis-surveys/oasis-platform.git
cd oasis-platform
cp .env.example .env

Open .env and set two values:

OPENAI_API_KEY=sk-...
SECRET_KEY=some-random-secret

That's enough to start. With just an OpenAI key you get text chat, voice (Whisper STT + GPT-4o-mini-tts), and voice-to-voice (gpt-realtime). Add DEEPGRAM_API_KEY, ELEVENLABS_API_KEY, GOOGLE_API_KEY, etc. later if you want those providers.

2. Start it

docker compose up -d

Open http://localhost. The dashboard is there.

3. Run your first study (under a minute)

  1. Click New Study, give it a name.
  2. Click From Template and pick one of the four research templates (semi-structured qualitative, cognitive interview pretest, open-ended survey follow-up, or telephone survey).
  3. The agent is Active by default. Copy the share link from the agent page.
  4. Open the link in a new tab to take the interview yourself, or send it to participants.

Transcripts appear live under the session in the dashboard. Export to CSV/JSON when you're done.

Going further

See .env.example for every option (Google, Scaleway, Azure, GCP Vertex, self-hosted STT/TTS, RAG embeddings, Twilio, dashboard auth) and the FAQ for self-hosting on HPC and European cloud guidance.

Project Structure

oasis/
├── backend/
│   ├── app/
│   │   ├── api/          # REST + WebSocket endpoints
│   │   ├── models/       # SQLAlchemy ORM models
│   │   ├── schemas/      # Pydantic request/response schemas
│   │   ├── pipeline/     # Pipecat pipeline runner
│   │   ├── knowledge/    # RAG: chunking, embedding, retrieval
│   │   ├── config.py     # Environment settings
│   │   └── main.py       # FastAPI entry point
│   ├── alembic/          # DB migrations
│   ├── tests/
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── pages/        # Dashboard + interview widget
│   │   ├── components/   # Shared UI
│   │   ├── contexts/     # React contexts
│   │   └── lib/          # API client, utils
│   └── Dockerfile
├── docker/
│   └── Caddyfile
├── docker-compose.yml
└── .env.example

Testing

All external calls are mocked. No API keys needed to run tests.

# Backend
cd backend && pip install -r requirements.txt && pytest tests/ -v --cov=app

# Frontend
cd frontend && npm install && npx vitest run

CI runs on every push + weekly scheduled run.

Why

Commercial conversational AI platforms are built for support and sales. Research needs different things:

  • Methodological control. Semi-structured guides, probing logic, participant tracking built in, not bolted on.
  • Transparency. Reviewers should know what system, models, and data storage you used.
  • Affordability. Academic budgets aren't enterprise budgets. Self-hosting with pay-as-you-go API keys is often the only option.
  • Data sovereignty. Especially in Europe, running your own infrastructure is often a compliance requirement.

Contributing

Contributions welcome. Read CONTRIBUTING.md first.

License

OASIS is licensed under the GNU Affero General Public License v3 (AGPL-3.0).

You're free to use, modify, and distribute OASIS for any purpose, including funded academic research. If you deploy a modified version as a network service, you must make your changes available under the same license. That's the deal.

See the full LICENSE file for details and the FAQ for a plain-English explanation. If you use OASIS in your research, a citation would be appreciated (see below).

Citation

@software{lang2026oasis,
  author       = {Lang, Max M.},
  title        = {{OASIS}: Open Agentic Survey Interview System},
  year         = {2026},
  url          = {https://github.com/oasis-surveys/oasis-platform},
  note         = {Self-hosted platform for AI-powered conversational interviews},
  doi          = {10.5281/zenodo.19041570}
}

About

Self-hosted open-source platform for AI-powered survey interviews. Voice-to-voice and text chat agents with semi-structured interview guides, multi-provider LLM support (OpenAI, Gemini, Scaleway), RAG knowledge base, and Twilio telephony. Built for qualitative researchers. Docker, one command, full control.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors