GitHub - chianner/oransim: Causal Digital Twin for Marketing at Scale · Predict any marketing decision before you spend a dollar.

Causal Digital Twin for Marketing at Scale

🇬🇧 English · 🇨🇳 中文

Reason. Simulate. Intervene.
Predict any marketing decision before you spend a dollar.

What it does

Oransim is the open-source reference implementation of the OranAI causal digital-twin stack for social-media ad campaigns — same architecture as the production OranAI Enterprise models, shipped in full so researchers and engineers can read, extend, and pressure-test every layer. Give it a creative, a budget, and a KOL list — in ~60 seconds you get:

📈 Predicted impressions / clicks / conversions / ROI with P35/P50/P65 bands
🔄 Counterfactuals — do(creative=B) / do(budget=x) / do(kol=…) in one forward pass
🗣️ 100k LLM personas reading the actual creative, returning click / skip / comment reactions
📊 14-day Hawkes diffusion curve with mid-campaign intervention rollouts, e.g. do(mute_at_day=3)
🧭 Ranked next actions

v0.2 ships a synthetic demo corpus (2.3 MB — 200 KOLs, 2k scenarios, 100 event streams) and a pretrained LightGBM baseline (R² 0.69–0.89 on synthetic eval). Clone, install, set an LLM API key, run.

Causal Transformer + Causal Neural Hawkes ship architecture + training loop + inference code only (pip install 'oransim[ml]'). Pretrained weights land with OrancBench v0.5 — the current synthetic corpus sits inside the LightGBM baseline's hypothesis class, so CT/NH factual R² wouldn't beat it. The v0.5 causal-native tasks (confounded treatment · CATE heterogeneity · temporal intervention) are where CT/NH structurally win, and that's when weights go out.

Why this isn't one model

Ad prediction looks like regression, but it's actually 5 unrelated hard subproblems stacked on top of each other. Skip any one and the others don't hold.

1. Treatment ≠ observation · historical data is selected, not sampled. High-budget campaigns almost always go to tier-1 KOLs. When you ask do(budget=50k, kol=mid-tier), that combination appears zero times in training. Naive regression credits the ROI to budget when the real driver was KOL quality. You need a loss that decouples the learned representation from treatment assignment — that's why HSIC / adversarial-IPTW / BCAUSS exist, not academic flavor.

2. Budget curves are saturated and fatigue-driven. Doubling spend doesn't double impressions; a user's CTR drops to 40% on their 3rd exposure to the same creative. Linear models extrapolate wrong to budgets they never saw. Hill saturation is 30 years of MMM industry-validated functional form — without it you're inventing your own curve (Dubé-Manchanda 2005 + Naik-Raman 2003).

3. Diffusion is a self-exciting point process, not an independent time series. A 14-day engagement curve typically shows a second burst from reposts. RNN/Transformer can fit observed curves but can't answer "what if we stopped boosting on day 3" — that needs intervention rollout inside the temporal model. Hawkes intensity is the only family with native do()-over-time support (Mei-Eisner 2017 + Zuo 2020 + Geng 2022).

4. MMM answers totals; decisions answer "A vs B". Robyn / Meta LightweightMMM give you the total revenue curve. But the actual marketing question is usually "should I swap KOL A for B" — that's per-arm counterfactual, not total attribution. You need multi-head structure that emits all treatment-arm outcomes in a single forward pass (TARNet / Dragonnet are built for exactly this).

5. Creatives are multi-modal; the rest of the stack shouldn't care. Short videos have frames + BGM + subtitles + KOL faces; product pages have images + 3D models. If the embedder layer doesn't project every modality into the same vector space, the budget curve / Hawkes / SCM each need to refit for each new modality. UEB's job is to make downstream code modality-blind — you add a new input, zero downstream changes (text shipped; CLIP / SigLIP / I-JEPA / Whisper on v0.5).

Get all 5 right and you have Oransim. Each layer isn't there to sound good — each one is the only answer to one specific question the others can't handle.

Architecture + research lineage per layer (click to expand)

Problem 1 · Causal Transformer World Model (6-layer · code in backend/oransim/world_model/transformer.py)
- balancing loss: HSIC (Gretton 2005) · adversarial-IPTW · BCAUSS · CaT (Melnychuk ICML 2022)
- per-arm counterfactual heads: TARNet (Shalit ICML 2017) · Dragonnet (Shi NeurIPS 2019)
- in-context amortization: CInA (Arik & Pfister NeurIPS 2023)
Problem 2 · budget curves (world_model/budget.py): Hill saturation (Dubé & Manchanda 2005) + frequency fatigue (Naik & Raman 2003)
Problem 3 · Causal Neural Hawkes Process (CNHP · code in backend/oransim/diffusion/neural_hawkes.py)
- continuous-time neural intensity: Neural Hawkes Process (Mei & Eisner NeurIPS 2017)
- Transformer encoder: Transformer Hawkes (Zuo ICML 2020)
- counterfactual rollout: counterfactual TPP (Geng NeurIPS 2022)
- sampling + training: Intensity-free (Shchur ICLR 2020) · MC compensator (Chen ICLR 2021) · Ogata 1981 thinning
Problem 4 · per-arm counterfactual head (shares multi-head structure with the CT in Problem 1): TARNet / Dragonnet emit all treatment arms in a single forward pass
Problem 5 · Universal Embedding Bus (UEB): modality-generic registry; text via OpenAI-compat today, multi-modal (CLIP / Qwen-VL / SigLIP / I-JEPA / Whisper / CLAP) on v0.5
SCM (causal/scm.py · causal/counterfactual.py): Pearl 3-step (abduction → action → prediction), 64 nodes / 117 edges, discourse + cascade mediators (Sunstein 2017 · Bikhchandani 1992)
Agent population (data/population.py · data/synthesizers/): IPF / Deming-Stephan 1940 baseline; Bayesian-network / CTGAN / TabDDPM variants on roadmap
LightGBM Quantile baseline (world_model/lightgbm_quantile.py): P35/P50/P65 quantile regressors, sub-ms inference, ablation counterpart to CT/NH

🏢 OranAI Enterprise Edition — this OSS release is a reference implementation on synthetic data. The commercial Enterprise Edition extends it with a real-panel KOL/notes index, hosted inference with SLA, on-premise deployment, and vertical-specific calibration (beauty / fashion / 3C / food-and-beverage / luxury / automotive). Contact cto@orannai.com for pilot. See §Enterprise below for the capability matrix and fair-use boundary.

🚀 Quickstart (60 seconds)

# 1. Clone and install
git clone https://github.com/OranAi-Ltd/oransim.git
cd oransim
pip install -e '.[dev]'

# 2. Run backend (mock mode — no API key required)
LLM_MODE=mock python -m uvicorn oransim.api:app --port 8001 &

# 3. Run frontend
python -m http.server 8090 --directory frontend

# 4. Open http://localhost:8090 → click "⚡ 极速" → "🚀 Predict"

Mock mode returns deterministic stubs — good for CI / first look, but every LLM-driven feature (soul personas, group-chat, comment-section discourse, LLM calibration of KPIs) falls back to templates. To unlock the real pipeline, switch to api mode:

LLM_MODE=api \
LLM_API_KEY=sk-xxxxx \
LLM_MODEL=gpt-5.4 \
python -m uvicorn oransim.api:app --port 8001 &

Pick the native request format with LLM_PROVIDER — defaults to openai (also covers DeepSeek / vLLM / any OpenAI-compat gateway):

Per-provider recommended config (click)

`LLM_PROVIDER`	`LLM_BASE_URL`	`LLM_MODEL` example	Key env
`openai` (default)	`https://api.openai.com/v1`	`gpt-5.4` · `gpt-4o-mini`	`OPENAI_API_KEY` or `LLM_API_KEY`
`openai` (DeepSeek)	`https://api.deepseek.com/v1`	`deepseek-chat`	`LLM_API_KEY`
`openai` (vLLM local)	`http://localhost:8000/v1`	any served model	`LLM_API_KEY=local`
`anthropic`	`https://api.anthropic.com` (default)	`claude-sonnet-4-6`	`ANTHROPIC_API_KEY` or `LLM_API_KEY`
`gemini`	Google default	`gemini-2.5-pro` · `gemini-2.5-flash`	`GEMINI_API_KEY` / `GOOGLE_API_KEY` / `LLM_API_KEY`
`qwen`	`https://dashscope.aliyuncs.com/api/v1` (default)	`qwen-plus` · `qwen-turbo`	`DASHSCOPE_API_KEY` / `QWEN_API_KEY` / `LLM_API_KEY`

Full reference in .env.example; extended retry / fallback-chain options in docs/en/quickstart.md.

The frontend shows a yellow banner at the top whenever the backend is still in mock (or has no key set) — click ✕ to dismiss for the session.

Running right now · what's real vs aspirational

✅ Working today — full backend (POST /api/predict · /api/adapters · /api/sandbox/*, split across api_routers/ since api.py 1730-line god-file refactor) · full frontend (hero · 9 tabs · cascade animation · modular js/*.js) · LightGBM quantile baseline pkl shipped · 5 platform adapters (XHS v1 legacy + TikTok agent-level w/ FYP RL + IG / YouTube Shorts / Douyin MVP) · learned amortized abduction (pure-numpy MLP q(U|O)) · multi-LLM providers (OpenAI-compat · Anthropic · Gemini · Qwen).

🟡 Code-complete, weights pending — Causal Transformer world model + Causal Neural Hawkes diffusion — architecture + training loop + inference + thinning sampler all shipped; pretrained weights land with OrancBench v0.5.

📋 Roadmap-only — Twitter / Bilibili / LinkedIn adapters · multi-modal embedders (image/video/audio stubs only today) · Ray cluster · hosted demo.

🎬 See It In Action

Three-panel working UI — left: creative + budget + sliders · center: KPI / Agent pool / AI group-chat tabs (+「更多 ›」dropdown for deep analysis) · right: per-persona LLM reactions.

Opinion-propagation through a agent-based society — drop an ad copy, watch color-coded opinion waves (green=click / purple=high intent / red=skip / blue=curious) ripple outward from KOL seeds, cascading to their followers in real time.

Opinion propagation over the agent population

✨ Why Oransim

	Traditional Analytics	AutoML / Black-Box Predictors	Oransim
Answers "why did the prediction change?"	Partial — rule trace	❌ Opaque (SHAP at best)	✅ Every prediction traces back through the causal graph, per-agent reasoning, and attention paths
Answers "what if I'd done X instead?"	❌ Re-run from scratch	❌ Model doesn't know	✅ Native counterfactual heads — ask `do(creative=B)` in one forward pass
Sees individual user reactions	Aggregates only	Aggregates only	✅ scalable simulated consumers + 10k LLM personas reading your actual copy
Predicts 14-day diffusion + intervention	Linear decay	Generic time-series	✅ Self-exciting point process that handles "what if we stopped boosting on day 3"
Realistic budget curves	❌ Linear = 2× budget = 2× results	❌ Same	✅ Diminishing returns + frequency fatigue (real-world marketing economics)
Removes spurious correlations	❌	❌	✅ Representation balancing loss decorrelates learned features from treatment assignment
Transfers to a new campaign without retraining	❌ Redo the analysis	❌ Per-problem retrain	✅ In-context amortization — model conditions on your prior campaigns at inference time
Multiple platforms	Single platform	Single platform	✅ 5 adapters shipped (XHS / TikTok / IG / YouTube / Douyin), 2-axis extensible
Cost	Per-seat licensing	API tokens per call	✅ Apache-2.0 · self-hosted · free

Technical references for each row

Why explanation: causal-graph path tracing (64 nodes, 117 edges, cyclic with long-term feedback loops — see §causal graph for why it's not a strict DAG) + per-head attention maps + agent reasoning traces
Counterfactual heads: TARNet (Shalit ICML 2017), Dragonnet (Shi NeurIPS 2019); Pearl 3-step abduction → action → prediction
LLM personas: top-K salient agents (SOUL_POOL_N) upgraded to LLM-backed personas for qualitative rationalization (commentary-style, click decision stays in the statistical layer — see §Soul Agents for the honest positioning). Park et al. 2023-style LLM-decides variant is on the v0.5+ roadmap
14-day diffusion: Causal Neural Hawkes (Mei & Eisner 2017 + Zuo ICML 2020 + Geng NeurIPS 2022 counterfactual TPP)
Budget curves: Hill saturation (Dubé & Manchanda 2005) + frequency fatigue (Naik & Raman 2003)
Balancing loss: HSIC (Gretton 2005) or adversarial-IPTW · BCAUSS · CaT (Melnychuk ICML 2022)
In-context amortization: CInA (Arik & Pfister NeurIPS 2023)

🏗️ Architecture

A typical prediction request flows: Creative + Budget → PlatformAdapter (pulls data via pluggable DataProvider) → World Model (factual + counterfactual predictions) + Agent Layer (POP_SIZE-scalable IPF + LLM personas) → Causal Engine (64-node causal graph + do() counterfactuals) → Diffusion (14-day intervention-aware rollout) → Prediction JSON (14–19 schemas).

What runs where:

Surface	Default (ships today)	Research-grade (opt-in)
World model	LightGBM quantile baseline (`data/models/world_model_demo.pkl`) + hand-coded structural formula	`CausalTransformerWorldModel` (CaT / TARNet / Dragonnet / CInA) — train locally, or swap in via `POST /api/v2/world_model/predict?model=causal_transformer`
Diffusion	Parametric exponential-kernel Hawkes (Hawkes 1971)	`CausalNeuralHawkesProcess` (Mei & Eisner + Zuo et al. + Geng et al.) — same opt-in pattern: `POST /api/v2/diffusion/forecast?model=causal_neural_hawkes`
Agents	`StatisticalAgents` (vectorised, CPU)	`SoulAgentPool` LLM personas (enable via `use_llm=true` on `/api/predict`)
Sandbox	Budget-only slider uses a Hill-saturation + frequency-fatigue closed form (`mode: "fast_approx"` in the response) so the slider is responsive. Non-budget edits (creative / alloc / KOL) trigger a real model re-run (`mode: "counterfactual"` or `"full_rerun"`).	—

The registry is the extension point. Default /api/predict uses the baseline stack because it's what ships with weights today; /api/v2/* is how you A/B swap in the research stack once you've trained it. Both routes share the same SCM / agent / Hawkes plumbing.

Two-axis extensibility:

Platform axis — XHS (legacy, v1 live) + TikTok / Instagram / YouTube Shorts / Douyin (MVP on synthetic); Twitter / Bilibili / LinkedIn on roadmap
Data Provider axis — pluggable per platform (Synthetic / CSV / JSON / OpenAPI / your own)

See docs/en/architecture.md for the full design.

🌐 Platform Adapter Matrix

Platform	Region	Status	Data Provider	World Model	Milestone
🔴 XHS / RedNote	Greater China	✅ v1	Synthetic / CSV / JSON / OpenAPI	Causal Transformer + LightGBM baseline	—
⚫ TikTok	Global	🟢 MVP	Synthetic	LightGBM baseline	v0.5 (real panels)
🟣 Instagram Reels	Global	🟢 MVP	Synthetic	LightGBM baseline	v0.5 (real panels)
🔴 YouTube Shorts	Global	🟢 MVP	Synthetic	LightGBM baseline	v0.5 (real panels)
🔵 Douyin	Greater China	🟢 MVP	Synthetic	LightGBM baseline	v0.5 (real panels)
⚪ Twitter / X	Global	📋 planned	—	—	v0.5
📺 Bilibili	Greater China	📋 planned	—	—	v1.0
✒️ LinkedIn	Global	📋 planned	—	—	v1.0

What "MVP" actually means here: XHS is the canonical v1 adapter with real data-provider paths (CSV / JSON / OpenAPI). TikTok / IG / YouTube Shorts / Douyin ship as config-differentiated wrappers over the same PlatformAdapter interface (each has distinct CPM / CTR / CVR / duration priors — see backend/oransim/platforms/{platform}/adapter.py), all driven by the synthetic LightGBM baseline. They pass shape tests end-to-end but don't yet have platform-specific DataProviders hooked up; that's what "v0.5 (real panels)" means in the milestone column.

Want another platform? Open an Adapter Request — we prioritize based on community demand.

📊 What You Get — 14 to 19 Schemas

A single /api/predict call returns structured outputs across these schemas:

total_kpis — aggregate impressions / clicks / conversions / cost / revenue / CTR / CVR / ROI with P35/P50/P65 bands
per_platform — KPIs broken down per platform adapter
per_kol — KOL-level attribution
diffusion_curve — 14-day daily impression/engagement forecast (Causal Neural Hawkes; parametric Hawkes as baseline)
cate — Conditional Average Treatment Effect across agent demographics
counterfactual — "What if" branching: alternative creative / budget / KOL
soul_feedback — 10 LLM persona reactions in natural language
group_chat — simulated group conversation dynamics (Sunstein 2017 polarization)
discourse — second-wave mediator impact estimation
final_report — LLM-generated executive summary
verdict — top-line recommendation (greenlight / optimize / kill)
kol_optimizer — optimal KOL mix given objective
kol_content_match — creative × KOL compatibility scoring
tag_lift — incremental performance from tag/targeting choices
mediator_impact — path analysis from discourse/group_chat to funnel
brand_memory — longitudinal brand preference updates
sandbox_snapshot — serialized session state for "undo / redo"
audit_trace — explainability — which agents, which paths, which weights
benchmark — performance against OrancBench

See docs/en/schemas/ for JSON schema definitions.

🧠 Under the Hood

Causal Graph — 64 nodes, 117 edges

Hand-designed by domain experts covering the marketing funnel: impression → awareness → consideration → conversion → repeat purchase → brand memory, with mediators for group discourse (Sunstein 2017) and information cascades (Bikhchandani et al. 1992).

The graph includes long-term feedback loops (e.g. repeat_purchase → brand_equity → ecpm_bid → next-cycle impression_dist). This is intentional — it reflects real marketing physics, not a modeling artifact. Strict Pearl-style abduction on cycles is undefined; our do() evaluation uses the cyclic-SCM generalization of Bongers et al. 2021 (Foundations of Structural Causal Models with Cycles and Latent Variables), treating the 25-node feedback SCC as a fixed-point solve rather than a topological forward pass.

The 3-step evaluation in code:

Abduction — at the agent layer, re-use the sampled noise from baseline; at the graph layer, per-node residuals are frozen
Action — apply do() intervention (supported nodes listed in /api/dag's intervenable: true set)
Prediction — topologically sort the acyclic condensation, solve each SCC by numerical iteration (2–3 passes empirically converge on the shipped graph)

A time-unrolled DAG projection IS available in the OSS release via oransim.causal.scm.dag_dict_unrolled(n_steps=K) — each original node becomes N_t0, N_t1, ..., N_t{K-1}; feedback edges cross time (src_ti → dst_t{i+1}), non-feedback edges replicate within each slice. At n_steps=2 the shipped graph's 64 nodes + 117 edges (cyclic) unroll to 128 nodes + 220 edges (strict DAG, 14 feedback edges detected automatically via DFS back-edge analysis). Downstream modules that need strict acyclicity (CausalDAG-Transformer attention on a true DAG, textbook Pearl three-step abduction) can consume the unrolled view. The cyclic native graph + SCC condensation remains the default because it keeps the node count small and matches the shipped Transformer's 7-token input layout.

A full equilibrium-solver with fixed-point guarantees for the cyclic native graph is an Enterprise Edition upgrade; the OSS release offers the unrolled-DAG path as the acyclic alternative.

Agent Population — POP_SIZE-scalable IPF-calibrated virtual consumers

Generated via Iterative Proportional Fitting (IPF / Deming-Stephan 1940) against real Chinese demographic distributions (age × gender × region × income × platform). Each agent carries:

Demographics + psychographics
Platform-specific engagement priors
Niche/category affinity vectors
Time-of-day activity curves
Social graph embeddings

Soul Agents — LLM personas for qualitative feedback

The top-K most salient agents for a scenario are upgraded to LLM-backed personas (SOUL_POOL_N configurable; default 100 for demo, scalable via Ray in the Enterprise Edition). Default model: gpt-5.4. Each persona:

Generates a persona card from its demographic vector
Evaluates the creative (reaction / emotional response / intent)
Optionally participates in simulated group chats (Sunstein 2017 group polarization)
Feeds second-wave mediators back into the causal graph

Two modes, explicit trade-off:

Template mode (use_llm=False, default) — click decision is a Bernoulli draw against the statistical click_prob (+40% niche-match lift); the persona picks a consistent template reason / comment / feel. Zero LLM cost, deterministic given seed, used for CATE / ROI numerical reproducibility.
LLM-decider mode (use_llm=True, Park et al. 2023 Generative Agents style) — a real LLM gets the full persona card + creative + KOL context and returns a structured JSON (will_click, reason, comment, feel, purchase_intent_7d). The LLM's will_click is the agent's decision (not overridden by Bernoulli); the statistical click_prob is available as a prior in the prompt. Response tagged source: "llm". Trade-off: adds non-determinism per persona; for strict reproducibility stay in template mode or pin LLM_TEMPERATURE=0.

Cost controlled via:

In-flight request coalescing (leader/follower dedup pattern)
Persona card caching
Configurable SOUL_POOL_N

Causal Transformer World Model — primary (research-grade)

A 6-layer × 256-dim causal Transformer that ingests heterogeneous campaign features and predicts three quantile levels (P35/P50/P65) for each funnel KPI. Architecture lifts ideas from the recent causal-Transformer literature:

Token-type factorization (CaT, Melnychuk et al. ICML 2022) — inputs split into Covariate (platform, demographic, time), Treatment (creative embedding, budget, KOL), and Outcome (KPIs) tokens with distinct type embeddings
DAG-aware attention (CausalDAG-Transformer) — attention mask derived from the 64-node causal graph restricts each token to attend to topological ancestors; per-head learnable gate on the bias. Because the shipped graph is cyclic (see §Causal Graph), ancestry is defined on the graph's SCC condensation: within a feedback SCC all nodes are mutually ancestral, across SCCs the standard DAG ancestor relation applies (Bongers 2021 §3.2). Reference implementation shipped in CausalTransformerWorldModel.set_dag_from_edges() and toggleable via dag_attention_bias=True. The OSS release defaults to the LightGBM baseline path; pretrained CT checkpoints with DAG attention enabled ship with the Enterprise Edition (see §OranAI Enterprise Edition).
Per-arm counterfactual heads (TARNet, Shalit et al. ICML 2017 / Dragonnet, Shi et al. NeurIPS 2019) — one quantile head per discrete treatment arm enables predict_factual vs predict_counterfactual(do(T=t')) with a single forward pass
Representation balancing (BCAUSS + CaT) — HSIC (Gretton et al. 2005) or adversarial-IPTW loss decorrelates the learned representation from treatment assignment, reducing bias in counterfactual predictions
In-context amortization (CInA, Arik & Pfister NeurIPS 2023, optional) — model can condition on a context set of prior campaigns for amortized zero-shot causal inference

Core component: oransim.world_model.CausalTransformerWorldModel. Training loop, counterfactual rollout, and save/load are shipped today; pretrained weights land with OrancBench v0.5.

from oransim.world_model import get_world_model, CausalTransformerWMConfig

wm = get_world_model("causal_transformer", config=CausalTransformerWMConfig(
    dag_attention_bias=True,
    balancing_loss="hsic",
    use_counterfactual_head=True,
))
pred = wm.predict(features)                         # factual
cf = wm.counterfactual(features, arm_idx=2)         # do(T = arm 2)

Requires pip install 'oransim[ml]' (brings in PyTorch). Falls back gracefully to LightGBM if torch is unavailable.

Universal Embedding Bus (UEB) — text-only today, multi-modal hooks for v0.5

Every data source (creative copy, KOL bio, user comment, fan-profile tabular record, platform event stream) flows through a shared Embedder ABC that produces a fixed-dim vector. Downstream modules (world_model / agent / causal) never see modality-specific code — the registry is modality-generic.

Shipped today (v0.2):

RealTextEmbedder — OpenAI-compatible text-embedding-3-small via the same gateway as soul_llm (one key for everything). Falls back to a deterministic hash embedder if the API is unavailable.
TabularEmbedder, CategoricalEmbedder, TimeSeriesEmbedder, GeoEmbedder, EventEmbedder — non-learned baselines.

Stubs for v0.5 (raise NotImplementedError pointing to ROADMAP.md#v05 if called):

ImageEmbedderStub — planned backends: CLIP / Qwen-VL / SigLIP / ImageBind
VideoEmbedderStub — planned backends: I-JEPA v2 / TimeSformer / VideoMAE v2 / Qwen-VL video
AudioEmbedderStub — planned backends: Whisper-v3 encoder / CLAP / AudioMAE

Dropping a real implementation in is a ~50-line Embedder subclass with no downstream changes. See backend/oransim/runtime/embedding_bus.py.

LightGBM Quantile World Model — fast baseline

Three quantile regressors (P35, P50, P65) per KPI. Sub-millisecond inference, zero GPU requirement. Refs: Ke et al. 2017 (LightGBM), Koenker 2005 (Quantile Regression).

Shipped pkl (data/models/world_model_demo.pkl, feature_version: demo_v2, ~3 MB) consumes 23 features: 7 tabular (platform_id, niche_idx, budget, budget_bucket, kol_tier_idx, kol_fan_count, kol_engagement_rate) + 16 PCA-reduced text-embedding dimensions. The embedding input is a deterministic caption per scenario ("春季 {niche} 新品种草 · {tier} KOL · {budget_bucket}") passed through RealTextEmbedder — same embedder the rest of the stack uses (UEB, soul-agent persona matching, kol_content_match, search_elasticity). When OPENAI_API_KEY is set, it hits text-embedding-3-small; without a key, it falls back to the deterministic SHA-256 hash embedder so training / inference is still reproducible offline. PCA components ship inside the pkl and are applied at inference time via POST /api/v2/world_model/predict?model=lightgbm_quantile. R² on the 200 held-out from 2,000 synthetic scenarios: impressions 0.88 · clicks 0.79 · conversions 0.71 · revenue 0.75.

The Causal Transformer path consumes the full-dim creative embedding natively (without PCA) once weights land with OrancBench v0.5; the demo LightGBM pkl is the CPU-only fallback until then.

wm = get_world_model("lightgbm_quantile")

Budget Model — Hill saturation + frequency fatigue

Instead of naive linear budget scaling:

$$\text{effective_impr_ratio}(x) = \frac{(1+K) \cdot x}{K + x}$$

Michaelis-Menten / Hill saturation (Dubé & Manchanda 2005), combined with frequency fatigue (Naik & Raman 2003) on CTR/CVR:

$$\text{ctr_decay}(r) = \max(0.5, 1.0 - 0.08 \cdot \max(0, \log_2 r))$$

This captures diminishing returns, an optimal budget point, and realistic campaign dynamics.

Causal Neural Hawkes Process — primary diffusion forecaster

Transformer-parameterized neural temporal point process for 14-day cascading engagement forecasting, with first-class support for counterfactual rollouts under do() interventions.

Architectural references:

Mei & Eisner (NeurIPS 2017) — The Neural Hawkes Process — continuous-time neural intensity function, foundation of the field
Zuo et al. (ICML 2020) — Transformer Hawkes Process — self-attention encoder replacing the original CT-LSTM; directly the backbone of this implementation
Shchur et al. (ICLR 2020) — Intensity-Free Learning of TPPs — closed-form inter-event-time head for fast sampling
Chen et al. (ICLR 2021) — Neural Spatio-Temporal Point Processes — Monte Carlo estimator for the log-likelihood compensator
Geng et al. (NeurIPS 2022) — Counterfactual Temporal Point Processes — the intervention semantics for marked point processes
Noorbakhsh & Rodriguez (2022) — Counterfactual Temporal Point Processes — formalizes do() queries on event streams

Explicit treatment/control event typing (organic vs paid_boost) and an intervention-aware intensity decoder enable queries like "what if we had stopped boosting on day 3" via a counterfactual rollout loop.

Core component: oransim.diffusion.CausalNeuralHawkesProcess. Architecture, training loop (NLL with MC compensator), forecast sampler (Ogata thinning), and counterfactual rollout are shipped today; pretrained weights land with OrancBench v0.5.

from oransim.diffusion import get_diffusion_model

nh = get_diffusion_model("causal_neural_hawkes")
factual = nh.forecast(seed_events=[(0, "impression"), (12, "like")])
cf = nh.counterfactual_forecast(
    seed_events,
    intervention={"mute_at_min": 4320}  # stop boosting 3 days in
)

Requires pip install 'oransim[ml]'.

Parametric Hawkes — classical baseline

Exponential-kernel multivariate Hawkes process (Hawkes 1971). Closed-form intensity and log-likelihood; Ogata (1981) thinning sampler. Zero-dependency fallback and the baseline against which the Causal Neural Hawkes is evaluated on OrancBench.

ph = get_diffusion_model("parametric_hawkes")

Sandbox — incremental recomputation for "what if"

Scenario sessions persist state so users can iterate: "change budget from 100k to 150k, how does ROI move?" Incremental recomputation avoids redoing the full agent simulation when only budget changes. The agent pool is cached; counterfactual evaluation uses union-semantics CATE over reached vs. unreached populations.

📈 Benchmarks

Phase 1 benchmarks are based on the shipped synthetic corpus (2,000 scenarios + 100 event streams + 50 OrancBench tasks — reproducible from the files under data/synthetic/ and data/benchmarks/). See data/models/data_card.md for the data-generating process. The R² numbers below were run on 10% held-out of those 2k scenarios; larger-corpus numbers land with OrancBench v0.5.

Metric	R² (synthetic)	Baseline (linear)	Notes
`second_wave_click`	0.30	0.18	PRS quantile median
`first_wave_conversion`	0.33	0.21	PRS quantile median
`cascade_lift`	0.39	0.25	Second-wave mediator
`roi_point_estimate`	0.33	0.19	Single-shot regression
`retention_7d`	0.29	0.17	Longitudinal

⚠️ Honest reproducibility framing — this is a closed-loop evaluation: the same synthetic data generator (backend/scripts/gen_synthetic_data.py) produces both training and held-out splits, and we evaluate our own model on our own generative process. This measures "does the model fit our generative assumptions", not external validity. For real marketing-decision accuracy you need either (a) an independent real-panel benchmark (Enterprise Edition uses proprietary real-world data) or (b) a public benchmark with out-of-distribution campaigns — the OrancBench v0.5 plan (see ROADMAP.md) is our attempt at the latter.

See docs/en/benchmarks/ for the full protocol.

🗺️ Roadmap — Highlights

See ROADMAP.md for the full 3-horizon × 8-theme plan. Teasers:

v0.2 (Q3 2026) — shipping pretrained weights

📦 Trained Causal Transformer + Causal Neural Hawkes checkpoints on an expanded synthetic corpus (targeting ~100k scenarios for OrancBench v0.5)
TikTok + Douyin adapter MVPs
Docker Compose · MkDocs · CI

v0.5 (Q4 2026 – Q1 2027)

🎯 Cross-platform transfer learning — pretrain on XHS, fine-tune on TikTok
✅ Multi-LLM-format adapters — native Anthropic Messages, Gemini, Qwen DashScope shipped in v0.2; Bedrock Converse + native streaming roadmap item
🎯 10k soul agents on Ray cluster
✅ Instagram / YouTube Shorts / Douyin adapters MVP

v1.0+ (2027)

🎯 Causal Foundation Model — pretrain on 10M+ campaigns
🎯 Closed-loop AI media buying — real-time optimization with safety constraints
🎯 Differential privacy + Federated learning — for brand-proprietary training
15+ platforms, multi-modal creative understanding, vertical sub-benchmarks

🏢 OranAI Enterprise Edition

Oransim OSS ships on synthetic data for transparency and reproducibility. OranAI Enterprise Edition provides:

📊 Real-world training data — continuously updated 1M+ labeled campaigns across beauty, fashion, 3C, F&B, luxury, auto
⚡ SLA-backed hosted inference — 99.9% uptime, sub-second response
🎯 Vertical world models — beauty / fashion / electronics / F&B specialized calibration
🤝 White-glove onboarding — custom adapter development, integration support, training
🔒 On-premise deployment — with SOC 2 / ISO 27001 / GDPR compliance path
🎓 Managed model updates — no downtime model refresh as platforms evolve

Contact: cto@orannai.com · Book a demo

🤝 Contributing

We love contributions — platform adapters, world-model improvements, docs, benchmarks, translations, bug fixes.

Start here: CONTRIBUTING.md
Sign off commits per DCO: git commit -s
Good first issues: see labels
Platform adapter requests: file here

By contributing, you agree your contribution is licensed under Apache-2.0. No CLA required.

📚 Citation

If you use Oransim in research, please cite:

@software{oransim2026,
  author       = {Yin, Fakong and {Oransim contributors}},
  title        = {Oransim: Causal Digital Twin for Marketing at Scale},
  version      = {0.2.0-alpha},
  date         = {2026-04-18},
  url          = {https://github.com/OranAi-Ltd/oransim},
  organization = {OranAI Ltd.}
}

See CITATION.cff for cffconvert-compatible metadata.

📜 License

Apache License 2.0 — see LICENSE and NOTICE.

Third-party dependencies retain their original licenses. We are not affiliated with Xiaohongshu, ByteDance, Meta, Google, or any other platform mentioned in this repository.

💫 Team

Oransim is built by OranAI Ltd. (橙果视界（深圳）科技有限公司).

Core Maintainers

Fakong Yin — CTO & Core Architect · cto@orannai.com · @OranAi-Ltd

Open roles — we're hiring researchers (Causal ML, RL, Agent-based Simulation) and engineers (Platform, Infra). Reach out at cto@orannai.com.

Contributors appear on CONTRIBUTORS.md (auto-generated).

⭐ Star History

Built with ☕ in Shenzhen by OranAI. If Oransim helps your work, please ⭐ star the repo — it powers our open-source commitment.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
.github		.github
assets		assets
backend		backend
data		data
docker		docker
docs		docs
examples		examples
frontend		frontend
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.nojekyll		.nojekyll
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
index.html		index.html
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Causal Digital Twin for Marketing at Scale

What it does

Why this isn't one model

🚀 Quickstart (60 seconds)

🎬 See It In Action

✨ Why Oransim

🏗️ Architecture

🌐 Platform Adapter Matrix

📊 What You Get — 14 to 19 Schemas

🧠 Under the Hood

📈 Benchmarks

🗺️ Roadmap — Highlights

🏢 OranAI Enterprise Edition

🤝 Contributing

📚 Citation

📜 License

💫 Team

⭐ Star History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Causal Digital Twin for Marketing at Scale

What it does

Why this isn't one model

🚀 Quickstart (60 seconds)

🎬 See It In Action

✨ Why Oransim

🏗️ Architecture

🌐 Platform Adapter Matrix

📊 What You Get — 14 to 19 Schemas

🧠 Under the Hood

📈 Benchmarks

🗺️ Roadmap — Highlights

🏢 OranAI Enterprise Edition

🤝 Contributing

📚 Citation

📜 License

💫 Team

⭐ Star History

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages