Skip to content

chianner/oransim

 
 
Oransim

Causal Digital Twin for Marketing at Scale

License Release Python CI Stars Website

🇬🇧 English · 🇨🇳 中文

Reason. Simulate. Intervene.
Predict any marketing decision before you spend a dollar.


Oransim hero · 60-second prediction with counterfactual reasoning over a agent-based society

What it does

Oransim is the open-source reference implementation of the OranAI causal digital-twin stack for social-media ad campaigns — same architecture as the production OranAI Enterprise models, shipped in full so researchers and engineers can read, extend, and pressure-test every layer. Give it a creative, a budget, and a KOL list — in ~60 seconds you get:

  • 📈 Predicted impressions / clicks / conversions / ROI with P35/P50/P65 bands
  • 🔄 Counterfactuals — do(creative=B) / do(budget=x) / do(kol=…) in one forward pass
  • 🗣️ 100k LLM personas reading the actual creative, returning click / skip / comment reactions
  • 📊 14-day Hawkes diffusion curve with mid-campaign intervention rollouts, e.g. do(mute_at_day=3)
  • 🧭 Ranked next actions

v0.2 ships a synthetic demo corpus (2.3 MB — 200 KOLs, 2k scenarios, 100 event streams) and a pretrained LightGBM baseline (R² 0.69–0.89 on synthetic eval). Clone, install, set an LLM API key, run.

Causal Transformer + Causal Neural Hawkes ship architecture + training loop + inference code only (pip install 'oransim[ml]'). Pretrained weights land with OrancBench v0.5 — the current synthetic corpus sits inside the LightGBM baseline's hypothesis class, so CT/NH factual R² wouldn't beat it. The v0.5 causal-native tasks (confounded treatment · CATE heterogeneity · temporal intervention) are where CT/NH structurally win, and that's when weights go out.

Why this isn't one model

Ad prediction looks like regression, but it's actually 5 unrelated hard subproblems stacked on top of each other. Skip any one and the others don't hold.

1. Treatment ≠ observation · historical data is selected, not sampled. High-budget campaigns almost always go to tier-1 KOLs. When you ask do(budget=50k, kol=mid-tier), that combination appears zero times in training. Naive regression credits the ROI to budget when the real driver was KOL quality. You need a loss that decouples the learned representation from treatment assignment — that's why HSIC / adversarial-IPTW / BCAUSS exist, not academic flavor.

2. Budget curves are saturated and fatigue-driven. Doubling spend doesn't double impressions; a user's CTR drops to 40% on their 3rd exposure to the same creative. Linear models extrapolate wrong to budgets they never saw. Hill saturation is 30 years of MMM industry-validated functional form — without it you're inventing your own curve (Dubé-Manchanda 2005 + Naik-Raman 2003).

3. Diffusion is a self-exciting point process, not an independent time series. A 14-day engagement curve typically shows a second burst from reposts. RNN/Transformer can fit observed curves but can't answer "what if we stopped boosting on day 3" — that needs intervention rollout inside the temporal model. Hawkes intensity is the only family with native do()-over-time support (Mei-Eisner 2017 + Zuo 2020 + Geng 2022).

4. MMM answers totals; decisions answer "A vs B". Robyn / Meta LightweightMMM give you the total revenue curve. But the actual marketing question is usually "should I swap KOL A for B" — that's per-arm counterfactual, not total attribution. You need multi-head structure that emits all treatment-arm outcomes in a single forward pass (TARNet / Dragonnet are built for exactly this).

5. Creatives are multi-modal; the rest of the stack shouldn't care. Short videos have frames + BGM + subtitles + KOL faces; product pages have images + 3D models. If the embedder layer doesn't project every modality into the same vector space, the budget curve / Hawkes / SCM each need to refit for each new modality. UEB's job is to make downstream code modality-blind — you add a new input, zero downstream changes (text shipped; CLIP / SigLIP / I-JEPA / Whisper on v0.5).

Get all 5 right and you have Oransim. Each layer isn't there to sound good — each one is the only answer to one specific question the others can't handle.

Architecture + research lineage per layer (click to expand)
  • Problem 1 · Causal Transformer World Model (6-layer · code in backend/oransim/world_model/transformer.py)
    • balancing loss: HSIC (Gretton 2005) · adversarial-IPTW · BCAUSS · CaT (Melnychuk ICML 2022)
    • per-arm counterfactual heads: TARNet (Shalit ICML 2017) · Dragonnet (Shi NeurIPS 2019)
    • in-context amortization: CInA (Arik & Pfister NeurIPS 2023)
  • Problem 2 · budget curves (world_model/budget.py): Hill saturation (Dubé & Manchanda 2005) + frequency fatigue (Naik & Raman 2003)
  • Problem 3 · Causal Neural Hawkes Process (CNHP · code in backend/oransim/diffusion/neural_hawkes.py)
    • continuous-time neural intensity: Neural Hawkes Process (Mei & Eisner NeurIPS 2017)
    • Transformer encoder: Transformer Hawkes (Zuo ICML 2020)
    • counterfactual rollout: counterfactual TPP (Geng NeurIPS 2022)
    • sampling + training: Intensity-free (Shchur ICLR 2020) · MC compensator (Chen ICLR 2021) · Ogata 1981 thinning
  • Problem 4 · per-arm counterfactual head (shares multi-head structure with the CT in Problem 1): TARNet / Dragonnet emit all treatment arms in a single forward pass
  • Problem 5 · Universal Embedding Bus (UEB): modality-generic registry; text via OpenAI-compat today, multi-modal (CLIP / Qwen-VL / SigLIP / I-JEPA / Whisper / CLAP) on v0.5
  • SCM (causal/scm.py · causal/counterfactual.py): Pearl 3-step (abduction → action → prediction), 64 nodes / 117 edges, discourse + cascade mediators (Sunstein 2017 · Bikhchandani 1992)
  • Agent population (data/population.py · data/synthesizers/): IPF / Deming-Stephan 1940 baseline; Bayesian-network / CTGAN / TabDDPM variants on roadmap
  • LightGBM Quantile baseline (world_model/lightgbm_quantile.py): P35/P50/P65 quantile regressors, sub-ms inference, ablation counterpart to CT/NH

🏢 OranAI Enterprise Edition — this OSS release is a reference implementation on synthetic data. The commercial Enterprise Edition extends it with a real-panel KOL/notes index, hosted inference with SLA, on-premise deployment, and vertical-specific calibration (beauty / fashion / 3C / food-and-beverage / luxury / automotive). Contact cto@orannai.com for pilot. See §Enterprise below for the capability matrix and fair-use boundary.


🚀 Quickstart (60 seconds)

# 1. Clone and install
git clone https://github.com/OranAi-Ltd/oransim.git
cd oransim
pip install -e '.[dev]'

# 2. Run backend (mock mode — no API key required)
LLM_MODE=mock python -m uvicorn oransim.api:app --port 8001 &

# 3. Run frontend
python -m http.server 8090 --directory frontend

# 4. Open http://localhost:8090 → click "⚡ 极速" → "🚀 Predict"

Mock mode returns deterministic stubs — good for CI / first look, but every LLM-driven feature (soul personas, group-chat, comment-section discourse, LLM calibration of KPIs) falls back to templates. To unlock the real pipeline, switch to api mode:

LLM_MODE=api \
LLM_API_KEY=sk-xxxxx \
LLM_MODEL=gpt-5.4 \
python -m uvicorn oransim.api:app --port 8001 &

Pick the native request format with LLM_PROVIDER — defaults to openai (also covers DeepSeek / vLLM / any OpenAI-compat gateway):

Per-provider recommended config (click)
LLM_PROVIDER LLM_BASE_URL LLM_MODEL example Key env
openai (default) https://api.openai.com/v1 gpt-5.4 · gpt-4o-mini OPENAI_API_KEY or LLM_API_KEY
openai (DeepSeek) https://api.deepseek.com/v1 deepseek-chat LLM_API_KEY
openai (vLLM local) http://localhost:8000/v1 any served model LLM_API_KEY=local
anthropic https://api.anthropic.com (default) claude-sonnet-4-6 ANTHROPIC_API_KEY or LLM_API_KEY
gemini Google default gemini-2.5-pro · gemini-2.5-flash GEMINI_API_KEY / GOOGLE_API_KEY / LLM_API_KEY
qwen https://dashscope.aliyuncs.com/api/v1 (default) qwen-plus · qwen-turbo DASHSCOPE_API_KEY / QWEN_API_KEY / LLM_API_KEY

Full reference in .env.example; extended retry / fallback-chain options in docs/en/quickstart.md.

The frontend shows a yellow banner at the top whenever the backend is still in mock (or has no key set) — click ✕ to dismiss for the session.

Running right now · what's real vs aspirational

  • Working today — full backend (POST /api/predict · /api/adapters · /api/sandbox/*, split across api_routers/ since api.py 1730-line god-file refactor) · full frontend (hero · 9 tabs · cascade animation · modular js/*.js) · LightGBM quantile baseline pkl shipped · 5 platform adapters (XHS v1 legacy + TikTok agent-level w/ FYP RL + IG / YouTube Shorts / Douyin MVP) · learned amortized abduction (pure-numpy MLP q(U|O)) · multi-LLM providers (OpenAI-compat · Anthropic · Gemini · Qwen).
  • 🟡 Code-complete, weights pending — Causal Transformer world model + Causal Neural Hawkes diffusion — architecture + training loop + inference + thinning sampler all shipped; pretrained weights land with OrancBench v0.5.
  • 📋 Roadmap-only — Twitter / Bilibili / LinkedIn adapters · multi-modal embedders (image/video/audio stubs only today) · Ray cluster · hosted demo.

🎬 See It In Action

Three-panel working UI — left: creative + budget + sliders · center: KPI / Agent pool / AI group-chat tabs (+「更多 ›」dropdown for deep analysis) · right: per-persona LLM reactions.

Three-panel prediction UI

Opinion-propagation through a agent-based society — drop an ad copy, watch color-coded opinion waves (green=click / purple=high intent / red=skip / blue=curious) ripple outward from KOL seeds, cascading to their followers in real time.

Opinion propagation over the agent population

✨ Why Oransim

Traditional Analytics AutoML / Black-Box Predictors Oransim
Answers "why did the prediction change?" Partial — rule trace ❌ Opaque (SHAP at best) ✅ Every prediction traces back through the causal graph, per-agent reasoning, and attention paths
Answers "what if I'd done X instead?" ❌ Re-run from scratch ❌ Model doesn't know ✅ Native counterfactual heads — ask do(creative=B) in one forward pass
Sees individual user reactions Aggregates only Aggregates only ✅ scalable simulated consumers + 10k LLM personas reading your actual copy
Predicts 14-day diffusion + intervention Linear decay Generic time-series ✅ Self-exciting point process that handles "what if we stopped boosting on day 3"
Realistic budget curves ❌ Linear = 2× budget = 2× results ❌ Same ✅ Diminishing returns + frequency fatigue (real-world marketing economics)
Removes spurious correlations ✅ Representation balancing loss decorrelates learned features from treatment assignment
Transfers to a new campaign without retraining ❌ Redo the analysis ❌ Per-problem retrain ✅ In-context amortization — model conditions on your prior campaigns at inference time
Multiple platforms Single platform Single platform ✅ 5 adapters shipped (XHS / TikTok / IG / YouTube / Douyin), 2-axis extensible
Cost Per-seat licensing API tokens per call ✅ Apache-2.0 · self-hosted · free
Technical references for each row
  • Why explanation: causal-graph path tracing (64 nodes, 117 edges, cyclic with long-term feedback loops — see §causal graph for why it's not a strict DAG) + per-head attention maps + agent reasoning traces
  • Counterfactual heads: TARNet (Shalit ICML 2017), Dragonnet (Shi NeurIPS 2019); Pearl 3-step abduction → action → prediction
  • LLM personas: top-K salient agents (SOUL_POOL_N) upgraded to LLM-backed personas for qualitative rationalization (commentary-style, click decision stays in the statistical layer — see §Soul Agents for the honest positioning). Park et al. 2023-style LLM-decides variant is on the v0.5+ roadmap
  • 14-day diffusion: Causal Neural Hawkes (Mei & Eisner 2017 + Zuo ICML 2020 + Geng NeurIPS 2022 counterfactual TPP)
  • Budget curves: Hill saturation (Dubé & Manchanda 2005) + frequency fatigue (Naik & Raman 2003)
  • Balancing loss: HSIC (Gretton 2005) or adversarial-IPTW · BCAUSS · CaT (Melnychuk ICML 2022)
  • In-context amortization: CInA (Arik & Pfister NeurIPS 2023)

🏗️ Architecture

Oransim architecture diagram

A typical prediction request flows: Creative + BudgetPlatformAdapter (pulls data via pluggable DataProvider) → World Model (factual + counterfactual predictions) + Agent Layer (POP_SIZE-scalable IPF + LLM personas) → Causal Engine (64-node causal graph + do() counterfactuals) → Diffusion (14-day intervention-aware rollout) → Prediction JSON (14–19 schemas).

What runs where:

Surface Default (ships today) Research-grade (opt-in)
World model LightGBM quantile baseline (data/models/world_model_demo.pkl) + hand-coded structural formula CausalTransformerWorldModel (CaT / TARNet / Dragonnet / CInA) — train locally, or swap in via POST /api/v2/world_model/predict?model=causal_transformer
Diffusion Parametric exponential-kernel Hawkes (Hawkes 1971) CausalNeuralHawkesProcess (Mei & Eisner + Zuo et al. + Geng et al.) — same opt-in pattern: POST /api/v2/diffusion/forecast?model=causal_neural_hawkes
Agents StatisticalAgents (vectorised, CPU) SoulAgentPool LLM personas (enable via use_llm=true on /api/predict)
Sandbox Budget-only slider uses a Hill-saturation + frequency-fatigue closed form (mode: "fast_approx" in the response) so the slider is responsive. Non-budget edits (creative / alloc / KOL) trigger a real model re-run (mode: "counterfactual" or "full_rerun").

The registry is the extension point. Default /api/predict uses the baseline stack because it's what ships with weights today; /api/v2/* is how you A/B swap in the research stack once you've trained it. Both routes share the same SCM / agent / Hawkes plumbing.

Two-axis extensibility:

  • Platform axis — XHS (legacy, v1 live) + TikTok / Instagram / YouTube Shorts / Douyin (MVP on synthetic); Twitter / Bilibili / LinkedIn on roadmap
  • Data Provider axis — pluggable per platform (Synthetic / CSV / JSON / OpenAPI / your own)

See docs/en/architecture.md for the full design.


🌐 Platform Adapter Matrix

Platform Region Status Data Provider World Model Milestone
🔴 XHS / RedNote Greater China ✅ v1 Synthetic / CSV / JSON / OpenAPI Causal Transformer + LightGBM baseline
⚫ TikTok Global 🟢 MVP Synthetic LightGBM baseline v0.5 (real panels)
🟣 Instagram Reels Global 🟢 MVP Synthetic LightGBM baseline v0.5 (real panels)
🔴 YouTube Shorts Global 🟢 MVP Synthetic LightGBM baseline v0.5 (real panels)
🔵 Douyin Greater China 🟢 MVP Synthetic LightGBM baseline v0.5 (real panels)
⚪ Twitter / X Global 📋 planned v0.5
📺 Bilibili Greater China 📋 planned v1.0
✒️ LinkedIn Global 📋 planned v1.0

What "MVP" actually means here: XHS is the canonical v1 adapter with real data-provider paths (CSV / JSON / OpenAPI). TikTok / IG / YouTube Shorts / Douyin ship as config-differentiated wrappers over the same PlatformAdapter interface (each has distinct CPM / CTR / CVR / duration priors — see backend/oransim/platforms/{platform}/adapter.py), all driven by the synthetic LightGBM baseline. They pass shape tests end-to-end but don't yet have platform-specific DataProviders hooked up; that's what "v0.5 (real panels)" means in the milestone column.

Want another platform? Open an Adapter Request — we prioritize based on community demand.


📊 What You Get — 14 to 19 Schemas

A single /api/predict call returns structured outputs across these schemas:

  1. total_kpis — aggregate impressions / clicks / conversions / cost / revenue / CTR / CVR / ROI with P35/P50/P65 bands
  2. per_platform — KPIs broken down per platform adapter
  3. per_kol — KOL-level attribution
  4. diffusion_curve — 14-day daily impression/engagement forecast (Causal Neural Hawkes; parametric Hawkes as baseline)
  5. cate — Conditional Average Treatment Effect across agent demographics
  6. counterfactual — "What if" branching: alternative creative / budget / KOL
  7. soul_feedback — 10 LLM persona reactions in natural language
  8. group_chat — simulated group conversation dynamics (Sunstein 2017 polarization)
  9. discourse — second-wave mediator impact estimation
  10. final_report — LLM-generated executive summary
  11. verdict — top-line recommendation (greenlight / optimize / kill)
  12. kol_optimizer — optimal KOL mix given objective
  13. kol_content_match — creative × KOL compatibility scoring
  14. tag_lift — incremental performance from tag/targeting choices
  15. mediator_impact — path analysis from discourse/group_chat to funnel
  16. brand_memory — longitudinal brand preference updates
  17. sandbox_snapshot — serialized session state for "undo / redo"
  18. audit_trace — explainability — which agents, which paths, which weights
  19. benchmark — performance against OrancBench

See docs/en/schemas/ for JSON schema definitions.


🧠 Under the Hood

Causal Graph — 64 nodes, 117 edges

Hand-designed by domain experts covering the marketing funnel: impression → awareness → consideration → conversion → repeat purchase → brand memory, with mediators for group discourse (Sunstein 2017) and information cascades (Bikhchandani et al. 1992).

The graph includes long-term feedback loops (e.g. repeat_purchase → brand_equity → ecpm_bid → next-cycle impression_dist). This is intentional — it reflects real marketing physics, not a modeling artifact. Strict Pearl-style abduction on cycles is undefined; our do() evaluation uses the cyclic-SCM generalization of Bongers et al. 2021 (Foundations of Structural Causal Models with Cycles and Latent Variables), treating the 25-node feedback SCC as a fixed-point solve rather than a topological forward pass.

The 3-step evaluation in code:

  1. Abduction — at the agent layer, re-use the sampled noise from baseline; at the graph layer, per-node residuals are frozen
  2. Action — apply do() intervention (supported nodes listed in /api/dag's intervenable: true set)
  3. Prediction — topologically sort the acyclic condensation, solve each SCC by numerical iteration (2–3 passes empirically converge on the shipped graph)

A time-unrolled DAG projection IS available in the OSS release via oransim.causal.scm.dag_dict_unrolled(n_steps=K) — each original node becomes N_t0, N_t1, ..., N_t{K-1}; feedback edges cross time (src_ti → dst_t{i+1}), non-feedback edges replicate within each slice. At n_steps=2 the shipped graph's 64 nodes + 117 edges (cyclic) unroll to 128 nodes + 220 edges (strict DAG, 14 feedback edges detected automatically via DFS back-edge analysis). Downstream modules that need strict acyclicity (CausalDAG-Transformer attention on a true DAG, textbook Pearl three-step abduction) can consume the unrolled view. The cyclic native graph + SCC condensation remains the default because it keeps the node count small and matches the shipped Transformer's 7-token input layout.

A full equilibrium-solver with fixed-point guarantees for the cyclic native graph is an Enterprise Edition upgrade; the OSS release offers the unrolled-DAG path as the acyclic alternative.

Agent Population — POP_SIZE-scalable IPF-calibrated virtual consumers

Generated via Iterative Proportional Fitting (IPF / Deming-Stephan 1940) against real Chinese demographic distributions (age × gender × region × income × platform). Each agent carries:

  • Demographics + psychographics
  • Platform-specific engagement priors
  • Niche/category affinity vectors
  • Time-of-day activity curves
  • Social graph embeddings
Soul Agents — LLM personas for qualitative feedback

The top-K most salient agents for a scenario are upgraded to LLM-backed personas (SOUL_POOL_N configurable; default 100 for demo, scalable via Ray in the Enterprise Edition). Default model: gpt-5.4. Each persona:

  • Generates a persona card from its demographic vector
  • Evaluates the creative (reaction / emotional response / intent)
  • Optionally participates in simulated group chats (Sunstein 2017 group polarization)
  • Feeds second-wave mediators back into the causal graph

Two modes, explicit trade-off:

  • Template mode (use_llm=False, default) — click decision is a Bernoulli draw against the statistical click_prob (+40% niche-match lift); the persona picks a consistent template reason / comment / feel. Zero LLM cost, deterministic given seed, used for CATE / ROI numerical reproducibility.
  • LLM-decider mode (use_llm=True, Park et al. 2023 Generative Agents style) — a real LLM gets the full persona card + creative + KOL context and returns a structured JSON (will_click, reason, comment, feel, purchase_intent_7d). The LLM's will_click is the agent's decision (not overridden by Bernoulli); the statistical click_prob is available as a prior in the prompt. Response tagged source: "llm". Trade-off: adds non-determinism per persona; for strict reproducibility stay in template mode or pin LLM_TEMPERATURE=0.

Cost controlled via:

  • In-flight request coalescing (leader/follower dedup pattern)
  • Persona card caching
  • Configurable SOUL_POOL_N
Causal Transformer World Model — primary (research-grade)

A 6-layer × 256-dim causal Transformer that ingests heterogeneous campaign features and predicts three quantile levels (P35/P50/P65) for each funnel KPI. Architecture lifts ideas from the recent causal-Transformer literature:

  • Token-type factorization (CaT, Melnychuk et al. ICML 2022) — inputs split into Covariate (platform, demographic, time), Treatment (creative embedding, budget, KOL), and Outcome (KPIs) tokens with distinct type embeddings
  • DAG-aware attention (CausalDAG-Transformer) — attention mask derived from the 64-node causal graph restricts each token to attend to topological ancestors; per-head learnable gate on the bias. Because the shipped graph is cyclic (see §Causal Graph), ancestry is defined on the graph's SCC condensation: within a feedback SCC all nodes are mutually ancestral, across SCCs the standard DAG ancestor relation applies (Bongers 2021 §3.2). Reference implementation shipped in CausalTransformerWorldModel.set_dag_from_edges() and toggleable via dag_attention_bias=True. The OSS release defaults to the LightGBM baseline path; pretrained CT checkpoints with DAG attention enabled ship with the Enterprise Edition (see §OranAI Enterprise Edition).
  • Per-arm counterfactual heads (TARNet, Shalit et al. ICML 2017 / Dragonnet, Shi et al. NeurIPS 2019) — one quantile head per discrete treatment arm enables predict_factual vs predict_counterfactual(do(T=t')) with a single forward pass
  • Representation balancing (BCAUSS + CaT) — HSIC (Gretton et al. 2005) or adversarial-IPTW loss decorrelates the learned representation from treatment assignment, reducing bias in counterfactual predictions
  • In-context amortization (CInA, Arik & Pfister NeurIPS 2023, optional) — model can condition on a context set of prior campaigns for amortized zero-shot causal inference

Core component: oransim.world_model.CausalTransformerWorldModel. Training loop, counterfactual rollout, and save/load are shipped today; pretrained weights land with OrancBench v0.5.

from oransim.world_model import get_world_model, CausalTransformerWMConfig

wm = get_world_model("causal_transformer", config=CausalTransformerWMConfig(
    dag_attention_bias=True,
    balancing_loss="hsic",
    use_counterfactual_head=True,
))
pred = wm.predict(features)                         # factual
cf = wm.counterfactual(features, arm_idx=2)         # do(T = arm 2)

Requires pip install 'oransim[ml]' (brings in PyTorch). Falls back gracefully to LightGBM if torch is unavailable.

Universal Embedding Bus (UEB) — text-only today, multi-modal hooks for v0.5

Every data source (creative copy, KOL bio, user comment, fan-profile tabular record, platform event stream) flows through a shared Embedder ABC that produces a fixed-dim vector. Downstream modules (world_model / agent / causal) never see modality-specific code — the registry is modality-generic.

Shipped today (v0.2):

  • RealTextEmbedder — OpenAI-compatible text-embedding-3-small via the same gateway as soul_llm (one key for everything). Falls back to a deterministic hash embedder if the API is unavailable.
  • TabularEmbedder, CategoricalEmbedder, TimeSeriesEmbedder, GeoEmbedder, EventEmbedder — non-learned baselines.

Stubs for v0.5 (raise NotImplementedError pointing to ROADMAP.md#v05 if called):

  • ImageEmbedderStub — planned backends: CLIP / Qwen-VL / SigLIP / ImageBind
  • VideoEmbedderStub — planned backends: I-JEPA v2 / TimeSformer / VideoMAE v2 / Qwen-VL video
  • AudioEmbedderStub — planned backends: Whisper-v3 encoder / CLAP / AudioMAE

Dropping a real implementation in is a ~50-line Embedder subclass with no downstream changes. See backend/oransim/runtime/embedding_bus.py.

LightGBM Quantile World Model — fast baseline

Three quantile regressors (P35, P50, P65) per KPI. Sub-millisecond inference, zero GPU requirement. Refs: Ke et al. 2017 (LightGBM), Koenker 2005 (Quantile Regression).

Shipped pkl (data/models/world_model_demo.pkl, feature_version: demo_v2, ~3 MB) consumes 23 features: 7 tabular (platform_id, niche_idx, budget, budget_bucket, kol_tier_idx, kol_fan_count, kol_engagement_rate) + 16 PCA-reduced text-embedding dimensions. The embedding input is a deterministic caption per scenario ("春季 {niche} 新品种草 · {tier} KOL · {budget_bucket}") passed through RealTextEmbedder — same embedder the rest of the stack uses (UEB, soul-agent persona matching, kol_content_match, search_elasticity). When OPENAI_API_KEY is set, it hits text-embedding-3-small; without a key, it falls back to the deterministic SHA-256 hash embedder so training / inference is still reproducible offline. PCA components ship inside the pkl and are applied at inference time via POST /api/v2/world_model/predict?model=lightgbm_quantile. R² on the 200 held-out from 2,000 synthetic scenarios: impressions 0.88 · clicks 0.79 · conversions 0.71 · revenue 0.75.

The Causal Transformer path consumes the full-dim creative embedding natively (without PCA) once weights land with OrancBench v0.5; the demo LightGBM pkl is the CPU-only fallback until then.

wm = get_world_model("lightgbm_quantile")
Budget Model — Hill saturation + frequency fatigue

Instead of naive linear budget scaling:

$$\text{effective_impr_ratio}(x) = \frac{(1+K) \cdot x}{K + x}$$

Michaelis-Menten / Hill saturation (Dubé & Manchanda 2005), combined with frequency fatigue (Naik & Raman 2003) on CTR/CVR:

$$\text{ctr_decay}(r) = \max(0.5, 1.0 - 0.08 \cdot \max(0, \log_2 r))$$

This captures diminishing returns, an optimal budget point, and realistic campaign dynamics.

Causal Neural Hawkes Process — primary diffusion forecaster

Transformer-parameterized neural temporal point process for 14-day cascading engagement forecasting, with first-class support for counterfactual rollouts under do() interventions.

Architectural references:

  • Mei & Eisner (NeurIPS 2017)The Neural Hawkes Process — continuous-time neural intensity function, foundation of the field
  • Zuo et al. (ICML 2020)Transformer Hawkes Process — self-attention encoder replacing the original CT-LSTM; directly the backbone of this implementation
  • Shchur et al. (ICLR 2020)Intensity-Free Learning of TPPs — closed-form inter-event-time head for fast sampling
  • Chen et al. (ICLR 2021)Neural Spatio-Temporal Point Processes — Monte Carlo estimator for the log-likelihood compensator
  • Geng et al. (NeurIPS 2022)Counterfactual Temporal Point Processes — the intervention semantics for marked point processes
  • Noorbakhsh & Rodriguez (2022)Counterfactual Temporal Point Processes — formalizes do() queries on event streams

Explicit treatment/control event typing (organic vs paid_boost) and an intervention-aware intensity decoder enable queries like "what if we had stopped boosting on day 3" via a counterfactual rollout loop.

Core component: oransim.diffusion.CausalNeuralHawkesProcess. Architecture, training loop (NLL with MC compensator), forecast sampler (Ogata thinning), and counterfactual rollout are shipped today; pretrained weights land with OrancBench v0.5.

from oransim.diffusion import get_diffusion_model

nh = get_diffusion_model("causal_neural_hawkes")
factual = nh.forecast(seed_events=[(0, "impression"), (12, "like")])
cf = nh.counterfactual_forecast(
    seed_events,
    intervention={"mute_at_min": 4320}  # stop boosting 3 days in
)

Requires pip install 'oransim[ml]'.

Parametric Hawkes — classical baseline

Exponential-kernel multivariate Hawkes process (Hawkes 1971). Closed-form intensity and log-likelihood; Ogata (1981) thinning sampler. Zero-dependency fallback and the baseline against which the Causal Neural Hawkes is evaluated on OrancBench.

ph = get_diffusion_model("parametric_hawkes")
Sandbox — incremental recomputation for "what if"

Scenario sessions persist state so users can iterate: "change budget from 100k to 150k, how does ROI move?" Incremental recomputation avoids redoing the full agent simulation when only budget changes. The agent pool is cached; counterfactual evaluation uses union-semantics CATE over reached vs. unreached populations.


📈 Benchmarks

Phase 1 benchmarks are based on the shipped synthetic corpus (2,000 scenarios + 100 event streams + 50 OrancBench tasks — reproducible from the files under data/synthetic/ and data/benchmarks/). See data/models/data_card.md for the data-generating process. The R² numbers below were run on 10% held-out of those 2k scenarios; larger-corpus numbers land with OrancBench v0.5.

Metric R² (synthetic) Baseline (linear) Notes
second_wave_click 0.30 0.18 PRS quantile median
first_wave_conversion 0.33 0.21 PRS quantile median
cascade_lift 0.39 0.25 Second-wave mediator
roi_point_estimate 0.33 0.19 Single-shot regression
retention_7d 0.29 0.17 Longitudinal

⚠️ Honest reproducibility framing — this is a closed-loop evaluation: the same synthetic data generator (backend/scripts/gen_synthetic_data.py) produces both training and held-out splits, and we evaluate our own model on our own generative process. This measures "does the model fit our generative assumptions", not external validity. For real marketing-decision accuracy you need either (a) an independent real-panel benchmark (Enterprise Edition uses proprietary real-world data) or (b) a public benchmark with out-of-distribution campaigns — the OrancBench v0.5 plan (see ROADMAP.md) is our attempt at the latter.

See docs/en/benchmarks/ for the full protocol.


🗺️ Roadmap — Highlights

See ROADMAP.md for the full 3-horizon × 8-theme plan. Teasers:

v0.2 (Q3 2026) — shipping pretrained weights

  • 📦 Trained Causal Transformer + Causal Neural Hawkes checkpoints on an expanded synthetic corpus (targeting ~100k scenarios for OrancBench v0.5)
  • TikTok + Douyin adapter MVPs
  • Docker Compose · MkDocs · CI

v0.5 (Q4 2026 – Q1 2027)

  • 🎯 Cross-platform transfer learning — pretrain on XHS, fine-tune on TikTok
  • Multi-LLM-format adapters — native Anthropic Messages, Gemini, Qwen DashScope shipped in v0.2; Bedrock Converse + native streaming roadmap item
  • 🎯 10k soul agents on Ray cluster
  • ✅ Instagram / YouTube Shorts / Douyin adapters MVP

v1.0+ (2027)

  • 🎯 Causal Foundation Model — pretrain on 10M+ campaigns
  • 🎯 Closed-loop AI media buying — real-time optimization with safety constraints
  • 🎯 Differential privacy + Federated learning — for brand-proprietary training
  • 15+ platforms, multi-modal creative understanding, vertical sub-benchmarks

🏢 OranAI Enterprise Edition

Oransim OSS ships on synthetic data for transparency and reproducibility. OranAI Enterprise Edition provides:

  • 📊 Real-world training data — continuously updated 1M+ labeled campaigns across beauty, fashion, 3C, F&B, luxury, auto
  • SLA-backed hosted inference — 99.9% uptime, sub-second response
  • 🎯 Vertical world models — beauty / fashion / electronics / F&B specialized calibration
  • 🤝 White-glove onboarding — custom adapter development, integration support, training
  • 🔒 On-premise deployment — with SOC 2 / ISO 27001 / GDPR compliance path
  • 🎓 Managed model updates — no downtime model refresh as platforms evolve

Contact: cto@orannai.com · Book a demo


🤝 Contributing

We love contributions — platform adapters, world-model improvements, docs, benchmarks, translations, bug fixes.

By contributing, you agree your contribution is licensed under Apache-2.0. No CLA required.


📚 Citation

If you use Oransim in research, please cite:

@software{oransim2026,
  author       = {Yin, Fakong and {Oransim contributors}},
  title        = {Oransim: Causal Digital Twin for Marketing at Scale},
  version      = {0.2.0-alpha},
  date         = {2026-04-18},
  url          = {https://github.com/OranAi-Ltd/oransim},
  organization = {OranAI Ltd.}
}

See CITATION.cff for cffconvert-compatible metadata.


📜 License

Apache License 2.0 — see LICENSE and NOTICE.

Copyright (c) 2026 OranAI Ltd. (橙果视界(深圳)科技有限公司) and Oransim contributors.

Third-party dependencies retain their original licenses. We are not affiliated with Xiaohongshu, ByteDance, Meta, Google, or any other platform mentioned in this repository.


💫 Team

Oransim is built by OranAI Ltd. (橙果视界(深圳)科技有限公司).

Core Maintainers

Open roles — we're hiring researchers (Causal ML, RL, Agent-based Simulation) and engineers (Platform, Infra). Reach out at cto@orannai.com.

Contributors appear on CONTRIBUTORS.md (auto-generated).


⭐ Star History

Star History Chart
Built with ☕ in Shenzhen by OranAI. If Oransim helps your work, please ⭐ star the repo — it powers our open-source commitment.

About

Causal Digital Twin for Marketing at Scale · Predict any marketing decision before you spend a dollar.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 79.2%
  • JavaScript 12.8%
  • HTML 6.6%
  • CSS 1.2%
  • Dockerfile 0.2%