Skip to content

Commit 6bc4e6f

Browse files
committed
docs: self hosting nats workers
1 parent 8f3273a commit 6bc4e6f

6 files changed

Lines changed: 456 additions & 128 deletions

File tree

docs.json

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -295,8 +295,14 @@
295295
"self-host/enterprise-on-prem",
296296
"self-host/self-host-lightdash-docker-compose",
297297
"self-host/update-lightdash",
298-
"self-host/pre-aggregates",
299-
"self-host/nats-workers",
298+
{
299+
"group": "NATS workers",
300+
"pages": [
301+
"self-host/nats-workers/overview",
302+
"self-host/nats-workers/warehouse-workers",
303+
"self-host/nats-workers/pre-aggregate-workers"
304+
]
305+
},
300306
{
301307
"group": "Customize deployment",
302308
"pages": [
@@ -423,6 +429,14 @@
423429
"source": "/references/pre-aggregates",
424430
"destination": "/references/pre-aggregates/overview"
425431
},
432+
{
433+
"source": "/self-host/pre-aggregates",
434+
"destination": "/self-host/nats-workers/pre-aggregate-workers"
435+
},
436+
{
437+
"source": "/self-host/nats-workers",
438+
"destination": "/self-host/nats-workers/warehouse-workers"
439+
},
426440
{
427441
"source": "/guides/ai-analyst",
428442
"destination": "/guides/ai-agents"
@@ -677,4 +691,4 @@
677691
"display": "simple"
678692
}
679693
}
680-
}
694+
}

self-host/nats-workers.mdx

Lines changed: 0 additions & 41 deletions
This file was deleted.
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
---
2+
title: "NATS workers"
3+
sidebarTitle: "Overview"
4+
description: "Scale Lightdash query processing with dedicated NATS worker pods using the Helm chart."
5+
---
6+
7+
<Badge color="blue" size="md" shape="pill">Helm chart</Badge>
8+
9+
<Callout icon="wrench" color="#6B7280">
10+
This page is for engineering teams self-hosting their own Lightdash instance.
11+
</Callout>
12+
13+
By default, Lightdash processes all queries on the main API server. NATS workers move query execution onto dedicated pods, improving responsiveness under load and letting you scale query capacity independently.
14+
15+
Lightdash uses [NATS](https://nats.io/) — a lightweight, high-performance messaging system — with [JetStream](https://docs.nats.io/nats-concepts/jetstream), its built-in persistent streaming layer, to distribute work between the API server and worker pods.
16+
17+
NATS powers two opt-in features in Lightdash:
18+
19+
<CardGroup cols={2}>
20+
<Card title="Warehouse workers" icon="database" horizontal href="/self-host/nats-workers/warehouse-workers">
21+
Process interactive and background warehouse queries on dedicated pods.
22+
</Card>
23+
<Card title="Pre-aggregate workers" icon="layer-group" horizontal href="/self-host/nats-workers/pre-aggregate-workers">
24+
Materialize pre-aggregates and serve queries from DuckDB.
25+
</Card>
26+
</CardGroup>
27+
28+
## Requirements
29+
30+
- **Helm chart** version **2.7.2** or later
31+
- **Lightdash** version [**0.2675.0**](https://hub.docker.com/r/lightdash/lightdash/tags) or later. Older images will fail with `MODULE_NOT_FOUND`.
32+
33+
<Note>
34+
Upgrading the Helm chart alone does not change how Lightdash works. NATS features are entirely opt-in — your existing deployment will behave exactly the same until you explicitly enable the new Helm values described below.
35+
</Note>
36+
37+
## Architecture
38+
39+
```mermaid
40+
flowchart LR
41+
API[Lightdash API] -->|publish job| NATS[NATS JetStream]
42+
NATS -->|deliver message| Worker[Worker pod<br/>concurrency: 100]
43+
Worker -->|return result| API
44+
```
45+
46+
The Lightdash API publishes jobs to NATS JetStream. Worker pods consume messages from their stream and process them concurrently (default 100 concurrent jobs per pod).
47+
48+
## Enabling NATS
49+
50+
```yaml
51+
nats:
52+
enabled: true
53+
config:
54+
cluster:
55+
enabled: false
56+
jetstream:
57+
enabled: true
58+
fileStore:
59+
enabled: false
60+
memoryStore:
61+
enabled: true
62+
maxSize: 1Gi
63+
```
64+
65+
The JetStream configuration shown above reflects the defaults when `nats.enabled` is set to `true`. This deploys a NATS StatefulSet in your namespace. No queries are routed through it until you also enable a worker — see [Warehouse workers](/self-host/nats-workers/warehouse-workers) or [Pre-aggregate workers](/self-host/nats-workers/pre-aggregate-workers).
66+
67+
## Auto-configured environment variables
68+
69+
The chart automatically sets these environment variables in the shared ConfigMap — you do not need to set them manually:
70+
71+
| Variable | Set when | Value |
72+
| --- | --- | --- |
73+
| `NATS_ENABLED` | `nats.enabled: true` | `"true"` |
74+
| `NATS_URL` | `nats.enabled: true` | `nats://<release>-nats:4222` |
75+
76+
Additional environment variables are auto-configured per worker deployment — see [Warehouse workers](/self-host/nats-workers/warehouse-workers) and [Pre-aggregate workers](/self-host/nats-workers/pre-aggregate-workers) for details.
77+
78+
## NATS JetStream configuration
79+
80+
JetStream supports two [storage backends](https://docs.nats.io/nats-concepts/jetstream/streams#storagetype) — we recommend memory store, but you can use file store depending on your needs.
81+
82+
### Memory store vs file store
83+
84+
| | Memory store (recommended) | File store |
85+
| --- | --- | --- |
86+
| **How it works** | Messages are held in RAM | Messages are persisted to disk |
87+
| **Performance** | Faster — no disk I/O overhead | Slower — writes go through disk |
88+
| **Persistence** | Messages are lost if NATS restarts | Messages survive NATS restarts |
89+
| **Infrastructure** | No PersistentVolumeClaim needed | Requires a PersistentVolumeClaim |
90+
| **When to use** | Most deployments. Lightdash messages are small (just a query UUID) and are deleted once processed. | High message volume exceeding available RAM, or if you need messages to survive NATS pod restarts. |
91+
92+
For more details, see the NATS documentation on [JetStream](https://docs.nats.io/nats-concepts/jetstream) and [stream storage types](https://docs.nats.io/nats-concepts/jetstream/streams#storagetype).
93+
94+
### Recommended configuration
95+
96+
```yaml
97+
nats:
98+
enabled: true
99+
config:
100+
cluster:
101+
enabled: false # single-node NATS, no clustering
102+
jetstream:
103+
enabled: true
104+
fileStore:
105+
enabled: false # no disk persistence
106+
memoryStore:
107+
enabled: true
108+
maxSize: 1Gi # max memory for message storage
109+
```
110+
111+
| Setting | Recommended | Description |
112+
| --- | --- | --- |
113+
| `nats.config.jetstream.memoryStore.enabled` | `true` | Enable memory-backed storage |
114+
| `nats.config.jetstream.memoryStore.maxSize` | `1Gi` | Maximum memory for JetStream message storage |
115+
| `nats.config.jetstream.fileStore.enabled` | `false` | Enable disk-backed storage |
116+
| `nats.config.cluster.enabled` | `false` | Single-node NATS (no clustering) |
117+
118+
### Pod disruption
119+
120+
NATS is a stateful component — if the NATS pod restarts, in-flight messages are lost (queries will be retried by users). The chart protects against unplanned eviction with:
121+
122+
- `cluster-autoscaler.kubernetes.io/safe-to-evict: "false"` annotation
123+
- `PodDisruptionBudget` with `maxUnavailable: 0`

0 commit comments

Comments
 (0)