feat: optional ClickHouse dual-sink for analytics

## Summary

Add optional ClickHouse support as a parallel write sink alongside PostgreSQL. PG continues to handle point lookups (get block, get tx by hash); ClickHouse handles analytical queries (aggregations, time-range scans, charts, top-N rankings).

## Motivation

Atlas's analytics endpoints (daily tx charts, gas price history, token holder rankings) all hit PostgreSQL today. As chain history grows into hundreds of millions of rows, these queries will degrade — ClickHouse is purpose-built for this workload and can be orders of magnitude faster for full-column scans and aggregations.

## Design

### Dual-Sink Writer
1. Add a `ClickHouseSink` that writes blocks, txs, logs, and receipts using the `clickhouse` crate (RowBinary format + LZ4 compression)
2. Wrap PG + CH sinks in a `SinkSet` abstraction that fans out writes via `tokio::try_join!` (parallel writes, both must succeed)
3. CH is optional — enabled via `ENABLE_CLICKHOUSE=true` + `CLICKHOUSE_URL`
4. CH sink should chunk large batches (e.g., 10K rows per INSERT) to bound memory
5. Use `ReplacingMergeTree` engine so re-inserts after crashes are safe (duplicates merged away)
6. CH sink gets its own retry logic (3 attempts, exponential backoff)

### ClickHouse Schema
- Mirror the core PG tables: `blocks`, `transactions`, `event_logs`, `receipts`
- Partition by month: `toYYYYMM(timestamp)` for efficient time-range pruning
- Bloom filter indexes on `event_logs(topic0, address)` for fast event filtering
- Order keys optimized for analytical access patterns (e.g., logs ordered by `(address, topic0, block_number)`)

### Query Routing
- Analytics API endpoints (charts, top holders, aggregations) route to ClickHouse when available
- Point lookups (get block by number, get tx by hash) stay on PostgreSQL
- Could expose an `?engine=clickhouse` query param for explicit routing

### PG → CH Backfill
- One-time migration tool that reads historical data from PG and bulk-loads into CH
- Allows adding ClickHouse to an existing Atlas deployment without re-indexing from the chain
- Should run as a background task with configurable concurrency and throttling

### Configuration
| Var | Default | Description |
|---|---|---|
| `ENABLE_CLICKHOUSE` | `false` | Enable ClickHouse dual-sink |
| `CLICKHOUSE_URL` | — | ClickHouse HTTP endpoint (required if enabled) |
| `CLICKHOUSE_DATABASE` | `atlas` | Database name |
| `CLICKHOUSE_USER` | `default` | Auth user |
| `CLICKHOUSE_PASSWORD` | — | Auth password |
| `CH_INSERT_CHUNK_SIZE` | `10000` | Rows per INSERT batch |

### Docker Compose
- Add optional ClickHouse service (clickhouse/clickhouse-server)
- Only started when ENABLE_CLICKHOUSE=true profile is active

## References

- tidx implementation: `sync/sink.rs` (SinkSet fan-out), `sync/ch_sink.rs` (ClickHouse writer with chunked inserts + retry), `clickhouse.rs` (read engine with failover)
- tidx uses `ReplacingMergeTree` + monthly partitioning + bloom filter indexes on logs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: optional ClickHouse dual-sink for analytics #58

Summary

Motivation

Design

Dual-Sink Writer

ClickHouse Schema

Query Routing

PG → CH Backfill

Configuration

Docker Compose

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Var	Default	Description
`ENABLE_CLICKHOUSE`	`false`	Enable ClickHouse dual-sink
`CLICKHOUSE_URL`	—	ClickHouse HTTP endpoint (required if enabled)
`CLICKHOUSE_DATABASE`	`atlas`	Database name
`CLICKHOUSE_USER`	`default`	Auth user
`CLICKHOUSE_PASSWORD`	—	Auth password
`CH_INSERT_CHUNK_SIZE`	`10000`	Rows per INSERT batch

feat: optional ClickHouse dual-sink for analytics #58

Description

Summary

Motivation

Design

Dual-Sink Writer

ClickHouse Schema

Query Routing

PG → CH Backfill

Configuration

Docker Compose

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions