Add COLUMNAR_MAP index for per-key columnar storage for dense key and JSON storage for sparse key for MAP datatype#17896
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17896 +/- ##
=============================================
- Coverage 63.31% 34.93% -28.39%
+ Complexity 1627 789 -838
=============================================
Files 3229 3255 +26
Lines 196705 198612 +1907
Branches 30408 30770 +362
=============================================
- Hits 124544 69380 -55164
- Misses 62183 123101 +60918
+ Partials 9978 6131 -3847
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@tarun11Mavani Did you explore adding Sparse Map handling as part of Existing MAP (ComplexFieldSpec)? What were the challenges? I was thinking if you could plugin SparseIndexCreator/reader (may be a config) in MAP data type rather than creating a new data type ? |
Actually, I started with this before deciding to move towards having a dedicated data type. The short answer is that MAP and SPARSE_MAP have different enough semantics that extending MAP cleanly is harder than it appears. What MAP currently does: a MAP column has a single forward index storing the raw map blob per document. Key access is done by deserializing the full map on the fly via Why plugging
That said, a hybrid path is difficult but possible. Happy to explore that direction if the community prefers not adding a new DataType. |
|
Thanks for the detailed write-up and the work put into this. My main question before this moves forward: do we need a new DataType here, or can Looking at the existing SPI, MapIndexReader already has exactly the right Map<IndexType, R> getKeyIndexes(String key); And MapDataSource.getKeyDataSource(key) already returns a per-key DataSource — The per-key typing gap (ComplexFieldSpec has one homogeneous value type) seems { Existing schemas are unaffected; keyTypes absent → current behavior. If we go this route, the core storage work (OnHeapSparseMapIndexCreator, The query layer (PR 2) should also work without new operator types since the |
Thanks for the review! I have addressed this. the latest changes remove SPARSE_MAP from |
16e23f0 to
d66fb54
Compare
|
@tarun11Mavani Do we need a new index Type ? I think storage format can be a flag in forward Index itself. |
I had two options in mind to use for having columnar storage for MAP type. Option A (current): Explicit IndexType.COLUMNAR_MAP in fieldConfigList I went with A because it follows the same pattern as TEXT, JSON, and RANGE indexes — every storage/index optimization in Pinot is table-config-driven via fieldConfigList. The schema defines the logical model (what the data is), table config defines the physical storage (how to store it). This separation means: 1. Same schema works with blob or columnar storage — you can toggle without a schema change Option B couples type semantics (keyTypes) with storage layout, and still needs table config for properties anyway — so it doesn't actually eliminate the second config location, it just removes the activation flag. The dispatch in ImmutableSegmentImpl is clean: if columnarMapReader != null → ColumnarMapDataSource, else → ImmutableMapDataSource (blob path). Both implement MapDataSource, so the query layer uses the same MAP index APIs regardless of storage mode. Happy to rename COLUMNAR_MAP to just MAP in the IndexType enum if that reads better — the key point is keeping activation in table config rather than schema. |
b58b73f to
a276e48
Compare
|
Do we have a conclusion here? Did we decide on index type or field type? |
For now, I have implemented this as a new indexType for MAP datatype. I am connecting with @raghavyadav01 and @Jackie-Jiang tomorrow to discuss this further. |
|
Here is the RFC based on the offline discussion with @Jackie-Jiang and @raghavyadav01 I will refactor the PRs to publish smaller PRs once the design looks good. |
3b26bc2 to
d581105
Compare
40651ac to
e054bb0
Compare
…immutable read path Introduces the COLUMNAR_MAP index type for MAP columns with per-key columnar storage. Includes ComplexFieldSpec enhancements, SPMX v3 binary format with dense/sparse two-tier storage, dictionary encoding, forward index reader with co-iterator, per-key inverted index, and index plugin/type/handler wiring. Format details: - 56-byte header (magic + version + numKeys + numDocs + numDenseKeys + numSparseKeys + 4 section offsets) - 70-byte key metadata (tier flag + storedType + numDocs + 4 offset/length pairs for nullBitmap/forward/inverted/dictIdForward) - Dense tier: full forward index per key with run-optimized null bitmap - Sparse tier: JSON sidecar file with per-key SPMX entries reduced to type metadata (per-key presence bitmap added in PR-2 query layer) Quality fixes (from self-review): - sortValues() uses type-aware comparator matching ColumnarMapKeyDictionary, preventing wrong range query results and GROUP BY ordering for numeric keys - Sparse sidecar JSON serialization uses Jackson ObjectMapper to handle control characters per RFC 8259 - Class-level Javadoc accurately documents the 56-byte header and 70-byte key metadata layout - StandardIndexes.columnarMap() returns parameterized IndexType<> matching other accessor methods - Preconditions.checkState guards bufferSize long-to-int cast - ColumnarMapIndexHandler.updateIndices declares throws Exception - DataOutputStream wrapped in try-with-resources - WARN log when sparse sidecar missing but SPMX has sparse keys Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
e054bb0 to
545d24e
Compare
Summary
This is PR 1 of 2 introducing columnar MAP storage — an opt-in index for MAP columns in Apache Pinot that stores each key in its own columnar format with two-tier (dense/sparse) storage. This PR covers the full storage layer: SPI interfaces, binary format, segment creation, immutable reader, dictionary encoding, and lifecycle wiring.
Stack
Motivation
Many Pinot use cases need to store and query semi-structured map data (e.g., user metrics, event properties, feature stores) where the set of keys varies across records. Today users must flatten maps into individual columns at ingestion time or store them as opaque JSON blobs without query pushdown. The columnar MAP index solves this by storing maps in a columnar per-key format that supports typed access, filtering, and GROUP BY at query time — without requiring schema changes when new keys appear.
Discussed the benefits in detail in #17894.
RFC: https://docs.google.com/document/d/14kPmjDTKbO8l0ql4rrN7I5Yki5pqMw6GeGmxxc9grsU/edit?tab=t.0
What's in this PR
Schema & Config (pinot-spi)
DataType.MAPfromComplexFieldSpec. Columnar storage is an opt-in index viaindexTypes: ["COLUMNAR_MAP"]in table configComplexFieldSpec.MapFieldSpec— extended with optionalkeyTypes(per-key type declarations) anddefaultValueTypefor undeclared keysColumnarMapIndexConfig— controlsdenseKeyThreshold, explicitdenseKeys,maxKeys, inverted index enablement, andnoDictionaryKeysFieldConfig.IndexType.COLUMNAR_MAP— explicit index type enum for opt-inSPI Interfaces (pinot-segment-spi)
ColumnarMapIndexCreator— segment-build-time interface:add(Map<String, Object>)per doc,seal()to flushColumnarMapIndexReader— query-time interface: typed getters (getInt,getString, ...), presence bitmaps, inverted index lookups, per-key DataSource accessV1ConstantsandStandardIndexesfor index type registrationColumnMetadataImpl— extended to persist per-key type declarations in segment metadataIndex Implementation (pinot-segment-local)
Binary format (
.columnarmap.idx, SPMX v3) with two-tier storage:Sparse sidecar file (
.columnarmap.sparse):Two-tier storage design:
denseKeyThreshold(default 0.5) or explicitly listed indenseKeys. One forward index entry per segment document, O(1) access by docId, null bitmap tracks absent documentsKey storage features:
noDictionaryKeysforces raw encodingrank()call), with optimized null bitmaps for absent documentsSegment Creation Wiring
BaseSegmentCreator— skips forward index creation for MAP columns with COLUMNAR_MAP enabled; persists per-key type metadata in segment propertiesColumnarMapColumnPreIndexStatsCollector— lightweight stats collector (doc count only, no min/max/cardinality)StatsCollectorUtil— routes MAP columns to the columnar map stats collectorColumnMinMaxValueGenerator— skips min/max generation for MAP columnsSegmentGeneratorConfig— exposesgetColumnarMapColumnNames()for metadata persistenceHow to Use
1. Schema definition (
complexFieldSpecs){ "schemaName": "myTable", "complexFieldSpecs": [ { "name": "metrics", "dataType": "MAP", "keyTypes": { "clicks": "LONG", "spend": "DOUBLE", "country": "STRING" }, "defaultValueType": "STRING" } ] }keyTypes(optional) — declares known keys and their data types for type coerciondefaultValueType(optional) — type for undeclared/dynamic keys (defaults to STRING)2. Table config (
fieldConfigListwithCOLUMNAR_MAPindex){ "fieldConfigList": [ { "name": "metrics", "indexTypes": ["COLUMNAR_MAP"], "properties": { "maxKeys": "1000", "denseKeyThreshold": "0.5", "denseKeys": "country,sessions", "enableInvertedIndexForAll": "false", "invertedIndexKeys": "country,sessions" } } ] }What's NOT in this PR (comes in PR 2)
ColumnarMapDataSource— per-key query routing and DataSource constructionMutableColumnarMapIndexImpl— consuming segment support with O(1) lock-free readsMapFilterOperator— per-key inverted index filter strategy, IS NULL/IS NOT NULLItemTransformFunction— null bitmap propagation for item() expressionsImmutableSegmentImpl/MutableSegmentImpl— segment loading wiringTest plan
ColumnarMapDataTypeTest— 11 tests (schema serialization, ComplexFieldSpec round-trip, keyTypes/defaultValueType)ColumnarMapIndexConfigTest— 8 tests (config deserialization, denseKeys, properties round-trip)ColumnarMapSegmentCreationTest— 1 test (end-to-end segment build pipeline with COLUMNAR_MAP index)