Skip to content

HIVE-25948: Iceberg: Enable cost-based selection between Fanout and Clustered writers using column stats NDV#6389

Open
deniskuzZ wants to merge 1 commit intoapache:masterfrom
deniskuzZ:HIVE-25948
Open

HIVE-25948: Iceberg: Enable cost-based selection between Fanout and Clustered writers using column stats NDV#6389
deniskuzZ wants to merge 1 commit intoapache:masterfrom
deniskuzZ:HIVE-25948

Conversation

@deniskuzZ
Copy link
Member

@deniskuzZ deniskuzZ commented Mar 24, 2026

What changes were proposed in this pull request?

Cost-based selection between Fanout and Clustered writers

Why are the changes needed?

Perf optimization

Does this PR introduce any user-facing change?

No

How was this patch tested?

mvn test -Dtest=TestIcebergCliDriver -Dqfile=dynamic_partition_writes.q

┌───────────────────────┬───────────────────────────┐
│       Scenario        │         Expected          │
├───────────────────────┼───────────────────────────┤
│ threshold=0 (default) │ no sort (NDV<MAX_WRITERS) │
├───────────────────────┼───────────────────────────┤
│ threshold=-1          │ no sort                   │
├───────────────────────┼───────────────────────────┤
│ threshold=1           │ sort                      │
├───────────────────────┼───────────────────────────┤
│ threshold=2           │ sort (NDV>2)              │
├───────────────────────┼───────────────────────────┤
│ threshold=100         │ no sort (NDV<=100)        │
├───────────────────────┼───────────────────────────┤
│ fanout=false          │ sort                      │
└───────────────────────┴───────────────────────────┘

@deniskuzZ deniskuzZ changed the title HIVE-25948: Enable cost-based selection between FanoutWriter and ClusteredWriter for Iceberg tables based on column stats NDV HIVE-25948: Iceberg: Enable cost-based selection between FanoutWriter and ClusteredWriter based on column stats NDV Mar 24, 2026
@deniskuzZ deniskuzZ changed the title HIVE-25948: Iceberg: Enable cost-based selection between FanoutWriter and ClusteredWriter based on column stats NDV HIVE-25948: Iceberg: Enable cost-based selection between Fanout and Clustered writers using column stats NDV Mar 24, 2026
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants