(improvement) (python code path only): cache namedtuple class in named_tuple_factory to avoid … by mykaul · Pull Request #740 · scylladb/python-driver

mykaul · 2026-03-13T10:08:20Z

Cache the Row namedtuple class keyed on tuple(colnames) so Python's namedtuple() (which internally calls exec()) is only invoked once per unique column schema. For prepared statements the column names never change, eliminating redundant class creation on every result set.

Motivation

named_tuple_factory is the default row_factory in the driver. Every call to namedtuple('Row', columns) internally calls exec() to generate a new class -- this is surprisingly expensive. For prepared statements executing the same query repeatedly, the column names never change, yet we pay the namedtuple() + exec() cost on every result set.

Benchmark results

Benchmarks compare the original code (Before) against the new cached implementation (After). All timings in us (microseconds).

10 columns, 1 row (isolates class creation overhead):

Variant	Min (us)	Mean (us)	Median (us)	Ops/sec	Speedup
Before (original)	43.49	59.98	47.65	16,700	—
After (with cache)	0.24	0.45	0.35	2,210,000	~133x

5 columns, 100 rows:

Variant	Min (us)	Mean (us)	Median (us)	Ops/sec	Speedup
Before (original)	57.4	91.2	65.8	10,969	—
After (with cache)	19.3	25.3	24.0	39,594	~3.6x

10 columns, 100 rows:

Variant	Min (us)	Mean (us)	Median (us)	Ops/sec	Speedup
Before (original)	56.7	101.9	75.6	9,813	—
After (with cache)	18.1	21.4	20.4	46,825	~4.8x

Design notes

Cache is a plain dict keyed on tuple(colnames) (raw column names before cleaning)
Error handling paths (SyntaxError, Exception) preserved unchanged
Cache is naturally bounded by the number of distinct queries

Tests

All existing unit tests pass (46 passed).

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

…repeated exec() calls Cache the Row namedtuple class keyed on tuple(colnames) so Python's namedtuple() (which internally calls exec()) is only invoked once per unique column schema. For prepared statements the column names never change, eliminating redundant class creation on every result set. Cache is a plain dict keyed on tuple(colnames) (raw column names before cleaning). Error handling paths (SyntaxError, Exception) preserved unchanged. Cache is naturally bounded by the number of distinct queries.

mykaul changed the title ~~(improvement) cache namedtuple class in named_tuple_factory to avoid …~~ (improvement) (python code path only): cache namedtuple class in named_tuple_factory to avoid … Mar 13, 2026

mykaul marked this pull request as draft March 13, 2026 10:13

This was referenced Mar 14, 2026

Tracking: Vector search (VectorType) performance improvement PRs #746

Open

Tracking: General (non-vector) performance improvement PRs #747

Open

(improvement) LWT prepared statement performance: analysis and improvement plan #751

Open

mykaul force-pushed the perf/cache-named-tuple-factory branch from 9ea1ed1 to 8398d42 Compare April 2, 2026 17:08

mykaul force-pushed the perf/cache-named-tuple-factory branch from 8398d42 to 9a76016 Compare April 3, 2026 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(improvement) (python code path only): cache namedtuple class in named_tuple_factory to avoid …#740

(improvement) (python code path only): cache namedtuple class in named_tuple_factory to avoid …#740
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:perf/cache-named-tuple-factory

mykaul commented Mar 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mykaul commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Benchmark results

Design notes

Tests

Pre-review checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mykaul commented Mar 13, 2026 •

edited

Loading