(improvement) unit tests for benchmarking query planning. by mykaul · Pull Request #653 · scylladb/python-driver

mykaul · 2026-01-24T11:33:54Z

Not a very scientific one, but reasonable to get some measurements in terms of how different optimizations work. Example run (on #650 branch):


  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
============================================================================================================ test session starts ============================================================================================================= platform linux -- Python 3.14.2, pytest-8.3.5, pluggy-1.6.0 rootdir: /home/ykaul/github/python-driver
configfile: pyproject.toml
plugins: asyncio-1.1.0, anyio-4.12.1
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function collected 4 items

tests/unit/test_policy_performance.py Pinned to CPU 0 ....

=== Performance Benchmarks ===
Policy                         | Ops        | Time (s)   | Kops/s
----------------------------------------------------------------------
DCAware                        | 100000     | 0.2328     | 429
RackAware                      | 100000     | 0.3637     | 274
TokenAware(DCAware)            | 100000     | 1.5884     | 62
TokenAware(RackAware)          | 100000     | 1.6816     | 59
----------------------------------------------------------------------

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I have provided docstrings for the public items that I want to introduce.
I have adjusted the documentation in ./docs/source/.
I added appropriate Fixes: annotations to PR description.

mykaul · 2026-01-25T15:31:53Z

Latest commit add Host policy. Now it looks like this:

ykaul@ykaul:~/github/python-driver$ pytest -s tests/unit/test_policy_performance.py 
/usr/lib/python3.14/site-packages/pytest_asyncio/plugin.py:211: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
==================================================================================================================================================== test session starts =====================================================================================================================================================
platform linux -- Python 3.14.2, pytest-8.3.5, pluggy-1.6.0
rootdir: /home/ykaul/github/python-driver
configfile: pyproject.toml
plugins: asyncio-1.1.0, anyio-4.12.1
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 6 items                                                                                                                                                                                                                                                                                                            

tests/unit/test_policy_performance.py Pinned to CPU 0
......

=== Performance Benchmarks ===
Policy                         | Ops        | Time (s)   | Kops/s    
----------------------------------------------------------------------
DCAware                        | 100000     | 0.0989     | 1010      
Default(DCAware)               | 100000     | 0.1532     | 652       
HostFilter(DCAware)            | 100000     | 0.3303     | 302       
RackAware                      | 100000     | 0.1149     | 870       
TokenAware(DCAware)            | 100000     | 0.2112     | 473       
TokenAware(RackAware)          | 100000     | 0.2249     | 444       
----------------------------------------------------------------------

dkropachev · 2026-01-26T11:31:01Z

@mykaul , thanks looks great, can you implement them in pytest and mark with @pytest.mark.benchmark and register such a flag in pytest.ini

mykaul · 2026-02-06T09:03:58Z

@mykaul , thanks looks great, can you implement them in pytest and mark with @pytest.mark.benchmark and register such a flag in pytest.ini

There is no pytest.ini, so I've added one. The tests were already in pytest, now they are in the right dir, in a follow-up commit.

Lorak-mmk · 2026-02-06T09:12:49Z

There is benchmarks folder. Maybe its a better place for such tests?

dkropachev

I love it, have some comments though.
Also can you please take a look at pytest-benchmark, it looks like exactly what we need.

dkropachev · 2026-02-06T14:04:47Z

tests/unit/test_policy_performance.py

+"A micro-bechmark for performance of policies"
+


Is it a real commit ? there is real change here.

I'm not sure I understand the comment. It's a micro-benchmark, in the sense that it is using mocks and doesn't run against a real server?

Ah, OK. Will fix.

dkropachev · 2026-02-06T14:12:32Z

tests/performance/test_policy_performance.py

+        replicas = []
+        for r in range(3):
+            idx = (i + r) % len(hosts)
+            replicas.append(hosts[idx])
+        replicas_map[key] = replicas


I don't think it is good idea to connect number of token ranges with number of queries, you may want to run 1k queries on setup with 100 token ranges.
Also this way you would split mock cluster initialization from query initialization, which would make fixture api more readable and flexiable at the same time, it would look like this:

def test_dc_aware(policy, vnode_cluster_1000, select_10000): policy.populate(vnode_cluster_1000, vnode_cluster_1000.metadata.all_hosts()) for query in select_10000: ...

dkropachev · 2026-02-06T14:23:47Z

tests/performance/test_policy_performance.py

@@ -0,0 +1,242 @@
+import time


Can you please merge these two commits, it looks you you just moved code to another place

tests/integration/standard/column_encryption/test_encrypted_policies.py

dkropachev · 2026-02-06T14:25:26Z

tests/integration/standard/test_cluster.py

        connections = [c for holders in cluster.get_connection_holders() for c in holders.get_connections()]

        # make sure requests were sent on all connections
+        # Note: connections can be replaced in shard-aware environments, so we skip


Could you please create a separate pr for this change.

I was going to, first wanted to see that it solves the issue - but it doesn't:

FAILED tests/integration/standard/test_cluster.py::ClusterTests::test_idle_heartbeat - assert not True + where True = any(<generator object ClusterTests.test_idle_heartbeat.<locals>.<genexpr> at 0x7f63c0292d40>)

I feel like I should git bisect it - if I can reproduce this locally first.

Of course, it passes locally. Over and over, consistently. Not a single failure.

since other PR doesn't break like that, would assume it's the new benchmark tests which are running together with all of the other unittests

They all do - #686 as an example. So either I managed to sneak in something that fails everywhere, or something else is going on. I tend to blame me, but I have no proof yet, and it's very annoying it doesn't reproduce on my laptop.

Add micro-benchmarks measuring query plan generation throughput for DCAwareRoundRobinPolicy, RackAwareRoundRobinPolicy, TokenAwarePolicy, DefaultLoadBalancingPolicy, and HostFilterPolicy. Uses pytest-benchmark for accurate timing and statistical reporting with a simulated 45-node cluster topology (5 DCs x 3 racks x 3 nodes) and 100,000 deterministic queries. Also rename tests/integration/standard/column_encryption/test_policies.py to test_encrypted_policies.py to avoid module name conflicts when running the full test suite. Run with: pytest -m benchmark tests/performance/ Benchmark results comparing master vs PR scylladb#651 optimizations (Python 3.14.3, pytest-benchmark 5.2.3, GC disabled, median): Policy | master (Kops/s) | PR#651 (Kops/s) | Speedup --------------------------|-----------------|-----------------|-------- DCAware | 833 | 1898 | 2.3x RackAware | 542 | 1589 | 2.9x TokenAware(DCAware) | 135 | 572 | 4.2x TokenAware(RackAware) | 123 | 539 | 4.4x Default(DCAware) | 674 | 1257 | 1.9x HostFilter(DCAware) | 394 | 579 | 1.5x

mykaul marked this pull request as draft January 24, 2026 11:34

mykaul mentioned this pull request Jan 24, 2026

(improvement)Optimize DCAware/RackAware/TokenAware/HostFilter policies with host distance caching and overall perf. improvements #651

Open

8 tasks

mykaul marked this pull request as ready for review January 24, 2026 15:23

mykaul force-pushed the query_plan_opt_perf branch from b9a8faa to a797262 Compare January 25, 2026 15:25

mykaul force-pushed the query_plan_opt_perf branch from a797262 to d850f81 Compare February 6, 2026 09:03

dkropachev reviewed Feb 6, 2026

View reviewed changes

dkropachev mentioned this pull request Feb 6, 2026

Expand bench query planning coverage #691

Open

mykaul mentioned this pull request Feb 9, 2026

Add __slots__ to multiple classes for memory optimization #659

Closed

8 tasks

mykaul mentioned this pull request Mar 14, 2026

Tracking: General (non-vector) performance improvement PRs #747

Open

mykaul force-pushed the query_plan_opt_perf branch from 5dd6376 to 8c75060 Compare March 25, 2026 15:30

mykaul force-pushed the query_plan_opt_perf branch from 8c75060 to 76b79c3 Compare March 25, 2026 15:42

Conversation

mykaul commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-review checklist

Uh oh!

mykaul commented Jan 25, 2026

Uh oh!

dkropachev commented Jan 26, 2026

Uh oh!

mykaul commented Feb 6, 2026

Uh oh!

Lorak-mmk commented Feb 6, 2026

Uh oh!

dkropachev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mykaul commented Jan 24, 2026 •

edited

Loading