Skip to content

tests/benchmarks: Add VectorType deserialization benchmarks and expand test coverage#733

Draft
mykaul wants to merge 3 commits intoscylladb:masterfrom
mykaul:vector-tests-benchmarks
Draft

tests/benchmarks: Add VectorType deserialization benchmarks and expand test coverage#733
mykaul wants to merge 3 commits intoscylladb:masterfrom
mykaul:vector-tests-benchmarks

Conversation

@mykaul
Copy link
Copy Markdown

@mykaul mykaul commented Mar 7, 2026

Summary

  • Add VectorType deserialization benchmark harness testing 4 strategies across multiple vector sizes and types
  • Expand benchmark configurations to include larger vector sizes and more type combinations
  • Enable vector integration tests on Scylla 2025.4+
  • Add unit test coverage for variable-size VectorType Cython fallback and numpy large vector deserialization

Commits (4)

1. benchmarks: Add VectorType deserialization performance benchmark

New benchmarks/vector_deserialize.py (320 lines) testing:

  • 4 strategies: VectorType.deserialize(), raw struct.unpack, numpy.frombuffer().tolist(), Cython DesVectorType
  • Vector sizes: 3, 4, 128, 384, 768, 1536 (float); 128 (double, int)
  • Iteration counts scaled by vector size for stable measurements

2. benchmarks: expand vector sizes

Add double[768], double[1536], int32[64] configurations.

3. tests: enable vector integration tests on Scylla 2025.4+

Re-enable vector integration tests that were previously skipped for Scylla. Tested against Scylla 2025.4.2 and 2026.1.

4. tests: add coverage for variable-size VectorType Cython fallback and numpy large vector deserialization

  • Test that DesVectorType raises ValueError for variable-size subtypes (UTF8Type) while pure Python handles them
  • Exercise the numpy deserialization path for 64-element vectors across float, double, int32, int64

No production code changes — benchmark and test files only.

@mykaul mykaul marked this pull request as draft March 7, 2026 10:23
@mykaul mykaul requested a review from Copilot March 8, 2026 20:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new benchmark and test coverage around VectorType deserialization, and refreshes integration test formatting to support vector-related testing scenarios.

Changes:

  • Add a new benchmarks/vector_deserialize.py harness comparing multiple vector deserialization strategies across sizes/types.
  • Add unit tests for VectorType large-vector deserialization and intended Cython fallback behavior.
  • Reformat/clean up tests/integration/standard/test_types.py (imports/string literals/line wrapping) and keep vector test class enabled via @requires_vector_type.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
tests/unit/test_types.py Adds new unit tests for vector deserialization behavior (including a Cython-deserializer expectation).
tests/integration/standard/test_types.py Largely formatting/refactoring; keeps/organizes vector integration tests under @requires_vector_type.
benchmarks/vector_deserialize.py New benchmark script to measure vector deserialization performance across approaches and configurations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mykaul added 3 commits April 4, 2026 00:48
…nce benchmarks

Add benchmark scripts for measuring VectorType serialization and
deserialization performance across various vector sizes and numeric types
(float, double, int32, int64, short).

vector_deserialize.py compares Python struct.unpack baseline, Cython
DesVectorType deserializer, and numpy-accelerated path.

vector_serialize.py compares current VectorType.serialize() baseline,
Python struct.pack with batch format string, and BoundStatement.bind()
end-to-end.

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
…y large vector deserialization

Add test_vector_cython_deserializer_variable_size_subtype to verify that
DesVectorType correctly raises ValueError for variable-size subtypes
(e.g. UTF8Type) and that the pure Python path handles them.

Add test_vector_numpy_large_deserialization to exercise the numpy
deserialization path for vectors with >= 32 elements across all supported
numeric types (float, double, int32, int64).

Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
NotImplemented is a special singleton used for binary operator fallback,
not an exception class. Using 'raise NotImplemented(...)' would raise
TypeError instead of the intended error. Replace with NotImplementedError.
@mykaul mykaul force-pushed the vector-tests-benchmarks branch from 0ae255c to 7cbc15c Compare April 3, 2026 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants