Add xtensor support#1328
Conversation
2b3f780 to
da2ce67
Compare
|
Hi Vladimir, |
|
Hi @hpkfft, this PR is written from scratch, I suppose it would be a successor of xtensor-python (based on pybind11). I don't have a benchmark that compares both implementations yet, but plan to write it. xtensor-python is extremely optimized and quite fast, the target of this PR is to achieve the same performance, but I believe it could be even faster since nanobind function calls have less overhead. I have a benchmark for nanobind implementation only and it shows promising results It does not require C++20 actually, xtensor 0.26.0 (which is installed in tests) requires C++17, it should work. xtensor 0.27.0 requires C++20, but that depends on which xtensor version user installs Currently this PR is in progress, I'm still updating the library interface and testing the behavior. I've opened this draft for public testing and benchmarking |
|
This looks cool. I would be excited to have nice xtensor bindings in nanobind 👍 |
xtensor is a header-only C++ library for multi-dimensional arrays and tensors with lazy evaluation semantics. It provides NumPy-like syntax for element-wise operations, broadcasting, and mathematical functions.
This PR adds native casters for xtensor types, enabling seamless interop between NumPy arrays and xtensor containers.
Owning containers
xt::xarray<T>andxt::xtensor<T, N>are owning containers backed bystd::vector-like storage. Since they must own their data, the caster copies from the NumPy buffer on input:Zero-copy views
For performance-critical paths,
nb::xarray_view<T>andnb::xtensor_view<T, N>wrap the NumPy buffer directly without copying:Views are layout-aware, they default to
layout_type::row_major, which gives xtensor a compile-time contiguity guarantee and enables flat-pointer iteration instead of per-element stride arithmetic. Views withlayout_type::dynamicare also supported for non-contiguous arrays, at the cost of slower iteration.Since views point directly into the NumPy buffer, they can mutate the source array.
Expression returns
Returning an xtensor expression (e.g.,
xt::sin(a) * s + t) produces anxfunction, a lazy expression tree. The caster evaluates it directly into a freshly allocated NumPy buffer, avoiding intermediatexarraymaterialization.Vectorization
nb::xvectorizewraps a scalar C++ function for element-wise application over arrays, analogous tonp.vectorizebut compiled:Scope
The entire implementation is ~330 lines (excluding tests), split across 5 headers under
include/nanobind/xtensor/.Benchmark
A standalone benchmark suite is available at keltecc/xtensor-nanobind-benchmark.
For API details, see the documentation in
xtensor.rstand the test cases.