Releases: lance-format/lance
Releases · lance-format/lance
v6.0.0-rc.1
What's Changed
Breaking Changes 🛠
New Features 🎉
- feat: support segmented inverted index build and search by @Xuanwo in #6305
- feat(dictionary-namespace): support table related operation by @zhangyue19921010 in #6308
- feat: clean up transaction files on failed commits by @wjones127 in #6319
- feat: add planned blob reads with source-level coalescing by @Xuanwo in #6352
- refactor: use exact base-scoped store bindings by @Xuanwo in #6422
- feat: wire batch_size_bytes to Python and public Rust API by @westonpace in #6428
- feat(vector): add partition search parallelism by @BubbleCal in #6475
- feat(index): support float16 and float64 in IVF_FLAT by @BubbleCal in #6476
- feat: batch chopping fallback for filtered read by @westonpace in #6482
- feat(java): add Dataset.sample() API by @beinan in #6500
- feat: add ANN proto codecs and extract table_identifier module by @LuQQiu in #6503
- feat: add configurable blob v2 pack file size by @hamersaw in #6508
- feat: expose has_stable_row_ids property on LanceDataset by @pengw0048 in #6531
- feat: expose base scoped store bindings to python by @zhangyue19921010 in #6547
- feat: support zonemap index segments by @beinan in #6593
- feat: update lance-namespace to 0.7.2 and align namespace declared table lifecycle by @jackye1995 in #6608
- feat: generalize dynamic object store credentials by @jackye1995 in #6609
- feat: add ANNIvfPartitionExecProto by @LuQQiu in #6612
- feat: add prefilter_type to ANNIvfSubIndexExecProto by @LuQQiu in #6613
- feat: replace Azure SDK and google-cloud-auth with direct reqwest for credential vending by @jackye1995 in #6617
- feat(io): bypass backpressure for io_buffer_size=0 and 2.0 indirect I/O by @westonpace in #6627
Bug Fixes 🐛
- fix: warn and clamp LANCE_INITIAL_UPLOAD_SIZE instead of panicking by @LuciferYang in #6389
- fix: keep delete-by-source fast path with scalar indexes by @Xuanwo in #6435
- fix: include column_metadatas and column_infos in CachedFileMetadata::DeepSizeOf by @jiaoew1991 in #6480
- fix(index): preserve fts prewarm position codec by @BubbleCal in #6485
- fix: handle FlatBin quantization in optimize_vector_indices_v2 by @jackye1995 in #6488
- fix: use logical OR instead of bitwise OR in conflict resolver by @dentiny in #6492
- fix: add dir_listing_to_manifest_migration_enabled flag to avoid extra object store calls by @jackye1995 in #6507
- fix: prevent arithmetic overflow in U64Segment encoding selection for sparse/extreme row id ranges by @ivscheianu in #6516
- fix: bump jieba-rs to 0.9.0 to fix build-no-lock CI by @westonpace in #6518
- fix: blob projection schema compatibility by @Xuanwo in #6521
- fix(namespace): serialize manifest mutations by @Xuanwo in #6525
- fix: missing bumpversion entry for lance-tokenizer by @Xuanwo in #6526
- fix: scale default memory pool size by partition count by @westonpace in #6562
- fix: apply fragment bitmap allow-list to index search results by @westonpace in #6563
- fix: hard cap batch size in merge_insert to prevent sort failures by @westonpace in #6564
- fix: index type try_from miss RTree and BLOOMFILTER by @wojiaodoubao in #6568
- fix(namespace): align error handling with namespace spec by @jackye1995 in #6575
- fix: reject Rewrite vs CreateIndex when FRI groups straddle bitmap by @wjones127 in #6610
- fix(json): detect float64-stored numbers in json type extraction by @dentiny in #6622
- fix: respect LANCE_DEFAULT_IO_BUFFER_SIZE if it has been set by @westonpace in #6636
Documentation 📚
- docs: tighten python environment workflow guidance by @Xuanwo in #6520
- docs: fix broken intra-doc link in DatasetPreFilter by @LuciferYang in #6579
- docs: correct repetition level example in encoding docs by @BubbleCal in #6585
Performance Improvements 🚀
- perf: intern DataFile fields/column_indices to reduce manifest memory by @beinan in #6477
- perf: intern RowDatasetVersionMeta inline bytes to reduce manifest memory by @beinan in #6499
- perf: add SIMD-accelerated u8 dot product for SQ distance by @justinrmiller in #6506
- perf: add SIMD kernels for bf16 distance functions by @justinrmiller in #6510
- perf: submit I/O requests eagerly in FullZipScheduler by @hushengquan in #6513
- perf: add SIMD-accelerated u8 L2 and cosine distance kernels by @justinrmiller in #6517
- perf: speed up RaBitQ 4-bit LUT distance on ARM by 16x by @justinrmiller in #6537
- perf: add explicit SIMD types and distance kernels for f64 by @justinrmiller in #6540
- perf: don't spawn the scheduling on a separate thread for small reads by @westonpace in #6637
Full Changelog: release-root/6.0.0-beta.N...v6.0.0-rc.1
v6.0.0-beta.7
What's Changed
New Features 🎉
- feat: support segmented inverted index build and search by @Xuanwo in #6305
- feat(vector): add partition search parallelism by @BubbleCal in #6475
Performance Improvements 🚀
- perf: don't spawn the scheduling on a separate thread for small reads by @westonpace in #6637
Full Changelog: v6.0.0-beta.6...v6.0.0-beta.7
v6.0.0-beta.6
What's Changed
Bug Fixes 🐛
- fix: respect LANCE_DEFAULT_IO_BUFFER_SIZE if it has been set by @westonpace in #6636
Full Changelog: v6.0.0-beta.5...v6.0.0-beta.6
v6.0.0-beta.5
What's Changed
New Features 🎉
- feat(io): bypass backpressure for io_buffer_size=0 and 2.0 indirect I/O by @westonpace in #6627
Bug Fixes 🐛
Full Changelog: v6.0.0-beta.4...v6.0.0-beta.5
v6.0.0-beta.4
What's Changed
New Features 🎉
- feat: expose base scoped store bindings to python by @zhangyue19921010 in #6547
- feat: update lance-namespace to 0.7.2 and align namespace declared table lifecycle by @jackye1995 in #6608
- feat: generalize dynamic object store credentials by @jackye1995 in #6609
- feat: add prefilter_type to ANNIvfSubIndexExecProto by @LuQQiu in #6613
- feat: replace Azure SDK and google-cloud-auth with direct reqwest for credential vending by @jackye1995 in #6617
Bug Fixes 🐛
- fix: reject Rewrite vs CreateIndex when FRI groups straddle bitmap by @wjones127 in #6610
Full Changelog: v6.0.0-beta.3...v6.0.0-beta.4
v6.0.0-beta.3
What's Changed
New Features 🎉
- feat(java): add Dataset.sample() API by @beinan in #6500
- feat: expose has_stable_row_ids property on LanceDataset by @pengw0048 in #6531
- feat: support zonemap index segments by @beinan in #6593
- feat: add ANNIvfPartitionExecProto by @LuQQiu in #6612
Bug Fixes 🐛
- fix: prevent arithmetic overflow in U64Segment encoding selection for sparse/extreme row id ranges by @ivscheianu in #6516
- fix: index type try_from miss RTree and BLOOMFILTER by @wojiaodoubao in #6568
Documentation 📚
- docs: correct repetition level example in encoding docs by @BubbleCal in #6585
Full Changelog: v6.0.0-beta.2...v6.0.0-beta.3
v4.0.1
What's Changed
Full Changelog: v4.0.0...v4.0.1
v4.0.1-rc.1
What's Changed
Full Changelog: v4.0.0...v4.0.1-rc.1
v6.0.0-beta.2
What's Changed
New Features 🎉
Bug Fixes 🐛
- fix: add dir_listing_to_manifest_migration_enabled flag to avoid extra object store calls by @jackye1995 in #6507
- fix: scale default memory pool size by partition count by @westonpace in #6562
- fix: apply fragment bitmap allow-list to index search results by @westonpace in #6563
- fix: hard cap batch size in merge_insert to prevent sort failures by @westonpace in #6564
- fix(namespace): align error handling with namespace spec by @jackye1995 in #6575
Documentation 📚
- docs: fix broken intra-doc link in DatasetPreFilter by @LuciferYang in #6579
Performance Improvements 🚀
- perf: add SIMD-accelerated u8 dot product for SQ distance by @justinrmiller in #6506
- perf: add SIMD kernels for bf16 distance functions by @justinrmiller in #6510
- perf: speed up RaBitQ 4-bit LUT distance on ARM by 16x by @justinrmiller in #6537
- perf: add explicit SIMD types and distance kernels for f64 by @justinrmiller in #6540
Full Changelog: v6.0.0-beta.1...v6.0.0-beta.2
v6.0.0-beta.1
What's Changed
Breaking Changes 🛠
New Features 🎉
- feat(dictionary-namespace): support table related operation by @zhangyue19921010 in #6308
- feat: clean up transaction files on failed commits by @wjones127 in #6319
- refactor: use exact base-scoped store bindings by @Xuanwo in #6422
- feat: wire batch_size_bytes to Python and public Rust API by @westonpace in #6428
- feat(index): support float16 and float64 in IVF_FLAT by @BubbleCal in #6476
- feat: batch chopping fallback for filtered read by @westonpace in #6482
- feat: add ANN proto codecs and extract table_identifier module by @LuQQiu in #6503
- feat: add configurable blob v2 pack file size by @hamersaw in #6508
Bug Fixes 🐛
- fix: warn and clamp LANCE_INITIAL_UPLOAD_SIZE instead of panicking by @LuciferYang in #6389
- fix: keep delete-by-source fast path with scalar indexes by @Xuanwo in #6435
- fix: include column_metadatas and column_infos in CachedFileMetadata::DeepSizeOf by @jiaoew1991 in #6480
- fix(index): preserve fts prewarm position codec by @BubbleCal in #6485
- fix: handle FlatBin quantization in optimize_vector_indices_v2 by @jackye1995 in #6488
- fix: use logical OR instead of bitwise OR in conflict resolver by @dentiny in #6492
- fix: bump jieba-rs to 0.9.0 to fix build-no-lock CI by @westonpace in #6518
- fix: blob projection schema compatibility by @Xuanwo in #6521
- fix(namespace): serialize manifest mutations by @Xuanwo in #6525
- fix: missing bumpversion entry for lance-tokenizer by @Xuanwo in #6526
Documentation 📚
Performance Improvements 🚀
- perf: intern DataFile fields/column_indices to reduce manifest memory by @beinan in #6477
- perf: intern RowDatasetVersionMeta inline bytes to reduce manifest memory by @beinan in #6499
- perf: submit I/O requests eagerly in FullZipScheduler by @hushengquan in #6513
- perf: add SIMD-accelerated u8 L2 and cosine distance kernels by @justinrmiller in #6517
Full Changelog: release-root/6.0.0-beta.N...v6.0.0-beta.1