take/sparse-indexing + b2view by FrancescAlted · Pull Request #640 · Blosc/python-blosc2

FrancescAlted · 2026-05-28T10:26:31Z

This PR merges two related feature branches into one:

take / sparse-indexing / fancy-indexing

blosc2.take() / NDArray.take(): generalized to use b2nd_get_sparse_cbuffer() from C-Blosc2, giving a unified fast path for gather operations on both NDArrays and
CTable columns
CTable.take() and Column.take(): new gather API for row-position-based column access
Fancy indexing (arr[[1,3,5]]): routes through the same sparse-cbuffer backend
Sparse boolean mask fast path: auto-detects highly-selective boolean masks and routes them through take() instead of dense-materialization paths, avoiding unnecessary
full decompression
where(cond, x) and where(cond, x, y): now compiled through miniexpr for JIT-accelerated evaluation
Boolean mask materialization: lazy-index patterns like a[a < 5][:] now use a compressed transient mask (LZ4) and a hot cache, dramatically reducing memory for
repeated queries

Cross-cutting improvements

Context manager support: all blosc2.open() return types (SChunk, NDArray, C2Array, Proxy, LazyArray, CTable, DictStore, TreeStore, etc.) now support with
blosc2.open(...) as obj:
CTable.where(): always uses expr_result.compute() for the boolean filter, keeping the mask compressed by default
On-disk query cache: side-effect correctness and mode-consistency fixes; race-condition fix for miniexpr chunk caches on Apple Silicon
CMake: bumped bundled C-Blosc2 version

b2view — Interactive Browser for Blosc2 Data

A new CLI viewer (blosc2 view) for interactively exploring Blosc2 objects in the terminal. Built with rich and textual, it supports browsing:

CTables/CStore and nested column hierarchies
NDArrays with multi-dimensional navigation
vlmeta inspection
Panel-based workflow with flexible dimension mode

Instead of a fixed max_sparse_refine_candidates cutoff, estimate refinement cost from candidate count × operand count vs scan cost from total rows. Avoids both premature fallback for large but selective queries and pathological refinement of near-full-table predicates. Constants calibrated from profiling with sparse-gather optimisations.

… capacity after arrow import

…close

…arse()

…miniexpr

…t by default

…ates

…. NDArray.take has a new faste path for 1d now.

…er()

… chunk

…boolean condition

The on-disk miniexpr prefilter used a shared b2nd_array_t.chunk_cache buffer that was read on the fast path without holding the lock, while a different worker could concurrently free and replace that same buffer on a cache miss. This led to sporadic SIGSEGV crashes on Apple Silicon, where the weaker memory model and timing made the race visible much more often. In-memory arrays were unaffected because they bypass this path. A previous workaround used per-thread caches, which avoided the crash but made every worker fetch/copy the same on-disk chunk independently. That fixed correctness at the cost of much higher sys time, memory use, and overall runtime. Replace the shared mutable b2nd_array_t.chunk_cache use in miniexpr with a per-input shared cache owned by me_udata. Each cache entry has a small state machine (EMPTY, LOADING, READY, ERROR) plus a lock. The first worker reaching a chunk marks it LOADING, fetches and copies the chunk once, then publishes it as READY; the remaining workers wait briefly and reuse the same immutable chunk buffer. This preserves safe lifetime and restores chunk sharing without duplicated I/O. Also free SChunk with the GIL held again so threadpool teardown cannot race with active miniexpr workers during deallocation. Add a persisted regression test covering repeated a[a < 5][:] on a disk-backed array under multi-threaded execution.

- keep the miniexpr shared chunk-cache fix and replace yield-based waiting with a blocking lock handoff for safer contention behavior - pass the requested open mode into reopened NDArray wrappers - make vlmeta derive access state from its parent SChunk instead of keeping an independent mode snapshot - break the new vlmeta->SChunk reference cycle with a weak reference - make query-result caching hot-cache-only and stop persisting query cache catalogs or __query_cache__ sidecars in any open mode - document the no-hidden-writes rule in blosc2.open - preserve _from_schunk mode/storage state in EmbedStore - stop upgrading reopened Proxy caches/sources to append mode implicitly - keep read-only Proxy opens observational by falling back to source reads when a missing chunk would otherwise require mutating the cache - update tests for open-mode propagation, read-only metadata behavior, hot-cache-only query reuse, and read-only proxy reopening

… overall behavior

…ilters

FrancescAlted added 30 commits May 22, 2026 07:19

First version using new blosc2_schunk_get_sparse()

c6e532a

Code simplification: dict codes are always int32

e8b69df

Reuse already prefetched data from index

d146aed

Raise sort materialize limit to 50M

994b0bf

Allow to materialize masks below _SMALL_NROWS_LIMIT; also, trim table…

31158e5

… capacity after arrow import

More contained growth for large tables; also, trim table capacity on …

074cef7

…close

Use the new b2nd_get_sparse_cbuffer() instead of blosc2_schunk_get_sp…

d8e4b85

…arse()

Better dealing with missing L3 cache in apple silicon

4a02351

Better caches index catalog during queries

f3d6522

Optimization for building mask for index query result

2193502

New BLOSC_ME_JIT and BLOSC_ME_JIT_TRACE envvar for controling JIT in …

9369fa3

…miniexpr

Store CTable.nrows persistently for faster operation after re-opening

f766585

Embed store handling inside a dict store is bug-prone, so disabling i…

bd11d40

…t by default

Make the CTable index catalog cache aware of storage-side catalog upd…

715fbc4

…ates

CTable indexing code move into its own module

6836fd4

New .take for take/gather APIs has been implemented for CTable/Column…

0c94889

…. NDArray.take has a new faste path for 1d now.

Initial version of the b2view CLI viewer

1edf888

Improved CTable data navigation

76e08ce

Better navigation for 2d arrays

cee7848

New bench for creating/opening/reading a TreeStore

f74f04e

Reduce the limit for considering a table 'small'

a1d180f

Use t and b keystrokes for go to top and bottom respectively

4354579

Make 1d browsing similar to 2d but with 1 single column

7ce7dce

Allow full browsing of arrays with more than 2 dim

f1eb02f

New CTable.vlmeta. Also a new vlmeta pane for b2view.

a67e19e

New dim mode: allow to navigate in all dimension more flexibly

876d993

Add a new --panel option for going straight to the desired panel

04623a5

Use latest c-blosc2 release

4cd3058

Generalize take() and fancy indexing to use new b2nd_get_sparse_cbuff…

534e66a

…er()

FrancescAlted added 23 commits May 26, 2026 04:56

Skip tests if rich and textual are not installed

5e7165a

For ndim > 1 axis-based take, use orthogonal selection, as it is faster

2ab27da

New benchmark for take() functionality

5bf716a

Check boolean array key early to avoid expensive process_key / nonzero

d4ab437

Check where condition to avoid unnecessary numexpr setup overhead per…

3842d78

… chunk

Use miniexpr when where(cond, x, y), i.e. two args flavor

22319e2

Fast path for sparse boolean masks with high selectivity auto-detection

23e155d

Optimization: miniexpr can be used for where(cond, x) (1-arg) with a …

ad86bb5

…boolean condition

Add a new bench/profiles for lazy indexes like a[a < 5][:]

8722af2

Avoid a fully materialized numpy mask for optimal operation

f790643

Use iterchunks_info() for faster iteration

7f9138a

Make query hot cache compressed (LZ4) by default

cb4ee4e

Make transient mask in queries be compressed with LZ4 for a bit of speed

0e8d63f

On macOS, using the full L2 as a floor for chunksize has shown better…

cab0f8e

… overall behavior

Add context manager support for all blosc2.open() objects

9a71938

Update to latest c-blosc2

213edb0

Based on experiments, prefer a compressed boolean mask for LazyExpr f…

2f2f1c1

…ilters

Fix an issue on 32-bit platforms (i.e. wasm32)

34779c3

Fix some issues that showed up in heavy tests for reduce operations

1be61d6

Fix remaining issues in heavy suite

4019c1d

Yet another fix for tests

f4a49b5

FrancescAlted merged commit 7c7dccd into main May 28, 2026
17 checks passed

FrancescAlted deleted the b2view branch May 28, 2026 12:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

take/sparse-indexing + b2view#640

take/sparse-indexing + b2view#640
FrancescAlted merged 53 commits into
mainfrom
b2view

FrancescAlted commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FrancescAlted commented May 28, 2026

take / sparse-indexing / fancy-indexing

Cross-cutting improvements

b2view — Interactive Browser for Blosc2 Data

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant