Skip to content

feat: consolidate ONNX Runtime × Sherpa-ONNX across all platforms#479

Open
sanchitmonga22 wants to merge 11 commits intomainfrom
onnx-sherpa-consolidation
Open

feat: consolidate ONNX Runtime × Sherpa-ONNX across all platforms#479
sanchitmonga22 wants to merge 11 commits intomainfrom
onnx-sherpa-consolidation

Conversation

@sanchitmonga22
Copy link
Copy Markdown
Contributor

@sanchitmonga22 sanchitmonga22 commented Apr 15, 2026

Summary

Clean up the dual-stack ONNX Runtime / Sherpa-ONNX situation that had accumulated across the web, Android, iOS, React Native, and Flutter surfaces. Keep both technologies (sherpa-onnx for STT/TTS/VAD, raw ORT for wake-word + RAG embeddings — sherpa has no equivalents for the latter two) but make sherpa-onnx's expected ORT the single source of truth and strip the dead code that had accumulated around that assumption.

Also delivers the new wake-word + RAG-embeddings web implementations that the user has on the roadmap — full openWakeWord 3-stage pipeline + HuggingFace-compatible WordPiece tokenizer + BERT encoder, running via the new ORTRuntimeBridge wrapper over onnxruntime-web.

Full rationale (including why we can't drop either side) is in the plan doc at `thoughts/shared/plans/cross-platform-onnx-sherpa-consolidation.md`.

What's in each commit

# Commit What
1 `ae7660d3e` Web cleanup. Delete vestigial `--onnx` / `RAC_WASM_ONNX` path from `wasm/CMakeLists.txt`, `wasm/scripts/build.sh`, `wasm_exports.cpp`, 2 READMEs. The embedded path never worked (no WASM ORT build exists). `--build-sherpa` remains as the only correct sherpa-onnx path on the web.
2 `8d0b51f65` Android cleanup. `download-sherpa-onnx.sh` strips `libsherpa-onnx-jni.so` + `libsherpa-onnx-cxx-api.so` per ABI — neither is referenced by any consumer (we have our own JNI at `librac_backend_onnx_jni.so` and we link sherpa's C API, not C++ API). ~13.8 MB saved per AAR across the 3 ABIs. Belt-and-braces `packagingOptions.jniLibs.excludes` added to `runanywhere-core-onnx/build.gradle.kts`. Propagated through Kotlin / React-Native / Flutter build scripts, the commons-release workflow, and publishing docs.
3 `aa51b05ca` Delete intellij-plugin-demo. Retires `examples/intellij-plugin-demo/` and the gradle composite build, CI job, run configurations, and docs references. 23 files / ~1900 LOC gone.
4 `13199d7e4` Single ORT version pin. Collapses `ONNX_VERSION_{IOS,ANDROID,MACOS,LINUX,WINDOWS}` to the sherpa-expected `1.17.1` (was drifting to `1.23.2` on macOS / Linux / Windows, a silent bug that would have hit as a missing-symbol runtime crash). `load-versions.sh` + `LoadVersions.cmake` now hard-error if the five pins are ever out of sync. Adds top-of-file docstrings to `FetchONNXRuntime.cmake` and `Package.swift` explaining why `RABackendONNX.xcframework` + `onnxruntime-{ios,macos}.xcframework` must ship together.
5 `c5a771b72` Shared `Ort::Env` singleton. New `src/backends/onnx/shared/rac_ort_env.{h,cpp}`: one lazy, leaked, thread-safe `Ort::Env` accessible via both C API (`shared_ort_api()` + `shared_ort_env()`) and C++ API (`shared_cxx_env()`). Migrates the three independent `CreateEnv` call sites (`onnx_backend.cpp`, `wakeword_onnx.cpp`, `onnx_embedding_provider.cpp`) to consume it; clears pointers on cleanup without `ReleaseEnv`. Zero remaining `CreateEnv` / `ReleaseEnv` under `src/`.
6 `16a79d75c` Web wake-word + embeddings scaffolding. Adds `onnxruntime-web` dep, `Foundation/ORTRuntimeBridge.ts`, stubbed `WakeWord*` / `Embeddings*` extensions.
7 `2cb9dfa8d` Web wake-word + embeddings implementation. Replaces the stubs with real pipelines: full openWakeWord port (1280-sample framing, 480-sample context overlap, `(v/10)+2` post-mel transform, 76-frame stride-8 embedding windowing, per-classifier threshold + cooldown) and a ~230-LOC HuggingFace-compatible BERT WordPiece tokenizer (BasicTokenizer + greedy WordPiece, NFD normalize, accent stripping, `[CLS]`/`[SEP]`/`[UNK]`/`[PAD]`). Encoder inference wiring auto-maps BERT input aliases, attention-weighted mean pool, optional L2 normalize. Typecheck green across web/{core,llamacpp,onnx}.
8 `30890a60b` Reorganize `backends/onnx/` by capability. Creates `shared/`, `wakeword/`, `embeddings/` subfolders; git-renames `wakeword_onnx.cpp` + the three embedding files (from `src/features/rag/`) into their new homes. Eliminates the RAG_DIR cross-directory compile hack in `src/backends/onnx/CMakeLists.txt`.

Test plan

  • C++ build matrix (CI):
    • iOS (`build-ios.sh` + xcframework packaging)
    • Android `arm64-v8a` / `armeabi-v7a` / `x86_64`
    • macOS (confirm ORT 1.17.1 downloads — this is the version change)
    • Linux / Windows (same — ORT 1.17.1)
  • Web SDK build + typecheck:
    • `cd sdk/runanywhere-web && npm run typecheck` ← green locally
    • `./scripts/build-web.sh --build-wasm --all-backends` ← confirm the `--onnx` flag is now rejected
    • `./scripts/build-web.sh --build-sherpa` ← sherpa path still works
  • Android AAR shape:
    • `unzip -l build/outputs/aar/runanywhere-core-onnx-*-release.aar | grep sherpa` should list only `libsherpa-onnx-c-api.so`
    • Smoke-test STT / TTS / VAD in the Android example app (no `UnsatisfiedLinkError` from the stripped libs)
  • Flutter + React Native example apps build (inherit the Android AAR trimming via their download-task allowlists)
  • Commons-release workflow dry-run (verifies the trimmed sherpa-onnx download check)
  • Not in this PR, open follow-ups: real-model smoke tests for `WakeWord` + `Embeddings` in the web example app; `onnxruntime-web` WASM-asset serving recipes in the onnx package README; SentencePiece tokenizer for multilingual RAG models.

Size impact summary

  • Android `runanywhere-core-onnx` AAR: ~13.8 MB smaller (3 ABIs × ~4.6 MB of unused sherpa libs stripped).
  • macOS / Linux / Windows: ORT downloads drop from 1.23.2 → 1.17.1, closing a silent symbol-drift risk.
  • Web: `racommons-llamacpp.wasm` unchanged; the new `onnxruntime-web` WASM (~2 MB) is lazy-loaded only when wake-word / embeddings are actually called.
  • Monorepo LOC: +1,704 / −2,108 net (the `intellij-plugin-demo` deletion offsets the new TS pipeline code).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Web SDK: added wake-word detection and text embeddings support.
  • Removals

    • IntelliJ Plugin demo and IDE integration discontinued.
    • Android packaging no longer ships redundant Sherpa-ONNX JNI/C++ variants; ONNX WASM moved to a separate module.
  • Chores

    • Enforced single ONNX Runtime version across platforms.
    • Improved native-library packaging, build scripts, and download/version idempotency.

Greptile Summary

This PR consolidates the ONNX Runtime / Sherpa-ONNX dual-stack across all platforms: collapses the five per-platform ORT version pins to a single 1.17.1 (sherpa-onnx's requirement), strips unused sherpa .so files from the Android AAR (~13.8 MB saved), deletes the IntelliJ plugin demo, introduces a shared Ort::Env singleton to eliminate redundant ORT env creation, and adds full web implementations for wake-word detection (openWakeWord 3-stage pipeline) and RAG embeddings (BERT encoder + HuggingFace-compatible WordPiece tokenizer). All P2 findings; safe to merge.

Confidence Score: 5/5

Safe to merge; all findings are P2 style/resilience suggestions with no impact on correctness or data integrity.

No P0 or P1 issues found. The ORT version consolidation, shared-env singleton, Android AAR trim, and TypeScript web pipelines are all logically correct. The three P2 findings are: a rejected-promise retry gap in ORTRuntimeBridge, a header doc mismatch on the exception type, and a GC-inefficient audio buffer in WakeWordService — none affect current runtime behavior.

sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts (retry-on-failure gap)

Important Files Changed

Filename Overview
sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts New singleton bridge for lazy-loading onnxruntime-web; _loadPromise is cached permanently on rejection, preventing retry on transient failures.
sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts Full openWakeWord 3-stage pipeline port; correct streaming state management; audioBuffer: number[] with per-element push/splice is inefficient for real-time audio.
sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+Embeddings.ts BERT encoder pipeline with attention-weighted mean pooling and L2 normalization; clean input-alias resolution; well-guarded null checks throughout.
sdk/runanywhere-web/packages/onnx/src/Foundation/WordPieceTokenizer.ts HuggingFace-compatible BERT WordPiece tokenizer; correct greedy WordPiece with surrogate-pair and BOM handling.
sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp Thread-safe shared Ort::Env singleton via std::call_once; leaked intentionally for process-lifetime correctness.
sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.h Good API surface; doc mismatch on exception type for shared_cxx_env() — no runtime impact.
sdk/runanywhere-commons/VERSIONS All five ONNX_VERSION_* pins consolidated to 1.17.1; macOS/Linux/Windows 1.23.2 drift fixed; lock-step enforcement added.
sdk/runanywhere-commons/cmake/LoadVersions.cmake New FATAL_ERROR guard ensures all ONNX_VERSION_* pins match at CMake configure time.
sdk/runanywhere-commons/scripts/android/download-sherpa-onnx.sh strip_unused_sherpa_libs() correctly removes unused .so files; detection sentinel updated to libonnxruntime.so.
sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts Belt-and-braces packagingOptions.jniLibs.excludes for stripped sherpa libs; onnxLibs allowlist updated.

Sequence Diagram

sequenceDiagram
    participant App
    participant WakeWordService
    participant ORTRuntimeBridge
    participant ORT as onnxruntime-web
    participant MelspecModel as melspectrogram.onnx
    participant EmbeddingModel as embedding_model.onnx
    participant Classifier as classifier.onnx

    App->>WakeWordService: load(config)
    WakeWordService->>ORTRuntimeBridge: initialize()
    ORTRuntimeBridge->>ORT: import('onnxruntime-web') [lazy]
    ORT-->>ORTRuntimeBridge: module
    ORTRuntimeBridge-->>WakeWordService: ort module
    WakeWordService->>ORT: createSession(melspectrogramModel)
    WakeWordService->>ORT: createSession(embeddingModel)
    WakeWordService->>ORT: createSession(classifierModel[])
    WakeWordService-->>App: ready

    loop Per 1280-sample frame (80 ms @ 16 kHz)
        App->>WakeWordService: feed(samples)
        WakeWordService->>WakeWordService: buffer → align to 1280-sample frames
        WakeWordService->>MelspecModel: run([context+frame]) → mel frames [Fx32]
        WakeWordService->>WakeWordService: apply (v/10)+2 transform
        WakeWordService->>EmbeddingModel: run([76-frame window]) → embedding [96]
        WakeWordService->>Classifier: run([16 embeddings]) → score [0,1]
        alt score >= threshold AND cooldown elapsed
            WakeWordService-->>App: callback(WakeWordDetection)
        end
    end

    App->>WakeWordService: unload()
    WakeWordService->>ORT: session.release() x (2 + N classifiers)
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts
Line: 63-95

Comment:
**Rejected load promise cached permanently**

If `import('onnxruntime-web')` rejects (WASM asset missing, CDN unreachable, misconfigured `wasmPaths`), `_loadPromise` is left pointing at a rejected promise. Every subsequent call to `initialize()` hits `if (this._loadPromise) return this._loadPromise` and returns the same rejection without ever retrying, making the bridge permanently unusable — the only escape is the test-only `_resetForTests()`.

Clear `_loadPromise` in a catch so callers can retry after fixing the configuration.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.h
Line: 38-41

Comment:
**Doc says `Ort::Exception`; implementation throws `std::runtime_error`**

The header documents `shared_cxx_env()` as "Throws `Ort::Exception` if ORT could not create the env", but `rac_ort_env.cpp` throws `std::runtime_error`. All current call sites already catch both types, so there's no runtime issue — but the doc will mislead future callers who only guard against `Ort::Exception`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts
Line: 167-186

Comment:
**`audioBuffer: number[]` causes GC pressure at real-time audio rates**

Each `feed()` call pushes samples one-by-one into a plain `number[]`, then `splice(0, 1280)` removes the front chunk — an O(n) shift at ~12 frames/second. A `Float32Array` with a write-head cursor would eliminate the boxing and the O(n) splice.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "commons/onnx: reshape backends/onnx/ by ..." | Re-trigger Greptile

Greptile also left 3 inline comments on this PR.

sanchitmonga22 and others added 8 commits April 15, 2026 00:04
The embedded sherpa-onnx path in the web SDK's WASM build has never been
functional on the web: rac_backend_onnx requires ONNX Runtime headers
(FetchONNXRuntime.cmake) that have no WebAssembly distribution. The
build only survived because -sERROR_ON_UNDEFINED_SYMBOLS=0 silently
produced a binary with null rac_{tts,vad}_onnx_* exports.

--all-backends already excluded --onnx with a comment documenting this,
but the dead option and its 8 conditional blocks were still wired up in
wasm/CMakeLists.txt, wasm/scripts/build.sh, wasm_exports.cpp, and the
README examples.

STT / TTS / VAD on the web are served by the separate sherpa-onnx.wasm
module built by wasm/scripts/build-sherpa-onnx.sh and loaded lazily by
packages/onnx/src/Foundation/SherpaONNXBridge.ts. That path is unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…i.so

The Sherpa-ONNX Android prebuilt (v1.12.20) contains four .so files per
ABI: libonnxruntime.so + libsherpa-onnx-c-api.so + libsherpa-onnx-jni.so
+ libsherpa-onnx-cxx-api.so. We only need the first two:

  * librac_backend_onnx.so links against the C API (libsherpa-onnx-c-api.so).
  * Our own JNI bridge is librac_backend_onnx_jni.so, loaded from Kotlin.
    Sherpa-ONNX's libsherpa-onnx-jni.so is never referenced via
    System.loadLibrary on the JVM side.
  * libsherpa-onnx-cxx-api.so is sherpa's C++ wrapper — our backend
    consumes the C API, not the C++ API.

Changes:

  * scripts/android/download-sherpa-onnx.sh: strip_unused_sherpa_libs()
    runs after download/re-extract, deletes the two unused .so per ABI.
    Also fix the "already exists" early-return to check libonnxruntime.so
    (the real always-required artifact) instead of the now-stripped
    libsherpa-onnx-jni.so, so a trimmed tree doesn't force re-download.
  * modules/runanywhere-core-onnx/build.gradle.kts: narrow the download
    filter to the 4 libs we ship + add packagingOptions.jniLibs.excludes
    as a safety net against stale local copies.
  * Propagate the exclusion through the Kotlin / React Native / Flutter
    build scripts + the RN onnx CMakeLists (no imports/links of the two
    unused sherpa targets).
  * Commons-release workflow: verify libsherpa-onnx-c-api.so +
    libonnxruntime.so presence instead of the stripped jni.
  * Update docs (onnx module README, maven central publishing guide, build
    tree summary) to reflect the shipped set.

Savings: ~4.6 MB per ABI × 3 ABIs (arm64-v8a, armeabi-v7a, x86_64) =
~13.8 MB removed from the runanywhere-onnx-android AAR. Zero runtime
impact — neither library was resolved by any consumer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Retires the IntelliJ/Android Studio plugin demo. Not part of the active
roadmap; keeping it around imposes a CI job, a gradle composite build,
and two run configurations with no ongoing benefit.

Removed:

  * examples/intellij-plugin-demo/ (full directory, ~1900 LOC Kotlin +
    gradle wrapper)
  * settings.gradle.kts includeBuild entry
  * build.gradle.kts buildIntellijPlugin + runIntellijPlugin tasks and
    the branches of buildAll / cleanAll that drove them
  * .github/workflows/build-all-test.yml intellij-plugin job, its
    workflow_dispatch input, and the summary row
  * .idea/runConfigurations/09_Build_IntelliJ_Plugin.xml +
    10_Run_IntelliJ_Plugin.xml
  * docs/building.md IntelliJ Plugin section + output-table row
  * CLAUDE.md project description
  * .gitignore entry for plugin .idea/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rift

Sherpa-ONNX is our only ORT-consuming backend; its prebuilt artifacts
assume a specific ONNX Runtime version. On iOS the sherpa-onnx.xcframework
leaves ORT symbols undefined and expects them to be resolved against the
separate pod-archive-onnxruntime-c-<version>.zip. On Android, macOS, Linux,
and Windows sherpa's distribution bundles a compatible libonnxruntime
alongside its own libs, so we consume ORT from sherpa's tree.

Prior state had ONNX_VERSION_IOS=ONNX_VERSION_ANDROID=1.17.1 (sherpa's
actual version) but ONNX_VERSION_MACOS=ONNX_VERSION_LINUX=
ONNX_VERSION_WINDOWS=1.23.2 — a silent drift. On macOS/Linux/Windows the
separate ORT fetch would land at 1.23.2 while sherpa expected 1.17.1, and
nothing flagged it. First Ort:: call after sherpa loaded would hit a
missing symbol if the ABI had drifted.

Changes:

  * VERSIONS: all ONNX_VERSION_* pinned to 1.17.1 (what sherpa
    1.12.18/1.12.20/1.12.23 expects), with documentation explaining that
    sherpa is the single source of truth.
  * load-versions.sh + LoadVersions.cmake: hard-error when the five
    ONNX_VERSION_* pins don't all match. Makes drift a loud build failure
    rather than a silent runtime crash.
  * FetchONNXRuntime.cmake: top-of-file docstring stating the per-platform
    sourcing strategy (Android → from sherpa; WASM → interface; iOS →
    separate download pinned to sherpa version; macOS/Linux/Windows →
    separate fetch, also pinned).
  * Package.swift: document why RABackendONNX.xcframework and
    onnxruntime-{ios,macos}.xcframework must ship together (sherpa's
    undefined ORT symbols + our raw-ORT wake-word and embeddings code).

Zero runtime change on iOS and Android (versions were already aligned).
macOS/Linux/Windows now fetch ORT 1.17.1 instead of 1.23.2, matching
sherpa — closing a latent version-drift bug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ddings

Previously three call sites each constructed their own OrtEnv:

  * src/backends/onnx/onnx_backend.cpp        — "runanywhere"
  * src/backends/onnx/wakeword_onnx.cpp       — "WakeWord"
  * src/features/rag/onnx_embedding_provider.cpp — "RAGEmbedding"

ORT allows multiple envs but each spins up its own logger, thread-pool
scaffolding, and arena allocator — pure overhead when every consumer
runs in the same process. This adds shared/rac_ort_env.{h,cpp} and
migrates the three call sites to consume it.

Design:

  * One lazily-initialized Ort::Env (via std::call_once) with log name
    "RunAnywhere" and WARNING-level ORT logging.
  * The env is heap-allocated and intentionally never destroyed — it
    outlives every rac_backend_onnx / wakeword / embedding instance, so
    the dangling-ref-at-shutdown pitfall of static-duration env
    destructors is avoided.
  * Three accessors: shared_ort_api() → const OrtApi*, shared_ort_env()
    → OrtEnv*, shared_cxx_env() → Ort::Env&. C-API call sites and C++
    call sites both reach the same underlying env without double-wrap.

Call-site migrations:

  * onnx_backend.cpp: initialize_ort() pulls api + env from the shared
    singleton; cleanup() clears pointers but does NOT ReleaseEnv.
  * wakeword_onnx.cpp: drops the unique_ptr<Ort::Env> member from the
    backend struct, touches shared_cxx_env() during create() so we fail
    early if the singleton can't initialize, and passes the shared env
    reference to every Ort::Session ctor. Adds std::runtime_error catch
    for the initialization-failure path.
  * onnx_embedding_provider.cpp: pulls api + env from the shared
    singleton; cleanup() clears pointers but does NOT ReleaseEnv.

Zero remaining CreateEnv / ReleaseEnv calls under src/. Guarded by
RAC_HAS_ONNX so WASM (which doesn't compile rac_backend_onnx) is
unaffected.

Note: the further directory reshape under src/backends/onnx/ (sherpa/,
wakeword/, embeddings/ subfolders; moving onnx_embedding_provider out
of src/features/rag/) is deferred — pure cosmetic move that needs
validation across 5 native platforms' CMake include paths, better left
to a follow-up once CI can cycle per platform.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces a second WASM runtime alongside sherpa-onnx.wasm — Microsoft's
onnxruntime-web (~2 MB) — to run arbitrary ONNX models that sherpa's C API
doesn't expose. Wake-word (openWakeWord 3-stage pipeline) and RAG
embeddings (BERT-style encoders) both fall in that bucket because sherpa
is a speech-specific library, not a generic ONNX runtime.

Design mirrors the native side:

  sherpa-onnx.wasm       → STT / TTS / VAD  (unchanged)
  onnxruntime-web        → wake-word, embeddings, future direct-ORT
                           features

Both WASM modules are lazy-loaded, so apps that only use one feature set
don't pay the download cost for the other.

What this commit adds:

  Foundation/ORTRuntimeBridge.ts
    Singleton wrapper over onnxruntime-web. Initializes ort.env once
    (thread count, wasmPaths, log severity), exposes createSession().

  Extensions/WakeWordTypes.ts + RunAnywhere+WakeWord.ts
    Types match native rac_wakeword_onnx_config_t so the same
    openWakeWord .onnx models work cross-platform. load() loads
    Stage 1 (melspec) + Stage 2 (embedding) + N classifier sessions
    in parallel. feed() is stubbed with a descriptive error — the
    feed-forward port from native's process_audio_frame() is a
    separate ML-engineering PR.

  Extensions/EmbeddingsTypes.ts + RunAnywhere+Embeddings.ts
    Matches native onnx_embedding_provider's layout (input_ids /
    attention_mask / token_type_ids; last_hidden_state → mean-pool).
    load() wires up the session; embed()/embedBatch() stubbed pending
    WordPiece / SentencePiece tokenizer integration (follow-up PR).

  package.json
    Adds onnxruntime-web ^1.17.1 as a runtime dependency (version
    aligned with the native ORT pin in commons/VERSIONS so the same
    ORT release is used everywhere).

  index.ts
    Exports WakeWord, WakeWordService, Embeddings, EmbeddingsService,
    ORTRuntimeBridge, plus the new config / result types.

The API surface is the goal of this commit — the inference math lands
when product work on wake-word / RAG kicks off. Until then, calling
feed() or embed() throws a descriptive "not yet implemented" error, not
a cryptic runtime crash.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the "not yet implemented" stubs with full TypeScript ports of
the native logic.

WAKE-WORD (Extensions/RunAnywhere+WakeWord.ts)
----------------------------------------------

Full openWakeWord 3-stage feed-forward pipeline, mirrored from
sdk/runanywhere-commons/src/backends/onnx/wakeword_onnx.cpp:

  * 1280-sample audio framing (80 ms @ 16 kHz), with a 480-sample
    context overlap fed alongside each new frame so mel-spec boundary
    frames match Python openWakeWord exactly.
  * Stage 1: melspectrogram.onnx → (v / 10) + 2 post-transform applied.
  * Stage 2: sliding 76-frame windows, stride 8, through the embedding
    model. 96-dim output, zero-padded if the model emits fewer dims.
  * Stage 3: per-classifier `[1, N, 96]` input, N auto-discovered from
    the model's input metadata (fallback 16). Per-classifier threshold
    + optional cooldown-frames gating to prevent repeated fires on a
    single utterance.
  * Buffers pre-filled with 76 ones(32) mel frames at load() / reset()
    so the first embedding fires immediately without ~1 s warm-up,
    matching the Python reference.
  * feed() accepts Float32Array or Int16Array — converts int16 → float32
    inline [-1, 1].

EMBEDDINGS (Extensions/RunAnywhere+Embeddings.ts + Foundation/WordPieceTokenizer.ts)
------------------------------------------------------------------------------------

Full HuggingFace BERT-compatible WordPiece tokenizer (~230 LOC, matches
`transformers.BertTokenizer(do_lower_case=True, strip_accents=True)`) +
encoder inference pipeline:

  * WordPieceTokenizer loads either `vocab.txt` (one token per line, id
    = line index) or a HuggingFace `tokenizer.json` blob, resolving the
    four special tokens ([CLS] [SEP] [UNK] [PAD]) up-front.
  * BasicTokenizer: control-char strip, NFD normalize, optional
    lowercase + accent strip, whitespace + punctuation splitting.
  * WordPiece: greedy longest-prefix-first with `##` continuation; emits
    [UNK] when the first prefix doesn't resolve.
  * Encoder wiring introspects the model's input names and maps
    BERT-standard aliases (input_ids / attention_mask / token_type_ids)
    so the same code handles HF-exported and ONNX-optimized encoders.
  * Mean-pool along the sequence dimension weighted by attention_mask,
    optional L2 normalize (default ON) for cosine-similarity search.
  * `embed()` handles the [1, N, D] or [N, D] output shape variants.
  * Dependency: onnxruntime-web 1.24+ (latest stable — ORT-web is a
    separate runtime from sherpa's bundled ORT, so no version-pin
    coupling to SHERPA_ONNX_VERSION_*).

Everything typechecks green under `npm run typecheck` across the three
web packages (core / llamacpp / onnx).

Apps that want to start using wake-word or embeddings today:

  import { WakeWord, Embeddings } from '@runanywhere/web-onnx';

  await WakeWord.load({
    shared: { melspectrogramModel: '...', embeddingModel: '...' },
    classifiers: [{ modelId: 'hey_jarvis', wakeWord: 'Hey Jarvis',
                    classifierModel: '...' }],
  });
  WakeWord.setCallback(({ wakeWord, score }) => ...);
  await WakeWord.feed(pcm16kMonoFloat32Samples);

  await Embeddings.load({
    model: 'https://.../model.onnx',
    tokenizer: 'https://.../vocab.txt',  // or tokenizer.json
  });
  const { vector } = await Embeddings.embed('search query text');

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Groups the ONNX backend code by what it does, not by history:

  src/backends/onnx/
  ├── shared/          ← shared/rac_ort_env.{h,cpp}     (one OrtEnv for all)
  ├── wakeword/        ← wakeword/wakeword_onnx.cpp     (raw Ort::Session, openWakeWord)
  ├── embeddings/      ← onnx_embedding_provider.{h,cpp}
  │                      rac_onnx_embeddings_register.cpp  (raw ORT C API, BERT for RAG)
  ├── onnx_backend.{h,cpp}                              (sherpa-backed STT/TTS/VAD)
  ├── rac_onnx.cpp
  └── rac_backend_onnx_register.cpp

Moves:

  src/backends/onnx/wakeword_onnx.cpp
    → src/backends/onnx/wakeword/wakeword_onnx.cpp

  src/features/rag/onnx_embedding_provider.{cpp,h}
  src/features/rag/rac_onnx_embeddings_register.cpp
    → src/backends/onnx/embeddings/

The embedding-provider move resolves the cross-directory compile hack in
src/backends/onnx/CMakeLists.txt that previously reached into
src/features/rag/ to pull the .cpp files (for the
rac_commons → rac_backend_rag → rac_backend_onnx → rac_commons cycle
workaround). The workaround is still necessary architecturally, but now
the files live where they logically belong; CMakeLists uses a simple
`if(RAC_BACKEND_RAG)` list-append against local paths.

CMakeLists updates:

  * src/backends/onnx/CMakeLists.txt
    - wakeword_onnx.cpp           → wakeword/wakeword_onnx.cpp
    - RAG_DIR cross-dir references removed
    - embeddings/*.cpp + embeddings/onnx_embedding_provider.h listed
      locally under RAC_BACKEND_RAG gate
    - target_include_directories(... PRIVATE embeddings/) so
      rac_onnx_embeddings_register.cpp's flat
      `#include "onnx_embedding_provider.h"` still resolves

  * src/features/rag/CMakeLists.txt
    - cross-compile comment updated to point at the new location

Include-path fixups inside the moved files:

  * embeddings/onnx_embedding_provider.cpp
    - `../../backends/onnx/onnx_backend.h`      → `../onnx_backend.h`
    - `../../backends/onnx/shared/rac_ort_env.h` → `../shared/rac_ort_env.h`

  * wakeword/wakeword_onnx.cpp
    - `shared/rac_ort_env.h`                    → `../shared/rac_ort_env.h`

Doc references updated:
  * include/rac/core/rac_platform_compat.h      (call-site list)
  * tests/simple_tokenizer_test.cpp             (comment)
  * web/onnx TS file headers                    (native-path citations)

Git tracks all four file moves as renames, so blame / history is
preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

This pull request removes the IntelliJ plugin demo and its CI/build artifacts, centralizes ONNX Runtime into a process-wide shared environment, strips unused Sherpa-ONNX JNI/C++ libraries from mobile packaging, adds wake-word detection and text-embeddings to the Web SDK using onnxruntime-web, decouples Sherpa-ONNX from the main WASM artifact, and enforces identical ONNX version pins across platforms.

Changes

Cohort / File(s) Summary
IntelliJ plugin demo removal
.github/workflows/build-all-test.yml, .idea/runConfigurations/*, build.gradle.kts, settings.gradle.kts, examples/intellij-plugin-demo/plugin/*, .gitignore
Deleted the IntelliJ plugin demo project, its Gradle tasks/run configurations, wrapper scripts, plugin sources/manifests, and removed the workflow job and summary row.
ONNX runtime sharing (native)
sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.h, .../shared/rac_ort_env.cpp, .../onnx_backend.cpp, .../embeddings/onnx_embedding_provider.cpp, .../wakeword/wakeword_onnx.cpp
Introduced process-lifetime shared ONNX Runtime API/C++ Env and updated backends/providers to consume the shared singleton (removed per-backend env creation/release).
Sherpa-ONNX stripping & packaging changes (mobile/Android)
sdk/runanywhere-commons/scripts/android/download-sherpa-onnx.sh, sdk/runanywhere-commons/scripts/build-android.sh, sdk/runanywhere-kotlin/scripts/build-kotlin.sh, sdk/runanywhere-kotlin/modules/.../build.gradle.kts, sdk/runanywhere-flutter/scripts/build-flutter.sh, sdk/runanywhere-react-native/*, sdk/runanywhere-react-native/packages/onnx/android/CMakeLists.txt
Narrowed shipped Sherpa libs to libsherpa-onnx-c-api.so + libonnxruntime.so; added stripping function and packaging tasks to remove libsherpa-onnx-jni.so/libsherpa-onnx-cxx-api.so before AAR/JNI merge.
ONNX version pinning & validation
sdk/runanywhere-commons/VERSIONS, sdk/runanywhere-commons/cmake/LoadVersions.cmake, sdk/runanywhere-commons/cmake/FetchONNXRuntime.cmake, sdk/runanywhere-commons/scripts/load-versions.sh
Standardized ONNX pins to 1.17.1 across platforms and added invariant checks that abort configuration when per-platform ONNX version pins diverge.
Web SDK: wake-word & embeddings (new features)
sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts, .../WakeWordTypes.ts, .../RunAnywhere+Embeddings.ts, .../EmbeddingsTypes.ts
Added WakeWordService and EmbeddingsService implementations, types, and public exports for streaming wake-word detection and BERT-style embeddings using onnxruntime-web.
Web SDK runtime & tokenizer foundation
sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts, .../WordPieceTokenizer.ts, .../index.ts, .../package.json
Added ORTRuntimeBridge singleton (lazy-loads onnxruntime-web), WordPieceTokenizer, and updated package exports and dependencies to expose new ONNX/web features.
WASM build decoupling (sherpa-onnx)
sdk/runanywhere-web/wasm/CMakeLists.txt, sdk/runanywhere-web/wasm/scripts/build.sh, sdk/runanywhere-web/wasm/src/wasm_exports.cpp
Removed RAC_WASM_ONNX option and --onnx flag; stopped embedding sherpa-onnx into main WASM, documenting and supporting separate sherpa-onnx.wasm module.
Docs, packaging, scripts, and small fixes
CLAUDE.md, docs/building.md, Package.swift, sdk/runanywhere-kotlin/docs/*, sdk/runanywhere-commons/scripts/*, sdk/runanywhere-commons/tests/*, sdk/runanywhere-commons/src/core/capabilities/lifecycle_manager.cpp, etc.
Updated docs and scripts to reflect IntelliJ removal, ONNX sharing, stripped Sherpa libs, new Web features, version sentinel behavior, and packaging/task wiring changes.

Sequence Diagram(s)

sequenceDiagram
    participant Browser as Client (Web)
    participant WakeWordSvc as WakeWordService
    participant ORTBridge as ORTRuntimeBridge
    participant ORTRuntime as onnxruntime-web
    participant Classifier as Classifier Session

    Browser->>WakeWordSvc: feed(audio)
    WakeWordSvc->>WakeWordSvc: frame/align & melspec
    WakeWordSvc->>ORTBridge: ensure initialized
    ORTBridge->>ORTRuntime: import & configure
    ORTRuntime-->>ORTBridge: ready
    ORTBridge-->>WakeWordSvc: ort API/session factory

    WakeWordSvc->>ORTRuntime: run melspec session
    ORTRuntime-->>WakeWordSvc: mel output
    WakeWordSvc->>ORTRuntime: run embedding session
    ORTRuntime-->>WakeWordSvc: embedding vector

    loop per classifier
      WakeWordSvc->>Classifier: run classifier session
      Classifier-->>WakeWordSvc: score
      alt score > threshold
        WakeWordSvc-->>Browser: emit WakeWordDetection
      end
    end
Loading
sequenceDiagram
    participant App as Client
    participant EmbSvc as EmbeddingsService
    participant Tokenizer as WordPieceTokenizer
    participant ORTBridge as ORTRuntimeBridge
    participant Encoder as Encoder Session

    App->>EmbSvc: embed(text)
    EmbSvc->>Tokenizer: encode(text)
    Tokenizer-->>EmbSvc: inputIds, attentionMask
    EmbSvc->>ORTBridge: ensure initialized & createSession(encoder)
    ORTBridge->>Encoder: create session
    Encoder-->>EmbSvc: session created

    EmbSvc->>Encoder: session.run(inputs)
    Encoder-->>EmbSvc: last_hidden_state
    EmbSvc->>EmbSvc: mean-pool & optional L2 normalize
    EmbSvc-->>App: EmbeddingResult(vector, dim, tokenCount)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested labels

react-native-sdk, kotlin-sdk, ios-sdk, documentation

Suggested reviewers

  • Siddhesh2377

Poem

🐰 Hop hop, the plugin hopped away,

ONNX now shares its sunlit day,
Sherpa bits pruned, the web gains ears,
Wake-word listens, embeddings cheer,
A little rabbit dances — change is here!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.83% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: consolidating ONNX Runtime and Sherpa-ONNX across all platforms, which directly reflects the core objective of standardizing runtime usage.
Description check ✅ Passed The description is comprehensive, covering the rationale, detailed commit structure, test plan, and size impact; it aligns well with the template sections for Type of Change (Refactoring) and Testing (multiple platforms), though platform-specific checkboxes are not explicitly marked.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch onnx-sherpa-consolidation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment on lines +63 to +95
static async initialize(options: ORTRuntimeInitOptions = {}): Promise<typeof ort> {
if (this._ort) return this._ort;
if (this._loadPromise) return this._loadPromise;

this._loadPromise = (async () => {
const mod = await import('onnxruntime-web');

// Configure the shared ort.env before the first session is created.
if (options.wasmPaths !== undefined) {
mod.env.wasm.wasmPaths = options.wasmPaths as
| string
| Record<string, string>;
}

const threads = options.numThreads ?? Math.min(
typeof navigator !== 'undefined' ? navigator.hardwareConcurrency ?? 1 : 1,
4,
);
mod.env.wasm.numThreads = threads;

if (options.logSeverityLevel !== undefined) {
mod.env.logLevel = (
['verbose', 'info', 'warning', 'error', 'fatal'] as const
)[options.logSeverityLevel];
}

this._ort = mod;
this._initialized = true;
return mod;
})();

return this._loadPromise;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Rejected load promise cached permanently

If import('onnxruntime-web') rejects (WASM asset missing, CDN unreachable, misconfigured wasmPaths), _loadPromise is left pointing at a rejected promise. Every subsequent call to initialize() hits if (this._loadPromise) return this._loadPromise and returns the same rejection without ever retrying, making the bridge permanently unusable — the only escape is the test-only _resetForTests().

Clear _loadPromise in a catch so callers can retry after fixing the configuration.

Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts
Line: 63-95

Comment:
**Rejected load promise cached permanently**

If `import('onnxruntime-web')` rejects (WASM asset missing, CDN unreachable, misconfigured `wasmPaths`), `_loadPromise` is left pointing at a rejected promise. Every subsequent call to `initialize()` hits `if (this._loadPromise) return this._loadPromise` and returns the same rejection without ever retrying, making the bridge permanently unusable — the only escape is the test-only `_resetForTests()`.

Clear `_loadPromise` in a catch so callers can retry after fixing the configuration.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +38 to +41
/// Returns the shared Ort::Env (C++ API). Thread-safe; lazy-initialized.
/// Throws Ort::Exception if ORT could not create the env.
Ort::Env& shared_cxx_env();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Doc says Ort::Exception; implementation throws std::runtime_error

The header documents shared_cxx_env() as "Throws Ort::Exception if ORT could not create the env", but rac_ort_env.cpp throws std::runtime_error. All current call sites already catch both types, so there's no runtime issue — but the doc will mislead future callers who only guard against Ort::Exception.

Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.h
Line: 38-41

Comment:
**Doc says `Ort::Exception`; implementation throws `std::runtime_error`**

The header documents `shared_cxx_env()` as "Throws `Ort::Exception` if ORT could not create the env", but `rac_ort_env.cpp` throws `std::runtime_error`. All current call sites already catch both types, so there's no runtime issue — but the doc will mislead future callers who only guard against `Ort::Exception`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +167 to +186
async feed(samples: Float32Array | Int16Array): Promise<void> {
if (!this._isReady) {
throw new Error('WakeWordService.load() must complete before feed().');
}

// Normalize int16 → float32 in-place into the buffer.
if (samples instanceof Int16Array) {
for (let i = 0; i < samples.length; i++) {
this.audioBuffer.push(samples[i]! / 32768);
}
} else {
for (let i = 0; i < samples.length; i++) {
this.audioBuffer.push(samples[i]!);
}
}

while (this.audioBuffer.length >= FRAME_SIZE) {
const frame = Float32Array.from(this.audioBuffer.splice(0, FRAME_SIZE));
await this.processFrame(frame);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 audioBuffer: number[] causes GC pressure at real-time audio rates

Each feed() call pushes samples one-by-one into a plain number[], then splice(0, 1280) removes the front chunk — an O(n) shift at ~12 frames/second. A Float32Array with a write-head cursor would eliminate the boxing and the O(n) splice.

Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts
Line: 167-186

Comment:
**`audioBuffer: number[]` causes GC pressure at real-time audio rates**

Each `feed()` call pushes samples one-by-one into a plain `number[]`, then `splice(0, 1280)` removes the front chunk — an O(n) shift at ~12 frames/second. A `Float32Array` with a write-head cursor would eliminate the boxing and the O(n) splice.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/commons-release.yml (1)

331-337: ⚠️ Potential issue | 🟠 Major

Fail the workflow when required .so files are missing.

This verification currently only logs missing files; it should exit non-zero to prevent publishing incomplete artifacts.

🔧 Proposed fix
-          for lib in libsherpa-onnx-c-api.so libonnxruntime.so; do
+          missing=0
+          for lib in libsherpa-onnx-c-api.so libonnxruntime.so; do
             if [ -f "third_party/sherpa-onnx-android/jniLibs/${{ matrix.abi }}/$lib" ]; then
               echo "✅ Found: $lib"
             else
               echo "❌ Missing: $lib"
+              missing=1
             fi
           done
+          if [ "$missing" -ne 0 ]; then
+            echo "Required Sherpa/ORT artifacts are missing for ABI: ${{ matrix.abi }}"
+            exit 1
+          fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/commons-release.yml around lines 331 - 337, The current
shell loop only echoes missing .so files but doesn't fail the job; modify the
loop that iterates over libsherpa-onnx-c-api.so and libonnxruntime.so (the for
... in ...; do ... done block) to track whether any file was missing (e.g., set
a variable like "missing=1" when the else branch runs) and after the loop exit
with a non-zero status if any missing files were detected (exit 1) so the
workflow fails when required .so files are absent.
🧹 Nitpick comments (7)
sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts (1)

191-200: Update check_libs_exist() in build-kotlin.sh to validate all ONNX dependencies.

The onnxLibs set in build.gradle.kts includes libonnxruntime.so and libsherpa-onnx-c-api.so, but check_libs_exist() only validates librac_backend_onnx_jni.so. Add checks for the missing libraries to catch incomplete dependency downloads during the pre-build validation step.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts` around
lines 191 - 200, check_libs_exist() in build-kotlin.sh currently only verifies
librac_backend_onnx_jni.so while build.gradle.kts defines the onnxLibs set
(librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so,
libsherpa-onnx-c-api.so); update check_libs_exist() to iterate/validate all
those filenames from the onnxLibs list (or explicitly check each:
librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so,
libsherpa-onnx-c-api.so) and fail early with a clear error if any are missing so
incomplete ONNX dependency downloads are caught during pre-build validation.
sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp (1)

59-61: Make shared_ort_env() return const OrtEnv* to enforce read-only access.

The ORT C API CreateSession expects const OrtEnv*, and the process-wide singleton is never released. Returning a mutable OrtEnv* creates unnecessary risk of accidental ReleaseEnv() calls. Tightening the return type to const OrtEnv* will encode the non-owning, read-only contract in the type system and match ORT's downstream expectations.

Also consider const-qualifying get_ort_env() in ONNXBackendNew (line 209 of onnx_backend.h) for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp` around
lines 59 - 61, Change shared_ort_env() to return a const OrtEnv* (instead of
OrtEnv*) to enforce read-only, non-owning access; update its return expression
to return g_cxx_env ? static_cast<const OrtEnv*>(*g_cxx_env) : nullptr and
ensure callers use const pointers. Also const-qualify get_ort_env() in
ONNXBackendNew to return const OrtEnv* for consistency with ORT C API and to
prevent accidental ReleaseEnv() usage. Verify any downstream usages that expect
mutable OrtEnv* are adjusted to accept const OrtEnv*.
sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+Embeddings.ts (2)

101-103: Add defensive check for output tensor existence.

The output[this.outputName] access could return undefined if the model's output name doesn't match expectations. While this.outputName is validated during load(), a defensive check would prevent runtime errors if the model behaves unexpectedly.

🛡️ Suggested improvement
     const output = await this.session.run(feeds);
-    const lastHidden = output[this.outputName]!;
+    const lastHidden = output[this.outputName];
+    if (!lastHidden) {
+      throw new Error(`EmbeddingsService: expected output "${this.outputName}" not found in model response`);
+    }
     const pooled = this.meanPool(lastHidden, encoded.attentionMask);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+Embeddings.ts
around lines 101 - 103, Add a defensive check after running the session to
ensure the model returned the expected output tensor: after const output = await
this.session.run(feeds), verify that output[this.outputName] is defined before
using it (the current code assigns to lastHidden and calls this.meanPool). If
it's undefined, throw or return a clear error that mentions this.outputName and
the session outputs to aid debugging; keep the existing validation in load() but
add this runtime guard around the use of output[this.outputName] (affecting the
variables lastHidden and pooled and the call to this.meanPool).

220-222: Consider validating tensor data type before casting.

The type assertion hidden.data as Float32Array assumes the encoder outputs float32, which is typical but not guaranteed. If a model outputs a different type, this could lead to incorrect embedding values.

🛡️ Optional type check
+    if (hidden.type !== 'float32') {
+      throw new Error(`EmbeddingsService: expected float32 output, got ${hidden.type}`);
+    }
     const data = hidden.data as Float32Array;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+Embeddings.ts
around lines 220 - 222, The code unconditionally casts hidden.data to
Float32Array (hidden.data as Float32Array) which can be invalid for models that
output other dtypes; update the logic around hidden, hidden.data and hiddenDim
to validate the tensor dtype (e.g., check hidden.type or use instanceof on
hidden.data) before casting, convert supported numeric types to Float32Array
(e.g., create a Float32Array from Int32Array/Float64Array/Uint8Array) and only
then populate out, and throw or return a clear error for unsupported dtypes to
avoid silent corruption of embeddings.
sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts (1)

63-95: Document that options only apply on first initialization.

The initialize() method correctly implements idempotency, but callers may not realize that passing different options on subsequent calls has no effect since the first initialization wins. Consider adding a warning log or updating the JSDoc.

📝 Suggested documentation improvement
   /**
    * Ensure the onnxruntime-web module is loaded and its global env is
-   * configured. Idempotent — subsequent calls resolve with the same module.
+   * configured. Idempotent — subsequent calls resolve with the same module
+   * and ignore any new options (first call's options are authoritative).
    */
   static async initialize(options: ORTRuntimeInitOptions = {}): Promise<typeof ort> {
-    if (this._ort) return this._ort;
+    if (this._ort) {
+      if (Object.keys(options).length > 0) {
+        console.warn('[ORTRuntimeBridge] Already initialized; ignoring options');
+      }
+      return this._ort;
+    }
     if (this._loadPromise) return this._loadPromise;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts` around
lines 63 - 95, The initialize() method of ORTRuntimeBridge is idempotent and
ignores options after the first call, so update the ORTRuntimeBridge.initialize
implementation and docs to make that explicit: add a JSDoc note on
ORTRuntimeBridge.initialize stating "options only apply on first initialization"
and, inside initialize(), check this._initialized or this._loadPromise and if
options are provided on subsequent calls emit a warning (use your logger or
console.warn) indicating that passed options will be ignored because the runtime
is already initialized; reference the initialize method and the
_initialized/_loadPromise fields to locate the change.
sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts (2)

127-131: Add validation for session input/output names.

The non-null assertions on session.inputNames[0] and session.outputNames[0] assume the classifier model has at least one input and output. If a malformed model is loaded, this would throw an unclear error.

🛡️ Suggested validation
         return {
           modelId: c.modelId,
           wakeWord: c.wakeWord,
           threshold: c.threshold ?? config.globalThreshold ?? 0.5,
           numEmbeddings: this.resolveClassifierWindow(session),
           session,
-          inputName: session.inputNames[0]!,
-          outputName: session.outputNames[0]!,
+          inputName: session.inputNames[0] ?? (() => {
+            throw new Error(`Classifier ${c.modelId} has no input tensors`);
+          })(),
+          outputName: session.outputNames[0] ?? (() => {
+            throw new Error(`Classifier ${c.modelId} has no output tensors`);
+          })(),
           lastDetectionFrame: -Infinity,
         } satisfies LoadedClassifier;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+WakeWord.ts
around lines 127 - 131, The code uses non-null assertions for
session.inputNames[0] and session.outputNames[0] when creating a
LoadedClassifier, which will throw unclear errors for malformed models; update
the code that constructs the LoadedClassifier (the object with keys session,
inputName, outputName, lastDetectionFrame) to explicitly validate that
session.inputNames and session.outputNames are arrays with length > 0 before
accessing index 0, and if validation fails throw or return a clear, descriptive
error (e.g., "model missing inputNames" / "model missing outputNames") or handle
the fallback case accordingly so the LoadedClassifier creation is safe.

183-186: Consider using a circular buffer for better memory efficiency.

The current approach using splice(0, FRAME_SIZE) creates a new array on each frame extraction and shifts all remaining elements. For continuous wake-word detection, this could cause memory churn. A circular buffer or typed array with index tracking would be more efficient.

However, at 16kHz with 80ms frames (~12.5 frames/second), this is unlikely to cause noticeable issues in practice.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+WakeWord.ts
around lines 183 - 186, The current loop in RunAnywhere+WakeWord.ts that
extracts frames using this.audioBuffer.splice(0, FRAME_SIZE) is inefficient;
replace the growing JS array with a fixed-size circular buffer (e.g., a
Float32Array) and maintain read/write indices so frames are read by slicing from
the typed array with wrap-around logic instead of splicing; update the producer
to write into the buffer and the consumer loop (the code that calls await
this.processFrame(frame)) to assemble a FRAME_SIZE Float32Array from the
circular buffer using the indices, advance the read index by FRAME_SIZE, and
handle buffer-full/overflow conditions to avoid shifting or allocations on each
frame.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sdk/runanywhere-commons/cmake/LoadVersions.cmake`:
- Around line 61-71: The ONNX pin equality loop (_ONNX_PINS / _ONNX_CANONICAL)
currently allows all-empty pins to pass; update the invariant to first ensure
the canonical pin (RAC_ONNX_VERSION_IOS) and every entry in _ONNX_PINS are
non-empty and then enforce equality. Specifically, check that
RAC_ONNX_VERSION_IOS (used to set _ONNX_CANONICAL) is not empty and add a
non-empty guard for each _pin in the foreach before comparing, and emit a clear
message(FATAL_ERROR) referencing the symbol names (_ONNX_CANONICAL, _ONNX_PINS,
RAC_ONNX_VERSION_IOS/ANDROID/MACOS/LINUX/WINDOWS) when any pin is empty or
mismatched.

In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp`:
- Around line 54-70: Rename the three externally visible helper functions to use
the rac_ prefix and update their declarations and all call sites: change
shared_ort_api -> rac_shared_ort_api, shared_ort_env -> rac_shared_ort_env, and
shared_cxx_env -> rac_shared_cxx_env; keep the internal init logic (g_init_flag,
init_once, g_api, g_cxx_env) intact, update any header prototypes and exported
symbols, and ensure any external linkage/exports (e.g., extern "C" or symbol
export macros) reference the new rac_ names so the public ABI follows the
Commons prefix convention.

In `@sdk/runanywhere-kotlin/docs/KOTLIN_MAVEN_CENTRAL_PUBLISHING.md`:
- Around line 141-143: Update earlier documentation entries that still claim
wildcard Sherpa variants and "6 libs per ABI" to match the later statement "ONNX
module: backend + ORT + sherpa C API only" and the mkdir line for
modules/runanywhere-core-onnx/src/androidMain/jniLibs/$ABI; specifically remove
or replace references to wildcard sherpa artifacts and the six-per-ABI count in
any tables or summary counts so they reflect only the C API artifacts actually
shipped for modules/runanywhere-core-onnx.

In `@sdk/runanywhere-kotlin/scripts/build-kotlin.sh`:
- Around line 402-419: The current directory-level branching prevents
per-library fallbacks: change the logic in the lib copy loop so each lib
(libonnxruntime.so and libsherpa-onnx-c-api.so) is checked first in
"${COMMONS_DIST}/onnx/${ABI}/${lib}" and if missing then checked in
"${SHERPA_ONNX_LIBS}/${ABI}/${lib}" before skipping; use the same copy + log
sequence to copy to "${ONNX_JNILIBS_DIR}/${ABI}/" when either source exists,
keeping the existing variable names (COMMONS_DIST, SHERPA_ONNX_LIBS,
ONNX_JNILIBS_DIR, ABI) and loop variables (lib or lib_name) to locate and modify
the code in build-kotlin.sh.

In `@sdk/runanywhere-web/packages/onnx/src/Foundation/WordPieceTokenizer.ts`:
- Around line 92-96: The tokenizer's vocab build loop in WordPieceTokenizer
currently uses vocab.size to assign IDs, which shifts IDs when the vocab.txt
contains empty lines; instead use the file line index as the token ID. In the
loop that iterates lines (the block that references tok and vocab), replace the
vocab.set(tok, vocab.size) behavior with vocab.set(tok, i) so each token gets
the original line-number ID (skip adding when tok is undefined or empty but
still consume the index i), ensuring the Map<string, number> preserves
HuggingFace line-number-based IDs.

---

Outside diff comments:
In @.github/workflows/commons-release.yml:
- Around line 331-337: The current shell loop only echoes missing .so files but
doesn't fail the job; modify the loop that iterates over libsherpa-onnx-c-api.so
and libonnxruntime.so (the for ... in ...; do ... done block) to track whether
any file was missing (e.g., set a variable like "missing=1" when the else branch
runs) and after the loop exit with a non-zero status if any missing files were
detected (exit 1) so the workflow fails when required .so files are absent.

---

Nitpick comments:
In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp`:
- Around line 59-61: Change shared_ort_env() to return a const OrtEnv* (instead
of OrtEnv*) to enforce read-only, non-owning access; update its return
expression to return g_cxx_env ? static_cast<const OrtEnv*>(*g_cxx_env) :
nullptr and ensure callers use const pointers. Also const-qualify get_ort_env()
in ONNXBackendNew to return const OrtEnv* for consistency with ORT C API and to
prevent accidental ReleaseEnv() usage. Verify any downstream usages that expect
mutable OrtEnv* are adjusted to accept const OrtEnv*.

In `@sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts`:
- Around line 191-200: check_libs_exist() in build-kotlin.sh currently only
verifies librac_backend_onnx_jni.so while build.gradle.kts defines the onnxLibs
set (librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so,
libsherpa-onnx-c-api.so); update check_libs_exist() to iterate/validate all
those filenames from the onnxLibs list (or explicitly check each:
librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so,
libsherpa-onnx-c-api.so) and fail early with a clear error if any are missing so
incomplete ONNX dependency downloads are caught during pre-build validation.

In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+Embeddings.ts:
- Around line 101-103: Add a defensive check after running the session to ensure
the model returned the expected output tensor: after const output = await
this.session.run(feeds), verify that output[this.outputName] is defined before
using it (the current code assigns to lastHidden and calls this.meanPool). If
it's undefined, throw or return a clear error that mentions this.outputName and
the session outputs to aid debugging; keep the existing validation in load() but
add this runtime guard around the use of output[this.outputName] (affecting the
variables lastHidden and pooled and the call to this.meanPool).
- Around line 220-222: The code unconditionally casts hidden.data to
Float32Array (hidden.data as Float32Array) which can be invalid for models that
output other dtypes; update the logic around hidden, hidden.data and hiddenDim
to validate the tensor dtype (e.g., check hidden.type or use instanceof on
hidden.data) before casting, convert supported numeric types to Float32Array
(e.g., create a Float32Array from Int32Array/Float64Array/Uint8Array) and only
then populate out, and throw or return a clear error for unsupported dtypes to
avoid silent corruption of embeddings.

In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+WakeWord.ts:
- Around line 127-131: The code uses non-null assertions for
session.inputNames[0] and session.outputNames[0] when creating a
LoadedClassifier, which will throw unclear errors for malformed models; update
the code that constructs the LoadedClassifier (the object with keys session,
inputName, outputName, lastDetectionFrame) to explicitly validate that
session.inputNames and session.outputNames are arrays with length > 0 before
accessing index 0, and if validation fails throw or return a clear, descriptive
error (e.g., "model missing inputNames" / "model missing outputNames") or handle
the fallback case accordingly so the LoadedClassifier creation is safe.
- Around line 183-186: The current loop in RunAnywhere+WakeWord.ts that extracts
frames using this.audioBuffer.splice(0, FRAME_SIZE) is inefficient; replace the
growing JS array with a fixed-size circular buffer (e.g., a Float32Array) and
maintain read/write indices so frames are read by slicing from the typed array
with wrap-around logic instead of splicing; update the producer to write into
the buffer and the consumer loop (the code that calls await
this.processFrame(frame)) to assemble a FRAME_SIZE Float32Array from the
circular buffer using the indices, advance the read index by FRAME_SIZE, and
handle buffer-full/overflow conditions to avoid shifting or allocations on each
frame.

In `@sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts`:
- Around line 63-95: The initialize() method of ORTRuntimeBridge is idempotent
and ignores options after the first call, so update the
ORTRuntimeBridge.initialize implementation and docs to make that explicit: add a
JSDoc note on ORTRuntimeBridge.initialize stating "options only apply on first
initialization" and, inside initialize(), check this._initialized or
this._loadPromise and if options are provided on subsequent calls emit a warning
(use your logger or console.warn) indicating that passed options will be ignored
because the runtime is already initialized; reference the initialize method and
the _initialized/_loadPromise fields to locate the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b303d4c9-b72a-412a-8740-9e4a0514ddae

📥 Commits

Reviewing files that changed from the base of the PR and between bc7db9b and 30890a6.

⛔ Files ignored due to path filters (1)
  • examples/intellij-plugin-demo/plugin/gradle/wrapper/gradle-wrapper.jar is excluded by !**/*.jar
📒 Files selected for processing (61)
  • .github/workflows/build-all-test.yml
  • .github/workflows/commons-release.yml
  • .gitignore
  • .idea/runConfigurations/09_Build_IntelliJ_Plugin.xml
  • .idea/runConfigurations/10_Run_IntelliJ_Plugin.xml
  • CLAUDE.md
  • Package.swift
  • build.gradle.kts
  • docs/building.md
  • examples/intellij-plugin-demo/plugin/build.gradle.kts
  • examples/intellij-plugin-demo/plugin/gradle/wrapper/gradle-wrapper.properties
  • examples/intellij-plugin-demo/plugin/gradlew
  • examples/intellij-plugin-demo/plugin/gradlew.bat
  • examples/intellij-plugin-demo/plugin/settings.gradle.kts
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/RunAnywherePlugin.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/ModelManagerAction.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceCommandAction.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceDictationAction.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/services/VoiceService.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/toolwindow/STTToolWindow.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/ModelManagerDialog.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/WaveformVisualization.kt
  • examples/intellij-plugin-demo/plugin/src/main/resources/META-INF/plugin.xml
  • sdk/runanywhere-commons/VERSIONS
  • sdk/runanywhere-commons/cmake/FetchONNXRuntime.cmake
  • sdk/runanywhere-commons/cmake/LoadVersions.cmake
  • sdk/runanywhere-commons/include/rac/core/rac_platform_compat.h
  • sdk/runanywhere-commons/scripts/android/download-sherpa-onnx.sh
  • sdk/runanywhere-commons/scripts/build-android.sh
  • sdk/runanywhere-commons/scripts/load-versions.sh
  • sdk/runanywhere-commons/src/backends/onnx/CMakeLists.txt
  • sdk/runanywhere-commons/src/backends/onnx/embeddings/onnx_embedding_provider.cpp
  • sdk/runanywhere-commons/src/backends/onnx/embeddings/onnx_embedding_provider.h
  • sdk/runanywhere-commons/src/backends/onnx/embeddings/rac_onnx_embeddings_register.cpp
  • sdk/runanywhere-commons/src/backends/onnx/onnx_backend.cpp
  • sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp
  • sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.h
  • sdk/runanywhere-commons/src/backends/onnx/wakeword/wakeword_onnx.cpp
  • sdk/runanywhere-commons/src/features/rag/CMakeLists.txt
  • sdk/runanywhere-commons/tests/simple_tokenizer_test.cpp
  • sdk/runanywhere-flutter/scripts/build-flutter.sh
  • sdk/runanywhere-kotlin/docs/KOTLIN_MAVEN_CENTRAL_PUBLISHING.md
  • sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/README.md
  • sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts
  • sdk/runanywhere-kotlin/scripts/build-kotlin.sh
  • sdk/runanywhere-react-native/packages/onnx/android/CMakeLists.txt
  • sdk/runanywhere-react-native/scripts/build-react-native.sh
  • sdk/runanywhere-web/README.md
  • sdk/runanywhere-web/packages/core/README.md
  • sdk/runanywhere-web/packages/onnx/package.json
  • sdk/runanywhere-web/packages/onnx/src/Extensions/EmbeddingsTypes.ts
  • sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+Embeddings.ts
  • sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts
  • sdk/runanywhere-web/packages/onnx/src/Extensions/WakeWordTypes.ts
  • sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts
  • sdk/runanywhere-web/packages/onnx/src/Foundation/WordPieceTokenizer.ts
  • sdk/runanywhere-web/packages/onnx/src/index.ts
  • sdk/runanywhere-web/wasm/CMakeLists.txt
  • sdk/runanywhere-web/wasm/scripts/build.sh
  • sdk/runanywhere-web/wasm/src/wasm_exports.cpp
  • settings.gradle.kts
💤 Files with no reviewable changes (20)
  • examples/intellij-plugin-demo/plugin/gradle/wrapper/gradle-wrapper.properties
  • .idea/runConfigurations/10_Run_IntelliJ_Plugin.xml
  • settings.gradle.kts
  • docs/building.md
  • .gitignore
  • .idea/runConfigurations/09_Build_IntelliJ_Plugin.xml
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceDictationAction.kt
  • examples/intellij-plugin-demo/plugin/gradlew.bat
  • examples/intellij-plugin-demo/plugin/settings.gradle.kts
  • examples/intellij-plugin-demo/plugin/build.gradle.kts
  • .github/workflows/build-all-test.yml
  • examples/intellij-plugin-demo/plugin/gradlew
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/WaveformVisualization.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/ModelManagerAction.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/services/VoiceService.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/RunAnywherePlugin.kt
  • examples/intellij-plugin-demo/plugin/src/main/resources/META-INF/plugin.xml
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/toolwindow/STTToolWindow.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceCommandAction.kt
  • examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/ModelManagerDialog.kt

Comment on lines +61 to +71
set(_ONNX_CANONICAL "${RAC_ONNX_VERSION_IOS}")
foreach(_pin IN LISTS _ONNX_PINS)
if(NOT "${_pin}" STREQUAL "${_ONNX_CANONICAL}")
message(FATAL_ERROR
"ONNX_VERSION_* pins in VERSIONS must all match. "
"Got: iOS=${RAC_ONNX_VERSION_IOS}, Android=${RAC_ONNX_VERSION_ANDROID}, "
"macOS=${RAC_ONNX_VERSION_MACOS}, Linux=${RAC_ONNX_VERSION_LINUX}, "
"Windows=${RAC_ONNX_VERSION_WINDOWS}. "
"Sherpa-ONNX is the single ORT source of truth — bump in lock-step.")
endif()
endforeach()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Require non-empty ONNX pins in the invariant check.

At Line 61-Line 64, the check only validates equality. If all RAC_ONNX_VERSION_* values are empty, the loop still passes and configuration continues with invalid version state.

Suggested fix
 set(_ONNX_CANONICAL "${RAC_ONNX_VERSION_IOS}")
+if("${_ONNX_CANONICAL}" STREQUAL "")
+    message(FATAL_ERROR "ONNX_VERSION_IOS is missing/empty in VERSIONS")
+endif()
 foreach(_pin IN LISTS _ONNX_PINS)
-    if(NOT "${_pin}" STREQUAL "${_ONNX_CANONICAL}")
+    if("${_pin}" STREQUAL "" OR NOT "${_pin}" STREQUAL "${_ONNX_CANONICAL}")
         message(FATAL_ERROR
             "ONNX_VERSION_* pins in VERSIONS must all match. "
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/cmake/LoadVersions.cmake` around lines 61 - 71, The
ONNX pin equality loop (_ONNX_PINS / _ONNX_CANONICAL) currently allows all-empty
pins to pass; update the invariant to first ensure the canonical pin
(RAC_ONNX_VERSION_IOS) and every entry in _ONNX_PINS are non-empty and then
enforce equality. Specifically, check that RAC_ONNX_VERSION_IOS (used to set
_ONNX_CANONICAL) is not empty and add a non-empty guard for each _pin in the
foreach before comparing, and emit a clear message(FATAL_ERROR) referencing the
symbol names (_ONNX_CANONICAL, _ONNX_PINS,
RAC_ONNX_VERSION_IOS/ANDROID/MACOS/LINUX/WINDOWS) when any pin is empty or
mismatched.

Comment on lines +54 to +70
const OrtApi* shared_ort_api() {
std::call_once(g_init_flag, init_once);
return g_api;
}

OrtEnv* shared_ort_env() {
std::call_once(g_init_flag, init_once);
return g_cxx_env ? static_cast<OrtEnv*>(*g_cxx_env) : nullptr;
}

Ort::Env& shared_cxx_env() {
std::call_once(g_init_flag, init_once);
if (!g_cxx_env) {
throw std::runtime_error(
"rac::onnx::shared_cxx_env() failed to initialize Ort::Env");
}
return *g_cxx_env;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Prefix these exported helpers with rac_.

shared_ort_api, shared_ort_env, and shared_cxx_env are externally linked helpers in sdk/runanywhere-commons, so they should follow the Commons symbol-prefix convention before this becomes part of the backend ABI surface.

As per coding guidelines, "All public symbols must be prefixed with rac_ (RunAnywhere Commons)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp` around
lines 54 - 70, Rename the three externally visible helper functions to use the
rac_ prefix and update their declarations and all call sites: change
shared_ort_api -> rac_shared_ort_api, shared_ort_env -> rac_shared_ort_env, and
shared_cxx_env -> rac_shared_cxx_env; keep the internal init logic (g_init_flag,
init_once, g_api, g_cxx_env) intact, update any header prototypes and exported
symbols, and ensure any external linkage/exports (e.g., extern "C" or symbol
export macros) reference the new rac_ names so the public ABI follows the
Commons prefix convention.

Comment on lines +141 to 143
# ONNX module: backend + ORT + sherpa C API only (sherpa-jni / sherpa-cxx-api
# are intentionally not shipped — we use our own JNI and link the C API).
mkdir -p modules/runanywhere-core-onnx/src/androidMain/jniLibs/$ABI
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update earlier ONNX counts/descriptions to match this c-api-only statement.

This section is now correct, but it conflicts with earlier table/count entries that still imply wildcard Sherpa variants and 6 libs per ABI.

📝 Suggested doc alignment
-| `io.github.sanchitmonga22:runanywhere-onnx-android` | 6 per ABI | STT/TTS/VAD: `librac_backend_onnx*.so`, `libonnxruntime.so`, `libsherpa-onnx-*.so` |
+| `io.github.sanchitmonga22:runanywhere-onnx-android` | 4 per ABI | STT/TTS/VAD: `librac_backend_onnx.so`, `librac_backend_onnx_jni.so`, `libonnxruntime.so`, `libsherpa-onnx-c-api.so` |

-With 3 ABIs (arm64-v8a, armeabi-v7a, x86_64): SDK=12, LlamaCPP=6, ONNX=18 = **36 total .so files**.
+With 3 ABIs (arm64-v8a, armeabi-v7a, x86_64): SDK=12, LlamaCPP=6, ONNX=12 = **30 total .so files**.

-| ONNX | `RABackendONNX-android-{abi}-v{ver}.zip` | `librac_backend_onnx*.so`, `libonnxruntime.so`, `libsherpa-onnx-*.so` |
+| ONNX | `RABackendONNX-android-{abi}-v{ver}.zip` | `librac_backend_onnx.so`, `librac_backend_onnx_jni.so`, `libonnxruntime.so`, `libsherpa-onnx-c-api.so` |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-kotlin/docs/KOTLIN_MAVEN_CENTRAL_PUBLISHING.md` around lines
141 - 143, Update earlier documentation entries that still claim wildcard Sherpa
variants and "6 libs per ABI" to match the later statement "ONNX module: backend
+ ORT + sherpa C API only" and the mkdir line for
modules/runanywhere-core-onnx/src/androidMain/jniLibs/$ABI; specifically remove
or replace references to wildcard sherpa artifacts and the six-per-ABI count in
any tables or summary counts so they reflect only the C API artifacts actually
shipped for modules/runanywhere-core-onnx.

Comment on lines 402 to 419
if [ -d "${COMMONS_DIST}/onnx/${ABI}" ]; then
for lib in libonnxruntime.so libsherpa-onnx-c-api.so libsherpa-onnx-cxx-api.so libsherpa-onnx-jni.so; do
for lib in libonnxruntime.so libsherpa-onnx-c-api.so; do
if [ -f "${COMMONS_DIST}/onnx/${ABI}/${lib}" ]; then
cp "${COMMONS_DIST}/onnx/${ABI}/${lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
log_info "ONNX: ${lib}"
fi
done
elif [ -d "${SHERPA_ONNX_LIBS}/${ABI}" ]; then
for lib in "${SHERPA_ONNX_LIBS}/${ABI}"/*.so; do
if [ -f "$lib" ]; then
cp "$lib" "${ONNX_JNILIBS_DIR}/${ABI}/"
log_info "ONNX: $(basename $lib)"
# Whitelist which sherpa-bundled .so files to copy (skip sherpa-jni
# and sherpa-cxx-api, which we don't use).
for lib_name in libonnxruntime.so libsherpa-onnx-c-api.so; do
local src_lib="${SHERPA_ONNX_LIBS}/${ABI}/${lib_name}"
if [ -f "${src_lib}" ]; then
cp "${src_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
log_info "ONNX: ${lib_name}"
fi
done
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Per-library fallback is broken by directory-level branching.

At Line 402-Line 419, once ${COMMONS_DIST}/onnx/${ABI} exists, the elif fallback to ${SHERPA_ONNX_LIBS}/${ABI} is never reached—even when a specific lib is missing in dist. That can leave libonnxruntime.so or libsherpa-onnx-c-api.so out of jniLibs.

Suggested fix
-        if [ -d "${COMMONS_DIST}/onnx/${ABI}" ]; then
-            for lib in libonnxruntime.so libsherpa-onnx-c-api.so; do
-                if [ -f "${COMMONS_DIST}/onnx/${ABI}/${lib}" ]; then
-                    cp "${COMMONS_DIST}/onnx/${ABI}/${lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
-                    log_info "ONNX: ${lib}"
-                fi
-            done
-        elif [ -d "${SHERPA_ONNX_LIBS}/${ABI}" ]; then
-            # Whitelist which sherpa-bundled .so files to copy (skip sherpa-jni
-            # and sherpa-cxx-api, which we don't use).
-            for lib_name in libonnxruntime.so libsherpa-onnx-c-api.so; do
-                local src_lib="${SHERPA_ONNX_LIBS}/${ABI}/${lib_name}"
-                if [ -f "${src_lib}" ]; then
-                    cp "${src_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
-                    log_info "ONNX: ${lib_name}"
-                fi
-            done
-        fi
+        for lib_name in libonnxruntime.so libsherpa-onnx-c-api.so; do
+            local dist_lib="${COMMONS_DIST}/onnx/${ABI}/${lib_name}"
+            local sherpa_lib="${SHERPA_ONNX_LIBS}/${ABI}/${lib_name}"
+            if [ -f "${dist_lib}" ]; then
+                cp "${dist_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
+                log_info "ONNX: ${lib_name}"
+            elif [ -f "${sherpa_lib}" ]; then
+                cp "${sherpa_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
+                log_info "ONNX: ${lib_name} (from Sherpa-ONNX)"
+            else
+                log_warn "ONNX: ${lib_name} NOT FOUND for ${ABI}"
+            fi
+        done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-kotlin/scripts/build-kotlin.sh` around lines 402 - 419, The
current directory-level branching prevents per-library fallbacks: change the
logic in the lib copy loop so each lib (libonnxruntime.so and
libsherpa-onnx-c-api.so) is checked first in
"${COMMONS_DIST}/onnx/${ABI}/${lib}" and if missing then checked in
"${SHERPA_ONNX_LIBS}/${ABI}/${lib}" before skipping; use the same copy + log
sequence to copy to "${ONNX_JNILIBS_DIR}/${ABI}/" when either source exists,
keeping the existing variable names (COMMONS_DIST, SHERPA_ONNX_LIBS,
ONNX_JNILIBS_DIR, ABI) and loop variables (lib or lib_name) to locate and modify
the code in build-kotlin.sh.

Comment on lines +92 to +96
for (let i = 0; i < lines.length; i++) {
const tok = lines[i];
if (tok === undefined || tok === '') continue;
if (!vocab.has(tok)) vocab.set(tok, vocab.size);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

vocab.txt parsing does not preserve line-number-based IDs.

Standard HuggingFace vocab.txt files use line number as token ID (line 0 → ID 0, line 1 → ID 1). This implementation assigns IDs sequentially via vocab.size, which breaks if the file contains empty lines in the middle—tokens after an empty line will have incorrect IDs.

🐛 Proposed fix to use line index as token ID
-  for (let i = 0; i < lines.length; i++) {
-    const tok = lines[i];
-    if (tok === undefined || tok === '') continue;
-    if (!vocab.has(tok)) vocab.set(tok, vocab.size);
+  for (let i = 0; i < lines.length; i++) {
+    const tok = lines[i];
+    if (tok === undefined || tok === '') continue;
+    if (!vocab.has(tok)) vocab.set(tok, i);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (let i = 0; i < lines.length; i++) {
const tok = lines[i];
if (tok === undefined || tok === '') continue;
if (!vocab.has(tok)) vocab.set(tok, vocab.size);
}
for (let i = 0; i < lines.length; i++) {
const tok = lines[i];
if (tok === undefined || tok === '') continue;
if (!vocab.has(tok)) vocab.set(tok, i);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-web/packages/onnx/src/Foundation/WordPieceTokenizer.ts`
around lines 92 - 96, The tokenizer's vocab build loop in WordPieceTokenizer
currently uses vocab.size to assign IDs, which shifts IDs when the vocab.txt
contains empty lines; instead use the file line index as the token ID. In the
loop that iterates lines (the block that references tok and vocab), replace the
vocab.set(tok, vocab.size) behavior with vocab.set(tok, i) so each token gets
the original line-number ID (skip adding when tok is undefined or empty but
still consume the index i), ensuring the Map<string, number> preserves
HuggingFace line-number-based IDs.

sanchitmonga22 and others added 3 commits April 15, 2026 11:55
The first pass of the Android dead-weight removal used
packaging.jniLibs.excludes to strip libsherpa-onnx-jni.so and
libsherpa-onnx-cxx-api.so. Turns out that DSL only takes effect when a
downstream APP packages the AAR — it does NOT strip the .so files from
the library AAR itself during assembleRelease. I verified this by
unzipping runanywhere-core-onnx-release.aar and seeing both unwanted
libs still present in every jni/<abi>/ directory.

Fix: a `stripUnshippedSherpaLibs` Delete task that wipes the two files
from src/androidMain/jniLibs/ BEFORE preBuild / mergeReleaseJniLibFolders
run. Wired via:

  * tasks.matching { name == "downloadJniLibs" } finalizedBy strip
    (so freshly-downloaded files also get filtered, not just pre-
     existing ones from a stale testLocal=true build).
  * tasks.matching { name.contains("merge") &&
                     name.contains("JniLibFolders") } dependsOn strip
  * tasks.matching { name == "preBuild" } dependsOn strip

Verified locally: assembleRelease produces a runanywhere-core-onnx-
release.aar that contains exactly the 4 expected libs per ABI —
libonnxruntime.so + librac_backend_onnx.so + librac_backend_onnx_jni.so
+ libsherpa-onnx-c-api.so. The sherpa jni + cxx-api variants are gone.

Also fix a latent version-drift bug in the ONNX download helper scripts
that this PR's validation run surfaced:

  * scripts/macos/download-onnx.sh
  * scripts/ios/download-onnx.sh
  * scripts/linux/download-sherpa-onnx.sh

Previously all three skipped re-download based only on presence of the
binary artifact — so bumping ONNX_VERSION_* or SHERPA_ONNX_VERSION_*
would leave stale third_party/onnxruntime-{ios,macos}/ trees in place
forever, exactly the kind of silent drift the LoadVersions guard was
added to prevent. Each script now stamps a `.version` sentinel after a
successful download and re-downloads on mismatch. Verified on macOS:
nuking onnxruntime-macos and re-running download-onnx.sh replaced the
stale 1.23.2 dylib with 1.17.1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
lifecycle_manager.cpp uses std::condition_variable but only indirectly
includes it via <mutex> on libc++ (AppleClang on iOS / macOS and Android
NDK). GCC 13 with libstdc++ on Linux doesn't transitively include it,
so the Linux commons build fails with:

    error: 'condition_variable' in namespace 'std' does not name a type
       65 |     std::condition_variable service_cv{};

Surfaced while running build-linux.sh inside an ubuntu:24.04 container
as part of PR #479 validation. Pre-existing bug, not regression from
this PR — but Linux is an explicit target per VERSIONS, so fixing here
alongside the ONNX consolidation work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two pre-existing build-ios.sh bugs surfaced while validating PR #479 by
actually building the iOS example app against a fresh commons build:

BUG 1 — simulator xcframework slice missing our backend symbols
---------------------------------------------------------------

The xcframework's ios-arm64_x86_64-simulator libRABackendONNX.a had
rac_backend_onnx_* symbols present only under x86_64, not arm64, which
caused the example app to fail at link time:

    Undefined symbols for architecture arm64:
      "_rac_backend_onnx_unregister", referenced from:
          static ONNXRuntime.ONNX.unregister() -> ()

Root cause: sherpa-onnx.xcframework ships a UNIVERSAL (x86_64+arm64)
libsherpa-onnx.a for the simulator slice. When libtool merges our
x86_64-only librac_backend_onnx.a with sherpa's universal lib inside
the SIMULATOR platform build, the resulting framework binary ends up
reporting both archs to `lipo -archs` — even though its arm64 slice
contains ONLY sherpa's arm64 objects, not our rac_backend_* objects
(which exist only in the x86_64 slice).

The old `if [[ "$SIM_ARCHS" != *"arm64"* ]]` check then short-circuits
the final lipo-create step, leaving the SIMULATORARM64-built arm64
objects (which DO have our symbols) out of the shipped binary.

Fix: always extract the canonical thin arch from each platform's
framework binary (SIMULATORARM64 → arm64-thin, SIMULATOR → x86_64-thin)
and lipo-create a fresh fat binary. Applied in both create_xcframework
(commons) and create_backend_xcframework (backends). After fix, per-arch
nm counts match on all three slices:

    ios-arm64                      : 4 `T _rac_backend_onnx_*`
    ios-arm64_x86_64-simulator/arm64: 4 `T _rac_backend_onnx_*`
    ios-arm64_x86_64-simulator/x86_64: 4 `T _rac_backend_onnx_*`

BUG 2 — Package.swift requires RABackendMetalRT.xcframework to exist
---------------------------------------------------------------------

Package.swift unconditionally declared `RABackendMetalRTBinary` with a
local path under useLocalBinaries=true, so any dev doing
`build-ios.sh --backend onnx` (no MetalRT build) hit:

    local binary target 'RABackendMetalRTBinary' at '...
    RABackendMetalRT.xcframework' does not contain a binary artifact.

Fix: add `metalrtLocalBinaryExists` probe (FileManager.fileExists on the
xcframework path relative to Package.swift) and only declare the binary
target when the framework is actually present. `includeMetalRT` is
AND-gated on the existence check so the product/targets also disappear
— SPM graph resolves cleanly whether or not MetalRT is built locally.

Verified end-to-end: ** BUILD SUCCEEDED ** for the iOS example app
(examples/ios/RunAnywhereAI, xcodebuild -scheme RunAnywhereAI
-destination 'generic/platform=iOS Simulator') against fresh commons
xcframeworks synced via build-swift.sh --local.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh (1)

95-105: ⚠️ Potential issue | 🟠 Major

Fix Sherpa-ONNX URL format: remove -cpu suffix to match v1.12.23+ releases.

The URLs on lines 96 and 99 use sherpa-onnx-v${VERSION}-linux-x64-shared-cpu.tar.bz2, but should use sherpa-onnx-v${VERSION}-linux-x64-shared.tar.bz2 (without the -cpu suffix). Update both the x86_64 and aarch64 URLs to match the current release filename format.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh` around lines
95 - 105, The URL strings built for ARCH values aarch64 and x86_64 include a
trailing "-cpu" that no longer exists in recent Sherpa-ONNX releases; update the
URL assignments for the aarch64 and x86_64 branches (the URL variables set when
ARCH == "aarch64" and ARCH == "x86_64") to use filenames without the "-cpu"
suffix (e.g., change "sherpa-onnx-v${VERSION}-linux-*-shared-cpu.tar.bz2" to
"sherpa-onnx-v${VERSION}-linux-*-shared.tar.bz2"), and verify the corresponding
ARCHIVE_NAME values match the new filename pattern so downloads succeed; keep
the print_error and exit handling as-is.
Package.swift (1)

1-1: ⚠️ Potential issue | 🟡 Minor

Update swift-tools-version to 6.0 to comply with coding guidelines.

The coding guidelines explicitly state "Use the latest Swift 6 APIs always" for Swift files, which applies to Package.swift. Currently, the file specifies swift-tools-version: 5.9. Swift 6.0 is the latest version and should be adopted unless there are specific compatibility constraints in your CI/CD environment or dependency chain that prevent it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Package.swift` at line 1, Update the Package.swift header to declare Swift
tools version 6.0 by changing the swift-tools-version line from 5.9 to 6.0 (the
file's top-line directive "swift-tools-version: 5.9"); after updating, run swift
package resolve and swift build / CI tests to ensure compatibility and adjust
any APIs if the Swift 6 toolchain surfaces warnings or errors in package
manifests or targets.
🧹 Nitpick comments (2)
sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh (1)

76-89: Version mismatch detection proceeds to download without clearing the stale directory.

When a version mismatch is detected (lines 86-89), the script logs a message about "Clearing stale cache and re-downloading…" but doesn't actually clear ${DEST_DIR} here. The clearing happens later at lines 123-126 only if the directory exists, which it does. This works but is misleading — the message implies immediate clearing.

For consistency with the macOS script (which explicitly calls rm -rf "${ONNX_DIR}" at line 54 within the mismatch block), consider restructuring:

♻️ Clearer flow suggestion
     print_step "Sherpa-ONNX version mismatch at ${DEST_DIR}"
     echo "   Found: ${EXISTING:-unknown}, want: ${VERSION}"
     echo "   Clearing stale cache and re-downloading…"
+    rm -rf "${DEST_DIR}"
 fi
-
-# =============================================================================
-# Determine Download URL
-# =============================================================================
-...
-
-# Clean existing directory
-if [ -d "${DEST_DIR}" ]; then
-    print_step "Removing existing Sherpa-ONNX directory..."
-    rm -rf "${DEST_DIR}"
-fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh` around lines
76 - 89, The version-mismatch branch logs "Clearing stale cache…" but doesn't
remove the existing directory immediately; update the block that checks
VERSION_SENTINEL/EXISTING (the if that tests [ -d "${DEST_DIR}/lib" ] && [
"$FORCE_DOWNLOAD" = false ]) to remove the stale files right there by calling rm
-rf on ${DEST_DIR} (or the equivalent ONNX_DIR subpath) before continuing with
download so the message matches the action; ensure you reference DEST_DIR,
VERSION_SENTINEL, VERSION and preserve the FORCE_DOWNLOAD logic and existing
print_step/print_success calls.
sdk/runanywhere-commons/scripts/build-ios.sh (1)

549-569: Consider extracting the fat binary creation logic into a shared function.

The logic for creating fat simulator binaries (lines 549-569) is duplicated nearly verbatim in create_backend_xcframework (lines 803-839). Extracting this into a helper function would reduce duplication and ensure consistent behavior.

Additionally, the temporary thin files (${FRAMEWORK_NAME}-thin-arm64.a, ${FRAMEWORK_NAME}-thin-x86_64.a) are created but never removed. Consider cleaning them up after lipo -create.

♻️ Proposed helper function
# Add this helper function before create_xcframework():
create_fat_simulator_binary() {
    local FRAMEWORK_NAME=$1
    local BUILD_DIR=$2

    local SIM_FAT="${BUILD_DIR}/SIMULATOR"
    local SIM_ARM64_BIN="${BUILD_DIR}/SIMULATORARM64/${FRAMEWORK_NAME}.framework/${FRAMEWORK_NAME}"
    local SIM_X86_BIN="${SIM_FAT}/${FRAMEWORK_NAME}.framework/${FRAMEWORK_NAME}"

    if [[ -f "${SIM_ARM64_BIN}" && -f "${SIM_X86_BIN}" ]]; then
        log_step "Creating fat simulator binary (arm64 + x86_64)..."
        local SIM_ARM64_THIN="${BUILD_DIR}/SIMULATORARM64/${FRAMEWORK_NAME}-thin-arm64.a"
        local SIM_X86_THIN="${SIM_FAT}/${FRAMEWORK_NAME}-thin-x86_64.a"

        if ! lipo -thin arm64 "${SIM_ARM64_BIN}" -output "${SIM_ARM64_THIN}" 2>/dev/null; then
            cp "${SIM_ARM64_BIN}" "${SIM_ARM64_THIN}"
        fi
        if ! lipo -thin x86_64 "${SIM_X86_BIN}" -output "${SIM_X86_THIN}" 2>/dev/null; then
            cp "${SIM_X86_BIN}" "${SIM_X86_THIN}"
        fi

        lipo -create "${SIM_ARM64_THIN}" "${SIM_X86_THIN}" \
            -output "${SIM_FAT}/${FRAMEWORK_NAME}.framework/${FRAMEWORK_NAME}"

        # Clean up temp files
        rm -f "${SIM_ARM64_THIN}" "${SIM_X86_THIN}"
    fi
}

Then replace both duplicate blocks with:

create_fat_simulator_binary "${FRAMEWORK_NAME}" "${BUILD_DIR}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/scripts/build-ios.sh` around lines 549 - 569, The fat
simulator binary creation logic is duplicated; extract it into a helper function
(e.g. create_fat_simulator_binary) that accepts FRAMEWORK_NAME and BUILD_DIR,
moves the existing block that creates SIM_ARM64_BIN/SIM_X86_BIN, runs lipo -thin
for arm64 and x86_64 into SIM_ARM64_THIN and SIM_X86_THIN (falling back to cp on
failure), calls lipo -create to write the combined framework binary, and then
removes the temporary thin files (rm -f "${SIM_ARM64_THIN}" "${SIM_X86_THIN}");
replace the duplicated blocks in the current function and
create_backend_xcframework with a call to create_fat_simulator_binary
"${FRAMEWORK_NAME}" "${BUILD_DIR}" to ensure consistent behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@Package.swift`:
- Line 1: Update the Package.swift header to declare Swift tools version 6.0 by
changing the swift-tools-version line from 5.9 to 6.0 (the file's top-line
directive "swift-tools-version: 5.9"); after updating, run swift package resolve
and swift build / CI tests to ensure compatibility and adjust any APIs if the
Swift 6 toolchain surfaces warnings or errors in package manifests or targets.

In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh`:
- Around line 95-105: The URL strings built for ARCH values aarch64 and x86_64
include a trailing "-cpu" that no longer exists in recent Sherpa-ONNX releases;
update the URL assignments for the aarch64 and x86_64 branches (the URL
variables set when ARCH == "aarch64" and ARCH == "x86_64") to use filenames
without the "-cpu" suffix (e.g., change
"sherpa-onnx-v${VERSION}-linux-*-shared-cpu.tar.bz2" to
"sherpa-onnx-v${VERSION}-linux-*-shared.tar.bz2"), and verify the corresponding
ARCHIVE_NAME values match the new filename pattern so downloads succeed; keep
the print_error and exit handling as-is.

---

Nitpick comments:
In `@sdk/runanywhere-commons/scripts/build-ios.sh`:
- Around line 549-569: The fat simulator binary creation logic is duplicated;
extract it into a helper function (e.g. create_fat_simulator_binary) that
accepts FRAMEWORK_NAME and BUILD_DIR, moves the existing block that creates
SIM_ARM64_BIN/SIM_X86_BIN, runs lipo -thin for arm64 and x86_64 into
SIM_ARM64_THIN and SIM_X86_THIN (falling back to cp on failure), calls lipo
-create to write the combined framework binary, and then removes the temporary
thin files (rm -f "${SIM_ARM64_THIN}" "${SIM_X86_THIN}"); replace the duplicated
blocks in the current function and create_backend_xcframework with a call to
create_fat_simulator_binary "${FRAMEWORK_NAME}" "${BUILD_DIR}" to ensure
consistent behavior.

In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh`:
- Around line 76-89: The version-mismatch branch logs "Clearing stale cache…"
but doesn't remove the existing directory immediately; update the block that
checks VERSION_SENTINEL/EXISTING (the if that tests [ -d "${DEST_DIR}/lib" ] &&
[ "$FORCE_DOWNLOAD" = false ]) to remove the stale files right there by calling
rm -rf on ${DEST_DIR} (or the equivalent ONNX_DIR subpath) before continuing
with download so the message matches the action; ensure you reference DEST_DIR,
VERSION_SENTINEL, VERSION and preserve the FORCE_DOWNLOAD logic and existing
print_step/print_success calls.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9afa493d-4b6c-48b5-b617-be8f7e348ced

📥 Commits

Reviewing files that changed from the base of the PR and between 30890a6 and 9c83367.

📒 Files selected for processing (7)
  • Package.swift
  • sdk/runanywhere-commons/scripts/build-ios.sh
  • sdk/runanywhere-commons/scripts/ios/download-onnx.sh
  • sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh
  • sdk/runanywhere-commons/scripts/macos/download-onnx.sh
  • sdk/runanywhere-commons/src/core/capabilities/lifecycle_manager.cpp
  • sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts
✅ Files skipped from review due to trivial changes (1)
  • sdk/runanywhere-commons/src/core/capabilities/lifecycle_manager.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant