feat: consolidate ONNX Runtime × Sherpa-ONNX across all platforms#479
feat: consolidate ONNX Runtime × Sherpa-ONNX across all platforms#479sanchitmonga22 wants to merge 11 commits intomainfrom
Conversation
The embedded sherpa-onnx path in the web SDK's WASM build has never been
functional on the web: rac_backend_onnx requires ONNX Runtime headers
(FetchONNXRuntime.cmake) that have no WebAssembly distribution. The
build only survived because -sERROR_ON_UNDEFINED_SYMBOLS=0 silently
produced a binary with null rac_{tts,vad}_onnx_* exports.
--all-backends already excluded --onnx with a comment documenting this,
but the dead option and its 8 conditional blocks were still wired up in
wasm/CMakeLists.txt, wasm/scripts/build.sh, wasm_exports.cpp, and the
README examples.
STT / TTS / VAD on the web are served by the separate sherpa-onnx.wasm
module built by wasm/scripts/build-sherpa-onnx.sh and loaded lazily by
packages/onnx/src/Foundation/SherpaONNXBridge.ts. That path is unchanged.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…i.so
The Sherpa-ONNX Android prebuilt (v1.12.20) contains four .so files per
ABI: libonnxruntime.so + libsherpa-onnx-c-api.so + libsherpa-onnx-jni.so
+ libsherpa-onnx-cxx-api.so. We only need the first two:
* librac_backend_onnx.so links against the C API (libsherpa-onnx-c-api.so).
* Our own JNI bridge is librac_backend_onnx_jni.so, loaded from Kotlin.
Sherpa-ONNX's libsherpa-onnx-jni.so is never referenced via
System.loadLibrary on the JVM side.
* libsherpa-onnx-cxx-api.so is sherpa's C++ wrapper — our backend
consumes the C API, not the C++ API.
Changes:
* scripts/android/download-sherpa-onnx.sh: strip_unused_sherpa_libs()
runs after download/re-extract, deletes the two unused .so per ABI.
Also fix the "already exists" early-return to check libonnxruntime.so
(the real always-required artifact) instead of the now-stripped
libsherpa-onnx-jni.so, so a trimmed tree doesn't force re-download.
* modules/runanywhere-core-onnx/build.gradle.kts: narrow the download
filter to the 4 libs we ship + add packagingOptions.jniLibs.excludes
as a safety net against stale local copies.
* Propagate the exclusion through the Kotlin / React Native / Flutter
build scripts + the RN onnx CMakeLists (no imports/links of the two
unused sherpa targets).
* Commons-release workflow: verify libsherpa-onnx-c-api.so +
libonnxruntime.so presence instead of the stripped jni.
* Update docs (onnx module README, maven central publishing guide, build
tree summary) to reflect the shipped set.
Savings: ~4.6 MB per ABI × 3 ABIs (arm64-v8a, armeabi-v7a, x86_64) =
~13.8 MB removed from the runanywhere-onnx-android AAR. Zero runtime
impact — neither library was resolved by any consumer.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Retires the IntelliJ/Android Studio plugin demo. Not part of the active
roadmap; keeping it around imposes a CI job, a gradle composite build,
and two run configurations with no ongoing benefit.
Removed:
* examples/intellij-plugin-demo/ (full directory, ~1900 LOC Kotlin +
gradle wrapper)
* settings.gradle.kts includeBuild entry
* build.gradle.kts buildIntellijPlugin + runIntellijPlugin tasks and
the branches of buildAll / cleanAll that drove them
* .github/workflows/build-all-test.yml intellij-plugin job, its
workflow_dispatch input, and the summary row
* .idea/runConfigurations/09_Build_IntelliJ_Plugin.xml +
10_Run_IntelliJ_Plugin.xml
* docs/building.md IntelliJ Plugin section + output-table row
* CLAUDE.md project description
* .gitignore entry for plugin .idea/
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rift
Sherpa-ONNX is our only ORT-consuming backend; its prebuilt artifacts
assume a specific ONNX Runtime version. On iOS the sherpa-onnx.xcframework
leaves ORT symbols undefined and expects them to be resolved against the
separate pod-archive-onnxruntime-c-<version>.zip. On Android, macOS, Linux,
and Windows sherpa's distribution bundles a compatible libonnxruntime
alongside its own libs, so we consume ORT from sherpa's tree.
Prior state had ONNX_VERSION_IOS=ONNX_VERSION_ANDROID=1.17.1 (sherpa's
actual version) but ONNX_VERSION_MACOS=ONNX_VERSION_LINUX=
ONNX_VERSION_WINDOWS=1.23.2 — a silent drift. On macOS/Linux/Windows the
separate ORT fetch would land at 1.23.2 while sherpa expected 1.17.1, and
nothing flagged it. First Ort:: call after sherpa loaded would hit a
missing symbol if the ABI had drifted.
Changes:
* VERSIONS: all ONNX_VERSION_* pinned to 1.17.1 (what sherpa
1.12.18/1.12.20/1.12.23 expects), with documentation explaining that
sherpa is the single source of truth.
* load-versions.sh + LoadVersions.cmake: hard-error when the five
ONNX_VERSION_* pins don't all match. Makes drift a loud build failure
rather than a silent runtime crash.
* FetchONNXRuntime.cmake: top-of-file docstring stating the per-platform
sourcing strategy (Android → from sherpa; WASM → interface; iOS →
separate download pinned to sherpa version; macOS/Linux/Windows →
separate fetch, also pinned).
* Package.swift: document why RABackendONNX.xcframework and
onnxruntime-{ios,macos}.xcframework must ship together (sherpa's
undefined ORT symbols + our raw-ORT wake-word and embeddings code).
Zero runtime change on iOS and Android (versions were already aligned).
macOS/Linux/Windows now fetch ORT 1.17.1 instead of 1.23.2, matching
sherpa — closing a latent version-drift bug.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ddings
Previously three call sites each constructed their own OrtEnv:
* src/backends/onnx/onnx_backend.cpp — "runanywhere"
* src/backends/onnx/wakeword_onnx.cpp — "WakeWord"
* src/features/rag/onnx_embedding_provider.cpp — "RAGEmbedding"
ORT allows multiple envs but each spins up its own logger, thread-pool
scaffolding, and arena allocator — pure overhead when every consumer
runs in the same process. This adds shared/rac_ort_env.{h,cpp} and
migrates the three call sites to consume it.
Design:
* One lazily-initialized Ort::Env (via std::call_once) with log name
"RunAnywhere" and WARNING-level ORT logging.
* The env is heap-allocated and intentionally never destroyed — it
outlives every rac_backend_onnx / wakeword / embedding instance, so
the dangling-ref-at-shutdown pitfall of static-duration env
destructors is avoided.
* Three accessors: shared_ort_api() → const OrtApi*, shared_ort_env()
→ OrtEnv*, shared_cxx_env() → Ort::Env&. C-API call sites and C++
call sites both reach the same underlying env without double-wrap.
Call-site migrations:
* onnx_backend.cpp: initialize_ort() pulls api + env from the shared
singleton; cleanup() clears pointers but does NOT ReleaseEnv.
* wakeword_onnx.cpp: drops the unique_ptr<Ort::Env> member from the
backend struct, touches shared_cxx_env() during create() so we fail
early if the singleton can't initialize, and passes the shared env
reference to every Ort::Session ctor. Adds std::runtime_error catch
for the initialization-failure path.
* onnx_embedding_provider.cpp: pulls api + env from the shared
singleton; cleanup() clears pointers but does NOT ReleaseEnv.
Zero remaining CreateEnv / ReleaseEnv calls under src/. Guarded by
RAC_HAS_ONNX so WASM (which doesn't compile rac_backend_onnx) is
unaffected.
Note: the further directory reshape under src/backends/onnx/ (sherpa/,
wakeword/, embeddings/ subfolders; moving onnx_embedding_provider out
of src/features/rag/) is deferred — pure cosmetic move that needs
validation across 5 native platforms' CMake include paths, better left
to a follow-up once CI can cycle per platform.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces a second WASM runtime alongside sherpa-onnx.wasm — Microsoft's
onnxruntime-web (~2 MB) — to run arbitrary ONNX models that sherpa's C API
doesn't expose. Wake-word (openWakeWord 3-stage pipeline) and RAG
embeddings (BERT-style encoders) both fall in that bucket because sherpa
is a speech-specific library, not a generic ONNX runtime.
Design mirrors the native side:
sherpa-onnx.wasm → STT / TTS / VAD (unchanged)
onnxruntime-web → wake-word, embeddings, future direct-ORT
features
Both WASM modules are lazy-loaded, so apps that only use one feature set
don't pay the download cost for the other.
What this commit adds:
Foundation/ORTRuntimeBridge.ts
Singleton wrapper over onnxruntime-web. Initializes ort.env once
(thread count, wasmPaths, log severity), exposes createSession().
Extensions/WakeWordTypes.ts + RunAnywhere+WakeWord.ts
Types match native rac_wakeword_onnx_config_t so the same
openWakeWord .onnx models work cross-platform. load() loads
Stage 1 (melspec) + Stage 2 (embedding) + N classifier sessions
in parallel. feed() is stubbed with a descriptive error — the
feed-forward port from native's process_audio_frame() is a
separate ML-engineering PR.
Extensions/EmbeddingsTypes.ts + RunAnywhere+Embeddings.ts
Matches native onnx_embedding_provider's layout (input_ids /
attention_mask / token_type_ids; last_hidden_state → mean-pool).
load() wires up the session; embed()/embedBatch() stubbed pending
WordPiece / SentencePiece tokenizer integration (follow-up PR).
package.json
Adds onnxruntime-web ^1.17.1 as a runtime dependency (version
aligned with the native ORT pin in commons/VERSIONS so the same
ORT release is used everywhere).
index.ts
Exports WakeWord, WakeWordService, Embeddings, EmbeddingsService,
ORTRuntimeBridge, plus the new config / result types.
The API surface is the goal of this commit — the inference math lands
when product work on wake-word / RAG kicks off. Until then, calling
feed() or embed() throws a descriptive "not yet implemented" error, not
a cryptic runtime crash.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the "not yet implemented" stubs with full TypeScript ports of
the native logic.
WAKE-WORD (Extensions/RunAnywhere+WakeWord.ts)
----------------------------------------------
Full openWakeWord 3-stage feed-forward pipeline, mirrored from
sdk/runanywhere-commons/src/backends/onnx/wakeword_onnx.cpp:
* 1280-sample audio framing (80 ms @ 16 kHz), with a 480-sample
context overlap fed alongside each new frame so mel-spec boundary
frames match Python openWakeWord exactly.
* Stage 1: melspectrogram.onnx → (v / 10) + 2 post-transform applied.
* Stage 2: sliding 76-frame windows, stride 8, through the embedding
model. 96-dim output, zero-padded if the model emits fewer dims.
* Stage 3: per-classifier `[1, N, 96]` input, N auto-discovered from
the model's input metadata (fallback 16). Per-classifier threshold
+ optional cooldown-frames gating to prevent repeated fires on a
single utterance.
* Buffers pre-filled with 76 ones(32) mel frames at load() / reset()
so the first embedding fires immediately without ~1 s warm-up,
matching the Python reference.
* feed() accepts Float32Array or Int16Array — converts int16 → float32
inline [-1, 1].
EMBEDDINGS (Extensions/RunAnywhere+Embeddings.ts + Foundation/WordPieceTokenizer.ts)
------------------------------------------------------------------------------------
Full HuggingFace BERT-compatible WordPiece tokenizer (~230 LOC, matches
`transformers.BertTokenizer(do_lower_case=True, strip_accents=True)`) +
encoder inference pipeline:
* WordPieceTokenizer loads either `vocab.txt` (one token per line, id
= line index) or a HuggingFace `tokenizer.json` blob, resolving the
four special tokens ([CLS] [SEP] [UNK] [PAD]) up-front.
* BasicTokenizer: control-char strip, NFD normalize, optional
lowercase + accent strip, whitespace + punctuation splitting.
* WordPiece: greedy longest-prefix-first with `##` continuation; emits
[UNK] when the first prefix doesn't resolve.
* Encoder wiring introspects the model's input names and maps
BERT-standard aliases (input_ids / attention_mask / token_type_ids)
so the same code handles HF-exported and ONNX-optimized encoders.
* Mean-pool along the sequence dimension weighted by attention_mask,
optional L2 normalize (default ON) for cosine-similarity search.
* `embed()` handles the [1, N, D] or [N, D] output shape variants.
* Dependency: onnxruntime-web 1.24+ (latest stable — ORT-web is a
separate runtime from sherpa's bundled ORT, so no version-pin
coupling to SHERPA_ONNX_VERSION_*).
Everything typechecks green under `npm run typecheck` across the three
web packages (core / llamacpp / onnx).
Apps that want to start using wake-word or embeddings today:
import { WakeWord, Embeddings } from '@runanywhere/web-onnx';
await WakeWord.load({
shared: { melspectrogramModel: '...', embeddingModel: '...' },
classifiers: [{ modelId: 'hey_jarvis', wakeWord: 'Hey Jarvis',
classifierModel: '...' }],
});
WakeWord.setCallback(({ wakeWord, score }) => ...);
await WakeWord.feed(pcm16kMonoFloat32Samples);
await Embeddings.load({
model: 'https://.../model.onnx',
tokenizer: 'https://.../vocab.txt', // or tokenizer.json
});
const { vector } = await Embeddings.embed('search query text');
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Groups the ONNX backend code by what it does, not by history:
src/backends/onnx/
├── shared/ ← shared/rac_ort_env.{h,cpp} (one OrtEnv for all)
├── wakeword/ ← wakeword/wakeword_onnx.cpp (raw Ort::Session, openWakeWord)
├── embeddings/ ← onnx_embedding_provider.{h,cpp}
│ rac_onnx_embeddings_register.cpp (raw ORT C API, BERT for RAG)
├── onnx_backend.{h,cpp} (sherpa-backed STT/TTS/VAD)
├── rac_onnx.cpp
└── rac_backend_onnx_register.cpp
Moves:
src/backends/onnx/wakeword_onnx.cpp
→ src/backends/onnx/wakeword/wakeword_onnx.cpp
src/features/rag/onnx_embedding_provider.{cpp,h}
src/features/rag/rac_onnx_embeddings_register.cpp
→ src/backends/onnx/embeddings/
The embedding-provider move resolves the cross-directory compile hack in
src/backends/onnx/CMakeLists.txt that previously reached into
src/features/rag/ to pull the .cpp files (for the
rac_commons → rac_backend_rag → rac_backend_onnx → rac_commons cycle
workaround). The workaround is still necessary architecturally, but now
the files live where they logically belong; CMakeLists uses a simple
`if(RAC_BACKEND_RAG)` list-append against local paths.
CMakeLists updates:
* src/backends/onnx/CMakeLists.txt
- wakeword_onnx.cpp → wakeword/wakeword_onnx.cpp
- RAG_DIR cross-dir references removed
- embeddings/*.cpp + embeddings/onnx_embedding_provider.h listed
locally under RAC_BACKEND_RAG gate
- target_include_directories(... PRIVATE embeddings/) so
rac_onnx_embeddings_register.cpp's flat
`#include "onnx_embedding_provider.h"` still resolves
* src/features/rag/CMakeLists.txt
- cross-compile comment updated to point at the new location
Include-path fixups inside the moved files:
* embeddings/onnx_embedding_provider.cpp
- `../../backends/onnx/onnx_backend.h` → `../onnx_backend.h`
- `../../backends/onnx/shared/rac_ort_env.h` → `../shared/rac_ort_env.h`
* wakeword/wakeword_onnx.cpp
- `shared/rac_ort_env.h` → `../shared/rac_ort_env.h`
Doc references updated:
* include/rac/core/rac_platform_compat.h (call-site list)
* tests/simple_tokenizer_test.cpp (comment)
* web/onnx TS file headers (native-path citations)
Git tracks all four file moves as renames, so blame / history is
preserved.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis pull request removes the IntelliJ plugin demo and its CI/build artifacts, centralizes ONNX Runtime into a process-wide shared environment, strips unused Sherpa-ONNX JNI/C++ libraries from mobile packaging, adds wake-word detection and text-embeddings to the Web SDK using onnxruntime-web, decouples Sherpa-ONNX from the main WASM artifact, and enforces identical ONNX version pins across platforms. Changes
Sequence Diagram(s)sequenceDiagram
participant Browser as Client (Web)
participant WakeWordSvc as WakeWordService
participant ORTBridge as ORTRuntimeBridge
participant ORTRuntime as onnxruntime-web
participant Classifier as Classifier Session
Browser->>WakeWordSvc: feed(audio)
WakeWordSvc->>WakeWordSvc: frame/align & melspec
WakeWordSvc->>ORTBridge: ensure initialized
ORTBridge->>ORTRuntime: import & configure
ORTRuntime-->>ORTBridge: ready
ORTBridge-->>WakeWordSvc: ort API/session factory
WakeWordSvc->>ORTRuntime: run melspec session
ORTRuntime-->>WakeWordSvc: mel output
WakeWordSvc->>ORTRuntime: run embedding session
ORTRuntime-->>WakeWordSvc: embedding vector
loop per classifier
WakeWordSvc->>Classifier: run classifier session
Classifier-->>WakeWordSvc: score
alt score > threshold
WakeWordSvc-->>Browser: emit WakeWordDetection
end
end
sequenceDiagram
participant App as Client
participant EmbSvc as EmbeddingsService
participant Tokenizer as WordPieceTokenizer
participant ORTBridge as ORTRuntimeBridge
participant Encoder as Encoder Session
App->>EmbSvc: embed(text)
EmbSvc->>Tokenizer: encode(text)
Tokenizer-->>EmbSvc: inputIds, attentionMask
EmbSvc->>ORTBridge: ensure initialized & createSession(encoder)
ORTBridge->>Encoder: create session
Encoder-->>EmbSvc: session created
EmbSvc->>Encoder: session.run(inputs)
Encoder-->>EmbSvc: last_hidden_state
EmbSvc->>EmbSvc: mean-pool & optional L2 normalize
EmbSvc-->>App: EmbeddingResult(vector, dim, tokenCount)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| static async initialize(options: ORTRuntimeInitOptions = {}): Promise<typeof ort> { | ||
| if (this._ort) return this._ort; | ||
| if (this._loadPromise) return this._loadPromise; | ||
|
|
||
| this._loadPromise = (async () => { | ||
| const mod = await import('onnxruntime-web'); | ||
|
|
||
| // Configure the shared ort.env before the first session is created. | ||
| if (options.wasmPaths !== undefined) { | ||
| mod.env.wasm.wasmPaths = options.wasmPaths as | ||
| | string | ||
| | Record<string, string>; | ||
| } | ||
|
|
||
| const threads = options.numThreads ?? Math.min( | ||
| typeof navigator !== 'undefined' ? navigator.hardwareConcurrency ?? 1 : 1, | ||
| 4, | ||
| ); | ||
| mod.env.wasm.numThreads = threads; | ||
|
|
||
| if (options.logSeverityLevel !== undefined) { | ||
| mod.env.logLevel = ( | ||
| ['verbose', 'info', 'warning', 'error', 'fatal'] as const | ||
| )[options.logSeverityLevel]; | ||
| } | ||
|
|
||
| this._ort = mod; | ||
| this._initialized = true; | ||
| return mod; | ||
| })(); | ||
|
|
||
| return this._loadPromise; | ||
| } |
There was a problem hiding this comment.
Rejected load promise cached permanently
If import('onnxruntime-web') rejects (WASM asset missing, CDN unreachable, misconfigured wasmPaths), _loadPromise is left pointing at a rejected promise. Every subsequent call to initialize() hits if (this._loadPromise) return this._loadPromise and returns the same rejection without ever retrying, making the bridge permanently unusable — the only escape is the test-only _resetForTests().
Clear _loadPromise in a catch so callers can retry after fixing the configuration.
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts
Line: 63-95
Comment:
**Rejected load promise cached permanently**
If `import('onnxruntime-web')` rejects (WASM asset missing, CDN unreachable, misconfigured `wasmPaths`), `_loadPromise` is left pointing at a rejected promise. Every subsequent call to `initialize()` hits `if (this._loadPromise) return this._loadPromise` and returns the same rejection without ever retrying, making the bridge permanently unusable — the only escape is the test-only `_resetForTests()`.
Clear `_loadPromise` in a catch so callers can retry after fixing the configuration.
How can I resolve this? If you propose a fix, please make it concise.| /// Returns the shared Ort::Env (C++ API). Thread-safe; lazy-initialized. | ||
| /// Throws Ort::Exception if ORT could not create the env. | ||
| Ort::Env& shared_cxx_env(); | ||
|
|
There was a problem hiding this comment.
Doc says
Ort::Exception; implementation throws std::runtime_error
The header documents shared_cxx_env() as "Throws Ort::Exception if ORT could not create the env", but rac_ort_env.cpp throws std::runtime_error. All current call sites already catch both types, so there's no runtime issue — but the doc will mislead future callers who only guard against Ort::Exception.
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.h
Line: 38-41
Comment:
**Doc says `Ort::Exception`; implementation throws `std::runtime_error`**
The header documents `shared_cxx_env()` as "Throws `Ort::Exception` if ORT could not create the env", but `rac_ort_env.cpp` throws `std::runtime_error`. All current call sites already catch both types, so there's no runtime issue — but the doc will mislead future callers who only guard against `Ort::Exception`.
How can I resolve this? If you propose a fix, please make it concise.| async feed(samples: Float32Array | Int16Array): Promise<void> { | ||
| if (!this._isReady) { | ||
| throw new Error('WakeWordService.load() must complete before feed().'); | ||
| } | ||
|
|
||
| // Normalize int16 → float32 in-place into the buffer. | ||
| if (samples instanceof Int16Array) { | ||
| for (let i = 0; i < samples.length; i++) { | ||
| this.audioBuffer.push(samples[i]! / 32768); | ||
| } | ||
| } else { | ||
| for (let i = 0; i < samples.length; i++) { | ||
| this.audioBuffer.push(samples[i]!); | ||
| } | ||
| } | ||
|
|
||
| while (this.audioBuffer.length >= FRAME_SIZE) { | ||
| const frame = Float32Array.from(this.audioBuffer.splice(0, FRAME_SIZE)); | ||
| await this.processFrame(frame); | ||
| } |
There was a problem hiding this comment.
audioBuffer: number[] causes GC pressure at real-time audio rates
Each feed() call pushes samples one-by-one into a plain number[], then splice(0, 1280) removes the front chunk — an O(n) shift at ~12 frames/second. A Float32Array with a write-head cursor would eliminate the boxing and the O(n) splice.
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts
Line: 167-186
Comment:
**`audioBuffer: number[]` causes GC pressure at real-time audio rates**
Each `feed()` call pushes samples one-by-one into a plain `number[]`, then `splice(0, 1280)` removes the front chunk — an O(n) shift at ~12 frames/second. A `Float32Array` with a write-head cursor would eliminate the boxing and the O(n) splice.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/commons-release.yml (1)
331-337:⚠️ Potential issue | 🟠 MajorFail the workflow when required
.sofiles are missing.This verification currently only logs missing files; it should exit non-zero to prevent publishing incomplete artifacts.
🔧 Proposed fix
- for lib in libsherpa-onnx-c-api.so libonnxruntime.so; do + missing=0 + for lib in libsherpa-onnx-c-api.so libonnxruntime.so; do if [ -f "third_party/sherpa-onnx-android/jniLibs/${{ matrix.abi }}/$lib" ]; then echo "✅ Found: $lib" else echo "❌ Missing: $lib" + missing=1 fi done + if [ "$missing" -ne 0 ]; then + echo "Required Sherpa/ORT artifacts are missing for ABI: ${{ matrix.abi }}" + exit 1 + fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/commons-release.yml around lines 331 - 337, The current shell loop only echoes missing .so files but doesn't fail the job; modify the loop that iterates over libsherpa-onnx-c-api.so and libonnxruntime.so (the for ... in ...; do ... done block) to track whether any file was missing (e.g., set a variable like "missing=1" when the else branch runs) and after the loop exit with a non-zero status if any missing files were detected (exit 1) so the workflow fails when required .so files are absent.
🧹 Nitpick comments (7)
sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts (1)
191-200: Updatecheck_libs_exist()in build-kotlin.sh to validate all ONNX dependencies.The
onnxLibsset in build.gradle.kts includeslibonnxruntime.soandlibsherpa-onnx-c-api.so, butcheck_libs_exist()only validateslibrac_backend_onnx_jni.so. Add checks for the missing libraries to catch incomplete dependency downloads during the pre-build validation step.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts` around lines 191 - 200, check_libs_exist() in build-kotlin.sh currently only verifies librac_backend_onnx_jni.so while build.gradle.kts defines the onnxLibs set (librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so, libsherpa-onnx-c-api.so); update check_libs_exist() to iterate/validate all those filenames from the onnxLibs list (or explicitly check each: librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so, libsherpa-onnx-c-api.so) and fail early with a clear error if any are missing so incomplete ONNX dependency downloads are caught during pre-build validation.sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp (1)
59-61: Makeshared_ort_env()returnconst OrtEnv*to enforce read-only access.The ORT C API
CreateSessionexpectsconst OrtEnv*, and the process-wide singleton is never released. Returning a mutableOrtEnv*creates unnecessary risk of accidentalReleaseEnv()calls. Tightening the return type toconst OrtEnv*will encode the non-owning, read-only contract in the type system and match ORT's downstream expectations.Also consider const-qualifying
get_ort_env()inONNXBackendNew(line 209 ofonnx_backend.h) for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp` around lines 59 - 61, Change shared_ort_env() to return a const OrtEnv* (instead of OrtEnv*) to enforce read-only, non-owning access; update its return expression to return g_cxx_env ? static_cast<const OrtEnv*>(*g_cxx_env) : nullptr and ensure callers use const pointers. Also const-qualify get_ort_env() in ONNXBackendNew to return const OrtEnv* for consistency with ORT C API and to prevent accidental ReleaseEnv() usage. Verify any downstream usages that expect mutable OrtEnv* are adjusted to accept const OrtEnv*.sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+Embeddings.ts (2)
101-103: Add defensive check for output tensor existence.The
output[this.outputName]access could return undefined if the model's output name doesn't match expectations. Whilethis.outputNameis validated duringload(), a defensive check would prevent runtime errors if the model behaves unexpectedly.🛡️ Suggested improvement
const output = await this.session.run(feeds); - const lastHidden = output[this.outputName]!; + const lastHidden = output[this.outputName]; + if (!lastHidden) { + throw new Error(`EmbeddingsService: expected output "${this.outputName}" not found in model response`); + } const pooled = this.meanPool(lastHidden, encoded.attentionMask);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+Embeddings.ts around lines 101 - 103, Add a defensive check after running the session to ensure the model returned the expected output tensor: after const output = await this.session.run(feeds), verify that output[this.outputName] is defined before using it (the current code assigns to lastHidden and calls this.meanPool). If it's undefined, throw or return a clear error that mentions this.outputName and the session outputs to aid debugging; keep the existing validation in load() but add this runtime guard around the use of output[this.outputName] (affecting the variables lastHidden and pooled and the call to this.meanPool).
220-222: Consider validating tensor data type before casting.The type assertion
hidden.data as Float32Arrayassumes the encoder outputs float32, which is typical but not guaranteed. If a model outputs a different type, this could lead to incorrect embedding values.🛡️ Optional type check
+ if (hidden.type !== 'float32') { + throw new Error(`EmbeddingsService: expected float32 output, got ${hidden.type}`); + } const data = hidden.data as Float32Array;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+Embeddings.ts around lines 220 - 222, The code unconditionally casts hidden.data to Float32Array (hidden.data as Float32Array) which can be invalid for models that output other dtypes; update the logic around hidden, hidden.data and hiddenDim to validate the tensor dtype (e.g., check hidden.type or use instanceof on hidden.data) before casting, convert supported numeric types to Float32Array (e.g., create a Float32Array from Int32Array/Float64Array/Uint8Array) and only then populate out, and throw or return a clear error for unsupported dtypes to avoid silent corruption of embeddings.sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts (1)
63-95: Document that options only apply on first initialization.The
initialize()method correctly implements idempotency, but callers may not realize that passing differentoptionson subsequent calls has no effect since the first initialization wins. Consider adding a warning log or updating the JSDoc.📝 Suggested documentation improvement
/** * Ensure the onnxruntime-web module is loaded and its global env is - * configured. Idempotent — subsequent calls resolve with the same module. + * configured. Idempotent — subsequent calls resolve with the same module + * and ignore any new options (first call's options are authoritative). */ static async initialize(options: ORTRuntimeInitOptions = {}): Promise<typeof ort> { - if (this._ort) return this._ort; + if (this._ort) { + if (Object.keys(options).length > 0) { + console.warn('[ORTRuntimeBridge] Already initialized; ignoring options'); + } + return this._ort; + } if (this._loadPromise) return this._loadPromise;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts` around lines 63 - 95, The initialize() method of ORTRuntimeBridge is idempotent and ignores options after the first call, so update the ORTRuntimeBridge.initialize implementation and docs to make that explicit: add a JSDoc note on ORTRuntimeBridge.initialize stating "options only apply on first initialization" and, inside initialize(), check this._initialized or this._loadPromise and if options are provided on subsequent calls emit a warning (use your logger or console.warn) indicating that passed options will be ignored because the runtime is already initialized; reference the initialize method and the _initialized/_loadPromise fields to locate the change.sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.ts (2)
127-131: Add validation for session input/output names.The non-null assertions on
session.inputNames[0]andsession.outputNames[0]assume the classifier model has at least one input and output. If a malformed model is loaded, this would throw an unclear error.🛡️ Suggested validation
return { modelId: c.modelId, wakeWord: c.wakeWord, threshold: c.threshold ?? config.globalThreshold ?? 0.5, numEmbeddings: this.resolveClassifierWindow(session), session, - inputName: session.inputNames[0]!, - outputName: session.outputNames[0]!, + inputName: session.inputNames[0] ?? (() => { + throw new Error(`Classifier ${c.modelId} has no input tensors`); + })(), + outputName: session.outputNames[0] ?? (() => { + throw new Error(`Classifier ${c.modelId} has no output tensors`); + })(), lastDetectionFrame: -Infinity, } satisfies LoadedClassifier;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+WakeWord.ts around lines 127 - 131, The code uses non-null assertions for session.inputNames[0] and session.outputNames[0] when creating a LoadedClassifier, which will throw unclear errors for malformed models; update the code that constructs the LoadedClassifier (the object with keys session, inputName, outputName, lastDetectionFrame) to explicitly validate that session.inputNames and session.outputNames are arrays with length > 0 before accessing index 0, and if validation fails throw or return a clear, descriptive error (e.g., "model missing inputNames" / "model missing outputNames") or handle the fallback case accordingly so the LoadedClassifier creation is safe.
183-186: Consider using a circular buffer for better memory efficiency.The current approach using
splice(0, FRAME_SIZE)creates a new array on each frame extraction and shifts all remaining elements. For continuous wake-word detection, this could cause memory churn. A circular buffer or typed array with index tracking would be more efficient.However, at 16kHz with 80ms frames (~12.5 frames/second), this is unlikely to cause noticeable issues in practice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+WakeWord.ts around lines 183 - 186, The current loop in RunAnywhere+WakeWord.ts that extracts frames using this.audioBuffer.splice(0, FRAME_SIZE) is inefficient; replace the growing JS array with a fixed-size circular buffer (e.g., a Float32Array) and maintain read/write indices so frames are read by slicing from the typed array with wrap-around logic instead of splicing; update the producer to write into the buffer and the consumer loop (the code that calls await this.processFrame(frame)) to assemble a FRAME_SIZE Float32Array from the circular buffer using the indices, advance the read index by FRAME_SIZE, and handle buffer-full/overflow conditions to avoid shifting or allocations on each frame.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@sdk/runanywhere-commons/cmake/LoadVersions.cmake`:
- Around line 61-71: The ONNX pin equality loop (_ONNX_PINS / _ONNX_CANONICAL)
currently allows all-empty pins to pass; update the invariant to first ensure
the canonical pin (RAC_ONNX_VERSION_IOS) and every entry in _ONNX_PINS are
non-empty and then enforce equality. Specifically, check that
RAC_ONNX_VERSION_IOS (used to set _ONNX_CANONICAL) is not empty and add a
non-empty guard for each _pin in the foreach before comparing, and emit a clear
message(FATAL_ERROR) referencing the symbol names (_ONNX_CANONICAL, _ONNX_PINS,
RAC_ONNX_VERSION_IOS/ANDROID/MACOS/LINUX/WINDOWS) when any pin is empty or
mismatched.
In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp`:
- Around line 54-70: Rename the three externally visible helper functions to use
the rac_ prefix and update their declarations and all call sites: change
shared_ort_api -> rac_shared_ort_api, shared_ort_env -> rac_shared_ort_env, and
shared_cxx_env -> rac_shared_cxx_env; keep the internal init logic (g_init_flag,
init_once, g_api, g_cxx_env) intact, update any header prototypes and exported
symbols, and ensure any external linkage/exports (e.g., extern "C" or symbol
export macros) reference the new rac_ names so the public ABI follows the
Commons prefix convention.
In `@sdk/runanywhere-kotlin/docs/KOTLIN_MAVEN_CENTRAL_PUBLISHING.md`:
- Around line 141-143: Update earlier documentation entries that still claim
wildcard Sherpa variants and "6 libs per ABI" to match the later statement "ONNX
module: backend + ORT + sherpa C API only" and the mkdir line for
modules/runanywhere-core-onnx/src/androidMain/jniLibs/$ABI; specifically remove
or replace references to wildcard sherpa artifacts and the six-per-ABI count in
any tables or summary counts so they reflect only the C API artifacts actually
shipped for modules/runanywhere-core-onnx.
In `@sdk/runanywhere-kotlin/scripts/build-kotlin.sh`:
- Around line 402-419: The current directory-level branching prevents
per-library fallbacks: change the logic in the lib copy loop so each lib
(libonnxruntime.so and libsherpa-onnx-c-api.so) is checked first in
"${COMMONS_DIST}/onnx/${ABI}/${lib}" and if missing then checked in
"${SHERPA_ONNX_LIBS}/${ABI}/${lib}" before skipping; use the same copy + log
sequence to copy to "${ONNX_JNILIBS_DIR}/${ABI}/" when either source exists,
keeping the existing variable names (COMMONS_DIST, SHERPA_ONNX_LIBS,
ONNX_JNILIBS_DIR, ABI) and loop variables (lib or lib_name) to locate and modify
the code in build-kotlin.sh.
In `@sdk/runanywhere-web/packages/onnx/src/Foundation/WordPieceTokenizer.ts`:
- Around line 92-96: The tokenizer's vocab build loop in WordPieceTokenizer
currently uses vocab.size to assign IDs, which shifts IDs when the vocab.txt
contains empty lines; instead use the file line index as the token ID. In the
loop that iterates lines (the block that references tok and vocab), replace the
vocab.set(tok, vocab.size) behavior with vocab.set(tok, i) so each token gets
the original line-number ID (skip adding when tok is undefined or empty but
still consume the index i), ensuring the Map<string, number> preserves
HuggingFace line-number-based IDs.
---
Outside diff comments:
In @.github/workflows/commons-release.yml:
- Around line 331-337: The current shell loop only echoes missing .so files but
doesn't fail the job; modify the loop that iterates over libsherpa-onnx-c-api.so
and libonnxruntime.so (the for ... in ...; do ... done block) to track whether
any file was missing (e.g., set a variable like "missing=1" when the else branch
runs) and after the loop exit with a non-zero status if any missing files were
detected (exit 1) so the workflow fails when required .so files are absent.
---
Nitpick comments:
In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp`:
- Around line 59-61: Change shared_ort_env() to return a const OrtEnv* (instead
of OrtEnv*) to enforce read-only, non-owning access; update its return
expression to return g_cxx_env ? static_cast<const OrtEnv*>(*g_cxx_env) :
nullptr and ensure callers use const pointers. Also const-qualify get_ort_env()
in ONNXBackendNew to return const OrtEnv* for consistency with ORT C API and to
prevent accidental ReleaseEnv() usage. Verify any downstream usages that expect
mutable OrtEnv* are adjusted to accept const OrtEnv*.
In `@sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts`:
- Around line 191-200: check_libs_exist() in build-kotlin.sh currently only
verifies librac_backend_onnx_jni.so while build.gradle.kts defines the onnxLibs
set (librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so,
libsherpa-onnx-c-api.so); update check_libs_exist() to iterate/validate all
those filenames from the onnxLibs list (or explicitly check each:
librac_backend_onnx.so, librac_backend_onnx_jni.so, libonnxruntime.so,
libsherpa-onnx-c-api.so) and fail early with a clear error if any are missing so
incomplete ONNX dependency downloads are caught during pre-build validation.
In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+Embeddings.ts:
- Around line 101-103: Add a defensive check after running the session to ensure
the model returned the expected output tensor: after const output = await
this.session.run(feeds), verify that output[this.outputName] is defined before
using it (the current code assigns to lastHidden and calls this.meanPool). If
it's undefined, throw or return a clear error that mentions this.outputName and
the session outputs to aid debugging; keep the existing validation in load() but
add this runtime guard around the use of output[this.outputName] (affecting the
variables lastHidden and pooled and the call to this.meanPool).
- Around line 220-222: The code unconditionally casts hidden.data to
Float32Array (hidden.data as Float32Array) which can be invalid for models that
output other dtypes; update the logic around hidden, hidden.data and hiddenDim
to validate the tensor dtype (e.g., check hidden.type or use instanceof on
hidden.data) before casting, convert supported numeric types to Float32Array
(e.g., create a Float32Array from Int32Array/Float64Array/Uint8Array) and only
then populate out, and throw or return a clear error for unsupported dtypes to
avoid silent corruption of embeddings.
In `@sdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere`+WakeWord.ts:
- Around line 127-131: The code uses non-null assertions for
session.inputNames[0] and session.outputNames[0] when creating a
LoadedClassifier, which will throw unclear errors for malformed models; update
the code that constructs the LoadedClassifier (the object with keys session,
inputName, outputName, lastDetectionFrame) to explicitly validate that
session.inputNames and session.outputNames are arrays with length > 0 before
accessing index 0, and if validation fails throw or return a clear, descriptive
error (e.g., "model missing inputNames" / "model missing outputNames") or handle
the fallback case accordingly so the LoadedClassifier creation is safe.
- Around line 183-186: The current loop in RunAnywhere+WakeWord.ts that extracts
frames using this.audioBuffer.splice(0, FRAME_SIZE) is inefficient; replace the
growing JS array with a fixed-size circular buffer (e.g., a Float32Array) and
maintain read/write indices so frames are read by slicing from the typed array
with wrap-around logic instead of splicing; update the producer to write into
the buffer and the consumer loop (the code that calls await
this.processFrame(frame)) to assemble a FRAME_SIZE Float32Array from the
circular buffer using the indices, advance the read index by FRAME_SIZE, and
handle buffer-full/overflow conditions to avoid shifting or allocations on each
frame.
In `@sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts`:
- Around line 63-95: The initialize() method of ORTRuntimeBridge is idempotent
and ignores options after the first call, so update the
ORTRuntimeBridge.initialize implementation and docs to make that explicit: add a
JSDoc note on ORTRuntimeBridge.initialize stating "options only apply on first
initialization" and, inside initialize(), check this._initialized or
this._loadPromise and if options are provided on subsequent calls emit a warning
(use your logger or console.warn) indicating that passed options will be ignored
because the runtime is already initialized; reference the initialize method and
the _initialized/_loadPromise fields to locate the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b303d4c9-b72a-412a-8740-9e4a0514ddae
⛔ Files ignored due to path filters (1)
examples/intellij-plugin-demo/plugin/gradle/wrapper/gradle-wrapper.jaris excluded by!**/*.jar
📒 Files selected for processing (61)
.github/workflows/build-all-test.yml.github/workflows/commons-release.yml.gitignore.idea/runConfigurations/09_Build_IntelliJ_Plugin.xml.idea/runConfigurations/10_Run_IntelliJ_Plugin.xmlCLAUDE.mdPackage.swiftbuild.gradle.ktsdocs/building.mdexamples/intellij-plugin-demo/plugin/build.gradle.ktsexamples/intellij-plugin-demo/plugin/gradle/wrapper/gradle-wrapper.propertiesexamples/intellij-plugin-demo/plugin/gradlewexamples/intellij-plugin-demo/plugin/gradlew.batexamples/intellij-plugin-demo/plugin/settings.gradle.ktsexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/RunAnywherePlugin.ktexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/ModelManagerAction.ktexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceCommandAction.ktexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceDictationAction.ktexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/services/VoiceService.ktexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/toolwindow/STTToolWindow.ktexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/ModelManagerDialog.ktexamples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/WaveformVisualization.ktexamples/intellij-plugin-demo/plugin/src/main/resources/META-INF/plugin.xmlsdk/runanywhere-commons/VERSIONSsdk/runanywhere-commons/cmake/FetchONNXRuntime.cmakesdk/runanywhere-commons/cmake/LoadVersions.cmakesdk/runanywhere-commons/include/rac/core/rac_platform_compat.hsdk/runanywhere-commons/scripts/android/download-sherpa-onnx.shsdk/runanywhere-commons/scripts/build-android.shsdk/runanywhere-commons/scripts/load-versions.shsdk/runanywhere-commons/src/backends/onnx/CMakeLists.txtsdk/runanywhere-commons/src/backends/onnx/embeddings/onnx_embedding_provider.cppsdk/runanywhere-commons/src/backends/onnx/embeddings/onnx_embedding_provider.hsdk/runanywhere-commons/src/backends/onnx/embeddings/rac_onnx_embeddings_register.cppsdk/runanywhere-commons/src/backends/onnx/onnx_backend.cppsdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cppsdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.hsdk/runanywhere-commons/src/backends/onnx/wakeword/wakeword_onnx.cppsdk/runanywhere-commons/src/features/rag/CMakeLists.txtsdk/runanywhere-commons/tests/simple_tokenizer_test.cppsdk/runanywhere-flutter/scripts/build-flutter.shsdk/runanywhere-kotlin/docs/KOTLIN_MAVEN_CENTRAL_PUBLISHING.mdsdk/runanywhere-kotlin/modules/runanywhere-core-onnx/README.mdsdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.ktssdk/runanywhere-kotlin/scripts/build-kotlin.shsdk/runanywhere-react-native/packages/onnx/android/CMakeLists.txtsdk/runanywhere-react-native/scripts/build-react-native.shsdk/runanywhere-web/README.mdsdk/runanywhere-web/packages/core/README.mdsdk/runanywhere-web/packages/onnx/package.jsonsdk/runanywhere-web/packages/onnx/src/Extensions/EmbeddingsTypes.tssdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+Embeddings.tssdk/runanywhere-web/packages/onnx/src/Extensions/RunAnywhere+WakeWord.tssdk/runanywhere-web/packages/onnx/src/Extensions/WakeWordTypes.tssdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.tssdk/runanywhere-web/packages/onnx/src/Foundation/WordPieceTokenizer.tssdk/runanywhere-web/packages/onnx/src/index.tssdk/runanywhere-web/wasm/CMakeLists.txtsdk/runanywhere-web/wasm/scripts/build.shsdk/runanywhere-web/wasm/src/wasm_exports.cppsettings.gradle.kts
💤 Files with no reviewable changes (20)
- examples/intellij-plugin-demo/plugin/gradle/wrapper/gradle-wrapper.properties
- .idea/runConfigurations/10_Run_IntelliJ_Plugin.xml
- settings.gradle.kts
- docs/building.md
- .gitignore
- .idea/runConfigurations/09_Build_IntelliJ_Plugin.xml
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceDictationAction.kt
- examples/intellij-plugin-demo/plugin/gradlew.bat
- examples/intellij-plugin-demo/plugin/settings.gradle.kts
- examples/intellij-plugin-demo/plugin/build.gradle.kts
- .github/workflows/build-all-test.yml
- examples/intellij-plugin-demo/plugin/gradlew
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/WaveformVisualization.kt
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/ModelManagerAction.kt
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/services/VoiceService.kt
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/RunAnywherePlugin.kt
- examples/intellij-plugin-demo/plugin/src/main/resources/META-INF/plugin.xml
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/toolwindow/STTToolWindow.kt
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/actions/VoiceCommandAction.kt
- examples/intellij-plugin-demo/plugin/src/main/kotlin/com/runanywhere/plugin/ui/ModelManagerDialog.kt
| set(_ONNX_CANONICAL "${RAC_ONNX_VERSION_IOS}") | ||
| foreach(_pin IN LISTS _ONNX_PINS) | ||
| if(NOT "${_pin}" STREQUAL "${_ONNX_CANONICAL}") | ||
| message(FATAL_ERROR | ||
| "ONNX_VERSION_* pins in VERSIONS must all match. " | ||
| "Got: iOS=${RAC_ONNX_VERSION_IOS}, Android=${RAC_ONNX_VERSION_ANDROID}, " | ||
| "macOS=${RAC_ONNX_VERSION_MACOS}, Linux=${RAC_ONNX_VERSION_LINUX}, " | ||
| "Windows=${RAC_ONNX_VERSION_WINDOWS}. " | ||
| "Sherpa-ONNX is the single ORT source of truth — bump in lock-step.") | ||
| endif() | ||
| endforeach() |
There was a problem hiding this comment.
Require non-empty ONNX pins in the invariant check.
At Line 61-Line 64, the check only validates equality. If all RAC_ONNX_VERSION_* values are empty, the loop still passes and configuration continues with invalid version state.
Suggested fix
set(_ONNX_CANONICAL "${RAC_ONNX_VERSION_IOS}")
+if("${_ONNX_CANONICAL}" STREQUAL "")
+ message(FATAL_ERROR "ONNX_VERSION_IOS is missing/empty in VERSIONS")
+endif()
foreach(_pin IN LISTS _ONNX_PINS)
- if(NOT "${_pin}" STREQUAL "${_ONNX_CANONICAL}")
+ if("${_pin}" STREQUAL "" OR NOT "${_pin}" STREQUAL "${_ONNX_CANONICAL}")
message(FATAL_ERROR
"ONNX_VERSION_* pins in VERSIONS must all match. "🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/cmake/LoadVersions.cmake` around lines 61 - 71, The
ONNX pin equality loop (_ONNX_PINS / _ONNX_CANONICAL) currently allows all-empty
pins to pass; update the invariant to first ensure the canonical pin
(RAC_ONNX_VERSION_IOS) and every entry in _ONNX_PINS are non-empty and then
enforce equality. Specifically, check that RAC_ONNX_VERSION_IOS (used to set
_ONNX_CANONICAL) is not empty and add a non-empty guard for each _pin in the
foreach before comparing, and emit a clear message(FATAL_ERROR) referencing the
symbol names (_ONNX_CANONICAL, _ONNX_PINS,
RAC_ONNX_VERSION_IOS/ANDROID/MACOS/LINUX/WINDOWS) when any pin is empty or
mismatched.
| const OrtApi* shared_ort_api() { | ||
| std::call_once(g_init_flag, init_once); | ||
| return g_api; | ||
| } | ||
|
|
||
| OrtEnv* shared_ort_env() { | ||
| std::call_once(g_init_flag, init_once); | ||
| return g_cxx_env ? static_cast<OrtEnv*>(*g_cxx_env) : nullptr; | ||
| } | ||
|
|
||
| Ort::Env& shared_cxx_env() { | ||
| std::call_once(g_init_flag, init_once); | ||
| if (!g_cxx_env) { | ||
| throw std::runtime_error( | ||
| "rac::onnx::shared_cxx_env() failed to initialize Ort::Env"); | ||
| } | ||
| return *g_cxx_env; |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Prefix these exported helpers with rac_.
shared_ort_api, shared_ort_env, and shared_cxx_env are externally linked helpers in sdk/runanywhere-commons, so they should follow the Commons symbol-prefix convention before this becomes part of the backend ABI surface.
As per coding guidelines, "All public symbols must be prefixed with rac_ (RunAnywhere Commons)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-commons/src/backends/onnx/shared/rac_ort_env.cpp` around
lines 54 - 70, Rename the three externally visible helper functions to use the
rac_ prefix and update their declarations and all call sites: change
shared_ort_api -> rac_shared_ort_api, shared_ort_env -> rac_shared_ort_env, and
shared_cxx_env -> rac_shared_cxx_env; keep the internal init logic (g_init_flag,
init_once, g_api, g_cxx_env) intact, update any header prototypes and exported
symbols, and ensure any external linkage/exports (e.g., extern "C" or symbol
export macros) reference the new rac_ names so the public ABI follows the
Commons prefix convention.
| # ONNX module: backend + ORT + sherpa C API only (sherpa-jni / sherpa-cxx-api | ||
| # are intentionally not shipped — we use our own JNI and link the C API). | ||
| mkdir -p modules/runanywhere-core-onnx/src/androidMain/jniLibs/$ABI |
There was a problem hiding this comment.
Update earlier ONNX counts/descriptions to match this c-api-only statement.
This section is now correct, but it conflicts with earlier table/count entries that still imply wildcard Sherpa variants and 6 libs per ABI.
📝 Suggested doc alignment
-| `io.github.sanchitmonga22:runanywhere-onnx-android` | 6 per ABI | STT/TTS/VAD: `librac_backend_onnx*.so`, `libonnxruntime.so`, `libsherpa-onnx-*.so` |
+| `io.github.sanchitmonga22:runanywhere-onnx-android` | 4 per ABI | STT/TTS/VAD: `librac_backend_onnx.so`, `librac_backend_onnx_jni.so`, `libonnxruntime.so`, `libsherpa-onnx-c-api.so` |
-With 3 ABIs (arm64-v8a, armeabi-v7a, x86_64): SDK=12, LlamaCPP=6, ONNX=18 = **36 total .so files**.
+With 3 ABIs (arm64-v8a, armeabi-v7a, x86_64): SDK=12, LlamaCPP=6, ONNX=12 = **30 total .so files**.
-| ONNX | `RABackendONNX-android-{abi}-v{ver}.zip` | `librac_backend_onnx*.so`, `libonnxruntime.so`, `libsherpa-onnx-*.so` |
+| ONNX | `RABackendONNX-android-{abi}-v{ver}.zip` | `librac_backend_onnx.so`, `librac_backend_onnx_jni.so`, `libonnxruntime.so`, `libsherpa-onnx-c-api.so` |🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-kotlin/docs/KOTLIN_MAVEN_CENTRAL_PUBLISHING.md` around lines
141 - 143, Update earlier documentation entries that still claim wildcard Sherpa
variants and "6 libs per ABI" to match the later statement "ONNX module: backend
+ ORT + sherpa C API only" and the mkdir line for
modules/runanywhere-core-onnx/src/androidMain/jniLibs/$ABI; specifically remove
or replace references to wildcard sherpa artifacts and the six-per-ABI count in
any tables or summary counts so they reflect only the C API artifacts actually
shipped for modules/runanywhere-core-onnx.
| if [ -d "${COMMONS_DIST}/onnx/${ABI}" ]; then | ||
| for lib in libonnxruntime.so libsherpa-onnx-c-api.so libsherpa-onnx-cxx-api.so libsherpa-onnx-jni.so; do | ||
| for lib in libonnxruntime.so libsherpa-onnx-c-api.so; do | ||
| if [ -f "${COMMONS_DIST}/onnx/${ABI}/${lib}" ]; then | ||
| cp "${COMMONS_DIST}/onnx/${ABI}/${lib}" "${ONNX_JNILIBS_DIR}/${ABI}/" | ||
| log_info "ONNX: ${lib}" | ||
| fi | ||
| done | ||
| elif [ -d "${SHERPA_ONNX_LIBS}/${ABI}" ]; then | ||
| for lib in "${SHERPA_ONNX_LIBS}/${ABI}"/*.so; do | ||
| if [ -f "$lib" ]; then | ||
| cp "$lib" "${ONNX_JNILIBS_DIR}/${ABI}/" | ||
| log_info "ONNX: $(basename $lib)" | ||
| # Whitelist which sherpa-bundled .so files to copy (skip sherpa-jni | ||
| # and sherpa-cxx-api, which we don't use). | ||
| for lib_name in libonnxruntime.so libsherpa-onnx-c-api.so; do | ||
| local src_lib="${SHERPA_ONNX_LIBS}/${ABI}/${lib_name}" | ||
| if [ -f "${src_lib}" ]; then | ||
| cp "${src_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/" | ||
| log_info "ONNX: ${lib_name}" | ||
| fi | ||
| done | ||
| fi |
There was a problem hiding this comment.
Per-library fallback is broken by directory-level branching.
At Line 402-Line 419, once ${COMMONS_DIST}/onnx/${ABI} exists, the elif fallback to ${SHERPA_ONNX_LIBS}/${ABI} is never reached—even when a specific lib is missing in dist. That can leave libonnxruntime.so or libsherpa-onnx-c-api.so out of jniLibs.
Suggested fix
- if [ -d "${COMMONS_DIST}/onnx/${ABI}" ]; then
- for lib in libonnxruntime.so libsherpa-onnx-c-api.so; do
- if [ -f "${COMMONS_DIST}/onnx/${ABI}/${lib}" ]; then
- cp "${COMMONS_DIST}/onnx/${ABI}/${lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
- log_info "ONNX: ${lib}"
- fi
- done
- elif [ -d "${SHERPA_ONNX_LIBS}/${ABI}" ]; then
- # Whitelist which sherpa-bundled .so files to copy (skip sherpa-jni
- # and sherpa-cxx-api, which we don't use).
- for lib_name in libonnxruntime.so libsherpa-onnx-c-api.so; do
- local src_lib="${SHERPA_ONNX_LIBS}/${ABI}/${lib_name}"
- if [ -f "${src_lib}" ]; then
- cp "${src_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
- log_info "ONNX: ${lib_name}"
- fi
- done
- fi
+ for lib_name in libonnxruntime.so libsherpa-onnx-c-api.so; do
+ local dist_lib="${COMMONS_DIST}/onnx/${ABI}/${lib_name}"
+ local sherpa_lib="${SHERPA_ONNX_LIBS}/${ABI}/${lib_name}"
+ if [ -f "${dist_lib}" ]; then
+ cp "${dist_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
+ log_info "ONNX: ${lib_name}"
+ elif [ -f "${sherpa_lib}" ]; then
+ cp "${sherpa_lib}" "${ONNX_JNILIBS_DIR}/${ABI}/"
+ log_info "ONNX: ${lib_name} (from Sherpa-ONNX)"
+ else
+ log_warn "ONNX: ${lib_name} NOT FOUND for ${ABI}"
+ fi
+ done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-kotlin/scripts/build-kotlin.sh` around lines 402 - 419, The
current directory-level branching prevents per-library fallbacks: change the
logic in the lib copy loop so each lib (libonnxruntime.so and
libsherpa-onnx-c-api.so) is checked first in
"${COMMONS_DIST}/onnx/${ABI}/${lib}" and if missing then checked in
"${SHERPA_ONNX_LIBS}/${ABI}/${lib}" before skipping; use the same copy + log
sequence to copy to "${ONNX_JNILIBS_DIR}/${ABI}/" when either source exists,
keeping the existing variable names (COMMONS_DIST, SHERPA_ONNX_LIBS,
ONNX_JNILIBS_DIR, ABI) and loop variables (lib or lib_name) to locate and modify
the code in build-kotlin.sh.
| for (let i = 0; i < lines.length; i++) { | ||
| const tok = lines[i]; | ||
| if (tok === undefined || tok === '') continue; | ||
| if (!vocab.has(tok)) vocab.set(tok, vocab.size); | ||
| } |
There was a problem hiding this comment.
vocab.txt parsing does not preserve line-number-based IDs.
Standard HuggingFace vocab.txt files use line number as token ID (line 0 → ID 0, line 1 → ID 1). This implementation assigns IDs sequentially via vocab.size, which breaks if the file contains empty lines in the middle—tokens after an empty line will have incorrect IDs.
🐛 Proposed fix to use line index as token ID
- for (let i = 0; i < lines.length; i++) {
- const tok = lines[i];
- if (tok === undefined || tok === '') continue;
- if (!vocab.has(tok)) vocab.set(tok, vocab.size);
+ for (let i = 0; i < lines.length; i++) {
+ const tok = lines[i];
+ if (tok === undefined || tok === '') continue;
+ if (!vocab.has(tok)) vocab.set(tok, i);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (let i = 0; i < lines.length; i++) { | |
| const tok = lines[i]; | |
| if (tok === undefined || tok === '') continue; | |
| if (!vocab.has(tok)) vocab.set(tok, vocab.size); | |
| } | |
| for (let i = 0; i < lines.length; i++) { | |
| const tok = lines[i]; | |
| if (tok === undefined || tok === '') continue; | |
| if (!vocab.has(tok)) vocab.set(tok, i); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-web/packages/onnx/src/Foundation/WordPieceTokenizer.ts`
around lines 92 - 96, The tokenizer's vocab build loop in WordPieceTokenizer
currently uses vocab.size to assign IDs, which shifts IDs when the vocab.txt
contains empty lines; instead use the file line index as the token ID. In the
loop that iterates lines (the block that references tok and vocab), replace the
vocab.set(tok, vocab.size) behavior with vocab.set(tok, i) so each token gets
the original line-number ID (skip adding when tok is undefined or empty but
still consume the index i), ensuring the Map<string, number> preserves
HuggingFace line-number-based IDs.
The first pass of the Android dead-weight removal used
packaging.jniLibs.excludes to strip libsherpa-onnx-jni.so and
libsherpa-onnx-cxx-api.so. Turns out that DSL only takes effect when a
downstream APP packages the AAR — it does NOT strip the .so files from
the library AAR itself during assembleRelease. I verified this by
unzipping runanywhere-core-onnx-release.aar and seeing both unwanted
libs still present in every jni/<abi>/ directory.
Fix: a `stripUnshippedSherpaLibs` Delete task that wipes the two files
from src/androidMain/jniLibs/ BEFORE preBuild / mergeReleaseJniLibFolders
run. Wired via:
* tasks.matching { name == "downloadJniLibs" } finalizedBy strip
(so freshly-downloaded files also get filtered, not just pre-
existing ones from a stale testLocal=true build).
* tasks.matching { name.contains("merge") &&
name.contains("JniLibFolders") } dependsOn strip
* tasks.matching { name == "preBuild" } dependsOn strip
Verified locally: assembleRelease produces a runanywhere-core-onnx-
release.aar that contains exactly the 4 expected libs per ABI —
libonnxruntime.so + librac_backend_onnx.so + librac_backend_onnx_jni.so
+ libsherpa-onnx-c-api.so. The sherpa jni + cxx-api variants are gone.
Also fix a latent version-drift bug in the ONNX download helper scripts
that this PR's validation run surfaced:
* scripts/macos/download-onnx.sh
* scripts/ios/download-onnx.sh
* scripts/linux/download-sherpa-onnx.sh
Previously all three skipped re-download based only on presence of the
binary artifact — so bumping ONNX_VERSION_* or SHERPA_ONNX_VERSION_*
would leave stale third_party/onnxruntime-{ios,macos}/ trees in place
forever, exactly the kind of silent drift the LoadVersions guard was
added to prevent. Each script now stamps a `.version` sentinel after a
successful download and re-downloads on mismatch. Verified on macOS:
nuking onnxruntime-macos and re-running download-onnx.sh replaced the
stale 1.23.2 dylib with 1.17.1.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
lifecycle_manager.cpp uses std::condition_variable but only indirectly
includes it via <mutex> on libc++ (AppleClang on iOS / macOS and Android
NDK). GCC 13 with libstdc++ on Linux doesn't transitively include it,
so the Linux commons build fails with:
error: 'condition_variable' in namespace 'std' does not name a type
65 | std::condition_variable service_cv{};
Surfaced while running build-linux.sh inside an ubuntu:24.04 container
as part of PR #479 validation. Pre-existing bug, not regression from
this PR — but Linux is an explicit target per VERSIONS, so fixing here
alongside the ONNX consolidation work.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two pre-existing build-ios.sh bugs surfaced while validating PR #479 by actually building the iOS example app against a fresh commons build: BUG 1 — simulator xcframework slice missing our backend symbols --------------------------------------------------------------- The xcframework's ios-arm64_x86_64-simulator libRABackendONNX.a had rac_backend_onnx_* symbols present only under x86_64, not arm64, which caused the example app to fail at link time: Undefined symbols for architecture arm64: "_rac_backend_onnx_unregister", referenced from: static ONNXRuntime.ONNX.unregister() -> () Root cause: sherpa-onnx.xcframework ships a UNIVERSAL (x86_64+arm64) libsherpa-onnx.a for the simulator slice. When libtool merges our x86_64-only librac_backend_onnx.a with sherpa's universal lib inside the SIMULATOR platform build, the resulting framework binary ends up reporting both archs to `lipo -archs` — even though its arm64 slice contains ONLY sherpa's arm64 objects, not our rac_backend_* objects (which exist only in the x86_64 slice). The old `if [[ "$SIM_ARCHS" != *"arm64"* ]]` check then short-circuits the final lipo-create step, leaving the SIMULATORARM64-built arm64 objects (which DO have our symbols) out of the shipped binary. Fix: always extract the canonical thin arch from each platform's framework binary (SIMULATORARM64 → arm64-thin, SIMULATOR → x86_64-thin) and lipo-create a fresh fat binary. Applied in both create_xcframework (commons) and create_backend_xcframework (backends). After fix, per-arch nm counts match on all three slices: ios-arm64 : 4 `T _rac_backend_onnx_*` ios-arm64_x86_64-simulator/arm64: 4 `T _rac_backend_onnx_*` ios-arm64_x86_64-simulator/x86_64: 4 `T _rac_backend_onnx_*` BUG 2 — Package.swift requires RABackendMetalRT.xcframework to exist --------------------------------------------------------------------- Package.swift unconditionally declared `RABackendMetalRTBinary` with a local path under useLocalBinaries=true, so any dev doing `build-ios.sh --backend onnx` (no MetalRT build) hit: local binary target 'RABackendMetalRTBinary' at '... RABackendMetalRT.xcframework' does not contain a binary artifact. Fix: add `metalrtLocalBinaryExists` probe (FileManager.fileExists on the xcframework path relative to Package.swift) and only declare the binary target when the framework is actually present. `includeMetalRT` is AND-gated on the existence check so the product/targets also disappear — SPM graph resolves cleanly whether or not MetalRT is built locally. Verified end-to-end: ** BUILD SUCCEEDED ** for the iOS example app (examples/ios/RunAnywhereAI, xcodebuild -scheme RunAnywhereAI -destination 'generic/platform=iOS Simulator') against fresh commons xcframeworks synced via build-swift.sh --local. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh (1)
95-105:⚠️ Potential issue | 🟠 MajorFix Sherpa-ONNX URL format: remove
-cpusuffix to match v1.12.23+ releases.The URLs on lines 96 and 99 use
sherpa-onnx-v${VERSION}-linux-x64-shared-cpu.tar.bz2, but should usesherpa-onnx-v${VERSION}-linux-x64-shared.tar.bz2(without the-cpusuffix). Update both the x86_64 and aarch64 URLs to match the current release filename format.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh` around lines 95 - 105, The URL strings built for ARCH values aarch64 and x86_64 include a trailing "-cpu" that no longer exists in recent Sherpa-ONNX releases; update the URL assignments for the aarch64 and x86_64 branches (the URL variables set when ARCH == "aarch64" and ARCH == "x86_64") to use filenames without the "-cpu" suffix (e.g., change "sherpa-onnx-v${VERSION}-linux-*-shared-cpu.tar.bz2" to "sherpa-onnx-v${VERSION}-linux-*-shared.tar.bz2"), and verify the corresponding ARCHIVE_NAME values match the new filename pattern so downloads succeed; keep the print_error and exit handling as-is.Package.swift (1)
1-1:⚠️ Potential issue | 🟡 MinorUpdate swift-tools-version to 6.0 to comply with coding guidelines.
The coding guidelines explicitly state "Use the latest Swift 6 APIs always" for Swift files, which applies to Package.swift. Currently, the file specifies
swift-tools-version: 5.9. Swift 6.0 is the latest version and should be adopted unless there are specific compatibility constraints in your CI/CD environment or dependency chain that prevent it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Package.swift` at line 1, Update the Package.swift header to declare Swift tools version 6.0 by changing the swift-tools-version line from 5.9 to 6.0 (the file's top-line directive "swift-tools-version: 5.9"); after updating, run swift package resolve and swift build / CI tests to ensure compatibility and adjust any APIs if the Swift 6 toolchain surfaces warnings or errors in package manifests or targets.
🧹 Nitpick comments (2)
sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh (1)
76-89: Version mismatch detection proceeds to download without clearing the stale directory.When a version mismatch is detected (lines 86-89), the script logs a message about "Clearing stale cache and re-downloading…" but doesn't actually clear
${DEST_DIR}here. The clearing happens later at lines 123-126 only if the directory exists, which it does. This works but is misleading — the message implies immediate clearing.For consistency with the macOS script (which explicitly calls
rm -rf "${ONNX_DIR}"at line 54 within the mismatch block), consider restructuring:♻️ Clearer flow suggestion
print_step "Sherpa-ONNX version mismatch at ${DEST_DIR}" echo " Found: ${EXISTING:-unknown}, want: ${VERSION}" echo " Clearing stale cache and re-downloading…" + rm -rf "${DEST_DIR}" fi - -# ============================================================================= -# Determine Download URL -# ============================================================================= -... - -# Clean existing directory -if [ -d "${DEST_DIR}" ]; then - print_step "Removing existing Sherpa-ONNX directory..." - rm -rf "${DEST_DIR}" -fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh` around lines 76 - 89, The version-mismatch branch logs "Clearing stale cache…" but doesn't remove the existing directory immediately; update the block that checks VERSION_SENTINEL/EXISTING (the if that tests [ -d "${DEST_DIR}/lib" ] && [ "$FORCE_DOWNLOAD" = false ]) to remove the stale files right there by calling rm -rf on ${DEST_DIR} (or the equivalent ONNX_DIR subpath) before continuing with download so the message matches the action; ensure you reference DEST_DIR, VERSION_SENTINEL, VERSION and preserve the FORCE_DOWNLOAD logic and existing print_step/print_success calls.sdk/runanywhere-commons/scripts/build-ios.sh (1)
549-569: Consider extracting the fat binary creation logic into a shared function.The logic for creating fat simulator binaries (lines 549-569) is duplicated nearly verbatim in
create_backend_xcframework(lines 803-839). Extracting this into a helper function would reduce duplication and ensure consistent behavior.Additionally, the temporary thin files (
${FRAMEWORK_NAME}-thin-arm64.a,${FRAMEWORK_NAME}-thin-x86_64.a) are created but never removed. Consider cleaning them up afterlipo -create.♻️ Proposed helper function
# Add this helper function before create_xcframework(): create_fat_simulator_binary() { local FRAMEWORK_NAME=$1 local BUILD_DIR=$2 local SIM_FAT="${BUILD_DIR}/SIMULATOR" local SIM_ARM64_BIN="${BUILD_DIR}/SIMULATORARM64/${FRAMEWORK_NAME}.framework/${FRAMEWORK_NAME}" local SIM_X86_BIN="${SIM_FAT}/${FRAMEWORK_NAME}.framework/${FRAMEWORK_NAME}" if [[ -f "${SIM_ARM64_BIN}" && -f "${SIM_X86_BIN}" ]]; then log_step "Creating fat simulator binary (arm64 + x86_64)..." local SIM_ARM64_THIN="${BUILD_DIR}/SIMULATORARM64/${FRAMEWORK_NAME}-thin-arm64.a" local SIM_X86_THIN="${SIM_FAT}/${FRAMEWORK_NAME}-thin-x86_64.a" if ! lipo -thin arm64 "${SIM_ARM64_BIN}" -output "${SIM_ARM64_THIN}" 2>/dev/null; then cp "${SIM_ARM64_BIN}" "${SIM_ARM64_THIN}" fi if ! lipo -thin x86_64 "${SIM_X86_BIN}" -output "${SIM_X86_THIN}" 2>/dev/null; then cp "${SIM_X86_BIN}" "${SIM_X86_THIN}" fi lipo -create "${SIM_ARM64_THIN}" "${SIM_X86_THIN}" \ -output "${SIM_FAT}/${FRAMEWORK_NAME}.framework/${FRAMEWORK_NAME}" # Clean up temp files rm -f "${SIM_ARM64_THIN}" "${SIM_X86_THIN}" fi }Then replace both duplicate blocks with:
create_fat_simulator_binary "${FRAMEWORK_NAME}" "${BUILD_DIR}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/scripts/build-ios.sh` around lines 549 - 569, The fat simulator binary creation logic is duplicated; extract it into a helper function (e.g. create_fat_simulator_binary) that accepts FRAMEWORK_NAME and BUILD_DIR, moves the existing block that creates SIM_ARM64_BIN/SIM_X86_BIN, runs lipo -thin for arm64 and x86_64 into SIM_ARM64_THIN and SIM_X86_THIN (falling back to cp on failure), calls lipo -create to write the combined framework binary, and then removes the temporary thin files (rm -f "${SIM_ARM64_THIN}" "${SIM_X86_THIN}"); replace the duplicated blocks in the current function and create_backend_xcframework with a call to create_fat_simulator_binary "${FRAMEWORK_NAME}" "${BUILD_DIR}" to ensure consistent behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@Package.swift`:
- Line 1: Update the Package.swift header to declare Swift tools version 6.0 by
changing the swift-tools-version line from 5.9 to 6.0 (the file's top-line
directive "swift-tools-version: 5.9"); after updating, run swift package resolve
and swift build / CI tests to ensure compatibility and adjust any APIs if the
Swift 6 toolchain surfaces warnings or errors in package manifests or targets.
In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh`:
- Around line 95-105: The URL strings built for ARCH values aarch64 and x86_64
include a trailing "-cpu" that no longer exists in recent Sherpa-ONNX releases;
update the URL assignments for the aarch64 and x86_64 branches (the URL
variables set when ARCH == "aarch64" and ARCH == "x86_64") to use filenames
without the "-cpu" suffix (e.g., change
"sherpa-onnx-v${VERSION}-linux-*-shared-cpu.tar.bz2" to
"sherpa-onnx-v${VERSION}-linux-*-shared.tar.bz2"), and verify the corresponding
ARCHIVE_NAME values match the new filename pattern so downloads succeed; keep
the print_error and exit handling as-is.
---
Nitpick comments:
In `@sdk/runanywhere-commons/scripts/build-ios.sh`:
- Around line 549-569: The fat simulator binary creation logic is duplicated;
extract it into a helper function (e.g. create_fat_simulator_binary) that
accepts FRAMEWORK_NAME and BUILD_DIR, moves the existing block that creates
SIM_ARM64_BIN/SIM_X86_BIN, runs lipo -thin for arm64 and x86_64 into
SIM_ARM64_THIN and SIM_X86_THIN (falling back to cp on failure), calls lipo
-create to write the combined framework binary, and then removes the temporary
thin files (rm -f "${SIM_ARM64_THIN}" "${SIM_X86_THIN}"); replace the duplicated
blocks in the current function and create_backend_xcframework with a call to
create_fat_simulator_binary "${FRAMEWORK_NAME}" "${BUILD_DIR}" to ensure
consistent behavior.
In `@sdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.sh`:
- Around line 76-89: The version-mismatch branch logs "Clearing stale cache…"
but doesn't remove the existing directory immediately; update the block that
checks VERSION_SENTINEL/EXISTING (the if that tests [ -d "${DEST_DIR}/lib" ] &&
[ "$FORCE_DOWNLOAD" = false ]) to remove the stale files right there by calling
rm -rf on ${DEST_DIR} (or the equivalent ONNX_DIR subpath) before continuing
with download so the message matches the action; ensure you reference DEST_DIR,
VERSION_SENTINEL, VERSION and preserve the FORCE_DOWNLOAD logic and existing
print_step/print_success calls.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9afa493d-4b6c-48b5-b617-be8f7e348ced
📒 Files selected for processing (7)
Package.swiftsdk/runanywhere-commons/scripts/build-ios.shsdk/runanywhere-commons/scripts/ios/download-onnx.shsdk/runanywhere-commons/scripts/linux/download-sherpa-onnx.shsdk/runanywhere-commons/scripts/macos/download-onnx.shsdk/runanywhere-commons/src/core/capabilities/lifecycle_manager.cppsdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts
✅ Files skipped from review due to trivial changes (1)
- sdk/runanywhere-commons/src/core/capabilities/lifecycle_manager.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
- sdk/runanywhere-kotlin/modules/runanywhere-core-onnx/build.gradle.kts
Summary
Clean up the dual-stack ONNX Runtime / Sherpa-ONNX situation that had accumulated across the web, Android, iOS, React Native, and Flutter surfaces. Keep both technologies (sherpa-onnx for STT/TTS/VAD, raw ORT for wake-word + RAG embeddings — sherpa has no equivalents for the latter two) but make sherpa-onnx's expected ORT the single source of truth and strip the dead code that had accumulated around that assumption.
Also delivers the new wake-word + RAG-embeddings web implementations that the user has on the roadmap — full openWakeWord 3-stage pipeline + HuggingFace-compatible WordPiece tokenizer + BERT encoder, running via the new ORTRuntimeBridge wrapper over
onnxruntime-web.Full rationale (including why we can't drop either side) is in the plan doc at `thoughts/shared/plans/cross-platform-onnx-sherpa-consolidation.md`.
What's in each commit
Test plan
Size impact summary
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Removals
Chores
Greptile Summary
This PR consolidates the ONNX Runtime / Sherpa-ONNX dual-stack across all platforms: collapses the five per-platform ORT version pins to a single
1.17.1(sherpa-onnx's requirement), strips unused sherpa.sofiles from the Android AAR (~13.8 MB saved), deletes the IntelliJ plugin demo, introduces a sharedOrt::Envsingleton to eliminate redundant ORT env creation, and adds full web implementations for wake-word detection (openWakeWord 3-stage pipeline) and RAG embeddings (BERT encoder + HuggingFace-compatible WordPiece tokenizer). All P2 findings; safe to merge.Confidence Score: 5/5
Safe to merge; all findings are P2 style/resilience suggestions with no impact on correctness or data integrity.
No P0 or P1 issues found. The ORT version consolidation, shared-env singleton, Android AAR trim, and TypeScript web pipelines are all logically correct. The three P2 findings are: a rejected-promise retry gap in ORTRuntimeBridge, a header doc mismatch on the exception type, and a GC-inefficient audio buffer in WakeWordService — none affect current runtime behavior.
sdk/runanywhere-web/packages/onnx/src/Foundation/ORTRuntimeBridge.ts (retry-on-failure gap)
Important Files Changed
Sequence Diagram
sequenceDiagram participant App participant WakeWordService participant ORTRuntimeBridge participant ORT as onnxruntime-web participant MelspecModel as melspectrogram.onnx participant EmbeddingModel as embedding_model.onnx participant Classifier as classifier.onnx App->>WakeWordService: load(config) WakeWordService->>ORTRuntimeBridge: initialize() ORTRuntimeBridge->>ORT: import('onnxruntime-web') [lazy] ORT-->>ORTRuntimeBridge: module ORTRuntimeBridge-->>WakeWordService: ort module WakeWordService->>ORT: createSession(melspectrogramModel) WakeWordService->>ORT: createSession(embeddingModel) WakeWordService->>ORT: createSession(classifierModel[]) WakeWordService-->>App: ready loop Per 1280-sample frame (80 ms @ 16 kHz) App->>WakeWordService: feed(samples) WakeWordService->>WakeWordService: buffer → align to 1280-sample frames WakeWordService->>MelspecModel: run([context+frame]) → mel frames [Fx32] WakeWordService->>WakeWordService: apply (v/10)+2 transform WakeWordService->>EmbeddingModel: run([76-frame window]) → embedding [96] WakeWordService->>Classifier: run([16 embeddings]) → score [0,1] alt score >= threshold AND cooldown elapsed WakeWordService-->>App: callback(WakeWordDetection) end end App->>WakeWordService: unload() WakeWordService->>ORT: session.release() x (2 + N classifiers)Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "commons/onnx: reshape backends/onnx/ by ..." | Re-trigger Greptile