v0.9.3 - sandboxing by dzikowski · Pull Request #14 · jaiphlang/jaiph

dzikowski · 2026-04-20T11:26:14Z

No description provided.

Protect the host-mounted .jaiph/runs contract by asserting Docker-backed runs create and grow step .out/.err files before the workflow exits. Made-with: Cursor

Keep nested run/ensure calls explicit across validation, formatting, and runtime execution, and make Docker use the local Jaiph package with a writable workspace fallback so container behavior matches local runs. Made-with: Cursor

Enforce that nested call-like expressions inside argument positions must use an explicit `run` or `ensure` keyword. Bare call-like forms (`run foo(bar())`, `run foo(rule_bar())`, `run foo(\`echo x\`())`, `const x = bar()`) are now rejected at compile time with actionable error messages. The explicit forms (`run foo(run bar())`, `run foo(ensure rule_bar())`, `run foo(run \`echo x\`())`) execute the nested call first and pass the result as a single argument. Validator extended with inline script detection, runtime evaluates managed argument tokens before outer dispatch, and the formatter round-trips all valid nested forms. Regression tests cover all accepted and rejected patterns. Docs and grammar updated. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nfig file Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

…runtime images Remove all auto-derivation and runtime bootstrap paths from Docker mode. The runtime no longer builds derived images via npm pack or installs jaiph into arbitrary base images at run time. Every Docker image must already contain a working jaiph CLI; missing jaiph now fails fast with an actionable error. Default docker_image switches from node:20-bookworm to the official ghcr.io/jaiphlang/jaiph-runtime image. A new CI workflow publishes that image for release tags and nightly builds. Docs, init scaffolding, and E2E tests are updated to reflect the strict contract. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Support module.name, module.version, and module.description as optional string keys in the module-level config { } block. Values are stored on WorkflowMetadata.module as descriptive metadata only — they do not affect agent, run, or runtime behavior. Workflow-level config blocks reject module.* keys with E_PARSE, consistent with the existing runtime.* guard. The formatter round-trips all three keys. Unit tests cover happy path, partial keys, coexistence, round-trip, and workflow-level rejection. Docs and grammar updated. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add mount denylist rejecting dangerous host paths (/, /proc, /sys, /dev, Docker socket) at validation time with E_VALIDATE_MOUNT. Add environment variable denylist (SSH_*, GPG_*, AWS_*, GCP_*, AZURE_*, GOOGLE_*, DOCKER_*, KUBE*, NPM_TOKEN*) preventing host credential leakage into containers. Launch containers with --cap-drop ALL --cap-add SYS_ADMIN --security-opt no-new-privileges for least-privilege capability control. Document threat model in docs/sandboxing.md covering what Docker does and does not protect against (hooks on host, network egress, agent credential forwarding, image supply chain, container escapes). Add failure-modes reference table, expanded network-mode guidance, and env denylist spec. Unit tests cover all new validation and filtering paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Docker is now enabled by default when neither CI=true nor JAIPH_UNSAFE=true is set in the environment. This makes sandboxed execution the safe default for local development while keeping Docker off in CI (where it is typically unavailable or redundant) and when the user explicitly opts out via JAIPH_UNSAFE=true. Precedence: JAIPH_DOCKER_ENABLED env > in-file runtime.docker_enabled > CI/unsafe default rule. The test harness and E2E runner set JAIPH_UNSAFE=true so existing tests continue to run on host. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Docker runs enforce an immutability contract: the host workspace is bind-mounted read-only and /jaiph/workspace is a sandbox-local copy-on-write layer discarded on exit. The only persistence channel to the host is the run-artifacts directory. During teardown, the runtime now automatically exports a workspace.patch file (git diff --binary) into the run directory so sandbox edits can be reviewed or applied on the host. Patch export is best-effort, owned by the runtime (not workflow logic), and runs regardless of workflow exit status. When there are no changes, the file is omitted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nd eliminate outdated content. Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

Introduce `recover` as a first-class repair-and-retry primitive for `run` steps, distinct from the existing one-shot `catch`. When a run step fails, the recover block binds the error, executes a repair body, and retries the step in a loop until it succeeds or the retry limit (default 10, configurable via `config`) is exhausted. Covers parser, formatter, validator, runtime, e2e acceptance test, and docs-site syntax highlighting for the new keyword. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…osition Replace the implicit end-of-workflow join with a first-class Handle<T> that run async returns immediately. Handles resolve transparently on first non-passthrough read (argument passing to run, interpolation, comparison, branching) while passthrough operations (assignment, list storage, unchanged forwarding) leave them unresolved. Workflow exit implicitly joins any remaining unresolved handles. Ship recover composition for run async in the same change: the parser now accepts recover(err) { ... } after run async ref(args), and the runtime wires up the same retry-limit semantics used by non-async recover. Includes the spec document, parser/formatter/runtime tests, updated grammar and language docs, and syntax highlighter support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… out of the sandbox Introduce a two-layer artifacts system for workflows running inside the Docker sandbox (or on the host). The runtime layer creates a .jaiph/runs/<run_id>/artifacts/ directory before workflow execution and exposes its path via the JAIPH_ARTIFACTS_DIR env var (resolving to /jaiph/run/artifacts in the container, the host path otherwise). The library layer ships .jaiph/libs/jaiphlang/artifacts.jh paired with artifacts.sh, providing three export workflow entries: save (copy a file into artifacts), save_patch (git diff excluding .jaiph/), and apply_patch (git apply). The library mirrors the existing queue.jh pattern. Includes runtime unit tests, an E2E test, and docs updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Exercise the live TTY progress tree path for `run async` workflows under a real pseudo-terminal. The test spawns `jaiph run` with two concurrent async branches (branch_a, branch_b), each emitting deterministic progress events via log and script steps with sleeps. A Python pty.openpty() harness captures the raw PTY stream and asserts per-branch events render under correct subscript nodes (₁, ₂), resolved Handle<T> return values appear in the final frame, and no orphaned ANSI escape sequences survive after CSI stripping. This closes the regression gap left by the sync-only 81_tty_progress_tree.sh test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

…g guide The actual filesystem cleanup (deleting 22+ leftover debug directories, removing tracked cruft files safe_name and QUEUE.md.tmp.4951, and adding .gitignore patterns for docker-*/, nested-*/, overlay-*/, local-*/, .tmp*/, QUEUE.md.tmp.*) was committed earlier. This commit records the bookkeeping side: - CHANGELOG.md: add entry describing the cleanup and disposition of safe_name, lib/, and run/ (all deleted — no live consumers found). - QUEUE.md: remove the completed task from the queue. - docs/contributing.md: add "Workspace hygiene" section documenting the .gitignore patterns and how to override them with git add -f. Disposition of investigated paths: - safe_name: deleted (tracked file, no live consumers) - lib/: deleted (empty top-level directory, no live consumers) - run/: deleted (empty top-level directory, no live consumers) No code changes; documentation and queue bookkeeping only. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Delete exportWorkspacePatch and findRunArtifacts from src/runtime/docker.ts, exportPatchIfDocker from node-workflow-runtime.ts, and the findRunArtifacts call in src/cli/commands/run.ts. These functions served the abandoned per-call isolated keyword and are fully replaced by the artifacts.jh library (artifacts.save_patch() for workspace patches, JAIPH_ARTIFACTS_DIR for artifact discovery). Also removes ~150 LoC of dead tests in docker.test.ts and updates docs (sandboxing.md, architecture.md, artifacts.md) to reflect the removal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…TYLine When fuse overlay is unavailable, rsync and the cp fallback no longer copy .jaiph/runs; emit a clear stderr line before the temp workspace copy. In TTY mode, stderr_line events use writeTTYLine so lines show immediately without clearing/redrawing the running status line. Add QUEUE item for agent_inbox workflow quoting noise. Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

Strip backslash-escaped quotes from display formatting so workflow step labels and log lines render human-readably. Three layers changed: formatNamedParamsForDisplay and formatParamsForDisplay no longer escape inner double quotes with backslash (the surrounding key="value" delimiters are structural, not shell-safe); formatStartLine in display.ts applies the same change for prompt previews; and node-workflow-runtime strips outer quotes from interpolated channel-send payloads via stripOuterQuotes so messages flow through dispatch without literal quote wrappers. Regression tests added; E2E golden output updated. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- CI: build/push ghcr.io/jaiphlang/jaiph-runtime from .jaiph/Dockerfile on nightly branch (:nightly) and version tags (:<semver>, :latest); pass JAIPH_REPO_REF for install ref. - Runtime: resolveImage always uses configured/default image with pull + jaiph check; stop auto docker build of workspace .jaiph/Dockerfile on jaiph run (keep runtime.docker_image / JAIPH_DOCKER_*). - Docs and E2E aligned; unit test contract updated for resolveImage. Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

- Move published Docker recipe to runtime/Dockerfile; CI builds from runtime/. - jaiph init: stop creating .jaiph/Dockerfile; bootstrap prompt and tests/e2e updated. - Reference docs: describe current behavior only (sandboxing patches, inbox send, grammar/testing notes, jaiph-skill, libraries comment). Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

Replaces the in-container rsync/cp fallback (~40s on this repo) with a host-side workspace clone using cp -cR (APFS clonefile, O(1) per file) on macOS. Linux keeps fuse-overlayfs as the primary path; copy mode is selected when /dev/fuse is missing on the host or JAIPH_DOCKER_NO_OVERLAY=1. OVERLAY_SCRIPT shrinks from ~155 lines to ~22 (no in-container fallback; host owns the slow path). Copy mode drops SYS_ADMIN, /dev/fuse, and the overlay-script mount. Clone lives at <runs-root>/.sandbox-<id>/, removed on exit unless JAIPH_DOCKER_KEEP_SANDBOX=1. Tests cover both modes plus an explicit guard that the clone produces independent inodes (writes inside the container do not leak to the host). Made-with: Cursor

Add Docker Buildx and multi-platform build so GHCR tags include linux/arm64 for Apple Silicon hosts, alongside linux/amd64. Made-with: Cursor

1. landing-page.spec.ts (chromium) — docs/index.html had stale expected output for the agent_inbox sample with the old escaped-quote rendering ("\"Found 3 issues\"" / "Critical issue: \"Summary: ...\""). The runtime no longer emits those escapes (see prior CHANGELOG entry on formatNamedParamsForDisplay). Updated the page sample to match the current clean rendering — the test pulls expected text from the page verbatim, so this is the source of truth. 2. macos runner — 104_run_async fanout flake. Async branches complete in non-deterministic order; the previous slow rsync sandbox happened to serialize timing, masking the race. With the fast clone path (cp -cR) the race surfaces. Fix in e2e/lib/common.sh: extend normalize_output with a perl pass that sorts contiguous "async-progress" lines (lines starting with a leading space + subscript marker ₁..₉, UTF-8 bytes E2 82 81..89). Both actual and expected get the same canonical order, so strict equality still works while the inter-branch race is normalized away. No per-test changes needed — verified against fanout, sibling_depth, circled, nested_async, async_interleave and 78_lang_redesign_constructs. 3. ubuntu runner — 72_docker_run_artifacts: fuse-overlayfs mount fails with "Permission denied" even with SYS_ADMIN + /dev/fuse. Root cause is the default Docker AppArmor profile shipped on Ubuntu 22.04+ / GitHub Actions runners, which denies fuse mounts in containers. Documented workaround: --security-opt apparmor=unconfined. Added it to overlay mode args, Linux-only (macOS Docker Desktop has no AppArmor and rejects unknown security-opts). Tests cover both that the flag is added in overlay mode on Linux and absent in copy mode. docs/sandboxing.md updated with the rationale and remaining failure modes (rootless docker, locked-down kernels) where the operator still needs JAIPH_DOCKER_NO_OVERLAY=1. Made-with: Cursor

- Print a dim parenthetical after the .jh name: (no sandbox), local Docker detail (fusefs vs tmp dir), or (Docker sandbox, …) when CI=true/1 to avoid host-dependent snapshots. - Resolve runtime env before the banner so sandbox mode matches spawnDockerProcess. - E2E: normalize_output strips the parenthetical so existing expected stdout blocks stay stable. - Docs: landing-page run samples use run-banner-meta for the gray suffix. Made-with: Cursor

Move GHCR runtime build from a standalone workflow into CI with needs on test, e2e, docs-local, and e2e-wsl. Same triggers: nightly branch and v* tags. Remove docker-publish.yml to avoid duplicate pushes. Made-with: Cursor

…o "tmp workspace" Two coupled changes that fix the chromium landing-page test (say-hello / failure block was getting "(no sandbox)" because CI=true silently disabled Docker, so the docs sample comparing against "(Docker sandbox, fusefs)" diverged): 1. resolveDockerConfig: CI=true no longer disables Docker. The only environment-driven escape hatch is now JAIPH_UNSAFE=true. Rationale: landing-page e2e and docs sample tests must exercise the same sandbox path users do — silently dropping the sandbox in CI hides real regressions. Explicit overrides (JAIPH_DOCKER_ENABLED env or in-file runtime.docker_enabled) still take precedence. 2. formatJaiphRunningBannerLines: copy-mode label "tmp dir" → "tmp workspace" (clearer about what it actually is — a writable per-run clone of the workspace) and dropped the CI-only "…" obfuscation. The banner now always reflects the real sandbox mode so the docs/landing-page samples can compare against literal text. Side effects handled: - test/signal-lifecycle.test.ts: switched from CI="true" to JAIPH_UNSAFE="true" for the "exit-within-5s" assertion (it relied on Docker being disabled). - src/runtime/docker.test.ts: rewrote the three CI-related cases to document the new contract (CI=true keeps Docker on; in-file and env overrides still win). - src/cli/run/display.test.ts: dropped the obfuscation test, refreshed the copy-mode test for the new label, added a parity test that the banner is identical in CI and locally. - docs/sandboxing.md, docs/configuration.md: updated the default rule table, configuration key descriptions, and precedence text. e2e tests are unaffected: e2e/lib/common.sh already pins JAIPH_DOCKER_ENABLED=false and JAIPH_UNSAFE=true for non-Docker tests, and Docker-specific tests already set JAIPH_DOCKER_ENABLED=true explicitly. No tooling changes needed there. Made-with: Cursor

In overlay mode the container previously ran as the host UID, but /jaiph/workspace is owned by the image user (jaiph UID 10001), so fusermount3 refused to mount on a directory the calling user couldn't write to. Apparmor was a red herring. Fix: run the container as root (--user 0:0) so fuse-overlayfs can mount /jaiph/workspace, then have overlay-run.sh chown /jaiph/run to the host UID/GID and exec the workflow via setpriv. The workflow itself never runs as root, host-readable artifacts are preserved, and copy mode is unchanged (still --user host_uid:host_gid). Custom images that lack setpriv print a one-line warning and run the workflow as root inside the container. Made-with: Cursor

…_TIMEOUT Previously, JAIPH_DOCKER_TIMEOUT="-5" parsed to -5 and silently disabled the timeout because the > 0 guard in spawnDockerProcess treated it as "no timeout". Trailing-junk values like "300-" parsed as 300 via parseInt, masking user intent. resolveDockerConfig now validates strictly: the env value must match ^\d+$ (digits only); negative values, trailing junk, empty strings, and non-numeric input all throw E_DOCKER_TIMEOUT. The in-file runtime.docker_timeout path also rejects negatives. Zero continues to disable the timeout. Adds six new env tests and one in-file test. Updates docs/sandboxing.md failure-modes table with the new error code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…output Extract prepareImage() in src/runtime/docker.ts that runs pullImageIfNeeded + verifyImageHasJaiph before the CLI banner renders. On cold pull, Docker's native layer progress is suppressed via --quiet; a single "pulling image <name>…" status line is written to stderr instead. Call prepareImage from runWorkflow in src/cli/commands/run.ts before writeBanner so the banner and progress tree only appear after image preparation completes. spawnDockerProcess no longer handles pull or verification. resolveImage is retained as a thin wrapper for back-compat. New e2e test (74c_docker_prepull.sh) asserts banner ordering and status line output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ellCheck CI Move the 30+ line OVERLAY_SCRIPT bash constant out of a TypeScript template literal in src/runtime/docker.ts into a standalone file at runtime/overlay-run.sh. docker.ts now reads the script at module load via readFileSync, resolving from either dist/ or source layout. Add a ShellCheck CI job to lint the shell script on every push. Include the file in the npm package via the files array in package.json and copy it into dist/src/runtime/ during build. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the mutable state object threaded through copyEntryWithCloneFallback with a non-exported WorkspaceCloner class that encapsulates clone-probe and fallback logic behind a single copy(src, dst) method. cloneWorkspaceForSandbox now reads top-to-bottom without state plumbing. The clonefile-vs-cp decision tree and all existing tests remain unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a blockquote warning to the "Enabling Docker" section of docs/sandboxing.md about agent credential forwarding (ANTHROPIC_*, CLAUDE_*, CURSOR_* env vars) and outbound network access, advising users to set docker_network = "none" for sensitive workflows. Remove the incorrect claim that JAIPH_DOCKER_KEEP_SANDBOX=1 prints the sandbox path to stderr — the code does not print. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Made-with: Cursor

- Bump version across package, docs, and installer copy; restructure CHANGELOG (0.9.3 + empty Unreleased)\n- JAIPH_DOCKER_TIMEOUT: require digits-only (rejects -0); add test\n- say_hello example: validate name with match in rule Made-with: Cursor

…re footer Two related correctness fixes plus an e2e test that would have caught both: 1. Validator: reject unknown leading verbs in match arm bodies. Previously `"" => error "msg"` silently parsed `error "msg"` as a string literal, so a rule meant to fail would "pass" with a truthy value. Now `validateMatchExpr` requires the leading bare-word (when followed by args or `(`) to be one of `fail` / `run` / `ensure`, and emits a hint suggesting `fail` when the user typed `error`. 2. CLI: never emit a footer that is just "Workflow execution failed." When `discoverDockerRunDir` cannot match an expected `run_id`, the user used to get zero diagnostic info. `reportResult` now falls back to printing the sandbox runs root and the expected `run_id`, so the user can always investigate. 3. e2e/tests/76_docker_failure_parity.sh: rewrite to exercise BOTH a script-step failure AND a rule match-fail (the path the original bug report hit), and compare full normalized output between Docker and no-sandbox modes instead of using `assert_contains` on a few lines. Made-with: Cursor

Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

Accept `run \`...\`(args)` as a value expression in both `return` and `log` statements. This extends the parser, validator, formatter, and runtime to handle managed inline-script calls in value positions, consistent with how named-ref managed calls already work. Bare inline-script calls without `run` remain rejected with clear errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Match arms must now be separated by newlines only. Trailing commas after arm bodies (e.g. `"" => fail "msg",`) and inline comma-separated arms are rejected at parse time with the diagnostic "commas are not allowed in match arms; use one arm per line". Parser, validator, and e2e tests updated; grammar docs and changelog reflect the new constraint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Close the hole where a bare word like `true`, `false`, or `blorp` used as a match arm body was silently treated as a string literal. `validateMatchExpr` now receives the set of in-scope variables and rejects any bare identifier that is not a known const, capture, or parameter with E_VALIDATE: `unknown identifier "…" in match arm body`. Together with the existing unknown-verb check (for words followed by arguments), all unknown-identifier cases in arm bodies are now covered. Regression tests verify rejection of `true`, `false`, and arbitrary unknown words, and acceptance of in-scope identifiers and string literals. The e2e test that relied on `_ => true` is updated to use the subject variable instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…res, scripts) All Jaiph bindings are now immutable within their scope. The validator rejects rebinding a parameter via const, duplicate const declarations, capture name collisions, and script names that shadow existing immutable bindings. Diagnostics name the conflicting binding and its origin (file + line). Migrated examples/say_hello.jh to use distinct parameter and const names. Added regression tests covering all rejection and success paths. Updated grammar, language, skill, and landing-page docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

`return response` (bare identifier) is now a first-class return form, resolved against the same scope rules as interpolation and call args. Previously it fell through to the catch-all "inline shell steps are forbidden" validator error, which was incorrect for non-shell constructs. The shell-step diagnostic is now narrowed so it only fires for actual bare shell commands. Unknown-identifier returns produce a precise unknown-identifier error naming the missing binding. All existing return forms (string, interpolation, run, ensure, match, dotted) remain unchanged. Adds parser unit tests, compiler-test cases, and an e2e test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rams - evaluateMatch: bare in-scope identifier (e.g. `=> name_arg`) now returns the variable's value, mirroring `return val` sugar documented in docs/language.md. Previously fell through to string interpolation and silently returned the literal name. - emitPromptStepStart: drop the declaredParamNames block so prompt step display lists only ${var} references actually appearing in the prompt body. Workflow params unrelated to the prompt no longer leak into the prompt's run-tree line. - examples/say_hello.jh: rename inputs to demonstrate the corrected behavior (name_arg / name). Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com> Made-with: Cursor

`return response` was being rewritten to `return "${response}"` because the parser eagerly desugared bare identifiers into interpolated strings and stored only the desugared form in the AST. The runtime needs the interpolated form, but the formatter was discarding the original spelling. - types.ts: add optional `bareSource` to the `return` step, capturing the original `response` / `base.field` source when it came from the bare-identifier sugar path. - parse/{steps,workflow-brace,workflows}.ts: record `bareSource` at all three return parse sites; runtime `value` stays `"${...}"`. - format/emit.ts: when `bareSource` is set, emit `return <bareSource>`; otherwise emit `return <value>` so explicit `"${var}"` is preserved too. - format/emit.test.ts: round-trip tests for bare, dotted, and explicit ${var} return forms. Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com> Made-with: Cursor

…onse` in tests When the default workflow in a `jaiph run` exits successfully with a return value, the runtime now writes it to `<run_dir>/return_value.txt`, and the CLI prints it on its own line after `✓ PASS workflow default`, separated by a blank line. Workflows without a return statement produce no extra output. In the test runner, every `run <workflow>(...)` step now binds the workflow's return value (or captured failure message) to an implicit `response` variable, mirroring the convention used in `examples/say_hello.test.jh`. Explicit `const X = run ...` captures still set `X` as before; the implicit alias is in addition. This lets `expect_equal response "..."` work without forcing every test to write an explicit capture. Made-with: Cursor

The prompt step display previously substituted variables into the preview shown in the run tree, e.g. `prompt cursor "Say hello to Adam and..." (name="Adam")`. The substituted form duplicates information already exposed in the params and hides what the user actually wrote. Switch the preview source to the raw, un-interpolated prompt text so the tree now reads `prompt cursor "Say hello to ${name} and..." (name="Adam")`. Concrete values still appear alongside in the params. The `promptText` parameter to `emitPromptStepStart` was unused after this change and has been removed from the function signature and both call sites. A new artifacts test pins the behavior via the run summary. Made-with: Cursor

…scripts Replace `ensure report_exists() catch (...)` with the more direct `run check_report_exists() recover(failure) { ... }` pattern so the example matches current Jaiph idioms. Inline the report read via `return run \`cat report.txt\`()` to demonstrate inline backtick scripts as the function-call form, and define `check_report_exists` as a triple-backtick fenced script with a comment noting that the fence syntax supports other languages (`node`, `python3`, …). Mark recover_loop.test.jh executable to match the rest of the example shebangs. Made-with: Cursor

Tests can now declare a string constant once and reuse it across mocks and assertions: test "with name, returns greeting" { const expected = "Hello Alice!" mock prompt expected run hello.default("Alice") expect_equal response expected } Three small surface additions, all opt-in: * `const NAME = "literal"` binds a string in the enclosing test block. Only plain double-quoted literals are accepted in v1; no interpolation, no `run`, no `match`. The runner seeds these into the test-scope `vars` map during the same pre-pass that collects mocks, so order-of-declaration matters (the const must appear before any reference). * `mock prompt <ident>` resolves the response from a previously declared `const`. The literal form `mock prompt "..."` is unchanged. Undefined references fail the test with a clear message rather than crashing the run. * `expect_equal var <ident>`, `expect_contain var <ident>`, and `expect_not_contain var <ident>` accept a bare identifier as the second argument and resolve it against the test-scope `vars` map. The AST keeps the existing literal fields (`response`, `expected`, `substring`) and adds optional discriminator fields (`responseVar`, `expectedVar`, `substringVar`) populated only when the var-ref form was authored. The formatter emits whichever form was used so round-trips preserve user intent. The `examples/say_hello.test.jh` example is updated to use the new form. One pre-existing compiler-test fixture (`mock prompt not_quoted`) is now valid input under the new grammar; replaced with `mock prompt 123-not-a-string` so the "must be …" diagnostic still exercises a genuinely invalid token. Made-with: Cursor

The test runner used to silently bind every `run …` call's return value to a magic `response` variable, letting `expect_equal response …` pass without ever capturing the value in source. That hid intent and let typos in `expect_*` LHS slip through as empty-string comparisons. Now: - `run …` only introduces a name when written `const X = run …`. - A new `validateTestBlocks` pass rejects with E_VALIDATE when an `expect_*` LHS, `expect_* var <ident>` RHS, or `mock prompt <ident>` references an undeclared name. Wired into the existing `validateReferences` path so `jaiph test` fails before any test runs. - Runtime gained a fail-fast guard for the same case as a safety net for callers that bypass validation. `examples/say_hello.test.jh` updated to use explicit `const response = run hello.default("Alice")`. Compiler-test fixtures cover the three new error cases plus a happy path. Made-with: Cursor

…lighter - index.html: re-sync code samples (`say_hello.jh`, `say_hello.test.jh`, `recover_loop.jh`) and run-output blocks to match the current examples, including `match` rules, explicit `return`, prompt-placeholder previews, and the explicit `const response = run …` capture. - index.html: rewrite the channels paragraph to lead with a concrete `findings -> analyst` example instead of the dense dispatch contract, and drop `Handle<T>` jargon from the async paragraph in favor of "resolves on the first read, or at the end of the embracing workflow." - assets/js/main.js: extend the Jaiph syntax highlighter to color `match` / `return` / `fail` keywords, the `=>` arrow, single- and triple-backtick scripts, and bare regex literals at the start of an expression. - examples/recover_loop.jh: trim stale top-of-file comment and shorten the recover-body logerr message; this is the canonical source the doc snippet now mirrors. - Delete examples/recover_loop.test.jh — referenced a `check_report` script that no longer exists in the recast example. Made-with: Cursor

Make the timeout unit explicit in the config key (old name now produces an E_PARSE migration message). Rename DockerRunConfig.timeout to timeoutSeconds for the same reason. Tighten install URL parsing so git@host:path.git@ref and refs containing slashes work, and surface --raw, jaiph install, and jaiph compile in CLI usage. Sweep docs to match: clearer CLI vs runtime split, document JAIPH_UNSAFE, and update the renamed key everywhere (CHANGELOG entry adjusted for consistency). Drop the arg_nonempty shell helper in docs_parity.jh in favor of a plain if check. Made-with: Cursor

examples/ is the single source of truth for showcase workflows. e2e/agent_inbox.jh and e2e/async.jh were already dead (110_examples.sh copies the examples/ versions); e2e/say_hello.jh and its .test.jh were near-duplicates only used by 95_say_hello_failure_output.sh, which now copies the same files from examples/. Also drop the leftover scratch file tmp-sandbox-doc-example.jh from the repo root. Made-with: Cursor

- Normalize inline script names in e2e output (__inline_<id>). - Update example and workflow e2e expectations; add recover_loop tests. - Refresh compiler test lists; adjust recover_loop example. Made-with: Cursor

The artifacts library now uses a named save_script and only exports save(). Remove artifacts.sh and the unpublished patch/apply helpers; document the workflow and trim the E2E to save-only. Add git.patch and wire engineer to copy the workspace diff into run artifacts after CI, docs, and queue cleanup. Fix engineer imports to keep sibling .jaiph modules on relative paths. Made-with: Cursor

…hour), up from 300, via `resolveDockerConfig` / `runtime.docker_timeout_seconds` when not overridden by `JAIPH_DOCKER_TIMEOUT` or in-file config. Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

- Rewrite 0.9.3 summary around the two core themes: sandboxing and compiler hardening. Tests and CLI/polish get their own summary lines. - Backfill "All changes" with the commits that shipped since the 0.9.3 bump but weren't yet reflected: match-arm unknown-verb rejection and bare-identifier resolution, formatter round-trip for `return <ident>`, `const` literal bindings in test blocks, removal of the implicit `response` in tests (breaking), run-tree printing of the workflow return value, and prompt preview keeping authored `${var}` placeholders. - Promote the `runtime.docker_timeout` → `runtime.docker_timeout_seconds` rename to a top-level breaking bullet. - Delete `.github/workflows/release.yml`. The jaiph npm name is locked and we no longer publish to npm; tagged releases now only drive the GHCR runtime-image publish job in `ci.yml`. Made-with: Cursor Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

dzikowski and others added 10 commits April 17, 2026 15:55

Add Docker live run-artifact regression test.

435e7ae

Protect the host-mounted .jaiph/runs contract by asserting Docker-backed runs create and grow step .out/.err files before the workflow exits. Made-with: Cursor

Fix explicit nested managed calls in Docker runs.

81e9aa3

Keep nested run/ensure calls explicit across validation, formatting, and runtime execution, and make Docker use the local Jaiph package with a writable workspace fallback so container behavior matches local runs. Made-with: Cursor

Queue: Harden docker tasks, add version/name/description for jaiph co…

955a67b

…nfig file Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

Remove target design documentation file to streamline project focus a…

348e3e4

…nd eliminate outdated content. Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

dzikowski force-pushed the nightly branch from 8040d74 to 348e3e4 Compare April 20, 2026 11:26

dzikowski and others added 19 commits April 20, 2026 13:57

Queue: Add cleanup tasks

58bbcca

Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

Attempt to fix CI

9052ad0

Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

ci(docker): publish runtime image for amd64 and arm64

7d1255b

Add Docker Buildx and multi-platform build so GHCR tags include linux/arm64 for Apple Silicon hosts, alongside linux/amd64. Made-with: Cursor

ci: publish Docker image only after full CI succeeds

37e6687

Move GHCR runtime build from a standalone workflow into CI with needs on test, e2e, docs-local, and e2e-wsl. Same triggers: nightly branch and v* tags. Remove docker-publish.yml to avoid duplicate pushes. Made-with: Cursor

dzikowski and others added 28 commits April 22, 2026 07:54

fix(docker): copy overlay-run.sh for builder stage npm run build

262cdb4

Made-with: Cursor

Queue: Harden compiler

56b8c57

Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

test: fix e2e and compiler test fixtures

f65d11b

- Normalize inline script names in e2e output (__inline_<id>). - Update example and workflow e2e expectations; add recover_loop tests. - Refresh compiler test lists; adjust recover_loop example. Made-with: Cursor

Docker: Default container execution timeout is **3600** seconds (one …

efb633d

…hour), up from 300, via `resolveDockerConfig` / `runtime.docker_timeout_seconds` when not overridden by `JAIPH_DOCKER_TIMEOUT` or in-file config. Signed-off-by: Jakub Dzikowski <jakub.t.dzikowski@gmail.com>

dzikowski merged commit aa033da into main Apr 24, 2026
8 checks passed

dzikowski deleted the nightly branch April 24, 2026 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.3 - sandboxing#14

v0.9.3 - sandboxing#14
dzikowski merged 89 commits into
mainfrom
nightly

dzikowski commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dzikowski commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant