diff --git a/docs/00_overview/MVP2_DASHBOARD.md b/docs/00_overview/MVP2_DASHBOARD.md index 9b6ce04f..6f84335a 100644 --- a/docs/00_overview/MVP2_DASHBOARD.md +++ b/docs/00_overview/MVP2_DASHBOARD.md @@ -30,7 +30,7 @@ Plan approved; run /impl-execute to ship | Open bugs | 7 | | Legacy "Path to MVP2" | 17 items — scoped-not-done + bugs + chore-ideas only (excludes feat/infra ideas) | | Backlog ideas | 3 idea-only feat/infra (not yet scoped into MVP2) | -| In flight | 0 feature(s) actively shipping | +| In flight | 1 feature(s) actively shipping | ## Pipeline @@ -66,9 +66,11 @@ Plan approved; run /impl-execute to ship | [bug_backend_suite_nondeterministic_caplog_isolation](implemented_features/2026_06_01_bug_backend_suite_nondeterministic_caplog_isolation/idea.md) | Bug | Many backend unit tests assert on captured log records (`caplog` / a structlog capture fixture) and fail with empty-capture shapes (`assert []`, `assert 'x' in []`) when run in the full randomized sui | — | [PR #364](https://github.com/SoundMindsAI/relyloop/pull/364) merged 2026-06-01 | | [bug_contract_allowlists_outdated_after_mvp2_features](implemented_features/2026_06_01_bug_contract_allowlists_outdated_after_mvp2_features/idea.md) | Bug | Three separate contract-test allowlists were not updated as features shipped through MVP2. Each is a "hand-maintained canonical list of valid values" that drifts when a feature adds new entries to the | — | [PR #364](https://github.com/SoundMindsAI/relyloop/pull/364) merged 2026-06-01 | -### Implementing (0) +### Implementing (1) -_None._ +| # | Priority | Feature | Type | One-liner | Depends on | Status | +|---|---|---|---|---|---|---| +| 1 | P2 | [bug_cluster_url_ssrf_hostname_bypass](planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/feature_spec.md) | Bug | When private clusters are disallowed (`RELYLOOP_ALLOW_PRIVATE_CLUSTERS=False`), a `base_url` whose host **resolves** to a private / loopback / link-local / reserved / multicast / unspecified / carrier | — | [PR #510](https://github.com/SoundMindsAI/relyloop/pull/510) | ### Plan (3) @@ -82,7 +84,7 @@ _None._ _None._ -### Idea (17) +### Idea (16) | # | Priority | Feature | Type | One-liner | Depends on | Status | |---|---|---|---|---|---|---| @@ -90,19 +92,18 @@ _None._ | 2 | P2 | [chore_overnight_result_card_screenshot](planned_features/02_mvp2/chore_overnight_result_card_screenshot/idea.md) | Chore | The `docs/08_guides/tutorial-first-study.md` Step 12 sub-section *"In the morning — read the overnight result card"* (verified at [tutorial-first-study.md:510](../docs/08_guides/tutorial-first-study.m | — | Idea — deferred FR-9 deliverable from PR #442 | | 3 | P2 | [chore_solr_post_pipeline_followups](planned_features/02_mvp2/chore_solr_post_pipeline_followups/idea.md) | Chore | The 13-story `infra_adapter_solr` execution surfaced several follow-on items that fit neither the original spec nor any sister feature folder. None block the MVP2 Solr release — they're operator-exper | — | Idea — tangential observations from `infra_adapter_solr` end-to-end | | 4 | P2 | [chore_test_router_conditional_mount](planned_features/02_mvp2/chore_test_router_conditional_mount/idea.md) | Chore | The `_test` router exposes data-mutating endpoints used only for deterministic E2E (seed a completed study, demo reseed, hard-delete studies/judgment-lists/proposals). Today it is registered **uncondi | — | Idea — surfaced during a codebase-wide security review (branch `claude/codebase-security-review-6njwio`) | -| 5 | P2 | [bug_cluster_url_ssrf_hostname_bypass](planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/idea.md) | Bug | The cluster registration `base_url` validator is intended to stop SSRF into internal/cloud-metadata endpoints (it cites "spec §10 Threat 3"), but the guard only fires when the host parses as a **liter | — | Idea — surfaced during a codebase-wide security review (branch `claude/codebase-security-review-6njwio`) | -| 6 | P2 | [bug_e2e_teardown_chain_node_delete_500](planned_features/02_mvp2/bug_e2e_teardown_chain_node_delete_500/idea.md) | Bug | The E2E global-teardown deletes seeded rows in a fixed order (per `chore_e2e_test_rows_isolation` Story 1.2 cleanup registration). For auto-followup **chains**, the seeded nodes are `queued` studies c | — | Idea — tangential discovery during `feat_overnight_autopilot` (Story 4.2 E2E, PR forthcoming) | -| 7 | P2 | [bug_request_id_header_unvalidated_log_injection](planned_features/02_mvp2/bug_request_id_header_unvalidated_log_injection/idea.md) | Bug | `RequestIDMiddleware` adopts a client-supplied `X-Request-ID` header verbatim with no validation of length or character set: | — | Idea — surfaced during a codebase-wide security review (branch `claude/codebase-security-review-6njwio`) | -| 8 | P2 | [bug_reseed_failure_blocks_retry_arq_singleton_dedup](planned_features/02_mvp2/bug_reseed_failure_blocks_retry_arq_singleton_dedup/idea.md) | Bug | `run_demo_reseed` is enqueued with a fixed Arq job id `demo_reseed:singleton` (the singleton concurrency guard). When a run reaches a terminal state, Arq stores its **result** under `arq:result:demo_r | — | Idea — tangential discovery while verifying `fix(demo): add Solr (8983) to the reseed engine host-URL mapping` (branch `feat_demo_reseed_solr_and_steplog`) | -| 9 | P2 | [bug_studies_detail_vitest_intermittent_timeout](planned_features/02_mvp2/bug_studies_detail_vitest_intermittent_timeout/idea.md) | Bug | Under the full `pnpm test` run (`vitest run`, default worker pool), the Study-detail-page render test sometimes blocks past the 5 s `testTimeout` default — but the test itself is data-driven from mock | — | Idea — captured during `chore_template_library_expansion` post-impl tangential sweep | -| 10 | P2 | [bug_webhook_concurrent_merge_race_timing_sensitive](planned_features/02_mvp2/bug_webhook_concurrent_merge_race_timing_sensitive/idea.md) | Bug | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | — | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | -| 11 | Backlog | [infra_arq_subprocess_test](planned_features/02_mvp2/infra_arq_subprocess_test/idea.md) | Infra | Idea (deferred from `feat_study_lifecycle` Phase 2 / PR #25 final GPT-5.5 review). Still applicable as of 2026-05-14: the three in-process tests cited below still cover the resume contract correctly; | — | Idea (deferred from `feat_study_lifecycle` Phase 2 / PR #25 final GPT-5.5 review). Still applicable as of 2026-05-14: the three in-process tests cited below still cover the resume contract correctly; a subprocess test would add a narrow Arq-version-regression guard. | -| 12 | Backlog | [infra_pr_yml_split_backend_test_lanes](planned_features/02_mvp2/infra_pr_yml_split_backend_test_lanes/idea.md) | Infra | The heavy `backend (tests + coverage)` job in `.github/workflows/pr.yml` runs the full `pytest backend/tests/` matrix (unit + integration + contract) serially in one job with `--cov` gating at `fail_u | — | Idea — **deferred (defer-until-binding-constraint)**. Carved out of `chore_pr_yml_parallelize_backend_job` (now in `implemented_features/2026_06_05_*`; see "Relationship to other work" below for the link) at its 2026-06-05 descope. Pick up only when the integration layer becomes the binding CI constraint after other critical-path work lands. | -| 13 | Backlog | [infra_smoke_fork_pr_secret_skip](planned_features/02_mvp2/infra_smoke_fork_pr_secret_skip/idea.md) | Infra | `.github/workflows/pr.yml` triggers on `pull_request:` ([pr.yml:43](../.github/workflows/pr.yml)) — **not** `pull_request_target`. GitHub deliberately withholds repository secrets from workflows trigg | — | Idea — tangential discovery while merging PR #387 (`chore_arq_pool_aclose_deprecation`) | -| 14 | Backlog | [chore_auto_followup_parent_advisory_lock](planned_features/02_mvp2/chore_auto_followup_parent_advisory_lock/idea.md) | Chore | The shipped `feat_auto_followup_studies` worker uses a two-layer idempotency scheme: | — | Idea — captured as a standalone file to resolve broken cross-references in `feat_auto_followup_studies` D-11 + plan F2 + `bug_auto_followup_completed_parent_stop_chain_race/idea.md`. The slug was coined 2026-05-24 in D-11 but only existed as descriptive prose across other documents until now. | -| 15 | Backlog | [chore_e2e_overnight_strategy_radix_select_timing](planned_features/02_mvp2/chore_e2e_overnight_strategy_radix_select_timing/idea.md) | Chore | The Story 3.2 E2E spec walks the create-study wizard to Step 5, clicks the depth `` becomes visible. In chromium against `pnpm dev`, t | — | Idea — tangential follow-up captured during `feat_overnight_final_solution` Story 3.2 implementation | -| 16 | Backlog | [chore_ubi_hybrid_template_render](planned_features/02_mvp2/chore_ubi_hybrid_template_render/idea.md) | Chore | Idea — contract decision deferred (NOT a worker bug) | — | Idea — contract decision deferred (NOT a worker bug) | -| 17 | Backlog | [bug_chat_long_conversation_truncation](planned_features/02_mvp2/bug_chat_long_conversation_truncation/idea.md) | Bug | [`backend/app/services/agent_chat.send_user_message`](../../backend/app/services/agent_chat.py) defensively caps the OpenAI history at the most recent `HISTORY_MAX_MESSAGES = 100` messages… | — | Held for MVP2 (decided 2026-05-13). Folder renamed with `_mvp2` suffix to make the deferral visible at-a-glance in `ls docs/00_overview/planned_features/`. Resume work when MVP2 starts — no technical dependency on MVP2 infra (audit_log is N/A; Langfuse is convenience only); the deferral is scope discipline + zero current impact (latent bug, no operator has hit the 100-message cap). | +| 5 | P2 | [bug_e2e_teardown_chain_node_delete_500](planned_features/02_mvp2/bug_e2e_teardown_chain_node_delete_500/idea.md) | Bug | The E2E global-teardown deletes seeded rows in a fixed order (per `chore_e2e_test_rows_isolation` Story 1.2 cleanup registration). For auto-followup **chains**, the seeded nodes are `queued` studies c | — | Idea — tangential discovery during `feat_overnight_autopilot` (Story 4.2 E2E, PR forthcoming) | +| 6 | P2 | [bug_request_id_header_unvalidated_log_injection](planned_features/02_mvp2/bug_request_id_header_unvalidated_log_injection/idea.md) | Bug | `RequestIDMiddleware` adopts a client-supplied `X-Request-ID` header verbatim with no validation of length or character set: | — | Idea — surfaced during a codebase-wide security review (branch `claude/codebase-security-review-6njwio`) | +| 7 | P2 | [bug_reseed_failure_blocks_retry_arq_singleton_dedup](planned_features/02_mvp2/bug_reseed_failure_blocks_retry_arq_singleton_dedup/idea.md) | Bug | `run_demo_reseed` is enqueued with a fixed Arq job id `demo_reseed:singleton` (the singleton concurrency guard). When a run reaches a terminal state, Arq stores its **result** under `arq:result:demo_r | — | Idea — tangential discovery while verifying `fix(demo): add Solr (8983) to the reseed engine host-URL mapping` (branch `feat_demo_reseed_solr_and_steplog`) | +| 8 | P2 | [bug_studies_detail_vitest_intermittent_timeout](planned_features/02_mvp2/bug_studies_detail_vitest_intermittent_timeout/idea.md) | Bug | Under the full `pnpm test` run (`vitest run`, default worker pool), the Study-detail-page render test sometimes blocks past the 5 s `testTimeout` default — but the test itself is data-driven from mock | — | Idea — captured during `chore_template_library_expansion` post-impl tangential sweep | +| 9 | P2 | [bug_webhook_concurrent_merge_race_timing_sensitive](planned_features/02_mvp2/bug_webhook_concurrent_merge_race_timing_sensitive/idea.md) | Bug | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | — | Idea — surfaced during `bug_demo_clusters_unreachable_in_healthz` PR #236 CI. | +| 10 | Backlog | [infra_arq_subprocess_test](planned_features/02_mvp2/infra_arq_subprocess_test/idea.md) | Infra | Idea (deferred from `feat_study_lifecycle` Phase 2 / PR #25 final GPT-5.5 review). Still applicable as of 2026-05-14: the three in-process tests cited below still cover the resume contract correctly; | — | Idea (deferred from `feat_study_lifecycle` Phase 2 / PR #25 final GPT-5.5 review). Still applicable as of 2026-05-14: the three in-process tests cited below still cover the resume contract correctly; a subprocess test would add a narrow Arq-version-regression guard. | +| 11 | Backlog | [infra_pr_yml_split_backend_test_lanes](planned_features/02_mvp2/infra_pr_yml_split_backend_test_lanes/idea.md) | Infra | The heavy `backend (tests + coverage)` job in `.github/workflows/pr.yml` runs the full `pytest backend/tests/` matrix (unit + integration + contract) serially in one job with `--cov` gating at `fail_u | — | Idea — **deferred (defer-until-binding-constraint)**. Carved out of `chore_pr_yml_parallelize_backend_job` (now in `implemented_features/2026_06_05_*`; see "Relationship to other work" below for the link) at its 2026-06-05 descope. Pick up only when the integration layer becomes the binding CI constraint after other critical-path work lands. | +| 12 | Backlog | [infra_smoke_fork_pr_secret_skip](planned_features/02_mvp2/infra_smoke_fork_pr_secret_skip/idea.md) | Infra | `.github/workflows/pr.yml` triggers on `pull_request:` ([pr.yml:43](../.github/workflows/pr.yml)) — **not** `pull_request_target`. GitHub deliberately withholds repository secrets from workflows trigg | — | Idea — tangential discovery while merging PR #387 (`chore_arq_pool_aclose_deprecation`) | +| 13 | Backlog | [chore_auto_followup_parent_advisory_lock](planned_features/02_mvp2/chore_auto_followup_parent_advisory_lock/idea.md) | Chore | The shipped `feat_auto_followup_studies` worker uses a two-layer idempotency scheme: | — | Idea — captured as a standalone file to resolve broken cross-references in `feat_auto_followup_studies` D-11 + plan F2 + `bug_auto_followup_completed_parent_stop_chain_race/idea.md`. The slug was coined 2026-05-24 in D-11 but only existed as descriptive prose across other documents until now. | +| 14 | Backlog | [chore_e2e_overnight_strategy_radix_select_timing](planned_features/02_mvp2/chore_e2e_overnight_strategy_radix_select_timing/idea.md) | Chore | The Story 3.2 E2E spec walks the create-study wizard to Step 5, clicks the depth `` becomes visible. In chromium against `pnpm dev`, t | — | Idea — tangential follow-up captured during `feat_overnight_final_solution` Story 3.2 implementation | +| 15 | Backlog | [chore_ubi_hybrid_template_render](planned_features/02_mvp2/chore_ubi_hybrid_template_render/idea.md) | Chore | Idea — contract decision deferred (NOT a worker bug) | — | Idea — contract decision deferred (NOT a worker bug) | +| 16 | Backlog | [bug_chat_long_conversation_truncation](planned_features/02_mvp2/bug_chat_long_conversation_truncation/idea.md) | Bug | [`backend/app/services/agent_chat.send_user_message`](../../backend/app/services/agent_chat.py) defensively caps the OpenAI history at the most recent `HISTORY_MAX_MESSAGES = 100` messages… | — | Held for MVP2 (decided 2026-05-13). Folder renamed with `_mvp2` suffix to make the deferral visible at-a-glance in `ls docs/00_overview/planned_features/`. Resume work when MVP2 starts — no technical dependency on MVP2 infra (audit_log is N/A; Langfuse is convenience only); the deferral is scope discipline + zero current impact (latent bug, no operator has hit the 100-message cap). | ## Dependency graph diff --git a/docs/00_overview/mvp2_dashboard.html b/docs/00_overview/mvp2_dashboard.html index 7b3e2887..8aade395 100644 --- a/docs/00_overview/mvp2_dashboard.html +++ b/docs/00_overview/mvp2_dashboard.html @@ -446,7 +446,7 @@

MVP2 Progress

In flight: - 0 feature(s) actively shipping + 1 feature(s) actively shipping @@ -463,7 +463,7 @@

Pipeline

-

Idea 17

+

Idea 16

@@ -517,19 +517,6 @@

Idea 17

-
- -
- Bug - P2 - -
-
The cluster registration `base_url` validator is intended to stop SSRF into internal/cloud-metadata endpoints (it cites "spec §10 Threat 3"), but the guard only fires when the host parses as a **liter
- - -
- -
@@ -736,7 +723,19 @@

Plan 3

-

Implementing 0

+

Implementing 1

+ +
+ +
+ Bug + P2 + PR #510 +
+
When private clusters are disallowed (`RELYLOOP_ALLOW_PRIVATE_CLUSTERS=False`), a `base_url` whose host **resolves** to a private / loopback / link-local / reserved / multicast / unspecified / carrier
+
deferred: Phase 2
+ +
diff --git a/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/implementation_plan.md b/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/implementation_plan.md index f96422bc..56f95f51 100644 --- a/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/implementation_plan.md +++ b/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/implementation_plan.md @@ -1,7 +1,7 @@ # Implementation Plan — Cluster base_url SSRF guard (hostname-aware) **Date:** 2026-06-09 -**Status:** Ready for Execution +**Status:** Complete — Phase 1 (PR #510, squash-merged `3cb28c7`, 2026-06-09; Phase 2 deferred → phase2_idea.md) **Primary spec:** [`feature_spec.md`](feature_spec.md) **Policy source(s):** CLAUDE.md Absolute Rules #4 (engine-adapter boundary), #10 (never log secrets); `docs/01_architecture/cluster-lifecycle.md` diff --git a/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/pipeline_status.md b/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/pipeline_status.md index 64e111f1..f7e30862 100644 --- a/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/pipeline_status.md +++ b/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/pipeline_status.md @@ -1,5 +1,7 @@ # Pipeline Status — Cluster base_url SSRF guard (hostname-aware) +**Release:** mvp2 + ## Idea - Status: Complete - File: idea.md @@ -20,4 +22,8 @@ - Phases covered: Phase 1 (Phase 2 connect-time IP pinning deferred → phase2_idea.md) ## Implementation -- Status: Not started +- Status: Complete — Phase 1 (PR #510, squash-merged `3cb28c7`, 2026-06-09) +- CI: all `pr.yml` jobs green (smoke skipped — opt-in/off) +- Stories: 3/3 complete (classifier / orchestrator+wiring / docs) +- Review: Opus self-review (GPT-5.5 unreachable) + Gemini Code Assist 2 Medium findings accepted (bounded DNS timeout, malformed-port 422) +- **Folder retained in `planned_features/` — Phase 2 (connect-time IP pinning, `phase2_idea.md`) is still pending**, so it is NOT moved to `implemented_features/` per the impl-execute deferred-phase rule. diff --git a/state.md b/state.md index e9334d05..a03694a3 100644 --- a/state.md +++ b/state.md @@ -2,7 +2,7 @@ > Read this first. A one-page snapshot: current focus, the last few merges, what's in flight, what's queued, and where the project sits in the MVP1 → MVP2 → MVP3 → GA roadmap. **Historical feature-merge narrative + chained execution context lives in [`state_history.md`](state_history.md)** — new merge entries land there, not here (per `chore_state_md_size_compression`, 2026-05-29). Keep this file loadable in a single `Read` call. -**Last updated:** 2026-06-07 (**`chore_overnight_result_card_screenshot` finalized** — PR #492 reviewed via `/pr-review` and squash-merged (`4128572`); this entry is its `state.md`/`state_history.md` finalization. Ships the tutorial Step 12 morning-result-card PNG + the first-ever tutorial-image **ferry plumbing** across both doc-copy pipelines (`ui/scripts/copy-docs.mjs` `copyImageAssets`/`pruneStaleImages`; `website/scripts/build_guides.py` `copy_long_form_images()` + a back-compat-defaulted prune kwarg), both guarding the `images/` subdir from the flat `.md`/`rmtree` prune and no-op'ing when the source dir is absent. **Docs/test/script only — no code-path/API/migration** (Alembic head stays `0023`). 17 new tests (10 vitest + 7 pytest). GPT-5.5 unreachable → Opus self-review; Gemini 1 Med accepted (`02450dd`: hoist `readFileSync` to a top-level import vs inline `require`). All three freshness gates (`generated-artifacts-fresh`/`copy-docs`/`build-guides`) + 18/19 checks green (smoke skipped — opt-in/off). Full narrative in [`state_history.md`](state_history.md).) +**Last updated:** 2026-06-09 (**`bug_cluster_url_ssrf_hostname_bypass` Phase 1 merged** — PR #510, squash-merged `3cb28c7`; closes the cluster `base_url` SSRF hostname-bypass via a flag-gated async resolve-and-classify guard before any probe. New `domain/cluster/url_policy.py` (pure classifier) + `services/cluster_url_policy.py` (async orchestrator) + `400 CLUSTER_URL_BLOCKED`; the two `base_url` validators de-duped to one structural helper. No migration/UI (Alembic head stays `0023`). Surfaced by security review #504 (auto-closed); driven through `/pipeline --auto` (spec→plan→3 stories). 52 new tests; Gemini 2 Med accepted (bounded DNS, malformed-port 422). **Phase 2 (connect-time IP pinning) deferred — folder stays in `planned_features/02_mvp2/` with `phase2_idea.md`, NOT archived.** Previously: **`chore_overnight_result_card_screenshot` finalized** — PR #492 reviewed via `/pr-review` and squash-merged (`4128572`); this entry is its `state.md`/`state_history.md` finalization. Ships the tutorial Step 12 morning-result-card PNG + the first-ever tutorial-image **ferry plumbing** across both doc-copy pipelines (`ui/scripts/copy-docs.mjs` `copyImageAssets`/`pruneStaleImages`; `website/scripts/build_guides.py` `copy_long_form_images()` + a back-compat-defaulted prune kwarg), both guarding the `images/` subdir from the flat `.md`/`rmtree` prune and no-op'ing when the source dir is absent. **Docs/test/script only — no code-path/API/migration** (Alembic head stays `0023`). 17 new tests (10 vitest + 7 pytest). GPT-5.5 unreachable → Opus self-review; Gemini 1 Med accepted (`02450dd`: hoist `readFileSync` to a top-level import vs inline `require`). All three freshness gates (`generated-artifacts-fresh`/`copy-docs`/`build-guides`) + 18/19 checks green (smoke skipped — opt-in/off). Full narrative in [`state_history.md`](state_history.md).) ## Where the roadmap sits @@ -16,8 +16,8 @@ MVP1 (v0.1) **shipped** — all six differentiators live (Bayesian/TPE optimizer ## Current branch / execution context -- **Branch:** `main` (`chore_overnight_result_card_screenshot` just merged — PR #492, 2026-06-07; the prior second 3-item MVP2 queue #480/#481/#482 shipped 2026-06-05). All `pr.yml` checks green (smoke skipped — opt-in/off). Heavy backend CI check is named `backend (tests + coverage)`. -- **Active feature:** None in flight — `chore_overnight_result_card_screenshot` (#492) and the prior second 3-item queue (#480/#481/#482) all shipped. **Still deferred:** `chore_demo_seeding_integration_tests_rewrite` (14-story DB-only integration choreography; blocked on a local stack for safe CI-blind validation) and `infra_pr_yml_split_backend_test_lanes` (defer-until-integration-is-the-binding-CI-constraint). GPT-5.5 unreachable in this env → Opus self-review substitution. +- **Branch:** `main` (`bug_cluster_url_ssrf_hostname_bypass` Phase 1 just merged — PR #510, 2026-06-09; before it `chore_overnight_result_card_screenshot` PR #492). All `pr.yml` checks green (smoke skipped — opt-in/off). Heavy backend CI check is named `backend (tests + coverage)`. +- **Active feature:** None in flight. **Deferred phase tracked:** `bug_cluster_url_ssrf_hostname_bypass` **Phase 2** (connect-time IP pinning for DNS rebinding) remains in `planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass/phase2_idea.md` (folder intentionally NOT archived since a phase is pending). **Still deferred:** `chore_demo_seeding_integration_tests_rewrite` (14-story DB-only integration choreography; blocked on a local stack for safe CI-blind validation) and `infra_pr_yml_split_backend_test_lanes` (defer-until-integration-is-the-binding-CI-constraint). GPT-5.5 unreachable in this env → Opus self-review substitution. - **Alembic head:** `0023_proposals_superseded_status` (unchanged — `feat_fts_rank_ordering` is no-migration; head last moved by `feat_overnight_final_solution_phase3` PR #457). - **Python:** 3.13. **Frontend stack:** Next 16 (App Router + Turbopack), React 19, Tailwind 4 (CSS-first), Vitest 4, ESLint 9 (flat), TypeScript 6, Playwright (chromium, single worker) for E2E. - **Coverage gates:** backend 80% (`fail_under` in pyproject), UI vitest + tsc + ESLint + Next build, plus a full-stack smoke E2E job. Live pass counts: see the latest `pr.yml` run (the historical per-feature counts moved to `state_history.md`). @@ -26,12 +26,12 @@ MVP1 (v0.1) **shipped** — all six differentiators live (Bayesian/TPE optimizer Detail + reasoning for each is in [`state_history.md`](state_history.md). +- **2026-06-09** — `bug_cluster_url_ssrf_hostname_bypass` Phase 1 (PR #510, squash-merged `3cb28c7`). **Hostname-aware SSRF guard on cluster `base_url`.** The prior validator only inspected literal IPs, so in the hardened posture (`RELYLOOP_ALLOW_PRIVATE_CLUSTERS=False`) any DNS hostname — `metadata.google.internal`, an internal name, or anything resolving to a private/loopback/link-local IP — bypassed the check and got probed. New flag-gated async `assert_base_url_allowed` (`backend/app/services/cluster_url_policy.py`) runs **before** any adapter build / probe in `register_cluster` + `test_cluster_connection`: metadata-hostname denylist → literal-IP classify → else `getaddrinfo` (bounded 5s) + classify every resolved address; raises `ClusterUrlBlocked` → **400 `CLUSTER_URL_BLOCKED`**. Pure classifier in `backend/app/domain/cluster/url_policy.py` (private/loopback/link-local/reserved/multicast/unspecified + CGNAT + IPv4-mapped unwrap); the two duplicated `base_url` Pydantic validators collapsed to one structural helper (scheme + host + malformed-port→422). **Default posture unchanged** (flag defaults `True` → strict no-op; local Docker hostnames keep working). **No migration, no UI** (Alembic head stays `0023`). Surfaced by the codebase security review (#504, auto-closed). Found via the full pipeline (`/pipeline --auto`): spec → plan → 3 stories. 52 new tests (domain classifier + service orchestrator incl. enforcement-before-probe + real-DNS integration + contract envelope). GPT-5.5 unreachable → Opus self-review; Gemini 2 Medium accepted (`8c2ca37`: bounded DNS timeout + malformed-port 422). All `pr.yml` jobs green (smoke skipped). **Phase 2 (connect-time IP pinning for DNS rebinding) deferred** → folder stays in `planned_features/02_mvp2/` with `phase2_idea.md`; NOT moved to `implemented_features/`. - **2026-06-07** — `chore_overnight_result_card_screenshot` (PR #492, squash-merged `4128572`). **Tutorial Step 12 morning-result-card screenshot + image-ferry plumbing.** Closes FR-9 of `feat_overnight_final_solution_phase2` (PR #442) — the morning result-card prose shipped there but the screenshot deliverable didn't (the standard demo seed can't produce a terminated `follow_suggestions` chain with a winning proposal + digest; the plan's hard-fallback escape hatch authorized this follow-up chore). Ships three things: (1) the 34 KB PNG (`docs/08_guides/images/12-overnight-result-card.png`, captured 1440×960 against a deterministically-seeded anchor+follow-up chain, `selected_followup_kind='narrow'`, +0.0450 lift); (2) the Step 12 `![Overnight result card](images/12-overnight-result-card.png)` reference in `tutorial-first-study.md`; (3) the **first-ever tutorial-image ferry plumbing** — `ui/scripts/copy-docs.mjs` gains `copyImageAssets`/`pruneStaleImages` (mirrors `docs/08_guides/images/*.png` → `ui/public/docs/images/`) and `website/scripts/build_guides.py` gains `copy_long_form_images()` + a back-compat-defaulted `copied_long_form_images` prune kwarg (mirrors → `website/docs/guides/in-depth/images/`); both guard the `images/` subdir from the flat `.md`/`rmtree` prune, no-op when the source dir is absent, and are path-traversal-safe by construction. **Docs/test/script only — no code-path/API/migration** (head stays `0023`). 17 new tests (10 vitest + 7 pytest) cover copy + prune + steady-state-absent + overwrite-stale + protect-subdir-during-flat-prune + back-compat. GPT-5.5 unreachable → Opus self-review; Gemini 1 Med accepted (`02450dd`: hoist `readFileSync` to a top-level `node:fs` import vs inline `require`). All three freshness gates (`generated-artifacts-fresh`/`copy-docs`/`build-guides`) + 18/19 checks green (smoke skipped — opt-in/off). Bundled preflight idea patches (locks D-1 asset path, D-2 regen tool, D-3 plumbing-in-scope) + MVP2 dashboard regen. State finalization shipped as this separate docs PR (#492 was reviewed/merged via `/pr-review`). - **2026-06-05** — `bug_seed_meaningful_demos_silent_bulk_errors` (PR #482, squash-merged `7991147`). **`seed_meaningful_demos.py` fails loud on `/_bulk` errors.** The ESCI rich-scenario bulk loop read and DISCARDED the `/_bulk` response; ES bulk returns HTTP 200 even when the primary shard is INITIALIZING (error in the body), so `unavailable_shards_exception` on a cold ES — or any mapping bug — silently produced a partial/empty index while `make seed-demo` looked successful. **Script + test only — no code-path/API/migration** (head stays `0023`). Adds a **standalone urllib** retry helper (`_bulk_index_with_retry` + `_first_bulk_error`) mirroring the httpx-based `backend/app/scripts/seed_es.py` posture (`bug_smoke_seed_es_unavailable_shards_race`): parse the body, retry ONLY `unavailable_shards_exception` (3×2s), raise `RuntimeError` loud on any other error or exhausted retries. Locked **standalone** (not a shared `scripts/_es_bulk.py`) — the two scripts use different HTTP libs and `backend` already imports `SCENARIOS` *from* this script (cycle risk); `send`/`sleep` injectable for unit-testability. **GPT-5.5 unreachable → Opus self-review.** Gemini 1 Med accepted (`f24049d`): `_first_bulk_error` now type-guards the payload + returns a synthetic `unknown_bulk_error` when `errors:true` but no item error is found (a real gap in THIS caller — unlike `seed_es.py` it treats a `None` return as success), so a malformed body fails loud instead of phantom-success. 7 unit tests mirroring `test_seed_es_retry.py`; verified load-bearing via a `raise`→`return` mutation. Idea preflighted (stale `:917-935` → ~2549-2571; `scripts/seed_es.py` → `backend/app/scripts/seed_es.py`); bug_fix.md captures the locked design. All 19 `pr.yml` checks green. Finalization bundled (this batched-3 PR). - **2026-06-05** — `bug_relyloop_spec_ubi_section_drift` (PR #481, squash-merged `2b57848`). **Repoint 3 broken `relyloop-spec.md` UBI/Solr links to `implemented_features/`.** The §"Click-derived judgments" section linked `feat_ubi_judgments` + `infra_adapter_solr` via `planned_features/` paths, but both shipped + moved, so line 723 (`infra_adapter_solr` → `planned_features/02_mvp2/…`) and lines 2282 + 2293 (the original broken `../../00_overview/planned_features/…` pattern) 404'd; all three repointed to `implemented_features/2026_05_31_infra_adapter_solr/` and `implemented_features/2026_05_29_feat_ubi_judgments/` (verified resolve). **Docs-only — no code, no migration** (head stays `0023`). Preflight found the idea mostly OBE: the `(MVP1.5)` title + the §706 `feat_ubi_judgments` link were already fixed in earlier sweeps; the real remaining breakage was the 3 shipped-feature links. No Gemini findings; reduced docs-only check set green. - **2026-06-05** — `chore_demo_reseed_partial_completion_fast_test` (PR #480, squash-merged `878cd96`). **Fast unit guard for the demo-reseed partial-completion path.** The engine-tolerant behavior (an unreachable engine's scenario skips, the reseed still finishes `status="complete"` with a non-empty `scenarios_skipped` + exactly one `demo_reseed_partial_completion_engines_unreachable` WARN, AC-7) was asserted end-to-end only by the 13-19 min heavy-lane `test_demo_seeding_ubi_full.py`. **Test only — no production diff, no migration** (head stays `0023`). Drives the real `reseed_demo_state` orchestrator with every module-level I/O helper (`_post`/`_get`/`_put`/`_seed_real_study_for_scenario`/`_seed_rich_scenario`/UBI helpers) monkeypatched to canned success (locked approach **b′** — no httpx-URL mock, no seam extraction; control flow stays real, so no conflict with the deferred `chore_demo_seeding_integration_tests_rewrite`). `is_engine_reachable` reports only Solr down. 2 cases: only-Solr-skip (`scenarios_skipped == ["acme-kb-docs-solr"]`, status complete, one WARN) + AC-3 (a reachable mid-seed failure raises a generic `DemoSeedingError`, never a skip). **GPT-5.5 unreachable → Opus self-review.** Gemini 3 Med ALL accepted (`3c69188`): `_fake_get` gains `auth`, the `is_engine_reachable` mock gains `**kwargs` (real fn has keyword-only `timeout_s`), AC-3 reads `captured["progress"]` directly. Verified load-bearing via a WARN-suppression mutation; pure unit (no DB/engine/OpenAI). bug_fix.md captures the locked design. All 19 `pr.yml` checks green. -- **2026-06-05** — `chore_pr_yml_parallelize_backend_job` (PR #478, squash-merged `ba11653`). **Drop redundant ruff/format/mypy from the heavy `backend` CI job.** **CI-config + docs only — no code, no migration** (head stays `0023`). Removed the heavy job's `ruff check` / `ruff format --check` / `mypy backend/` steps (each already covered by the always-run `static-checks-backend` job) + the now-unused mypy/ruff cache-restore step; renamed display name `backend (lint + typecheck + tests + coverage)` → `backend (tests + coverage)`. ~30-40s off the per-PR critical path; lint/type failures still go red via `static-checks-backend` (this lane just no longer self-aborts early on them). **Descoped at plan-design to lint-dedup only:** the lane-split's real win is just ~1-1.5min — the ~8min integration layer can't run under `-n auto` (FK-teardown collision `query_sets_cluster_id_fkey`, reverted on PR #291) — so the lane-split + `coverage combine` merge-gate + split-integration-by-service-container were deferred to the new [`infra_pr_yml_split_backend_test_lanes`](docs/00_overview/planned_features/02_mvp2/infra_pr_yml_split_backend_test_lanes/idea.md) idea. Shipped ad-hoc (no spec/plan — the obsolete full-pipeline `feature_spec.md` + `pipeline_status.md` from the abandoned spec→plan path were discarded). `docs/06_vendor_docs/github-branch-protection.md` updated: added the `static-checks-*` jobs to the required-check set (lint/type lives there now, not in the heavy backend job) + the renamed check name. Gemini 1 Med accepted (`ec013dc`: repo-root relative-link depth in the idea's PREFLIGHT REFRESH block — file is 5 levels deep, links used 4 `..` → fixed to 5; both targets verified to resolve). No GPT-5.5 (unreachable; CI-config + docs only). All 18 `pr.yml` checks green (smoke skipped — opt-in/off). Finalization bundled the dashboard + public-roadmap regen (no extra PR). -_(older entries — full narrative in [`state_history.md`](state_history.md): `chore_studies_post_arq_spy_fixture` PR #476, `chore_ubi_reader_search_after_pagination` PR #474, `feat_fts_rank_ordering` PR #472, `bug_judgment_header_omits_click_bucket` PR #470, `bug_baseline_phase_test_isolation` PR #466, `chore_cluster_detail_rung_badge` PR #464, `feat_ubi_llm_study_comparison` PR #461, `feat_query_normalization_tuning` PR #459, `feat_overnight_final_solution_phase3` PR #457, `feat_study_wizard_inline_judgment_generation` PR #453, `feat_walkthrough_video_cursor_captions` PR #451, `feat_website_walkthrough_guides` PR #448, `feat_proposal_full_param_space_view` PR #446, `feat_overnight_studies_summary_card` PR #444, `feat_overnight_final_solution_phase2` PR #442, `feat_overnight_final_solution` PR #440, `feat_studies_list_trial_convergence_columns` PR #438, `feat_list_count_columns` PR #436, `infra_generated_artifact_freshness_gate` PR #433, `chore_scorecard_pin_deps_postcss` PR #430, `bug_llm_capability_cache_no_refresh` PR #426, `infra_smoke_reseed_runtime_budget` PR #424, `feat_studies_convergence_visibility` PR #421/#422, `bug/cli-seed-ubi-missing-engine-type` PR #419, `chore_template_library_expansion` PR #416, `infra_solr_smoke_stability` PR #383, `infra_solr_ci_readiness` Phase 1 PR #367, MVP2 backlog batch PR #364, `feat_study_convergence_indicator` PR #352, `feat_overnight_autopilot` PR #343, `infra_adapter_solr` PR #336, …)_ +_(older entries — full narrative in [`state_history.md`](state_history.md): `chore_pr_yml_parallelize_backend_job` PR #478, `chore_studies_post_arq_spy_fixture` PR #476, `chore_ubi_reader_search_after_pagination` PR #474, `feat_fts_rank_ordering` PR #472, `bug_judgment_header_omits_click_bucket` PR #470, `bug_baseline_phase_test_isolation` PR #466, `chore_cluster_detail_rung_badge` PR #464, `feat_ubi_llm_study_comparison` PR #461, `feat_query_normalization_tuning` PR #459, `feat_overnight_final_solution_phase3` PR #457, `feat_study_wizard_inline_judgment_generation` PR #453, `feat_walkthrough_video_cursor_captions` PR #451, `feat_website_walkthrough_guides` PR #448, `feat_proposal_full_param_space_view` PR #446, `feat_overnight_studies_summary_card` PR #444, `feat_overnight_final_solution_phase2` PR #442, `feat_overnight_final_solution` PR #440, `feat_studies_list_trial_convergence_columns` PR #438, `feat_list_count_columns` PR #436, `infra_generated_artifact_freshness_gate` PR #433, `chore_scorecard_pin_deps_postcss` PR #430, `bug_llm_capability_cache_no_refresh` PR #426, `infra_smoke_reseed_runtime_budget` PR #424, `feat_studies_convergence_visibility` PR #421/#422, `bug/cli-seed-ubi-missing-engine-type` PR #419, `chore_template_library_expansion` PR #416, `infra_solr_smoke_stability` PR #383, `infra_solr_ci_readiness` Phase 1 PR #367, MVP2 backlog batch PR #364, `feat_study_convergence_indicator` PR #352, `feat_overnight_autopilot` PR #343, `infra_adapter_solr` PR #336, …)_ ## In flight diff --git a/website/docs/roadmap.md b/website/docs/roadmap.md index daa9824b..35f3e829 100644 --- a/website/docs/roadmap.md +++ b/website/docs/roadmap.md @@ -208,7 +208,7 @@ - 🟡 [PR Yml Split Backend Test Lanes](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/infra_pr_yml_split_backend_test_lanes) - 🟡 [Smoke Fork PR Secret Skip](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/infra_smoke_fork_pr_secret_skip) -??? note "Maintenance & fixes (17)" +??? note "Maintenance & fixes (21)" - ✅ [Backend Suite Nondeterministic Caplog Isolation](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/implemented_features/2026_06_01_bug_backend_suite_nondeterministic_caplog_isolation) · [#364](https://github.com/SoundMindsAI/relyloop/pull/364) - ✅ [Contract Allowlists Outdated After Mvp2 Features](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/implemented_features/2026_06_01_bug_contract_allowlists_outdated_after_mvp2_features) · [#364](https://github.com/SoundMindsAI/relyloop/pull/364) @@ -216,15 +216,19 @@ - ✅ [Studies Post Arq Spy Fixture](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/implemented_features/2026_06_05_chore_studies_post_arq_spy_fixture) · [#476](https://github.com/SoundMindsAI/relyloop/pull/476) - ✅ [Template Library Expansion](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/implemented_features/2026_06_02_chore_template_library_expansion) · [#416](https://github.com/SoundMindsAI/relyloop/pull/416) - ✅ [UBI Reader Search After Pagination](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/implemented_features/2026_06_05_chore_ubi_reader_search_after_pagination) · [#474](https://github.com/SoundMindsAI/relyloop/pull/474) + - 🟡 [Agent Confirmation Tool Name Word Boundary](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_agent_confirmation_tool_name_word_boundary) - 🟡 [Auto Followup Parent Advisory Lock](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_auto_followup_parent_advisory_lock) - 🟡 [Chat Long Conversation Truncation](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/bug_chat_long_conversation_truncation) + - 🟡 [Cluster Url Ssrf Hostname Bypass](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/bug_cluster_url_ssrf_hostname_bypass) - 🟡 [Demo Seeding Integration Tests Rewrite](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_demo_seeding_integration_tests_rewrite) - 🟡 [E2E Overnight Strategy Radix Select Timing](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_e2e_overnight_strategy_radix_select_timing) - 🟡 [E2E Teardown Chain Node Delete 500](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/bug_e2e_teardown_chain_node_delete_500) - 🟡 [Overnight Result Card Screenshot](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_overnight_result_card_screenshot) + - 🟡 [Request Id Header Unvalidated Log Injection](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/bug_request_id_header_unvalidated_log_injection) - 🟡 [Reseed Failure Blocks Retry Arq Singleton Dedup](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/bug_reseed_failure_blocks_retry_arq_singleton_dedup) - 🟡 [Solr Post Pipeline Followups](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_solr_post_pipeline_followups) - 🟡 [Studies Detail Vitest Intermittent Timeout](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/bug_studies_detail_vitest_intermittent_timeout) + - 🟡 [Test Router Conditional Mount](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_test_router_conditional_mount) - 🟡 [UBI Hybrid Template Render](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/chore_ubi_hybrid_template_render) - 🟡 [Webhook Concurrent Merge Race Timing Sensitive](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/02_mvp2/bug_webhook_concurrent_merge_race_timing_sensitive) @@ -236,7 +240,11 @@ ## GA v1 / v1.0 — Production-ready   ⬜ Planned -⬜ **Planned** — *Production-ready*. Themed in the release matrix; individual features not yet filed. +??? note "Maintenance & fixes (1)" + + - ⬜ [Cors Credentials Origin Hardening](https://github.com/SoundMindsAI/relyloop/tree/main/docs/00_overview/planned_features/04_ga/chore_cors_credentials_origin_hardening) + +→ Full engineering view: [GA v1 / v1.0 dashboard](https://github.com/SoundMindsAI/relyloop/blob/main/docs/00_overview/GA_DASHBOARD.md) ---