Skip to content

test: rewrite demo-reseed integration tests for the async POST→poll flow#514

Open
SoundMindsAI wants to merge 5 commits into
mainfrom
feature/demo-seeding-integration-tests-rewrite
Open

test: rewrite demo-reseed integration tests for the async POST→poll flow#514
SoundMindsAI wants to merge 5 commits into
mainfrom
feature/demo-seeding-integration-tests-rewrite

Conversation

@SoundMindsAI

Copy link
Copy Markdown
Owner

Summary

Rewrites the 10 skipped demo-reseed integration tests for the async POST→202→poll contract (they were written against the old synchronous handler; bug_demo_reseed_fake_metric_regression made the flow async + Arq + Redis-poll). Both module-level skip markers removed.

Implements chore_demo_seeding_integration_tests_rewrite.

  • Story 0.1 (D-3): Settings.relyloop_worker_api_base_url (default http://api:8000); the worker reads it instead of the hardcoded literal, so the harness redirects self-calls to the in-process uvicorn via a clean env override. No behavior change at the default.
  • Epic 1 (harness): module-scoped uvicorn + arq_ctx (real Redis handle for await run_demo_reseed(ctx)) + post_and_run_to_terminal helper; clears the Redis status key + the three Arq demo_reseed:singleton dedup keys before each test and between POSTs (the inline harness never consumes the queued job).
  • Epic 2 (11 cases): AC-1/2/3/5/12/13/14/15/16 + AC-Async (monotonic running→complete) + AC-Reg (worker-registration + enqueue guard). The worker runs inline so the advisory lock is pg_locks-observable and the cleanup gate is drivable from the same loop.
  • Epic 3 (AC-4): worker-side per-call timeout → terminal failed + cleanup (re-homed from the old 503-on-POST framing).

Test coverage

  • Integration: test_demo_seeding.py (11 cases) + test_demo_seeding_timeout.py (1 case); both skip markers removed.
  • Unit (test_demo_seeding_status.py) + contract (test_openapi_surface.py, test_test_endpoint_guard.py) unchanged.

Test plan

  • ruff format --check + ruff check + mypy backend/ (618 files, clean)
  • make test-unit (2656 passed — regression on the unchanged unit file + Story 0.1)
  • Both integration files type-check, lint, and import cleanly
  • The integration tests run in CI's heavy backend (tests + coverage) lane (Postgres + ES + OS + Redis service containers) — they skip in the sandbox (no local stack), so CI is the authoritative gate for this PR.

Notes / documented deviations

  • CI-blind: there is no local Postgres/ES/OS/Redis/uvicorn in the Claude Code remote sandbox, so the rewritten integration tests could not be executed locally; CI is the validation surface (operator-acknowledged).
  • Cross-model review: Opus self-review (GPT-5.5 unreachable); Gemini is the live cross-family gate.
  • Plan-vs-binding corrections (the worker imports its own bindings, so patching the demo_seeding originals wouldn't intercept worker calls): AC-Async spies backend.workers.demo_reseed.status_set; AC-12 patches the worker's run_demo_reseed_cleanup binding and the demo_seeding cleanup gate. Documented in code comments.

🤖 Generated with Claude Code


Generated by Claude Code

claude added 2 commits June 9, 2026 20:46
… (Story 0.1, D-3)

Replace the hardcoded base_url="http://api:8000" in run_demo_reseed with Settings.relyloop_worker_api_base_url (same default), so the demo-reseed integration harness can redirect the worker's self-calls to the in-process uvicorn (127.0.0.1:8000) via a clean env override instead of patching httpx.AsyncClient construction. No behavior change at the default. Runbook note added.

https://claude.ai/code/session_012vWN7bUoy74xvuqVR7S2H8
Signed-off-by: Claude <noreply@anthropic.com>
…flow (Stories 1.1–3.1)

Rewrites the 10 skipped sync-flow tests for the async enqueue+poll contract; removes both module-level skip markers. The worker is invoked INLINE (await run_demo_reseed(ctx)) so the advisory lock is observable via pg_locks (AC-16) and the cleanup gate is drivable from the same loop (AC-12); the Arq singleton dedup keys are cleared before each test + between POSTs (the inline harness never consumes the queued job).

Coverage: AC-1 (happy path, runtime summary counts), AC-2 (disjoint cluster ids), AC-3 (409 SEED_IN_PROGRESS + 503 ARQ_POOL_UNAVAILABLE), AC-5 (mid-loop engine failure → terminal failed + cleanup log), AC-12 (cleanup-while-locked blocks concurrent reseed), AC-13 (truncate-before-self-call ordering), AC-14 (failure cleanup wipes tables), AC-15 (dual-client no role mixing), AC-16 (advisory lock pinned to one pid), AC-Async (monotonic running→complete), AC-Reg (worker registration + enqueue guard), AC-4 (worker-side per-call timeout → failed + cleanup, in test_demo_seeding_timeout.py).

Plan-vs-binding corrections (worker imports its own bindings): AC-Async spies backend.workers.demo_reseed.status_set (not demo_seeding.status_set); AC-12 patches the worker's run_demo_reseed_cleanup binding + the demo_seeding gate. Base-URL redirect uses the Story 0.1 RELYLOOP_WORKER_API_BASE_URL env override.

CI-only verification: skips locally (no Postgres/ES/OS/Redis in the sandbox); runs in the heavy pr.yml lane.

https://claude.ai/code/session_012vWN7bUoy74xvuqVR7S2H8
Signed-off-by: Claude <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the demo-reseed integration tests to support the new asynchronous flow driven by Arq and Redis polling, replacing the previous synchronous tests. It introduces a new configuration setting, relyloop_worker_api_base_url, allowing the worker to route API self-calls correctly during tests, and documents this in the local development runbook. The review feedback correctly identifies a critical issue in the test files where httpx.ConnectError is raised without the required request keyword-only argument, which would lead to a TypeError at runtime.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +485 to +486
if call_count["engine_put"] > fail_threshold:
raise httpx.ConnectError("simulated ES unreachable")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In httpx, ConnectError (which inherits from RequestError) requires a request keyword-only argument. Raising it without a request object will result in a TypeError at runtime: TypeError: RequestError.__init__() missing 1 required keyword-only argument: 'request'.

Since the integration tests were not run locally due to sandbox limitations, this would fail in CI. You can resolve this by passing a dummy httpx.Request object constructed with the intercepted url.

Suggested change
if call_count["engine_put"] > fail_threshold:
raise httpx.ConnectError("simulated ES unreachable")
if call_count["engine_put"] > fail_threshold:
raise httpx.ConnectError("simulated ES unreachable", request=httpx.Request("PUT", url))

Comment on lines +560 to +561
if call_count["engine_put"] > fail_threshold:
raise httpx.ConnectError("simulated ES unreachable", request=None)
raise httpx.ConnectError("simulated ES unreachable")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In httpx, ConnectError (which inherits from RequestError) requires a request keyword-only argument. Raising it without a request object will result in a TypeError at runtime: TypeError: RequestError.__init__() missing 1 required keyword-only argument: 'request'.

Since the integration tests were not run locally due to sandbox limitations, this would fail in CI. You can resolve this by passing a dummy httpx.Request object constructed with the intercepted url.

Suggested change
if call_count["engine_put"] > fail_threshold:
raise httpx.ConnectError("simulated ES unreachable", request=None)
raise httpx.ConnectError("simulated ES unreachable")
if call_count["engine_put"] > fail_threshold:
raise httpx.ConnectError("simulated ES unreachable", request=httpx.Request("PUT", url))

claude added 3 commits June 9, 2026 21:04
CI surfaced all 13 demo-reseed tests erroring at setup: the carried-forward _patch_engine_for_test_host fixture patched _resolve_engine_base_url on BOTH demo_seeding AND the _test route module, but the async refactor (PR #286) moved the engine self-calls out of _test.py — the symbol no longer exists there (AttributeError). Patch it on demo_seeding only (its sole owner now).

https://claude.ai/code/session_012vWN7bUoy74xvuqVR7S2H8
Signed-off-by: Claude <noreply@anthropic.com>
…ction (Gemini)

Gemini High (AC-5 + AC-14): httpx.ConnectError requires the keyword-only `request` argument; raising it without one is a TypeError at runtime. Construct a dummy httpx.Request("PUT", url) from the intercepted URL.

https://claude.ai/code/session_012vWN7bUoy74xvuqVR7S2H8
Signed-off-by: Claude <noreply@anthropic.com>
…eption

Temporary: capture the backend.app.api.errors unhandled-exception log in the happy-path test so the real traceback behind the cluster-create HTTP 500 is visible in CI output (the generic 500 body only says "has been notified"). Will be removed once the root cause is fixed.

https://claude.ai/code/session_012vWN7bUoy74xvuqVR7S2H8
Signed-off-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants