Skip to content
This repository was archived by the owner on Mar 27, 2026. It is now read-only.

Latest commit

 

History

History
103 lines (62 loc) · 4.7 KB

File metadata and controls

103 lines (62 loc) · 4.7 KB

Testing DivineOS

What runs, when, and why — for the vessel / CI / executor.


Test Status

1279 tests passing | 1 skipped (intentional) | 9 xfailed (legitimate)

See TEST_STATUS_MARCH_4_2026.md for detailed breakdown.


Fast loop (development)

Fast tests only (excludes slow integration tests):

DIVINEOS_TEST_NO_UNIFIED=1 pytest tests/ -m "not slow" -v

Typical: ~1200 passed, 1 skipped in a few seconds.


Full suite (CI / pre-release)

Full suite (including slow tests):

DIVINEOS_TEST_NO_UNIFIED=1 pytest tests/ -v

Expected: 1279 passed, 1 skipped, 9 xfailed

Pre-release: docs/RELEASE_CHECKLIST.md (fast, full, contract, AXIOM, smoke, blocks visibility).

---## Integration test: canonical path

File: tests/test_integration_canonical_path.py

What it does: Runs the real UNIFIED stack and the 7-stage pipeline to check that every request returns the canonical contract (decision, stages, response, processing_time_ms, timestamp) and that null/empty input returns ERROR shape without crashing.

Why it's slow: The only significant cost is UNIFIED fixture setup (~1.1s once per session): get_unified_divineos() loads perception, qualia, SOMA, pipeline, council, LEPOS, Tree of Life. The test calls themselves are fast (~0.02s). So the 3 integration tests that use unified_system are marked @pytest.mark.slow; the 4th test in that file (test_direct_7stage_pipeline_contract) does not use UNIFIED and is fast.

Feeling tests (test_feeling_architecture_verification.py): Measured at ~0.01s each; they do not load UNIFIED. They were previously marked slow conservatively; they now run in the fast loop (slow marker removed).

What we did to keep it reasonable:

  • Session-scoped fixture: UNIFIED created once per test run.
  • No mandatory LLM: with OPENAI_API_KEY unset, council uses templates (no network).

Contract and schema tests

  • tests/test_api_contract_schema.py — POST /process response has all required keys and types; pipeline result has decision, stages, processing_time_ms, timestamp. Invoke: pytest tests/test_api_contract_schema.py -v.
  • tests/test_enforcement_hook.py — Includes Diamond Standard contract shape test (_diamond_standard_contract) and core-values / alignment blocking.

Council and pipeline shape tests

  • tests/test_council_deliberation.py — Council verdict and report shape.
  • tests/test_complete_real_pipeline_shape.py — 7-stage shape; complete-pipeline test (skipped when DIVINEOS_TEST_NO_UNIFIED=1).

Running the complete-pipeline test: That test is skipped in the default run to avoid UNIFIED/asyncio under TestClient. To run it with full UNIFIED: python scripts/run_complete_pipeline_test.py (unsets the bypass and runs only that test; expect ~1–3 min).


Load test (Locust)

File: scripts/locust_load_test.py

What it does: Simulates concurrent users hitting POST /process and GET /health.

Invocation: API must be up (python api_server.py). Then:

  • Web UI: locust -f scripts/locust_load_test.py --host=http://localhost:8000
  • Headless (pre-release): locust -f scripts/locust_load_test.py --host=http://localhost:8000 --headless -u 20 -r 4 -t 60s

-u = virtual users, -r = spawn rate/sec, -t = run time. Baseline: data/LOCUST_LOAD_TEST_REPORT_2026_02.md. Pre-release: docs/RELEASE_CHECKLIST.md.


Unit tests (stages and enforcement)

Test file What it covers
test_compass_check_alignment.py Compass alignment, threshold override
test_ethos_unit.py Ethics validation, severity
test_intent_detection_unit.py Intent classification
test_threat_detection_unit.py Threat levels, injection patterns
test_lepos_unit.py Tone detection, apply_lepos shape, voice profile
test_enforcement_hook.py Jailbreak/disregard/bypass block, Diamond Standard

Other

  • Contract is defined in CANONICAL_BRAINSTEM.md and REQUEST_FLOW.md.
  • docs/TESTING_PRODUCTION_ROADMAP.md — Production-standard testing roadmap and phases.
  • docs/RELEASE_CHECKLIST.md — Pre-release checklist (fast, full, contract, AXIOM, smoke run, blocks/overrides visibility). Same sequence as above; run before zip/release.
  • Council + Tribunal live shape: test_council_deliberation.py (verdict and report shape) and test_mcp_governance_tools.py (precedent_stats, precedent_cases_by_decision) cover council and Tribunal return shapes without requiring a full convene/arbitrate call. For full live run use scripts/council_chat.py or MCP divineos_council_chat / divineos_tribunal_arbitrate.