A workstation smoke rerun reached the first durable checkpoint, then failed at PolicyEngine calibration target loading because the configured targets DB path did not exist.
Command shape:
.venv/bin/python -m microplex_us.pipelines.pe_us_data_rebuild_checkpoint \
--output-root artifacts/local_us_microplex_smoke \
--version-id local-smoke-v1 \
--baseline-dataset /Users/administrator/Documents/PolicyEngine/policyengine-us-data/policyengine_us_data/storage/enhanced_cps_2024.h5 \
--targets-db /Users/administrator/Documents/PolicyEngine/calibration-diagnostics/.artifacts/policy_data.db \
--policyengine-us-data-repo /Users/administrator/Documents/PolicyEngine/policyengine-us-data \
--policyengine-us-data-python /Users/administrator/Documents/PolicyEngine/worktrees/microplex-us/fix-pe-rebuild-smoke-issues/.venv/bin/python \
--calibration-backend microcalibrate \
--donor-imputer-backend zi_qrf \
--policyengine-materialize-batch-size 100000 \
--cps-sample-n 1000 \
--puf-sample-n 1000 \
--donor-sample-n 1000 \
--n-synthetic 1000 \
--no-include-acs \
--defer-policyengine-harness \
--defer-policyengine-native-score \
--defer-native-audit \
--defer-imputation-ablation \
--pipeline-checkpoint-save-post-imputation-path artifacts/local_us_microplex_smoke/local-smoke-v1/checkpoints/post_imputation \
--pipeline-checkpoint-save-post-microsim-path artifacts/local_us_microplex_smoke/local-smoke-v1/checkpoints/post_microsim
Progress before failure:
US microplex build: policyengine tables complete [households=1000, persons=2741]
US microplex build: post-imputation checkpoint saved [path=artifacts/local_us_microplex_smoke/local-smoke-v1/checkpoints/post_imputation]
US microplex build: policyengine calibration start [backend=microcalibrate]
Failure:
FileNotFoundError: PolicyEngine targets DB not found: /Users/administrator/Documents/PolicyEngine/calibration-diagnostics/.artifacts/policy_data.db
The target DB path had been detected earlier in the workstation setup, but by this run it was unavailable. Because the post-imputation checkpoint was successfully saved, the next retry should resume from that checkpoint rather than rebuilding donor integration.
Potential improvements:
- Preflight
--targets-db existence before starting donor integration/source build.
- Include a clear checkpoint-resume command in the failure or manifest output when a later stage fails.
- Consider failing before expensive donor imputation if required calibration inputs are missing.
A workstation smoke rerun reached the first durable checkpoint, then failed at PolicyEngine calibration target loading because the configured targets DB path did not exist.
Command shape:
Progress before failure:
Failure:
The target DB path had been detected earlier in the workstation setup, but by this run it was unavailable. Because the post-imputation checkpoint was successfully saved, the next retry should resume from that checkpoint rather than rebuilding donor integration.
Potential improvements:
--targets-dbexistence before starting donor integration/source build.