Skip to content

Publish Microplex diagnostics and deployed dataset artifacts to Hugging Face #178

@PavelMakarchuk

Description

@PavelMakarchuk

Problem

The calibration dashboard can now read local Microplex run bundles when MICROPLEX_ARTIFACT_ROOTS points at a generated artifact root. That works locally, but it does not keep the dashboard in sync with the newest Microplex runs because generated bundles are not published to a stable artifact location.

Recent run bundles include the files the dashboard needs:

  • pe_native_target_diagnostics.json - row-level target diagnostics with target value, PE baseline aggregate/error, Microplex aggregate/error, deltas, family, geography, entity, support flags, etc.
  • policyengine_native_scores.json - small headline native-loss summary and family breakdown.
  • pe_us_data_rebuild_native_audit.json - deeper audit payload with target deltas, support audit, top improvements/regressions, and verdict hints.
  • policyengine_us.h5 - generated candidate PolicyEngine-US dataset for recomputation or microdata inspection.

Today these are local/generated files under --output-root/<version_id>/. The dashboard can only see them if it is pointed at that local path.

Proposal

Publish Microplex run outputs to Hugging Face using two stable repos/artifact channels.

1. Diagnostics repo

A lightweight JSON-focused repo for dashboard/default analytics consumption.

Suggested layout:

runs/<run_id>/manifest.json
runs/<run_id>/policyengine_native_scores.json
runs/<run_id>/pe_native_target_diagnostics.json
runs/<run_id>/pe_us_data_rebuild_native_audit.json
latest.json
run_registry.jsonl

The dashboard should default to latest.json, list available runs, and load the selected run's JSONs. This avoids requiring deployed dashboard instances to access local /tmp paths or private generated directories.

2. Deployed dataset repo

A fuller data-artifact repo, analogous to policyengine-us-data, for generated candidate datasets.

Suggested layout:

staging/<run_id>/policyengine_us.h5
staging/<run_id>/manifest.json
policyengine_us.h5                 # optional promoted/current production candidate
manifest.json                      # optional promoted/current production manifest

This repo is for recomputation, dataset inspection, and deployed candidate data. The normal dashboard should not need to download the H5 for standard rendering because the JSON diagnostics are enough.

Relevant Microplex code

Bundle generation already exists in:

  • src/microplex_us/pipelines/pe_us_data_rebuild_checkpoint.py
    • attaches/writes PE-native scores, native audit, and target diagnostics sidecar.
  • src/microplex_us/pipelines/artifacts.py
    • allocates --output-root/<version_id> and saves versioned artifact bundles.
  • src/microplex_us/pipelines/registry.py
    • run registry/frontier tracking.
  • src/microplex_us/pipelines/stage_contracts.py
    • artifact contract keys, including policyengine_native_target_diagnostics.

The new upload/publish helper probably belongs in src/microplex_us/pipelines/, modeled after policyengine-us-data's Hugging Face upload utilities.

Relevant us-data pattern to mirror

policyengine-us-data already has the staging/promote flow we likely want to copy conceptually:

  • policyengine_us_data/utils/data_upload.py
    • upload_to_staging_hf
    • promote_staging_to_production_hf
    • cleanup_staging_hf
  • policyengine_us_data/utils/huggingface.py
    • lower-level upload/download helpers.
  • modal_app/pipeline.py
    • orchestration of staged uploads.

Dashboard integration target

The calibration dashboard currently reads local bundles through:

MICROPLEX_ARTIFACT_ROOTS=/path/to/output-root

and discovers manifest.json recursively.

Once the HF diagnostics repo exists, dashboard code should add HF discovery in backend/routes/microplex.py and default to the newest published diagnostics run instead of relying on local roots or committed historical summary JSONs.

Acceptance criteria

  • A completed Microplex run can publish JSON diagnostics to a stable HF diagnostics repo.
  • A completed Microplex run can optionally publish policyengine_us.h5 to a stable HF deployed-data repo.
  • Published diagnostics include a latest.json or equivalent index pointing to the newest successful run.
  • The published manifest preserves artifact keys:
    • policyengine_dataset
    • policyengine_native_scores
    • policyengine_native_audit
    • policyengine_native_target_diagnostics
  • The dashboard can read the newest published diagnostics without local filesystem access.
  • The dashboard can still use local MICROPLEX_ARTIFACT_ROOTS as an override for development.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions