Publish Microplex diagnostics and deployed dataset artifacts to Hugging Face

## Problem

The calibration dashboard can now read local Microplex run bundles when `MICROPLEX_ARTIFACT_ROOTS` points at a generated artifact root. That works locally, but it does not keep the dashboard in sync with the newest Microplex runs because generated bundles are not published to a stable artifact location.

Recent run bundles include the files the dashboard needs:

- `pe_native_target_diagnostics.json` - row-level target diagnostics with target value, PE baseline aggregate/error, Microplex aggregate/error, deltas, family, geography, entity, support flags, etc.
- `policyengine_native_scores.json` - small headline native-loss summary and family breakdown.
- `pe_us_data_rebuild_native_audit.json` - deeper audit payload with target deltas, support audit, top improvements/regressions, and verdict hints.
- `policyengine_us.h5` - generated candidate PolicyEngine-US dataset for recomputation or microdata inspection.

Today these are local/generated files under `--output-root/<version_id>/`. The dashboard can only see them if it is pointed at that local path.

## Proposal

Publish Microplex run outputs to Hugging Face using two stable repos/artifact channels.

### 1. Diagnostics repo

A lightweight JSON-focused repo for dashboard/default analytics consumption.

Suggested layout:

```text
runs/<run_id>/manifest.json
runs/<run_id>/policyengine_native_scores.json
runs/<run_id>/pe_native_target_diagnostics.json
runs/<run_id>/pe_us_data_rebuild_native_audit.json
latest.json
run_registry.jsonl
```

The dashboard should default to `latest.json`, list available runs, and load the selected run's JSONs. This avoids requiring deployed dashboard instances to access local `/tmp` paths or private generated directories.

### 2. Deployed dataset repo

A fuller data-artifact repo, analogous to `policyengine-us-data`, for generated candidate datasets.

Suggested layout:

```text
staging/<run_id>/policyengine_us.h5
staging/<run_id>/manifest.json
policyengine_us.h5                 # optional promoted/current production candidate
manifest.json                      # optional promoted/current production manifest
```

This repo is for recomputation, dataset inspection, and deployed candidate data. The normal dashboard should not need to download the H5 for standard rendering because the JSON diagnostics are enough.

## Relevant Microplex code

Bundle generation already exists in:

- `src/microplex_us/pipelines/pe_us_data_rebuild_checkpoint.py`
  - attaches/writes PE-native scores, native audit, and target diagnostics sidecar.
- `src/microplex_us/pipelines/artifacts.py`
  - allocates `--output-root/<version_id>` and saves versioned artifact bundles.
- `src/microplex_us/pipelines/registry.py`
  - run registry/frontier tracking.
- `src/microplex_us/pipelines/stage_contracts.py`
  - artifact contract keys, including `policyengine_native_target_diagnostics`.

The new upload/publish helper probably belongs in `src/microplex_us/pipelines/`, modeled after `policyengine-us-data`'s Hugging Face upload utilities.

## Relevant us-data pattern to mirror

`policyengine-us-data` already has the staging/promote flow we likely want to copy conceptually:

- `policyengine_us_data/utils/data_upload.py`
  - `upload_to_staging_hf`
  - `promote_staging_to_production_hf`
  - `cleanup_staging_hf`
- `policyengine_us_data/utils/huggingface.py`
  - lower-level upload/download helpers.
- `modal_app/pipeline.py`
  - orchestration of staged uploads.

## Dashboard integration target

The calibration dashboard currently reads local bundles through:

```bash
MICROPLEX_ARTIFACT_ROOTS=/path/to/output-root
```

and discovers `manifest.json` recursively.

Once the HF diagnostics repo exists, dashboard code should add HF discovery in `backend/routes/microplex.py` and default to the newest published diagnostics run instead of relying on local roots or committed historical summary JSONs.

## Acceptance criteria

- A completed Microplex run can publish JSON diagnostics to a stable HF diagnostics repo.
- A completed Microplex run can optionally publish `policyengine_us.h5` to a stable HF deployed-data repo.
- Published diagnostics include a `latest.json` or equivalent index pointing to the newest successful run.
- The published manifest preserves artifact keys:
  - `policyengine_dataset`
  - `policyengine_native_scores`
  - `policyengine_native_audit`
  - `policyengine_native_target_diagnostics`
- The dashboard can read the newest published diagnostics without local filesystem access.
- The dashboard can still use local `MICROPLEX_ARTIFACT_ROOTS` as an override for development.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish Microplex diagnostics and deployed dataset artifacts to Hugging Face #178

Problem

Proposal

1. Diagnostics repo

2. Deployed dataset repo

Relevant Microplex code

Relevant us-data pattern to mirror

Dashboard integration target

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Publish Microplex diagnostics and deployed dataset artifacts to Hugging Face #178

Description

Problem

Proposal

1. Diagnostics repo

2. Deployed dataset repo

Relevant Microplex code

Relevant us-data pattern to mirror

Dashboard integration target

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions