Skip to content

Local test#16

Merged
JamesKane merged 12 commits into
mainfrom
local-test
Jul 2, 2026
Merged

Local test#16
JamesKane merged 12 commits into
mainfrom
local-test

Conversation

@JamesKane

Copy link
Copy Markdown
Owner

No description provided.

JamesKane and others added 12 commits July 2, 2026 07:07
The lab transform predated migration 0038, which seeds the canonical
labs keyed by NAME (with the D8 instrument→lab lookup FK'd to them by
name). Forcing legacy ids over that seed via OVERRIDING SYSTEM VALUE +
ON CONFLICT (id) collided on the sequencing_lab_name_key unique index
during cutover (legacy id 1 "YSEQ" vs seeded id 5 "YSEQ").

Merge prod labs onto the seed by name instead — enrich metadata and add
prod-only labs. Legacy lab ids carry no downstream FK (sequence_library
resolves its lab by name string, not id), so id preservation is
unneeded. Verified end-to-end on a real prod dump.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The tree "/full" API now emits each defining variant's per-branch
ancestral/derived alleles (link_ancestral/link_derived from
tree.haplogroup_variant) alongside the variant's global coordinates.
Descent classification must use the branch's actual ancestral→derived
direction: the coordinate blob carries a single global polarity per
variant that disagrees with the branch on recurrent SNPs, back-mutations
and reference-frame cases, flipping ~18% of backbone calls. NULL link
alleles ⇒ forward/legacy link, fall back to coordinates.

Also rewrites for_dna_type_grouped from one wide join into two narrow
passes stitched in-process: a recurrent SNP fans out to many branches,
so the whole-tree join produced ~100k links over ~5k nodes and the
planner picked a nested loop random-reading the wide core.variant heap
once per link (plus a disk-spilling sort). Now: pull narrow link rows,
read each distinct variant once by id, stitch + sort in Rust.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DU_DISABLE_CSRF=1 skips the double-submit CSRF enforcement (the `csrf`
cookie is still issued) so the native-form credential login/logout can
be exercised locally. The login form submits as a non-boosted native
POST and therefore never carries the X-CSRF-Token header the middleware
requires, so credential auth 403s without this toggle.

Dev/local-test only — never set DU_DISABLE_CSRF in production. The
durable fix is to make login/logout carry the token (hidden field +
form-field fallback, or HX-Redirect-boosted forms).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Variant Naming Authority fixes surfaced during the cutover smoke test:

- Display the hs1 (T2T-CHM13) coordinate the de-novo catalog is built on,
  not GRCh38 (most de-novo variants had no GRCh38 coord → blank). Show the
  mutation state (ancestral→derived) the curator needs to name a variant.
- Dedup by SITE (locus + ancestral + derived), not position alone: two
  distinct SNPs can share a position with different alleles (Z12236 A>C vs
  Y17125 A>G) and are not duplicates.
- Reuse an established (non-DU) name via a new "adopt" action instead of
  minting — "named by definition" — guarded against the (name, branch)
  unique index so a taken name routes to merge review, not a 500.
- Scope the "needs a name" queue to variants that DEFINE a branch: a
  canonical name is (name + branch), so branch-less catalog rows are
  reference data, not naming work.

The (canonical_name, defining_haplogroup_id) model — same SNP name canonical
on each branch it defines, replacing ISOGG's L270.1/.2 suffixes — was never
populated (defining_haplogroup_id NULL everywhere), collapsing the unique
index to global-name-uniqueness. scripts/backfill-defining-haplogroup.sql
populates it from the tree links (row-per-(name,branch), recurrence split).
Durable loader support (denovo::load setting it + clear_dna deleting the
recurrence copies) is the tracked follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Naming queue is Y-DNA only: mtDNA variants are identified by their
  [anc]pos[der] coordinate string and never get a DU name, so exclude
  them from "needs a name" (they were sorting first and filling the queue).
- Load bootstrap.bundle.min.js in <head> (deferred) instead of at end of
  <body>: hx-boost swaps the whole body on every nav, re-executing a
  body <script> each time and stacking duplicate Bootstrap event-delegation
  listeners (dropdowns/toggles double-fire; htmx content swaps misbehave
  until a full reload). In <head> it runs once and survives boosts.
- hx-boost="false" on the nav dropdown-toggle anchors (href="#"): htmx was
  boosting their click and navigating to "#" instead of letting Bootstrap
  open the menu.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Prev/Next buttons put mode in their hx-get URL AND inherit
hx-include="#nm-mode" from the parent #naming-table, so htmx appended
mode a second time (?mode=X&page=N&mode=X). serde rejects the duplicate
key → 400, so pagination silently failed. Drop mode from the button
URLs and let the inherited hx-include supply it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The de-novo loader stamps every branch-defining variant NAMED with a
synthetic coordinate string as its canonical_name (e.g. "chrY:10158192
C->G") and also copies that string into common_names. The naming queue
already surfaced these (canonical_name LIKE 'chr%:%'), but the detail
panel keyed "already named" purely off naming_status = NAMED, so the
curator saw them as named with no way to mint a DU identifier.

Add du_db::naming::is_placeholder_name and thread it through:
- assign_du_name / adopt_established_name accept a placeholder-named
  NAMED variant as still-nameable (real name blocks, placeholder doesn't)
- assign_du_name purges chr placeholders from common_names on mint and
  never preserves a placeholder as an alias
- established_name / common_names display filter skip placeholders so
  they aren't offered for adoption or shown as alias chips
- list + detail render placeholder-named rows as UNNAMED (unnamed)

Regression test covers the NAMED-placeholder mint path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e-set tools

Same duplicate-query-param bug as the naming tool (24bab37): each of
these fragment lists sits in a container with hx-include pointing at a
filter control (search box or status select), and the Prev/Next buttons
ALSO hard-coded that same param in their hx-get URL. htmx appends the
included value on top of the URL's, producing a duplicate key
(?query=&page=2&query=) which axum's Query extractor rejects with 400.

Fix (matching the naming + haplogroups pattern): drop the hard-coded
filter param from the pagination URL and keep only `page` — the inherited
hx-include supplies the live filter value (also fixes stale-snapshot
filters when paging).

Tools fixed: regions (the reported one), variants, publications,
proposals, change-sets. Audited all curator fragment pagers:
haplogroups/naming already correct; instrument-proposals and
denovo-conflicts have no hx-include so don't collide; inbox has no
Prev/Next controls.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two follow-ups to the pagination-400 sweep.

1. Defense-in-depth: crate::extract::Query, a drop-in replacement for
   axum::extract::Query that no longer 400s on a repeated query key.
   axum's Query (serde_urlencoded) and axum-extra's (serde_html_form)
   both reject a duplicated scalar key — exactly what HTMX produces when
   a pagination link hard-codes a param that the container also sends via
   hx-include. The new extractor splits the raw query into ordered pairs
   (a sequence tolerates repeats), collapses duplicates keeping the last
   value (form semantics), then deserializes. Swapped into all 11 curator
   fragment route modules. Unit tests cover last-wins, plain, and empty.

2. Inbox pagination: the team inbox rendered a "Page x / y" label with no
   controls, so curators couldn't get past page 1. Added Prev/Next buttons
   (its container has no hx-include, so status is carried in the URL — now
   doubly safe under the new extractor).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tions

The Sequencer-Lab tool was proposal-only — a consensus review queue that's
empty until federation flows. The 36 preseeded (mig 0038) instrument→lab
mappings were applied to sequencer_instrument.lab_id and served by the
public lookup, but had no maintenance surface.

Add a second tab that lists every instrument→lab association (search +
pagination, unassigned first) and lets a curator reassign the lab, fix the
model/manufacturer, and set the D2C flag directly — no proposal needed.
Reuses the get-or-create-lab + audited-update path from accept_proposal,
keyed by sequencer_instrument.id.

du_db::sequencer: list_established, established_detail, list_labs,
update_instrument_lab (audited REASSIGN_LAB). New routes under
/curator/instrument-labs/* (distinct prefix — no :id route collision).
Bootstrap nav-tabs on the page; each tab is its own two-panel pane.
Pagination follows the ?page-only + hx-include pattern. i18n en/es/fr.
DB test covers list/search/reassign/create-new-lab/audit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…idation

Reference samples used as de-novo tree building blocks showed "No haplogroup
call" even though they're placed in the tree. Root cause: one individual is
split across several core.biosample rows — multiple publication accessions
(SAMN… NCBI + SAMEA… EBI) plus a de-novo tree tip keyed by the panel id
(NA18530) — and the ETL gave each its own specimen_donor while leaving the
tip donor-less. The tree placement (tree.haplogroup_sample) landed on the
tip, so it couldn't surface on the catalog accessions. ~2,660 individuals
affected.

Fix uses the donor as the linking entity (the entity above biosamples):

- du_db::donor::consolidate_denovo_donors — groups biosamples by shared panel
  id (tip.accession == ref.alias), points every member at the richest donor,
  fills its gaps + canonical identifier, prunes the emptied donors. New
  run-once job `consolidate-donors` (preview unless --apply). Applied to the
  cutover DB: 2,661 groups, 2,684 biosamples repointed, 23 donors pruned.
- Sample report resolves the haplogroup at the DONOR level: a new
  TreePlacement call origin sits between FedConsensus and Original in the
  precedence, resolving tree.haplogroup_sample across the donor's biosamples.
  Shows on every accession with a "from de-novo tree placement" badge.
- De-novo loader now attaches each tip to the shared donor at creation
  (reuse an existing catalog donor by alias/accession, else create one keyed
  by the panel id) so reloads don't recreate donor-less tips.
- dedup merge plan: cover genomics.biosample_str_profile (mig 0053, was
  stale → assert_fk_coverage would reject any merge); KeepSurvivor now
  handles a table unique on the repoint column alone.

Tests: donor_consolidation (split→consolidate→prune→report surfaces
placement on both accessions); existing dedup_merge/merge_e2e still pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The landing page had a single Y-DNA tree button and no explanation of what
Decoding Us is. Add:
- an mtDNA tree CTA alongside the Y-DNA one (both now icon buttons), keeping
  the variant-search button;
- a "What is Decoding Us?" section — three cards (open haplogroup trees,
  public variant catalog, private discovery & research);
- a "How to participate" section — explore / contribute your DNA / own your
  data / collaborate, with a link to the About page.

Renamed home.cta.tree → home.cta.ytree and added home.cta.mtree + the
eco.*/participate.* strings across en/es/fr (25 home.* keys, at parity).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@JamesKane JamesKane merged commit 5638451 into main Jul 2, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant