Skip to content

Docker one-command WSI TIL inference + smoke test#3

Open
YoniSchirris wants to merge 6 commits into
mainfrom
claude/docker-wsi-inference-2KL7d
Open

Docker one-command WSI TIL inference + smoke test#3
YoniSchirris wants to merge 6 commits into
mainfrom
claude/docker-wsi-inference-2KL7d

Conversation

@YoniSchirris
Copy link
Copy Markdown
Collaborator

@YoniSchirris YoniSchirris commented May 26, 2026

Summary

  • Adds a one-command, Dockerised pipeline to run end-to-end ECTIL TIL inference on whole-slide images: tissue mask (FESI) → DLUP tiling → RetCCL feature extraction → MeanMIL + GatedAttention → TIL score. Supports a single slide or a directory of slides, writes a timestamped run dir with per-slide artifacts (tils_score.json, tile_predictions.csv, features.h5, thumbnail/mask/heatmap PNGs) and an aggregate tils_scores.csv.
  • Fixes the Docker build: pins setuptools/wheel/packaging as a matched set with pip==23.3.2 (the editable install otherwise died on canonicalize_version(strip_trailing_zero=)), and relaxes python_requires from ==3.10.9 to >=3.10,<3.11 so conda-resolved 3.10.x patches install.
  • Adds --overwrite-mpp so slides without an embedded spacing (many TCGA SVS) can be opened/tiled instead of raising dlup's UnsupportedSlideError. Off by default (None); when set, it supplies the slide's native base spacing so dlup can resample to the model's target --mpp 0.5 (the value used in the extraction experiments — see configs/datamodule/encoder/retccl.yaml). For TCGA 40x diagnostic slides that native spacing is 0.25.
  • Adds tools/infer/run_demo.sh: a standalone smoke test (plain bash, no external tooling) that downloads the RetCCL + ECTIL weights and 5 TCGA-BRCA slides, builds the image, runs both single-slide and directory modes on CPU, and asserts the outputs before printing SMOKE TEST PASSED/FAILED.

Verification

Smoke test passed end-to-end on CPU (both modes, all assertions green). Real TIL scores on 5 TCGA-BRCA slides:

Slide TIL score Tiles
TCGA-AC-A23G 5.3% 179
TCGA-OL-A5RW 10.3% 102
TCGA-OL-A5RX 7.9% 205
TCGA-OL-A5RZ 24.0% 266
TCGA-OL-A5S0 12.7% 147

Test plan

This is a single command. It downloads everything, builds the container, runs both inference modes, and validates the output. No weights, no slides, and no Python deps need to be set up by hand.

1. Clone and enter the repo

git clone https://github.com/NKI-AI/ectil
cd ectil

You need Docker running, plus curl and python3 on the host (the latter only to bootstrap gdown and parse the result JSON — all the heavy C deps live inside the image). The script refuses to start and tells you exactly what's missing if not.

2. Run the one command

./tools/infer/run_demo.sh

That's it. Everything below happens automatically, and it's idempotent — re-running skips anything already downloaded.

3. What you should see scroll past

The script narrates 7 stages. Roughly:

==> [1/7] Pre-flight checks ...
    docker, curl present; daemon running.
==> [2/7] RetCCL encoder weights ...        # ~94 MB from Google Drive
    -> model_zoo/retccl/retccl_best_ckpt.pth
==> [3/7] ECTIL classifier weights ...      # ~4.7 MB
    -> model_zoo/ectil/tcga/fold_0/epoch_065_step_858_weights_only.ckpt
==> [4/7] TCGA-BRCA slides (5 total) ...    # downloaded to data/wsi/demo/*.svs
    have 5 slides in data/wsi/demo.
==> [5/7] Building Docker image 'ectil-inference' (cached after first build) ...
==> [6/7] Running SINGLE-SLIDE mode (...) ...   # --wsi <one .svs>  -> 1 result
==> [6/7] Running DIRECTORY mode (all 5 slides ...) ...   # --wsi data/wsi/demo -> 5 results
==> [7/7] Validating outputs ...
    [ok]   tils_scores.csv has 1 data row(s)
    [ok]   5 per-slide subdir(s)
    [ok]   score sane for TCGA-...: 0.xx
    ...

The two [6/7] runs are the point of the test: single-slide (--wsi <one .svs>, expect 1 row) and directory (--wsi data/wsi/demo, expect 5 rows) modes.

4. What "pass" looks like

It ends with a per-slide score table and:

============================================================
SMOKE TEST PASSED
TIL score (single-slide TCGA-...): 0.xxxx   |   run dir: data/inference_output/demo_single

and exits 0. On any problem it prints SMOKE TEST FAILED with the offending [FAIL] lines and exits 1.

5. Inspect the artifacts (optional)

ls data/inference_output/demo_dir/          # config.json, tils_scores.csv, 5 per-slide subdirs
cat data/inference_output/demo_dir/tils_scores.csv

Each per-slide subdir holds: tils_score.json, tile_predictions.csv, features.h5, and thumbnail.png / mask.png / mask_overlay.png / attention_heatmap.png / til_heatmap.png. Spot-check a til_heatmap.png against the slide to sanity-check the spatial predictions.

Everything downloaded (data/wsi, model_zoo/**) and written (data/inference_output) is gitignored.


Requesting @NUltee as reviewer (recent commits on main).

🤖 Generated with Claude Code

claude and others added 5 commits May 26, 2026 08:44
Provide a single end-to-end entry point (ectil/inference.py) that takes a WSI
and ECTIL classifier weights and runs mask -> tiling -> RetCCL -> ECTIL,
auto-loading RetCCL. Reuses the existing DLUP tiling/FESI mask, RetCCL encoder,
and MeanMIL + GatedAttention components so results match extract.py + eval.py.

Per slide it writes the final TIL score (tils_score.json), per-tile TIL and
attention scores (tile_predictions.csv), the generated feature dataset
(features.h5), thumbnail/mask/mask-overlay images, and attention/TIL heatmaps.

Add a Dockerfile (conda-based, mirrors README install), .dockerignore, an
example run script, and README usage. Weights are mounted at runtime.

https://claude.ai/code/session_018mX7wRvMnm23m4uf44Upq8
--wsi now accepts a directory of slides (recursively globbed by extension,
including .mrxs, whose companion data directory is never matched). Slides that
fail are skipped and recorded rather than aborting the run. RetCCL and the ECTIL
classifier are loaded once and reused across all slides.

Each run writes a timestamped <output>/<run_name>/ directory (override with
--run-name) containing config.json, an aggregate tils_scores.csv (one row per
slide, written incrementally), and a per-slide subdir. Per-slide tils_score.json
now embeds the full run config.

https://claude.ai/code/session_018mX7wRvMnm23m4uf44Upq8
The editable install failed inside the image because pip 23.3.2 was paired
with a setuptools whose _core_metadata calls canonicalize_version with
strip_trailing_zero, a kwarg the resolved packaging lacked. Pin
setuptools/wheel/packaging as a matched set. Also relax python_requires from
==3.10.9 to >=3.10,<3.11 so conda-resolved 3.10.x patch releases install.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add --overwrite-mpp so TCGA-style SVS that lack an embedded micron-per-pixel
can be opened and tiled (dlup otherwise raises UnsupportedSlideError);
forwarded to both SlideImage.from_file_path and from_standard_tiling. Decode
the slide thumbnail a single time and reuse it for the saved PNG and both
heatmap overlays, hoist csv/PIL imports, and derive --mask-function choices
from AvailableMaskFunctions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
run_demo.sh downloads the RetCCL encoder and ECTIL classifier weights and five
TCGA-BRCA slides, builds the Docker image, runs both single-slide and directory
inference on CPU, and asserts the expected per-slide outputs before printing a
SMOKE TEST PASSED/FAILED verdict. Ignore the demo's data/inference_output run dir.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@YoniSchirris YoniSchirris requested a review from NUltee May 26, 2026 13:49
Lead with WSI inference (smoke test, Docker, direct), move the manuscript-
reproduction details into a collapsible section, and surface run_demo.sh.
Fix broken example commands (missing line-continuation backslashes in the
extract and eval snippets), a malformed markdown link, the clone URL
(YoniSchirris -> nki-ai, also in setup.py), and note --overwrite-mpp for
TCGA slides that lack an embedded spacing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants