Address review: fix accuracy, conventions, emphasize visual form-filling

ochafik · ochafik · commit 9ba70e2d1b7d · 2026-03-24T15:15:03.000Z
Accuracy:
- list_pdfs: 'remote origins' → 'local directories' (server has no
  remote allowlist, any HTTPS works)
- get_screenshot: 'PNG' → 'image' (server returns JPEG)
- Supported sources: clarify only arXiv auto-converts /abs/→PDF;
  others need direct PDF URLs

Conventions (matching legal/, productivity/ patterns):
- Add argument-hint frontmatter to all 4 commands
- Add CONNECTORS.md callout to all commands

Positioning — form filling:
- Emphasize this is VISUAL form filling (vs programmatic tools)
- Document the unnamed-field workflow: screenshot → match bounding
  boxes to visual labels → fill by name. Many real PDFs have fields
  named 'Text1', 'Field_7' with labels only on the rendered page.
- User gets live feedback and can edit directly in viewer
diff --git a/pdf/README.md b/pdf/README.md
@@ -45,10 +45,9 @@ the PDF server starts automatically when the plugin loads.
 ## Supported PDF Sources
 
 - Local files (file paths in your working directory)
-- [arXiv](https://arxiv.org), [bioRxiv](https://biorxiv.org),
-  [medRxiv](https://medrxiv.org), [chemRxiv](https://chemrxiv.org)
-- [Zenodo](https://zenodo.org), [OSF](https://osf.io),
-  [HAL Science](https://hal.science), [SSRN](https://ssrn.com)
+- [arXiv](https://arxiv.org) — `/abs/` URLs auto-convert to PDF
+- Any direct HTTPS PDF URL (bioRxiv, Zenodo, OSF, etc. — use the PDF
+  link, not the landing page)
 
 ## Signature Disclaimer
 
diff --git a/pdf/commands/annotate.md b/pdf/commands/annotate.md
@@ -1,7 +1,10 @@
 ---
 description: Collaboratively annotate a PDF — propose markup, review together, iterate
+argument-hint: "[path-or-url]"
 ---
 
+> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
+
 # Annotate PDF
 
 Walk through a document with the user, proposing and applying
diff --git a/pdf/commands/fill-form.md b/pdf/commands/fill-form.md
@@ -1,48 +1,73 @@
 ---
-description: Fill PDF form fields interactively
+description: Fill PDF form fields interactively with live visual feedback
+argument-hint: "[path-or-url]"
 ---
 
+> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
+
 # Fill Form
 
-Help the user complete a fillable PDF form with live preview.
+Help the user complete a fillable PDF form in the live viewer. Unlike
+programmatic form tools, this gives the user **direct visual feedback**
+on every field as it's filled, with easy undo/edit in the viewer.
+
+## Why use this instead of programmatic form filling
+
+- **Visual confirmation** — the user sees each value land in the right
+  box, not just a success message
+- **Unnamed/unlabeled fields** — many real-world PDFs have fields with
+  machine names like `Text1`, `Field_7`, or no name at all. The label
+  ("Date of Birth", "SSN") is printed **next to** the field on the
+  rendered page, not in the field metadata. Use `get_screenshot` to
+  see what each field actually is, then fill by name.
+- **Easy correction** — the user can edit or clear any field directly
+  in the viewer, or ask you to `fill_form` again with new values
 
 ## Two approaches
 
-### User-driven (recommended for simple forms)
+### User-driven (simple, well-labeled forms)
 
 Call `display_pdf` with `elicit_form_inputs: true`. The server detects
 form fields and prompts the user to enter values **before** the viewer
 opens. The filled PDF is then displayed.
 
-### AI-assisted (for complex forms or when you have context)
+### AI-assisted (complex forms, unnamed fields, or when you have context)
 
-1. `display_pdf` (without elicit) — inspect the returned `formFields`
-   array (name, type, page, bounding box)
-2. For each field, either:
+1. `display_pdf` (without elicit) — inspect returned `formFields`
+   (name, type, page, bounding box)
+2. If field names are cryptic (`Text1`, `Field_7`), use `interact` →
+   `get_screenshot` of each page with fields. Look at the visual
+   labels next to each bounding box to understand what each field is.
+3. For each field, either:
    - Infer the value from conversation context (name, date, email)
-   - Ask the user directly
-3. `interact` → `fill_form` with `fields: [{name, value}, ...]`
-4. `interact` → `get_screenshot` of each page with filled fields
-5. Ask the user to confirm before they download
+   - Ask the user, describing the field by its **visual** label
+     ("the 'Date of Birth' box on page 1")
+4. `interact` → `fill_form` with `fields: [{name, value}, ...]`
+5. `interact` → `get_screenshot` of each filled page
+6. Show the user, ask them to confirm or edit
 
 ## Example
 
 > **User:** Help me fill out this W-9
 >
-> *You:* `display_pdf` → see formFields: Name, Business name, Address,
-> TIN, Signature, Date
+> *You:* `display_pdf` → formFields: `f1_1`, `f1_2`, `f1_3`, `c1_1`, ...
+> (cryptic names)
+>
+> *You:* `interact` → `get_screenshot` page 1 → see `f1_1` is next to
+> "Name", `f1_2` is "Business name", `c1_1` is the "Individual" checkbox
 >
-> *You:* "I see 6 fields. I can fill Name and Date from what I know —
-> what's your TIN and business address?"
+> *You:* "I can see Name, Business name, Address, TIN, and tax
+> classification checkboxes. I'll fill Name and Date from what I
+> know — what's your TIN and business address?"
 >
 > *After answers:* `interact` → `fill_form` + `get_screenshot`
 >
-> *You:* "Here's the filled form. The signature field is still blank —
-> want to add your signature image with `/pdf:sign`?"
+> *You:* "Here's the filled form [screenshot]. The signature line is
+> still blank — want to add your signature with `/pdf:sign`?"
 
 ## Notes
 
-- Signature fields are usually separate from text fields — fill text
-  first, then hand off to `/pdf:sign` for the image
-- Some forms have checkboxes/radio buttons — `value` is `true`/`false`
-  or the selected option string
+- Signature fields are usually separate — fill text first, then hand
+  off to `/pdf:sign` for the image
+- Checkbox/radio values are `true`/`false` or the option string
+- The user can always drag & edit fields directly in the viewer
diff --git a/pdf/commands/open.md b/pdf/commands/open.md
@@ -1,7 +1,10 @@
 ---
 description: Open a PDF in the interactive viewer
+argument-hint: "[path-or-url]"
 ---
 
+> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
+
 # Open PDF
 
 Display a PDF document in the live viewer. Use this when the user wants
@@ -22,7 +25,7 @@ to **see** a document — not just extract its text.
 
 - Local files (paths or drag-and-drop into your working directory)
 - arXiv (`arxiv.org/abs/...` auto-converts to PDF URL)
-- bioRxiv, medRxiv, chemRxiv, Zenodo, OSF, HAL Science, SSRN
+- Any direct HTTPS PDF URL (use the PDF link, not a landing page)
 
 ## When NOT to use this
 
diff --git a/pdf/commands/sign.md b/pdf/commands/sign.md
@@ -1,7 +1,10 @@
 ---
 description: Place a signature or initials image on a PDF
+argument-hint: "[path-or-url] [signature-image-path]"
 ---
 
+> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
+
 # Sign PDF
 
 Add a visual signature or initials to a document using an image
diff --git a/pdf/skills/pdf/SKILL.md b/pdf/skills/pdf/SKILL.md
@@ -30,7 +30,7 @@ on markup — not streaming text back to you.
 ## Tools
 
 ### `list_pdfs`
-List available local PDFs and allowed remote origins. No arguments.
+List available local PDFs and allowed local directories. No arguments.
 
 ### `display_pdf`
 Open a PDF in the interactive viewer. **Call once per document.**
@@ -66,7 +66,7 @@ more commands. **Batch multiple commands in one call** via the
 **Extraction actions:**
 - `get_text` — extract text from page ranges (max 20 pages). Use for
   reading content to decide what to annotate, NOT for summarization.
-- `get_screenshot` — capture a page as PNG (verify your annotations)
+- `get_screenshot` — capture a page as an image (verify your annotations)
 
 **Form action:**
 - `fill_form` — fill named fields: `fields: [{name, value}, ...]`
@@ -105,12 +105,22 @@ images directly onto the viewer.
 6. When done, remind them they can download the annotated PDF from the
    viewer toolbar
 
-### Form filling
-1. `display_pdf` with `elicit_form_inputs: true` — the user fills
-   fields in a prompt before the viewer opens
-2. OR: `display_pdf`, inspect returned `formFields`, ask the user for
-   values, then `interact` → `fill_form`
-3. `get_screenshot` to confirm
+### Form filling (visual, not programmatic)
+Unlike headless form tools, this gives the user **live visual
+feedback** and handles forms with cryptic/unnamed fields where the
+label is printed on the page rather than in field metadata.
+
+1. `display_pdf` — inspect returned `formFields` (name, type, page,
+   bounding box)
+2. If field names are cryptic (`Text1`, `Field_7`), `get_screenshot`
+   the pages and match bounding boxes to visual labels
+3. Ask the user for values using the **visual** labels, or infer from
+   context
+4. `interact` → `fill_form`, then `get_screenshot` to show the result
+5. User confirms or edits directly in the viewer
+
+For simple well-labeled forms, `display_pdf` with
+`elicit_form_inputs: true` prompts the user upfront instead.
 
 ### Signing (visual, not certified)
 1. Ask for the signature/initials image path
@@ -126,7 +136,9 @@ certified or cryptographic digital signature.
 ## Supported Sources
 
 - Local files (paths under client MCP roots)
-- arXiv, bioRxiv, medRxiv, chemRxiv, Zenodo, OSF, HAL Science, SSRN
+- arXiv (`/abs/` URLs auto-convert to PDF)
+- Any direct HTTPS PDF URL (bioRxiv, Zenodo, OSF, etc. — use the
+  direct PDF link, not the landing page)
 
 ## Out of Scope