Skip to content

Commit 9ba70e2

Browse files
committed
Address review: fix accuracy, conventions, emphasize visual form-filling
Accuracy: - list_pdfs: 'remote origins' → 'local directories' (server has no remote allowlist, any HTTPS works) - get_screenshot: 'PNG' → 'image' (server returns JPEG) - Supported sources: clarify only arXiv auto-converts /abs/→PDF; others need direct PDF URLs Conventions (matching legal/, productivity/ patterns): - Add argument-hint frontmatter to all 4 commands - Add CONNECTORS.md callout to all commands Positioning — form filling: - Emphasize this is VISUAL form filling (vs programmatic tools) - Document the unnamed-field workflow: screenshot → match bounding boxes to visual labels → fill by name. Many real PDFs have fields named 'Text1', 'Field_7' with labels only on the rendered page. - User gets live feedback and can edit directly in viewer
1 parent e925679 commit 9ba70e2

File tree

6 files changed

+80
-35
lines changed

6 files changed

+80
-35
lines changed

pdf/README.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,9 @@ the PDF server starts automatically when the plugin loads.
4545
## Supported PDF Sources
4646

4747
- Local files (file paths in your working directory)
48-
- [arXiv](https://arxiv.org), [bioRxiv](https://biorxiv.org),
49-
[medRxiv](https://medrxiv.org), [chemRxiv](https://chemrxiv.org)
50-
- [Zenodo](https://zenodo.org), [OSF](https://osf.io),
51-
[HAL Science](https://hal.science), [SSRN](https://ssrn.com)
48+
- [arXiv](https://arxiv.org)`/abs/` URLs auto-convert to PDF
49+
- Any direct HTTPS PDF URL (bioRxiv, Zenodo, OSF, etc. — use the PDF
50+
link, not the landing page)
5251

5352
## Signature Disclaimer
5453

pdf/commands/annotate.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
---
22
description: Collaboratively annotate a PDF — propose markup, review together, iterate
3+
argument-hint: "[path-or-url]"
34
---
45

6+
> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
7+
58
# Annotate PDF
69

710
Walk through a document with the user, proposing and applying

pdf/commands/fill-form.md

Lines changed: 46 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,73 @@
11
---
2-
description: Fill PDF form fields interactively
2+
description: Fill PDF form fields interactively with live visual feedback
3+
argument-hint: "[path-or-url]"
34
---
45

6+
> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
7+
58
# Fill Form
69

7-
Help the user complete a fillable PDF form with live preview.
10+
Help the user complete a fillable PDF form in the live viewer. Unlike
11+
programmatic form tools, this gives the user **direct visual feedback**
12+
on every field as it's filled, with easy undo/edit in the viewer.
13+
14+
## Why use this instead of programmatic form filling
15+
16+
- **Visual confirmation** — the user sees each value land in the right
17+
box, not just a success message
18+
- **Unnamed/unlabeled fields** — many real-world PDFs have fields with
19+
machine names like `Text1`, `Field_7`, or no name at all. The label
20+
("Date of Birth", "SSN") is printed **next to** the field on the
21+
rendered page, not in the field metadata. Use `get_screenshot` to
22+
see what each field actually is, then fill by name.
23+
- **Easy correction** — the user can edit or clear any field directly
24+
in the viewer, or ask you to `fill_form` again with new values
825

926
## Two approaches
1027

11-
### User-driven (recommended for simple forms)
28+
### User-driven (simple, well-labeled forms)
1229

1330
Call `display_pdf` with `elicit_form_inputs: true`. The server detects
1431
form fields and prompts the user to enter values **before** the viewer
1532
opens. The filled PDF is then displayed.
1633

17-
### AI-assisted (for complex forms or when you have context)
34+
### AI-assisted (complex forms, unnamed fields, or when you have context)
1835

19-
1. `display_pdf` (without elicit) — inspect the returned `formFields`
20-
array (name, type, page, bounding box)
21-
2. For each field, either:
36+
1. `display_pdf` (without elicit) — inspect returned `formFields`
37+
(name, type, page, bounding box)
38+
2. If field names are cryptic (`Text1`, `Field_7`), use `interact`
39+
`get_screenshot` of each page with fields. Look at the visual
40+
labels next to each bounding box to understand what each field is.
41+
3. For each field, either:
2242
- Infer the value from conversation context (name, date, email)
23-
- Ask the user directly
24-
3. `interact``fill_form` with `fields: [{name, value}, ...]`
25-
4. `interact``get_screenshot` of each page with filled fields
26-
5. Ask the user to confirm before they download
43+
- Ask the user, describing the field by its **visual** label
44+
("the 'Date of Birth' box on page 1")
45+
4. `interact``fill_form` with `fields: [{name, value}, ...]`
46+
5. `interact``get_screenshot` of each filled page
47+
6. Show the user, ask them to confirm or edit
2748

2849
## Example
2950

3051
> **User:** Help me fill out this W-9
3152
>
32-
> *You:* `display_pdf` → see formFields: Name, Business name, Address,
33-
> TIN, Signature, Date
53+
> *You:* `display_pdf` → formFields: `f1_1`, `f1_2`, `f1_3`, `c1_1`, ...
54+
> (cryptic names)
55+
>
56+
> *You:* `interact``get_screenshot` page 1 → see `f1_1` is next to
57+
> "Name", `f1_2` is "Business name", `c1_1` is the "Individual" checkbox
3458
>
35-
> *You:* "I see 6 fields. I can fill Name and Date from what I know —
36-
> what's your TIN and business address?"
59+
> *You:* "I can see Name, Business name, Address, TIN, and tax
60+
> classification checkboxes. I'll fill Name and Date from what I
61+
> know — what's your TIN and business address?"
3762
>
3863
> *After answers:* `interact``fill_form` + `get_screenshot`
3964
>
40-
> *You:* "Here's the filled form. The signature field is still blank —
41-
> want to add your signature image with `/pdf:sign`?"
65+
> *You:* "Here's the filled form [screenshot]. The signature line is
66+
> still blank — want to add your signature with `/pdf:sign`?"
4267
4368
## Notes
4469

45-
- Signature fields are usually separate from text fields — fill text
46-
first, then hand off to `/pdf:sign` for the image
47-
- Some forms have checkboxes/radio buttons — `value` is `true`/`false`
48-
or the selected option string
70+
- Signature fields are usually separate — fill text first, then hand
71+
off to `/pdf:sign` for the image
72+
- Checkbox/radio values are `true`/`false` or the option string
73+
- The user can always drag & edit fields directly in the viewer

pdf/commands/open.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
---
22
description: Open a PDF in the interactive viewer
3+
argument-hint: "[path-or-url]"
34
---
45

6+
> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
7+
58
# Open PDF
69

710
Display a PDF document in the live viewer. Use this when the user wants
@@ -22,7 +25,7 @@ to **see** a document — not just extract its text.
2225

2326
- Local files (paths or drag-and-drop into your working directory)
2427
- arXiv (`arxiv.org/abs/...` auto-converts to PDF URL)
25-
- bioRxiv, medRxiv, chemRxiv, Zenodo, OSF, HAL Science, SSRN
28+
- Any direct HTTPS PDF URL (use the PDF link, not a landing page)
2629

2730
## When NOT to use this
2831

pdf/commands/sign.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
---
22
description: Place a signature or initials image on a PDF
3+
argument-hint: "[path-or-url] [signature-image-path]"
34
---
45

6+
> If you need to check which tools are connected, see [CONNECTORS.md](../CONNECTORS.md).
7+
58
# Sign PDF
69

710
Add a visual signature or initials to a document using an image

pdf/skills/pdf/SKILL.md

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ on markup — not streaming text back to you.
3030
## Tools
3131

3232
### `list_pdfs`
33-
List available local PDFs and allowed remote origins. No arguments.
33+
List available local PDFs and allowed local directories. No arguments.
3434

3535
### `display_pdf`
3636
Open a PDF in the interactive viewer. **Call once per document.**
@@ -66,7 +66,7 @@ more commands. **Batch multiple commands in one call** via the
6666
**Extraction actions:**
6767
- `get_text` — extract text from page ranges (max 20 pages). Use for
6868
reading content to decide what to annotate, NOT for summarization.
69-
- `get_screenshot` — capture a page as PNG (verify your annotations)
69+
- `get_screenshot` — capture a page as an image (verify your annotations)
7070

7171
**Form action:**
7272
- `fill_form` — fill named fields: `fields: [{name, value}, ...]`
@@ -105,12 +105,22 @@ images directly onto the viewer.
105105
6. When done, remind them they can download the annotated PDF from the
106106
viewer toolbar
107107

108-
### Form filling
109-
1. `display_pdf` with `elicit_form_inputs: true` — the user fills
110-
fields in a prompt before the viewer opens
111-
2. OR: `display_pdf`, inspect returned `formFields`, ask the user for
112-
values, then `interact``fill_form`
113-
3. `get_screenshot` to confirm
108+
### Form filling (visual, not programmatic)
109+
Unlike headless form tools, this gives the user **live visual
110+
feedback** and handles forms with cryptic/unnamed fields where the
111+
label is printed on the page rather than in field metadata.
112+
113+
1. `display_pdf` — inspect returned `formFields` (name, type, page,
114+
bounding box)
115+
2. If field names are cryptic (`Text1`, `Field_7`), `get_screenshot`
116+
the pages and match bounding boxes to visual labels
117+
3. Ask the user for values using the **visual** labels, or infer from
118+
context
119+
4. `interact``fill_form`, then `get_screenshot` to show the result
120+
5. User confirms or edits directly in the viewer
121+
122+
For simple well-labeled forms, `display_pdf` with
123+
`elicit_form_inputs: true` prompts the user upfront instead.
114124

115125
### Signing (visual, not certified)
116126
1. Ask for the signature/initials image path
@@ -126,7 +136,9 @@ certified or cryptographic digital signature.
126136
## Supported Sources
127137

128138
- Local files (paths under client MCP roots)
129-
- arXiv, bioRxiv, medRxiv, chemRxiv, Zenodo, OSF, HAL Science, SSRN
139+
- arXiv (`/abs/` URLs auto-convert to PDF)
140+
- Any direct HTTPS PDF URL (bioRxiv, Zenodo, OSF, etc. — use the
141+
direct PDF link, not the landing page)
130142

131143
## Out of Scope
132144

0 commit comments

Comments
 (0)