agtx is a macOS and Windows friendly, low-dependency native skill manager for Agentex skills.
The v1 implementation is intentionally small: one Go binary, standard library first, and no Python/NPM/Homebrew runtime dependency. The built-in OCR path is native-only and uses explicit ONNX Runtime or ncnn adapter builds instead of Python wrappers. DOCX, XLSX, and PPTX text extraction is built in through the Go standard library OpenXML reader.
agtx search "summarize PDFs and Word files"
agtx install pdf --plan --json
agtx install pdf docx --yes
agtx run pdf --timeout-ms 30000 --output-limit-bytes 1048576 --json
agtx run docx --json -- ./sample.docx
agtx run security_audit --json -- --manifest ./manifest.json --policy strict
agtx uninstall pdf --plan --json
agtx list --json
agtx status
agtx doctor --json
agtx verify pdf --json
agtx config init
agtx config keys --json
agtx config set registry_url https://example.com/agtx/registry.json
agtx config set pro_api_url https://agtx-pro.example.com
agtx config set agent_name Codex
agtx config set package_max_bytes 268435456
agtx config set extracted_max_bytes 1073741824
agtx config set extracted_max_files 8192
agtx registry sources
agtx commerce packs --json
agtx commerce install-pack standard --plan --json
agtx commerce install-pack standard --yes --json
agtx commerce snapshot --pack-id standard --json
agtx pro setup
agtx pro login --open
agtx pro status --json
agtx mcpMutating commands require confirmation. Agent callers should pass --json --yes where appropriate and consume the fixed response shape:
{
"ok": true,
"data": {},
"warnings": [],
"trace_id": "agtx-..."
}For agtx run, --output-limit-bytes bounds captured stdout/stderr and --input file|- reads in CLI agent calls. Use -- before skill arguments; any --json or --ndjson after that separator is passed through to the skill, not treated as an agtx output flag.
Structured errors include recovery hints for agent callers: unknown commands include supported_commands, nested command errors include supported_subcommands, missing positional arguments include expected_args, flag parse/unexpected-argument errors include supported_flags, and MCP parse/envelope/method/tool/params/argument errors include field, expected, supported_fields, supported_methods, supported_tools, supported_params, or supported_arguments. MCP required-argument, argument-shape, and confirmation errors also include the tool name plus expected argument shape or yes=true retry details.
Prebuilt CLI archives are attached to tagged releases: https://github.com/agentex-ai/agtx/releases/latest.
Maintainers can publish a release by pushing a v* tag, for example:
git tag v0.1.0
git push origin v0.1.0Release builds should disable cgo and prefer Go-native resolver/user lookup paths:
CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 go build -tags "netgo osusergo" -trimpath -ldflags "-s -w" -o dist/agtx-darwin-arm64 ./cmd/agtx
CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 go build -tags "netgo osusergo" -trimpath -ldflags "-s -w" -o dist/agtx-darwin-amd64 ./cmd/agtx
CGO_ENABLED=0 GOOS=windows GOARCH=amd64 go build -trimpath -ldflags "-s -w" -o dist/agtx-windows-amd64.exe ./cmd/agtxOn macOS, audit dynamic links before release:
otool -L dist/agtx-darwin-arm64rapidocr, ppocrv6, and paddleocr resolve to the built-in ocr skill.
The default binary exposes the native OCR manifest and probe path without
linking Python or NPM. To build the optional ONNX Runtime adapter, enable cgo
and the ocr_onnxruntime tag:
CGO_ENABLED=1 go build -tags ocr_onnxruntime -o dist/agtx-ocr ./cmd/agtxConfigure the native runtime and RapidOCR model files with environment variables or skill args:
set AGTX_OCR_ONNXRUNTIME_LIBRARY=C:\path\to\onnxruntime.dll
set AGTX_OCR_MODEL_DIR=C:\path\to\ppocr-models
agtx run rapidocr --json -- --probe
agtx run rapidocr --json -- --download-runtime --dry-run
agtx run rapidocr --json -- --download-runtime
agtx run rapidocr --json -- --download-runtime --keep-archive
agtx run rapidocr --json -- --download-models --dry-run
agtx run rapidocr --json -- --download-models
agtx run rapidocr --json -- --det-model ch_PP-OCRv4_det_mobile.onnx --rec-model ch_PP-OCRv4_rec_mobile.onnx --keys ppocr_keys_v1.txt sample.png
agtx run rapidocr --json -- --det-limit-side-len 736 --det-threshold 0.3 --box-threshold 0.5 --unclip-ratio 1.6 --text-score 0.5 sample.pngOCR inference accepts raster image inputs such as PNG, JPEG, GIF, screenshots,
scans, and already-rendered PDF page images. Raw PDF files are not rasterized by
the OCR skill itself; use the built-in pdf skill for text PDFs, or render the
target page to an image before running OCR.
By default agtx looks for the ONNX Runtime shared library under
AGTX_OCR_RUNTIME_DIR, beside the executable, or in the model directory's
runtime subdirectory. --download-runtime downloads the Microsoft ONNX
Runtime CPU archive for the current platform, extracts only the shared library
into that runtime directory, and removes the archive unless --keep-archive is
set. Microsoft no longer publishes macOS Intel CPU archives for ONNX Runtime
1.26.0, so Intel Mac users should provide AGTX_OCR_ONNXRUNTIME_LIBRARY or
explicitly select a compatible older runtime.
The default model profile is rapidocr, matching RapidOCR 3.8.x defaults:
ONNX Runtime, PP-OCRv4 mobile detector, PP-OCRv4 mobile recognizer, and
ppocr_keys_v1.txt. agtx looks for ch_PP-OCRv4_det_mobile.onnx,
ch_PP-OCRv4_rec_mobile.onnx, and ppocr_keys_v1.txt under
AGTX_OCR_MODEL_DIR or the local agtx built-in OCR directory.
--download-models downloads those assets with four candidate sources per
file and verifies SHA-256 before writing: ModelScope www.modelscope.cn,
ModelScope modelscope.cn, Hugging Face SWHL/RapidOCR, and Hugging Face
pitapo/rapidocr for ONNX files; the keys file uses ModelScope, Gitee,
GitHub raw, and jsDelivr. ppocrv6 remains an explicit compatibility profile
for existing PaddlePaddle Hugging Face tiny, small, and medium exports.
The probe reports whether the adapter is linked, which native library is used,
which files are missing, and the detector/recognizer ONNX input/output metadata
when the models load. Runtime tuning follows the RapidOCR/PaddleOCR defaults:
det_limit_side_len=736, det_threshold=0.3, box_threshold=0.5,
unclip_ratio=1.6, max_candidates=1000, text_score=0.5, and
rec_max_width=1600; each can be overridden with the matching --kebab-case
skill argument or AGTX_OCR_* environment variable. Dynamic-width recognizer
models scale the crop width per detected text box up to rec_max_width, while
--rec-width forces a fixed recognizer width. OCR never falls back to Python,
Node, or npm wrappers.
web_search is available as a built-in no-Python runtime for lightweight web
discovery. It sends HTTPS requests to a search endpoint, parses JSON search
proxy responses or HTML result pages, and returns ranked title, URL, snippet,
and source candidates. Localhost HTTP endpoints are allowed for tests and
private search proxies.
agtx run web_search --json -- "agentex capability packs"
agtx run web_search --json -- --query "agentex capability packs" --max-results 5
agtx run web_search --json -- --base-url http://127.0.0.1:8080/search --query "private index"web_fetch is available as a built-in no-Python runtime for known URLs. It uses
the Go HTTP stack, returns page title, metadata, links, and readable text, and
rejects remote plaintext HTTP except for localhost fixtures.
agtx run web_fetch --json -- https://example.com
agtx run web_fetch --json -- --url https://example.com --text-onlydeep_research is available as a built-in no-Python workflow. It plans search
queries, reads supplied sources or discovered URLs with the built-in web tools,
extracts relevant evidence sentences, and returns structured sources,
findings, caveats, next_actions, and a markdown report. It is an
extractive workflow, so agents should treat the cited source trail as the
ground truth and review it for high-stakes work.
agtx run deep_research --json -- --question "How should capability packs be reused?" --max-sources 5
agtx run deep_research --json --input task.jsonsecurity_audit is available as a built-in no-Python static scanner for skill
store submissions, capability pack manifests, local archives or directories,
permissions, dependency declarations, download URLs, hashes, and release notes.
It does not execute package content and does not automatically download remote
packages; URL inputs are inspected as declared metadata.
agtx run security_audit --json -- --manifest ./manifest.json --policy strict
agtx run security_audit --json -- --path ./skill-package.zip --expected-sha256 <sha256>
agtx run security_audit --json --input security-audit-task.jsonaudio is available as a built-in no-Python workflow for WAV inspection,
transcript normalization, and meeting notes. It reads RIFF/WAVE PCM files,
returns duration, sample rate, channels, peak/RMS, and silence ratio, and can
turn supplied transcript text or segments into summaries, decisions, action
items, questions, and keywords. Native ASR/TTS models are not bundled in this
minimal runtime; those can be added later as downloadable backends while keeping
the same capability-pack contract.
agtx run audio --json -- ./meeting.wav
agtx run audio --json --input meeting-audio-task.jsonimagen is available as a built-in no-Python local media
workflow. It creates deterministic procedural PNG assets from prompts, writes a
generation manifest, and returns file paths, sizes, hashes, seeds, palette, and
request metadata for other agent frameworks to consume. Photorealistic diffusion
and video generation are intentionally left as future downloadable/provider
backends behind the same pack contract.
agtx run imagen --json -- --prompt "launch badge for capability packs" --width 1024 --height 1024
agtx run imagen --json --input media-task.jsondocx, xlsx, pptx, and text-oriented pdf extraction are built in for
local document files. They run inside the agtx binary with Go's standard
library, so normal and Pro accounts do not need an extra package download for
basic text, metadata, sheet, row, slide, speaker-note, and PDF text-stream
extraction on Windows or macOS. Scanned or image-only PDFs should be sent to
ocr.
agtx run docx --json -- ./contract.docx
agtx run xlsx --json -- --path ./invoice.xlsx --max-rows 500
agtx run pptx --json -- --file ./deck.pptx
agtx run pdf --json -- ./paper.pdfPrefer MCP stdio:
agtx agent targets
agtx agent init codex --print
agtx agent init cursor --print
agtx agent init cc --print
agtx agent init codex --jsonUse agtx agent targets to discover supported agent names first. Use --print for paste-ready snippets and --json when another tool needs structured setup metadata, aliases, path hints, and ordered setup_steps with priority / blocking / verification / platforms / applies_when / writes_files / artifacts hints instead of human-formatted text.
agtx mcp implements a minimal newline-delimited JSON-RPC MCP server with tools for search, install, run, list, upgrade, rollback, status, registry inspection/validation, doctor, verify, capability-pack commerce, and Pro auth/device management.
The server also accepts Content-Length framed JSON-RPC messages used by MCP clients.
Its tools/list metadata includes strict JSON Schema with required fields, positive integer minima, and additionalProperties: false to help agent clients form valid calls without guessing.
It also exposes agent-bootstrap helpers so an external client can call list_agent_targets and get_agent_target over MCP instead of shelling out to agtx agent ..., and those tools now advertise output schema for setup_steps, writes_files, and artifacts. Search/list/status/registry/refresh/install-plan/install/upgrade/rollback/uninstall/doctor/verify/run, capability-pack commerce, and Pro auth/device tools also publish output schema so setup UIs, confirmation flows, registry panels, local-status panels, login flows, callback-scheme registration, website commerce panels, and result viewers can be prepared from discovery metadata alone. get_pro_setup adds a no-side-effect preflight path so wrappers can inspect whether pro_api_url is configured, whether a login is pending, and which CLI/MCP actions should come next before attempting any auth mutation. Pro-related failures such as unauthorized, subscription_required, and device_limit_exceeded now also carry a structured error.details.pro_setup preview plus error.details.next_actions recovery steps, so wrappers can recover without hard-coded heuristics. Each tool also exposes errorOutputSchema so wrappers can model the shared failure envelope, including partial data on preserved-error paths such as verify/run failures.
For multi-skill tasks, agents should assemble a capability bundle before
installing or running individual skills. See docs/capability-bundles.md for
the recommended task profile, priorities, roles, stages, and execution notes.
Pro login is CLI-managed and server-authorized. The CLI stores local device auth in auth.json, while a Cloudflare Worker checks Stripe subscription state, enforces the default 3 active device limit, filters registry entries, and gates package downloads:
agtx config set pro_api_url https://agtx-pro.example.com
agtx pro setup
agtx pro register-scheme
agtx pro login --open
agtx pro callback "agtx://pro/callback?code=...&state=..."
agtx pro devices
agtx pro revoke <device-id> --yes
agtx pro logoutregistry refresh and HTTP package downloads send Authorization: Bearer <token> only to the same origin as pro_api_url or registry_url.
agtx pro setup is a no-side-effect preflight check: it does not refresh tokens or call the network, and instead reports current local status plus recommended next actions for either humans or agents.
If auth.json is corrupt, agtx pro setup still returns a preview with current_status including auth_invalid and a reset_local_auth action instead of failing outright.
agtx pro status --json now mirrors those local markers and recommended_actions too, including pending_login and auth_invalid, so wrappers can branch from one status call without rereading auth.json or separately calling setup.
agtx pro register-scheme now targets both macOS and Windows; on macOS it installs a tiny local callback app bundle under the agtx config directory so browser login can return through agtx://.
Set agent_name to pass default artifact attribution into installed skills, or use agtx run --agent-name ... for a single invocation. During agtx run, the value is exposed as AGTX_AGENT_NAME, AGTX_BYLINE, and AGTX_GENERATED_BY so document-generating skills can write Office creator metadata or visible bylines such as by Codex. After a successful run, agtx also best-effort updates explicit OpenXML Office output paths (.docx, .docm, .xlsx, .xlsm, .pptx, and .pptm) with Office core metadata (creator, lastModifiedBy, and a by <agent> description line), and reports successful writes in attributed_files for JSON/MCP callers. To avoid mutating source templates, pass generated files through output-style arguments such as -o file.docx, --output file.docx, --output=file.docx, output=file.docx, outputPath=file.docx, saveAs=file.docx, exportPath=file.docx, local file:// URIs, JSON/NDJSON outputs, JSON output objects such as {"output":{"path":"file.docx"}} or {"artifact":{"uri":"file:///..."}}, text hints such as Saved to: file.docx, or action=create path=file.docx. agtx only annotates real OpenXML Office packages, so arbitrary zip files renamed to an Office extension are ignored.
agtx starts with a built-in registry so it can run offline. Optional registry overlays can be configured in config.json:
{
"schema_version": 1,
"registry_url": "https://example.com/agtx/registry.json",
"pro_api_url": "https://agtx-pro.example.com",
"agent_name": "Codex",
"registry_files": ["/path/to/local-registry.json"],
"channel": "stable",
"telemetry": "off",
"registry_max_bytes": 16777216,
"registry_download_timeout_ms": 30000,
"package_max_bytes": 268435456,
"package_download_timeout_ms": 30000,
"extracted_max_bytes": 1073741824,
"extracted_max_files": 8192
}config.json is strict: unknown keys, null values, trailing JSON values, invalid URLs, unsupported schema versions, and non-positive limits are rejected instead of silently falling back.
Use agtx config keys --json to discover supported settings; unknown-key errors also include supported_keys for agent recovery.
agtx is the source-of-truth repo for Agentex capability-pack standards. The
standards under docs/standards/ define optional manifest commerce metadata,
including vendor identity, capability class, billing meters, CPA/CPS attribution,
revenue share, support links, and settlement rules. agentex.cc publishes static
copies for agents, ISVs, and the website.
After changing a standard here, publish the website copy from the sibling
agentex.cc repo:
cd ..\agentex.cc
npm run sync:agtx-standards
npm run buildThe built-in first-wave skills already declare default billing meters:
web_search:callweb_fetch:page,calldeep_research:taskocrandpdf:pageaudio:minuteimagen:task,creditdocx,xlsx, andpptx:task
Example ISV manifests are available under docs/standards/examples/ for
usage-metered packs and CPA/CPS outcome packs.
agtx install --plan --json and MCP plan_install expose a compact commerce
summary so agents can show vendor, capability class, billing meters, attribution
events, and support URL before asking for install confirmation.
The built-in capability-pack layer mirrors the first-wave packs shown on
agentex.cc and keeps two bundle packs for ordinary and advanced installs.
Website first-wave packs:
web_search: web discovery and ranked source candidates.web_fetch: known-URL reading, article extraction, metadata, and relay fallback.deep_research: multi-step evidence gathering, synthesis, analysis, and UI review.ocr(aliases:rapidocr,ppocrv6): RapidOCR-compatible screenshots, scans, rendered PDF page images, UI images, and photo text extraction with PP-OCRv6-ready metadata.audio: ASR, TTS, meeting notes, and batch audio jobs.imagen: text-to-image, image-to-video, and media generation.docx,xlsx,pptx, andpdf: native document-family packs.
Compatibility and bundle packs:
documents: registry-compatible document-family pack fordocx,xlsx,pptx, andpdf.standard: ordinary bundle for web, document, OCR, and research workflows.advanced: full bundle with all built-in first-wave skills, including audio, media generation, and presentation handling.
Website integrations can query pack state, install history, and billing history through the CLI, the local HTTP API, or the local MCP server. They can also query task scenarios such as invoice processing, contract review, meeting-to-deck handoff, marketing asset generation, and support knowledge-base creation. Each scenario returns the recommended pack, missing skills, real task inputs, deliverables, workflow steps, acceptance criteria, install plan, and billing preview needed for a website purchase/install flow:
agtx commerce packs --json
agtx commerce packs --pack-id pdf --json
agtx commerce scenarios --pack-id standard --json
agtx commerce scenarios --pack-id pdf --json
agtx commerce scenarios --scenario-id invoice_processing --json
agtx commerce install-pack pdf --plan --json
agtx commerce install-pack pdf --yes --json
agtx commerce install-pack standard --plan --json
agtx commerce install-pack standard --yes --json
agtx commerce install-scenario invoice_processing --plan --json
agtx commerce install-scenario invoice_processing --yes --json
agtx commerce scenario-ledger invoice_processing --json
agtx commerce install-records --pack-id standard --json
agtx commerce install-records --pack-id pdf --json
agtx commerce billing-records --pack-id standard --json
agtx commerce billing-records --pack-id pdf --json
agtx commerce billing-records --scenario-id invoice_processing --json
agtx commerce billing-records --pack-id standard --type pack_install --currency USD --status local_only --json
agtx run <installed-skill> --scenario-id invoice_processing --json
agtx commerce snapshot --pack-id standard --json
agtx commerce snapshot --pack-id pdf --json
agtx commerce snapshot --scenario-id invoice_processing --json
agtx commerce snapshot --pack-id standard --out ./commerce-snapshot.json --json
agtx commerce serve --addr 127.0.0.1:8765 --allow-origin https://example.comThe HTTP API returns the same response envelope as CLI JSON output. Website panels can call:
GET /commercefor the local capability-pack dashboardGET /v1/commerce/packsGET /v1/commerce/packs?pack_id=pdfGET /v1/commerce/scenarios?scenario_id=invoice_processingGET /v1/commerce/scenarios?pack_id=pdfGET /v1/commerce/install-plan?pack_id=pdfGET /v1/commerce/install-plan?pack_id=standardGET /v1/commerce/scenario-install-plan?scenario_id=invoice_processingPOST /v1/commerce/install-packwith headerX-AGTX-Commerce-Tokenand body{"pack_id":"pdf","yes":true}POST /v1/commerce/install-packwith headerX-AGTX-Commerce-Tokenand body{"pack_id":"standard","yes":true}POST /v1/commerce/install-scenariowith headerX-AGTX-Commerce-Tokenand body{"scenario_id":"invoice_processing","yes":true}GET /v1/commerce/scenario-ledger?scenario_id=invoice_processing&limit=100GET /v1/commerce/install-records?pack_id=standard&status=installed&limit=100GET /v1/commerce/install-records?pack_id=pdf&status=installed&limit=100GET /v1/commerce/install-records?scenario_id=invoice_processing&limit=100GET /v1/commerce/billing-records?pack_id=standard&type=pack_install¤cy=USD&status=local_only&limit=100GET /v1/commerce/billing-records?pack_id=pdf&type=pack_install&limit=100GET /v1/commerce/billing-records?scenario_id=invoice_processing&limit=100GET /v1/commerce/snapshot?pack_id=standard&limit=100GET /v1/commerce/snapshot?pack_id=pdf&limit=100GET /v1/commerce/snapshot?scenario_id=invoice_processing&limit=100
commerce serve prints the dashboard URL and one-session mutation token used
by the dashboard when installing packs. The server only binds loopback
addresses such as 127.0.0.1 or localhost; --allow-origin must be a
specific http:// or https:// origin and cannot be *. CORS is applied only
to /v1/commerce... API routes, not to the local dashboard HTML that embeds the
mutation token.
The matching MCP tools are list_capability_packs,
list_capability_scenarios, get_capability_scenario,
plan_capability_pack_install, install_capability_pack,
plan_capability_scenario_install, install_capability_scenario,
get_capability_scenario_ledger, list_install_records,
list_billing_records, and get_commerce_snapshot.
Local install and billing ledgers are stored as JSONL under the agtx config
directory and returned through both surfaces with stable JSON Schema discovery
metadata on MCP. Scenario-driven installs write install_scenario install
records and tag matching install and billing records with scenario_id. Skill
runs can also pass --scenario-id, and MCP run_skill accepts scenario_id;
metered usage events and local skill_usage billing records carry the
canonical scenario id for website invoices and workflow history.
For website account pages, scenario-ledger and
get_capability_scenario_ledger return the scenario view, latest install,
matching install records, billing totals, and a split between pack-install and
skill-usage records in one response.
Plan before mutating, then refresh a configured remote registry:
agtx install pdf --plan --json
agtx registry validate ./registry.json --json
agtx registry refresh --json
agtx doctor --json
agtx verify pdf --jsonOn macOS:
- Config:
~/Library/Application Support/agtx/config.json - Registry cache:
~/Library/Caches/agtx/registry/ - Skills:
~/Library/Application Support/agtx/skills/<name>/<version>/ - Logs:
~/Library/Logs/agtx/
Set AGTX_HOME to redirect all state into a single directory for tests or isolated agent runs.
On Windows:
- Config/auth:
%APPDATA%\agtx\config.jsonand%APPDATA%\agtx\auth.json - Registry cache:
%LOCALAPPDATA%\agtx\Cache\registry\ - Skills:
%APPDATA%\agtx\skills\<name>\<version>\ - Logs:
%LOCALAPPDATA%\agtx\Logs\
Local Windows test example:
$env:AGTX_HOME="$PWD\.tmp-agtx"
go test ./...
go run ./cmd/agtx status