openclaw/docs/tools
Tyler Yust d0ac1b0195
feat: add PDF analysis tool with native provider support (#31319)
* feat: add PDF analysis tool with native provider support

New `pdf` tool for analyzing PDF documents with model-powered analysis.

Architecture:
- Native PDF path: sends raw PDF bytes directly to providers that support
  inline document input (Anthropic via DocumentBlockParam, Google Gemini
  via inlineData with application/pdf MIME type)
- Extraction fallback: for providers without native PDF support, extracts
  text via pdfjs-dist and rasterizes pages to images via @napi-rs/canvas,
  then sends through the standard vision/text completion path

Key features:
- Single PDF (`pdf` param) or multiple PDFs (`pdfs` array, up to 10)
- Page range selection (`pages` param, e.g. "1-5", "1,3,7-9")
- Model override (`model` param) and file size limits (`maxBytesMb`)
- Auto-detects provider capability and falls back gracefully
- Same security patterns as image tool (SSRF guards, sandbox support,
  local path roots, workspace-only policy)

Config (agents.defaults):
- pdfModel: primary/fallbacks (defaults to imageModel, then session model)
- pdfMaxBytesMb: max PDF file size (default: 10)
- pdfMaxPages: max pages to process (default: 20)

Model catalog:
- Extended ModelInputType to include "document" alongside "text"/"image"
- Added modelSupportsDocument() capability check

Files:
- src/agents/tools/pdf-tool.ts - main tool factory
- src/agents/tools/pdf-tool.helpers.ts - helpers (page range, config, etc.)
- src/agents/tools/pdf-native-providers.ts - direct API calls for Anthropic/Google
- src/agents/tools/pdf-tool.test.ts - 43 tests covering all paths
- Modified: model-catalog.ts, openclaw-tools.ts, config schema/types/labels/help

* fix: prepare pdf tool for merge (#31319) (thanks @tyler6204)
2026-03-01 22:39:12 -08:00
..
acp-agents.md fix: align ACP permission docs defaults (#31044) (thanks @barronlroth) 2026-03-01 23:30:39 +00:00
agent-send.md Docs: add nav titles across docs (#5689) 2026-01-31 15:04:03 -06:00
apply-patch.md fix(security): default apply_patch workspace containment 2026-02-15 03:19:27 +01:00
browser-linux-troubleshooting.md revert(docs): undo markdownlint autofix churn 2026-02-06 10:00:08 -05:00
browser-login.md chore(skills): remove bird skill 2026-02-06 22:28:44 -08:00
browser.md fix(security): harden browser SSRF defaults and migrate legacy key 2026-02-24 01:52:01 +00:00
chrome-extension.md Docs: clarify Chrome extension relay port derivation (gateway + 3) 2026-02-24 04:16:08 +00:00
clawhub.md Docs: expand ClawHub overview 2026-02-02 02:26:11 -08:00
creating-skills.md docs: add missing summaries and read_when hints 2026-02-22 20:37:02 +01:00
diffs.md fix(diffs): harden viewer security and docs 2026-03-02 05:07:09 +00:00
elevated.md fix(security): tighten elevated allowFrom sender matching 2026-02-22 22:00:08 +01:00
exec-approvals.md refactor!: remove versioned system-run approval contract 2026-03-02 01:12:53 +00:00
exec.md Exec/ACP: inject OPENCLAW_SHELL into child shell env (#31271) 2026-03-01 20:31:06 -08:00
firecrawl.md Docs: add nav titles across docs (#5689) 2026-01-31 15:04:03 -06:00
index.md feat: add PDF analysis tool with native provider support (#31319) 2026-03-01 22:39:12 -08:00
llm-task.md revert(docs): undo markdownlint autofix churn 2026-02-06 10:00:08 -05:00
lobster.md refactor(lobster): remove lobsterPath overrides 2026-02-19 14:58:13 +01:00
loop-detection.md docs: add missing summary/read_when metadata 2026-02-22 20:45:09 +01:00
multi-agent-sandbox-tools.md fix(security): scope session tools and webhook secret fallback 2026-02-16 03:47:10 +01:00
plugin.md Onboarding: support plugin-owned interactive channel flows (#27191) 2026-02-26 01:14:57 -05:00
reactions.md Docs: add nav titles across docs (#5689) 2026-01-31 15:04:03 -06:00
skills-config.md docs(secrets): align provider model and add exec resolver coverage 2026-02-26 14:47:22 +00:00
skills.md docs(secrets): align provider model and add exec resolver coverage 2026-02-26 14:47:22 +00:00
slash-commands.md fix(slack): land #29032 /agentstatus alias from @maloqab 2026-02-27 19:09:38 +00:00
subagents.md fix: harden sessions_spawn delivery params and telegram account routing (#31000, #31110) 2026-03-02 02:35:48 +00:00
thinking.md docs: clarify adaptive thinking and openai websocket docs 2026-03-02 05:46:57 +00:00
web.md fix(security): block private-network web_search citation redirects 2026-03-02 01:05:20 +00:00