openclaw/docs
Tyler Yust d0ac1b0195
feat: add PDF analysis tool with native provider support (#31319)
* feat: add PDF analysis tool with native provider support

New `pdf` tool for analyzing PDF documents with model-powered analysis.

Architecture:
- Native PDF path: sends raw PDF bytes directly to providers that support
  inline document input (Anthropic via DocumentBlockParam, Google Gemini
  via inlineData with application/pdf MIME type)
- Extraction fallback: for providers without native PDF support, extracts
  text via pdfjs-dist and rasterizes pages to images via @napi-rs/canvas,
  then sends through the standard vision/text completion path

Key features:
- Single PDF (`pdf` param) or multiple PDFs (`pdfs` array, up to 10)
- Page range selection (`pages` param, e.g. "1-5", "1,3,7-9")
- Model override (`model` param) and file size limits (`maxBytesMb`)
- Auto-detects provider capability and falls back gracefully
- Same security patterns as image tool (SSRF guards, sandbox support,
  local path roots, workspace-only policy)

Config (agents.defaults):
- pdfModel: primary/fallbacks (defaults to imageModel, then session model)
- pdfMaxBytesMb: max PDF file size (default: 10)
- pdfMaxPages: max pages to process (default: 20)

Model catalog:
- Extended ModelInputType to include "document" alongside "text"/"image"
- Added modelSupportsDocument() capability check

Files:
- src/agents/tools/pdf-tool.ts - main tool factory
- src/agents/tools/pdf-tool.helpers.ts - helpers (page range, config, etc.)
- src/agents/tools/pdf-native-providers.ts - direct API calls for Anthropic/Google
- src/agents/tools/pdf-tool.test.ts - 43 tests covering all paths
- Modified: model-catalog.ts, openclaw-tools.ts, config schema/types/labels/help

* fix: prepare pdf tool for merge (#31319) (thanks @tyler6204)
2026-03-01 22:39:12 -08:00
..
.i18n fix(docs): revert accidental es/pt-BR translation scaffold from #18473 2026-02-17 02:23:41 +01:00
assets docs: add Vercel sponsorship (#29270) 2026-03-02 06:25:46 +00:00
automation fix(cron): add retry policy for one-shot jobs on transient errors (#24355) (openclaw#24435) thanks @hugenshen 2026-03-01 06:58:03 -06:00
channels fix(channels): add optional defaultAccount routing 2026-03-02 04:03:46 +00:00
cli feat(config): add `openclaw config validate` and improve startup error messages (#31220) 2026-03-02 00:45:51 -05:00
concepts docs: sync android node docs with current pairing and capabilities 2026-03-02 11:08:51 +05:30
debug Docs: enable markdownlint autofixables except list numbering (#10476) 2026-02-06 10:08:59 -05:00
design feat: Add Kilo Gateway provider (#20212) 2026-02-23 23:29:27 +00:00
diagnostics Docs: add nav titles across docs (#5689) 2026-01-31 15:04:03 -06:00
experiments Discord: thread bindings idle + max-age lifecycle (#27845) (thanks @osolmaz) 2026-02-27 10:02:39 +01:00
gateway feat: add PDF analysis tool with native provider support (#31319) 2026-03-01 22:39:12 -08:00
help docs: sync android node docs with current pairing and capabilities 2026-03-02 11:08:51 +05:30
images Docs: add screenshot showing model picker usability issue 2026-02-17 09:15:55 +01:00
install Chore: add Dockerfile HEALTHCHECK and debug-log silent catch blocks (#11478) 2026-03-01 20:52:14 -08:00
ja-JP Docs: add all unlisted docs routes to navigation (#31027) 2026-03-01 15:09:35 -08:00
nodes docs: sync android node docs with current pairing and capabilities 2026-03-02 11:08:51 +05:30
platforms docs: sync android node docs with current pairing and capabilities 2026-03-02 11:08:51 +05:30
plugins docs: add WeChat community plugin listing 2026-02-24 08:41:28 -06:00
providers docs: replace bare provider URLs with markdown links 2026-03-02 06:01:29 +00:00
refactor docs: update outbound refactor test path 2026-02-22 21:28:08 +01:00
reference test: split fast lane from channel and gateway suites 2026-03-02 05:33:07 +00:00
security docs: add missing summary/read_when metadata 2026-02-22 20:45:09 +01:00
start docs: consolidate grammy links to telegram 2026-02-27 08:00:29 +05:30
tools feat: add PDF analysis tool with native provider support (#31319) 2026-03-01 22:39:12 -08:00
web Exec/ACP: inject OPENCLAW_SHELL into child shell env (#31271) 2026-03-01 20:31:06 -08:00
zh-CN fix(subagents): return completion message for manual session spawns 2026-02-18 02:52:35 +01:00
CNAME
brave-search.md Docs: enable markdownlint autofixables except list numbering (#10476) 2026-02-06 10:08:59 -05:00
ci.md docs: add missing summaries and read_when hints 2026-02-22 20:37:02 +01:00
date-time.md Docs: add nav titles across docs (#5689) 2026-01-31 15:04:03 -06:00
docs.json Docs: add all unlisted docs routes to navigation (#31027) 2026-03-01 15:09:35 -08:00
index.md docs: sync android node docs with current pairing and capabilities 2026-03-02 11:08:51 +05:30
logging.md Feat/logger support log level validation0222 (#23436) 2026-02-22 11:15:13 +01:00
nav-tabs-underline.js docs(ui): add animated underline for nav tabs (#21912) 2026-02-20 09:33:46 -05:00
network.md docs: canonicalize docs paths and align zh navigation (#11428) 2026-02-07 15:40:35 -05:00
perplexity.md Docs: enable markdownlint autofixables except list numbering (#10476) 2026-02-06 10:08:59 -05:00
pi-dev.md docs: replace removed pi test script with current commands 2026-02-22 21:07:34 +01:00
pi.md fix(pi): stop history image reinjection token blowup 2026-02-26 16:38:20 +01:00
prose.md docs: canonicalize docs paths and align zh navigation (#11428) 2026-02-07 15:40:35 -05:00
style.css fix(ios): force tls for non-loopback manual gateway hosts (#21969) 2026-02-20 16:28:47 +00:00
tts.md fix(tts): make model provider overrides opt-in 2026-02-21 13:16:07 +01:00
vps.md CLI: add root --help fast path and lazy channel option resolution (#30975) 2026-03-01 14:23:46 -08:00
whatsapp-openclaw-ai-zh.jpg Docs: add zh-CN landing notice + AI image 2026-02-02 18:35:01 +01:00
whatsapp-openclaw.jpg