mirror of https://github.com/openclaw/openclaw.git
* feat: add PDF analysis tool with native provider support New `pdf` tool for analyzing PDF documents with model-powered analysis. Architecture: - Native PDF path: sends raw PDF bytes directly to providers that support inline document input (Anthropic via DocumentBlockParam, Google Gemini via inlineData with application/pdf MIME type) - Extraction fallback: for providers without native PDF support, extracts text via pdfjs-dist and rasterizes pages to images via @napi-rs/canvas, then sends through the standard vision/text completion path Key features: - Single PDF (`pdf` param) or multiple PDFs (`pdfs` array, up to 10) - Page range selection (`pages` param, e.g. "1-5", "1,3,7-9") - Model override (`model` param) and file size limits (`maxBytesMb`) - Auto-detects provider capability and falls back gracefully - Same security patterns as image tool (SSRF guards, sandbox support, local path roots, workspace-only policy) Config (agents.defaults): - pdfModel: primary/fallbacks (defaults to imageModel, then session model) - pdfMaxBytesMb: max PDF file size (default: 10) - pdfMaxPages: max pages to process (default: 20) Model catalog: - Extended ModelInputType to include "document" alongside "text"/"image" - Added modelSupportsDocument() capability check Files: - src/agents/tools/pdf-tool.ts - main tool factory - src/agents/tools/pdf-tool.helpers.ts - helpers (page range, config, etc.) - src/agents/tools/pdf-native-providers.ts - direct API calls for Anthropic/Google - src/agents/tools/pdf-tool.test.ts - 43 tests covering all paths - Modified: model-catalog.ts, openclaw-tools.ts, config schema/types/labels/help * fix: prepare pdf tool for merge (#31319) (thanks @tyler6204) |
||
|---|---|---|
| .. | ||
| security | ||
| authentication.md | ||
| background-process.md | ||
| bonjour.md | ||
| bridge-protocol.md | ||
| cli-backends.md | ||
| configuration-examples.md | ||
| configuration-reference.md | ||
| configuration.md | ||
| discovery.md | ||
| doctor.md | ||
| gateway-lock.md | ||
| health.md | ||
| heartbeat.md | ||
| index.md | ||
| local-models.md | ||
| logging.md | ||
| multiple-gateways.md | ||
| network-model.md | ||
| openai-http-api.md | ||
| openresponses-http-api.md | ||
| pairing.md | ||
| protocol.md | ||
| remote-gateway-readme.md | ||
| remote.md | ||
| sandbox-vs-tool-policy-vs-elevated.md | ||
| sandboxing.md | ||
| secrets-plan-contract.md | ||
| secrets.md | ||
| tailscale.md | ||
| tools-invoke-http-api.md | ||
| troubleshooting.md | ||
| trusted-proxy-auth.md | ||