pdf-mcp offers three storage modes that control how generated PDFs are stored and returned. Understanding these modes is key to getting the best performance from both the REST API and MCP Server.
The Three Modes
| Mode | Stores to S3 | Response | Best For |
|---|---|---|---|
memory | No | Binary PDF (REST) or base64 (MCP) | One-off downloads, simple integrations |
default | Yes (pdf-mcp bucket) | JSON metadata + signed URL | AI agents, file management, audit trails |
byob | Yes (your bucket) | JSON metadata + signed URL | Enterprise, compliance, data sovereignty |
memory — No Persistence
The PDF is generated in memory and returned directly in the response. Nothing is stored. Once the response is delivered, the file is gone.
- REST: Returns
application/pdfbinary — pipe it to a file or stream it to a client. - MCP: Returns base64-encoded content as an
EmbeddedResourcethe MCP client can save to disk, plus a shortTextContentsummary for the LLM.
Use this when you just need the PDF bytes and don’t need to retrieve it later.
default — pdf-mcp Cloud Storage
The PDF is uploaded to the pdf-mcp S3 bucket and a document record is created in your account. The response includes a presigned download URL and document metadata.
- Files are accessible via the Files API (list, search, download, delete).
- Presigned URLs expire (configurable, default 1 hour, max 7 days).
- Documents can auto-expire via
retention_days. - Full-text search is available on stored document content.
byob — Bring Your Own Bucket
Same as default, but the PDF is stored in your own S3 bucket. Requires BYOB configuration in your account settings. Useful for compliance requirements or when you need full control over storage.
Channel Defaults
REST API and MCP Server use different defaults. This is intentional.
| Channel | Default Mode | Why |
|---|---|---|
| REST API | memory | Most REST callers just want the PDF bytes — download and done. |
| MCP Server | default | AI agents work best when files are stored and referenced by URL. |
Why MCP defaults to default (stored)
When an AI agent generates a PDF via MCP, the raw file bytes aren’t useful in the conversation. Here’s why storing is better:
-
Token efficiency. A 100 KB PDF base64-encoded becomes ~133 KB of text — roughly 32,000 tokens. Returning this in the conversation wastes context window and costs money. With
defaultstorage, the LLM only sees a short metadata summary (~100 tokens). -
EmbeddedResource support. MCP clients that support
EmbeddedResource(like Claude Desktop) can save the PDF blob directly to disk without the LLM ever “seeing” the bytes. But this only works for smart clients — storing the file provides a universal fallback. -
File management. Stored documents get a
document_idthat can be passed todownload_document,search_documents,list_documents, anddelete_documentMCP tools. The AI agent can organize, search, and retrieve files across conversations. -
Multi-step workflows. When the agent needs to merge, split, or convert a previously generated PDF, it can reference it by URL or document ID instead of holding megabytes of base64 in context.
-
Reliability. Presigned URLs and the
/files/{id}/downloadendpoint work from any environment. Base64 blobs can fail silently when an MCP client doesn’t handleEmbeddedResource.
You can always override the default by passing storage_mode: "memory" explicitly.
How to Set the Storage Mode
REST API
Pass a storage object in the JSON request body:
{
"html": "<h1>Hello</h1>",
"storage": {
"mode": "default",
"filename": "hello.pdf",
"expires_in": 86400,
"retention_days": 30
}
}
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | memory | memory, default, or byob |
filename | string | auto | Custom filename for stored PDFs |
expires_in | integer | 3600 | Signed URL expiration in seconds (60–604800) |
retention_days | integer | 14 | Auto-delete after N days (1–365) |
MCP Server
Pass storage_mode as a tool parameter:
html_to_pdf(html="<h1>Hello</h1>", storage_mode="memory")
The MCP interface uses a flat storage_mode string parameter instead of a nested object. MCP does not expose expires_in or retention_days — these use server defaults.
Resolution Priority
The storage mode is resolved in this order:
- Explicit request parameter —
storage.mode(REST) orstorage_mode(MCP) - BYOB configuration — If you have BYOB configured in your profile and no explicit mode is set,
byobis used - Channel default —
memoryfor REST,defaultfor MCP
The return_binary Parameter
Sometimes you want the best of both worlds: store the file to S3 and get the binary PDF back in the same response.
The return_binary parameter does exactly this. It’s available on all PDF-producing endpoints.
return_binary | storage.mode | What happens |
|---|---|---|
false (default) | memory | Return binary (no storage) |
false (default) | default / byob | Store, return JSON metadata |
true | memory | Return binary (no change) |
true | default / byob | Store and return binary |
REST API
When return_binary: true is set with default or byob storage, the response is application/pdf binary (same as memory mode) with storage metadata in custom headers:
Content-Type: application/pdf
Content-Disposition: attachment; filename="report.pdf"
X-Document-Id: 550e8400-e29b-41d4-a716-446655440000
X-Storage-Mode: default
X-Storage-Url: https://s3.eu-central-1.amazonaws.com/...
curl -X POST https://api.pdf-mcp.io/htmlToPdf \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"html": "<h1>Report</h1>",
"storage": { "mode": "default" },
"return_binary": true
}' \
--output report.pdf
The file is stored to S3 as normal. You get the PDF bytes immediately and can retrieve it later via the Files API.
MCP Server
When return_binary: true with default or byob, the MCP response includes the PDF as an EmbeddedResource (same format as memory mode) after the file has been stored:
html_to_pdf(html="<h1>Report</h1>", storage_mode="default", return_binary=true)
MCP Response Format
MCP tools return structured content that separates what the LLM sees from the actual file data.
Memory Mode Response
Two content blocks are returned:
-
TextContent— A short summary the LLM can read:PDF generated: report.pdf (45 KB)
-
EmbeddedResource— The PDF binary as aBlobResourceContentsobject. Smart MCP clients (like Claude Desktop) save this to disk automatically. The LLM does not see these bytes.
Stored Mode Response (default / byob)
-
TextContent— Metadata and download instructions:PDF stored successfully
Document ID: 550e8400… Filename: report.pdf Size: 45.2 KB Pages: 3
Presigned URL: https://s3…
REST Download:
GET /files/{id}/downloadOr use thedownload_documentMCP tool. -
EmbeddedResource(included automatically) — The PDF binary, so smart clients can save the file to disk even though it’s also stored in the cloud.
This dual-content approach means:
- The LLM sees only the small text summary — no wasted tokens.
- The MCP client gets the binary file for local saving.
- The document is stored in the cloud for later retrieval, search, and management.
Available Endpoints
Storage mode is supported on all PDF-producing endpoints:
| Endpoint | REST | MCP Tool |
|---|---|---|
| HTML to PDF | POST /htmlToPdf | html_to_pdf |
| Text to PDF | POST /textToPdf | text_to_pdf |
| Image to PDF | POST /imageToPdf | image_to_pdf |
| Extract Pages | POST /extractPages | extract_pages |
| Merge PDFs | POST /mergePdfs | merge_pdfs |
Read-only endpoints (/extractText, /pdfToImage, /pageCount, /grepPdf) do not support storage modes.
File Management
Documents stored with default or byob mode are tracked in your account and can be managed via:
REST API — Files API
| Endpoint | Description |
|---|---|
GET /files | List all stored documents (paginated) |
GET /files/search?q=... | Full-text search across document contents |
GET /files/{id} | Get document metadata + fresh presigned URL |
GET /files/{id}/download | Download PDF binary directly |
DELETE /files/{id} | Delete a document |
MCP Tools
| Tool | Description |
|---|---|
list_documents | List stored documents with pagination |
search_documents | Search document contents (fulltext or grep) |
get_document | Get document metadata and download URL |
download_document | Download a stored PDF (returns EmbeddedResource) |
delete_document | Delete a document from storage |
These tools allow AI agents to build complete document management workflows — generate, store, search, retrieve, and clean up — all within natural language conversations.
Related Documentation
- Quick Start — Get your API key and make your first request
- Files API — REST API for file management
- MCP Server — Set up AI agent integration
- HTML to PDF — Primary PDF generation endpoint
- Authentication — API key and OAuth setup