Storage Modes | pdf-mcp Documentation

pdf-mcp offers three storage modes that control how generated PDFs are stored and returned. Understanding these modes is key to getting the best performance from both the REST API and MCP Server.

The Three Modes

Mode	Stores to S3	Response	Best For
`memory`	No	Binary PDF (REST) or base64 (MCP)	One-off downloads, simple integrations
`default`	Yes (pdf-mcp bucket)	JSON metadata + signed URL	AI agents, file management, audit trails
`byob`	Yes (your bucket)	JSON metadata + signed URL	Enterprise, compliance, data sovereignty

`memory` — No Persistence

The PDF is generated in memory and returned directly in the response. Nothing is stored. Once the response is delivered, the file is gone.

REST: Returns application/pdf binary — pipe it to a file or stream it to a client.
MCP: Returns base64-encoded content as an EmbeddedResource the MCP client can save to disk, plus a short TextContent summary for the LLM.

Use this when you just need the PDF bytes and don’t need to retrieve it later.

`default` — pdf-mcp Cloud Storage

The PDF is uploaded to the pdf-mcp S3 bucket and a document record is created in your account. The response includes a presigned download URL and document metadata.

Files are accessible via the Files API (list, search, download, delete).
Presigned URLs expire (configurable, default 1 hour, max 7 days).
Documents can auto-expire via retention_days.
Full-text search is available on stored document content.

`byob` — Bring Your Own Bucket

Same as default, but the PDF is stored in your own S3 bucket. Requires BYOB configuration in your account settings. Useful for compliance requirements or when you need full control over storage.

Channel Defaults

REST API and MCP Server use different defaults. This is intentional.

Channel	Default Mode	Why
REST API	`memory`	Most REST callers just want the PDF bytes — download and done.
MCP Server	`default`	AI agents work best when files are stored and referenced by URL.

Why MCP defaults to `default` (stored)

When an AI agent generates a PDF via MCP, the raw file bytes aren’t useful in the conversation. Here’s why storing is better:

Token efficiency. A 100 KB PDF base64-encoded becomes ~133 KB of text — roughly 32,000 tokens. Returning this in the conversation wastes context window and costs money. With default storage, the LLM only sees a short metadata summary (~100 tokens).
EmbeddedResource support. MCP clients that support EmbeddedResource (like Claude Desktop) can save the PDF blob directly to disk without the LLM ever “seeing” the bytes. But this only works for smart clients — storing the file provides a universal fallback.
File management. Stored documents get a document_id that can be passed to download_document, search_documents, list_documents, and delete_document MCP tools. The AI agent can organize, search, and retrieve files across conversations.
Multi-step workflows. When the agent needs to merge, split, or convert a previously generated PDF, it can reference it by URL or document ID instead of holding megabytes of base64 in context.
Reliability. Presigned URLs and the /files/{id}/download endpoint work from any environment. Base64 blobs can fail silently when an MCP client doesn’t handle EmbeddedResource.

You can always override the default by passing storage_mode: "memory" explicitly.

How to Set the Storage Mode

REST API

Pass a storage object in the JSON request body:

{
  "html": "<h1>Hello</h1>",
  "storage": {
    "mode": "default",
    "filename": "hello.pdf",
    "expires_in": 86400,
    "retention_days": 30
  }
}

Field	Type	Default	Description
`mode`	string	`memory`	`memory`, `default`, or `byob`
`filename`	string	auto	Custom filename for stored PDFs
`expires_in`	integer	`3600`	Signed URL expiration in seconds (60–604800)
`retention_days`	integer	`14`	Auto-delete after N days (1–365)

MCP Server

Pass storage_mode as a tool parameter:

html_to_pdf(html="<h1>Hello</h1>", storage_mode="memory")

The MCP interface uses a flat storage_mode string parameter instead of a nested object. MCP does not expose expires_in or retention_days — these use server defaults.

Resolution Priority

The storage mode is resolved in this order:

Explicit request parameter — storage.mode (REST) or storage_mode (MCP)
BYOB configuration — If you have BYOB configured in your profile and no explicit mode is set, byob is used
Channel default — memory for REST, default for MCP

The `return_binary` Parameter

Sometimes you want the best of both worlds: store the file to S3 and get the binary PDF back in the same response.

The return_binary parameter does exactly this. It’s available on all PDF-producing endpoints.

`return_binary`	`storage.mode`	What happens
`false` (default)	`memory`	Return binary (no storage)
`false` (default)	`default` / `byob`	Store, return JSON metadata
`true`	`memory`	Return binary (no change)
`true`	`default` / `byob`	Store and return binary

REST API

When return_binary: true is set with default or byob storage, the response is application/pdf binary (same as memory mode) with storage metadata in custom headers:

Content-Type: application/pdf
Content-Disposition: attachment; filename="report.pdf"
X-Document-Id: 550e8400-e29b-41d4-a716-446655440000
X-Storage-Mode: default
X-Storage-Url: https://s3.eu-central-1.amazonaws.com/...

curl -X POST https://api.pdf-mcp.io/htmlToPdf \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "html": "<h1>Report</h1>",
    "storage": { "mode": "default" },
    "return_binary": true
  }' \
  --output report.pdf

The file is stored to S3 as normal. You get the PDF bytes immediately and can retrieve it later via the Files API.

MCP Server

When return_binary: true with default or byob, the MCP response includes the PDF as an EmbeddedResource (same format as memory mode) after the file has been stored:

html_to_pdf(html="<h1>Report</h1>", storage_mode="default", return_binary=true)

MCP Response Format

MCP tools return structured content that separates what the LLM sees from the actual file data.

Memory Mode Response

Two content blocks are returned:

TextContent — A short summary the LLM can read:

PDF generated: report.pdf (45 KB)
EmbeddedResource — The PDF binary as a BlobResourceContents object. Smart MCP clients (like Claude Desktop) save this to disk automatically. The LLM does not see these bytes.

Stored Mode Response (`default` / `byob`)

TextContent — Metadata and download instructions:

PDF stored successfully

Document ID: 550e8400… Filename: report.pdf Size: 45.2 KB Pages: 3

Presigned URL: https://s3…

REST Download: GET /files/{id}/download Or use the download_document MCP tool.
EmbeddedResource (included automatically) — The PDF binary, so smart clients can save the file to disk even though it’s also stored in the cloud.

This dual-content approach means:

The LLM sees only the small text summary — no wasted tokens.
The MCP client gets the binary file for local saving.
The document is stored in the cloud for later retrieval, search, and management.

Available Endpoints

Storage mode is supported on all PDF-producing endpoints:

Endpoint	REST	MCP Tool
HTML to PDF	`POST /htmlToPdf`	`html_to_pdf`
Text to PDF	`POST /textToPdf`	`text_to_pdf`
Image to PDF	`POST /imageToPdf`	`image_to_pdf`
Extract Pages	`POST /extractPages`	`extract_pages`
Merge PDFs	`POST /mergePdfs`	`merge_pdfs`

Read-only endpoints (/extractText, /pdfToImage, /pageCount, /grepPdf) do not support storage modes.

File Management

Documents stored with default or byob mode are tracked in your account and can be managed via:

REST API — Files API

Endpoint	Description
`GET /files`	List all stored documents (paginated)
`GET /files/search?q=...`	Full-text search across document contents
`GET /files/{id}`	Get document metadata + fresh presigned URL
`GET /files/{id}/download`	Download PDF binary directly
`DELETE /files/{id}`	Delete a document

MCP Tools

Tool	Description
`list_documents`	List stored documents with pagination
`search_documents`	Search document contents (fulltext or grep)
`get_document`	Get document metadata and download URL
`download_document`	Download a stored PDF (returns EmbeddedResource)
`delete_document`	Delete a document from storage

These tools allow AI agents to build complete document management workflows — generate, store, search, retrieve, and clean up — all within natural language conversations.

Quick Start — Get your API key and make your first request
Files API — REST API for file management
MCP Server — Set up AI agent integration
HTML to PDF — Primary PDF generation endpoint
Authentication — API key and OAuth setup

The Three Modes

memory — No Persistence

default — pdf-mcp Cloud Storage

byob — Bring Your Own Bucket

Channel Defaults

Why MCP defaults to default (stored)

How to Set the Storage Mode

REST API

MCP Server

Resolution Priority

The return_binary Parameter

REST API

MCP Server

MCP Response Format

Memory Mode Response

Stored Mode Response (default / byob)

Available Endpoints

File Management

REST API — Files API

MCP Tools

Related Documentation

`memory` — No Persistence

`default` — pdf-mcp Cloud Storage

`byob` — Bring Your Own Bucket

Why MCP defaults to `default` (stored)

The `return_binary` Parameter

Stored Mode Response (`default` / `byob`)