Files API | pdf-mcp Documentation

The Files API provides endpoints for managing PDF documents stored in your account. You can list all stored files, search their contents, retrieve individual documents with fresh download URLs, download PDF binaries directly, and delete files when no longer needed.

Overview

When PDFs are generated with storage: "default" or storage: "byob", they are stored and tracked in your account. The Files API lets you:

List all stored documents with pagination and filtering
Search document contents using full-text search or substring matching
Retrieve individual documents with fresh presigned download URLs
Download PDF binaries directly when presigned URLs are not accessible
Delete documents from storage

List Files

Retrieve a paginated list of all stored documents.

Endpoint

GET /files

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Query Parameters

Parameter	Type	Required	Default	Description
`limit`	integer	No	50	Maximum documents to return (max: 100)
`offset`	integer	No	0	Number of documents to skip for pagination
`storage_mode`	string	No	-	Filter by storage mode: `default` or `byob`

Example Request

# List first 50 documents
curl -X GET "https://api.pdf-mcp.io/files" \
  -H "Authorization: Bearer YOUR_API_KEY"

# List with pagination
curl -X GET "https://api.pdf-mcp.io/files?limit=20&offset=40" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Filter by storage mode
curl -X GET "https://api.pdf-mcp.io/files?storage_mode=default" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "documents": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "filename": "invoice-2024.pdf",
      "storage_mode": "default",
      "file_size_bytes": 45678,
      "page_count": 3,
      "source_endpoint": "/htmlToPdf",
      "url": "https://s3.eu-central-1.amazonaws.com/...",
      "signed_url_expires_at": "2024-01-15T12:00:00Z",
      "expires_at": "2024-02-15T10:30:00Z",
      "created_at": "2024-01-15T10:30:00Z"
    }
  ],
  "total": 150,
  "limit": 50,
  "offset": 0
}

Response Fields

Field	Type	Description
`documents`	array	Array of document objects
`total`	integer	Total number of documents matching filter
`limit`	integer	Limit used for this request
`offset`	integer	Offset used for this request

Document Object

Field	Type	Description
`id`	string	Document UUID
`filename`	string	Document filename
`storage_mode`	string	`default` or `byob`
`file_size_bytes`	integer	Size of the PDF in bytes
`page_count`	integer	Number of pages
`source_endpoint`	string	API endpoint that generated the PDF
`url`	string	Presigned URL for downloading
`signed_url_expires_at`	string	Expiration time of the signed URL (ISO 8601)
`expires_at`	string	Auto-deletion timestamp if set (ISO 8601)
`created_at`	string	Document creation timestamp (ISO 8601)

Status Codes

Code	Description
200	Success - Documents list returned
401	Unauthorized - Missing or invalid Authorization header
500	Internal Server Error - Failed to list documents

Credit Usage

Free (0 credits)

Search Files

Search document contents using full-text search or substring matching.

Endpoint

GET /files/search

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Query Parameters

Parameter	Type	Required	Default	Description
`q`	string	Yes	-	Search query string
`mode`	string	No	`fulltext`	Search mode: `fulltext` or `grep`
`limit`	integer	No	20	Maximum results to return (max: 100)

Search Modes

fulltext (default)

Uses PostgreSQL full-text search
Returns results ranked by relevance
Supports word stemming and language-aware matching
Faster for large document collections
Returns snippets with match context

grep

Case-insensitive substring matching
Exact phrase matching
Slower than fulltext, especially for large collections
No relevance ranking
Returns snippets with match context

Example Request

# Full-text search
curl -X GET "https://api.pdf-mcp.io/files/search?q=invoice%20total" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Grep mode for exact substring
curl -X GET "https://api.pdf-mcp.io/files/search?q=INV-2024-001&mode=grep" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Limit results
curl -X GET "https://api.pdf-mcp.io/files/search?q=contract&limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "results": [
    {
      "document": {
        "id": "550e8400-e29b-41d4-a716-446655440000",
        "filename": "invoice-2024.pdf",
        "storage_mode": "default",
        "file_size_bytes": 45678,
        "page_count": 3,
        "source_endpoint": "/htmlToPdf",
        "url": "https://s3.eu-central-1.amazonaws.com/...",
        "signed_url_expires_at": "2024-01-15T12:00:00Z",
        "expires_at": "2024-02-15T10:30:00Z",
        "created_at": "2024-01-15T10:30:00Z"
      },
      "rank": 0.85,
      "snippet": "...total amount due: $1,234.56. Invoice date..."
    }
  ],
  "total": 5,
  "query": "invoice total",
  "mode": "fulltext"
}

Response Fields

Field	Type	Description
`results`	array	Array of search hit objects
`total`	integer	Total number of matching documents
`query`	string	The search query used
`mode`	string	The search mode used

Search Hit Object

Field	Type	Description
`document`	object	Document metadata with presigned URL
`rank`	number	Relevance score (fulltext mode only, null for grep)
`snippet`	string	Text snippet showing match context

Status Codes

Code	Description
200	Success - Search results returned
400	Bad Request - Missing query or invalid mode
401	Unauthorized - Missing or invalid Authorization header
500	Internal Server Error - Search failed

Credit Usage

0.01 credits per search

Notes

Search is performed on the markdown representation of PDF content extracted during generation
Documents without text content (e.g., image-based PDFs) will not appear in search results
Only documents owned by the authenticated user are searched

Get File

Retrieve a single document by ID with a fresh presigned download URL.

Endpoint

GET /files/{document_id}

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Path Parameters

Parameter	Type	Required	Description
`document_id`	string	Yes	UUID of the document to retrieve

Query Parameters

Parameter	Type	Required	Default	Description
`expires_in`	integer	No	3600	Signed URL expiry in seconds (min: 60, max: 604800)

Example Request

# Get document with default 1-hour URL
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Get document with 24-hour URL
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=86400" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Get document with 7-day URL (maximum)
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=604800" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "filename": "invoice-2024.pdf",
  "storage_mode": "default",
  "file_size_bytes": 45678,
  "page_count": 3,
  "source_endpoint": "/htmlToPdf",
  "url": "https://s3.eu-central-1.amazonaws.com/...",
  "signed_url_expires_at": "2024-01-15T12:00:00Z",
  "expires_at": "2024-02-15T10:30:00Z",
  "created_at": "2024-01-15T10:30:00Z"
}

Response Fields

Field	Type	Description
`id`	string	Document UUID
`filename`	string	Document filename
`storage_mode`	string	`default` or `byob`
`file_size_bytes`	integer	Size of the PDF in bytes
`page_count`	integer	Number of pages
`source_endpoint`	string	API endpoint that generated the PDF
`url`	string	Fresh presigned URL for downloading
`signed_url_expires_at`	string	Expiration time of the signed URL (ISO 8601)
`expires_at`	string	Auto-deletion timestamp if set (ISO 8601)
`created_at`	string	Document creation timestamp (ISO 8601)

Status Codes

Code	Description
200	Success - Document returned with presigned URL
401	Unauthorized - Missing or invalid Authorization header
403	Forbidden - Access denied (document belongs to another user)
404	Not Found - Document not found or was deleted
502	Bad Gateway - Failed to generate download URL

Credit Usage

Free (0 credits)

Download File

Download a document’s PDF binary directly by ID. This proxies the file from S3 and returns the raw PDF, rather than a presigned URL.

Endpoint

GET /files/{document_id}/download

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Path Parameters

Parameter	Type	Required	Description
`document_id`	string	Yes	UUID of the document to download

Example Request

# Download PDF directly
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000/download" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output document.pdf

Success Response

Returns the raw PDF binary with the following headers:

Header	Value
`Content-Type`	`application/pdf`
`Content-Disposition`	`attachment; filename="original-filename.pdf"`
`Content-Length`	File size in bytes

Status Codes

Code	Description
200	Success - PDF binary returned
401	Unauthorized - Missing or invalid Authorization header
403	Forbidden - Access denied (document belongs to another user)
404	Not Found - Document not found or was deleted
502	Bad Gateway - Failed to download file from storage

Credit Usage

Free (0 credits)

Notes

Returns the actual PDF bytes, not a JSON response with a presigned URL
Useful when your environment cannot access S3 presigned URLs directly (e.g., firewalled VMs, restricted networks)
The original filename is preserved in the Content-Disposition header
Only the document owner can download their documents

Delete File

Delete a document by ID, removing it from storage.

Endpoint

DELETE /files/{document_id}

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Path Parameters

Parameter	Type	Required	Description
`document_id`	string	Yes	UUID of the document to delete

Example Request

curl -X DELETE "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "deleted": true,
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "message": "Document 550e8400-e29b-41d4-a716-446655440000 has been deleted"
}

Response Fields

Field	Type	Description
`deleted`	boolean	True if deletion succeeded
`document_id`	string	UUID of the deleted document
`message`	string	Confirmation message

Status Codes

Code	Description
200	Success - Document deleted
401	Unauthorized - Missing or invalid Authorization header
403	Forbidden - Access denied (document belongs to another user)
404	Not Found - Document not found or was already deleted
502	Bad Gateway - Failed to delete file from storage

Credit Usage

Free (0 credits)

Notes

Deletion is a soft delete - the document metadata is preserved with a deleted_at timestamp but will no longer appear in list queries
The actual file is removed from S3 storage
This action cannot be undone

Use Cases

List Recent Documents

Retrieve the most recent documents for display in a dashboard:

curl -X GET "https://api.pdf-mcp.io/files?limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY"

Paginate Through All Documents

Retrieve all documents in batches:

# First page
curl -X GET "https://api.pdf-mcp.io/files?limit=50&offset=0" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Second page
curl -X GET "https://api.pdf-mcp.io/files?limit=50&offset=50" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Continue until total is reached...

Find Documents by Content

Search for invoices containing a specific customer name:

curl -X GET "https://api.pdf-mcp.io/files/search?q=Acme%20Corporation" \
  -H "Authorization: Bearer YOUR_API_KEY"

Find Exact Document Reference

Search for a specific invoice number using grep mode:

curl -X GET "https://api.pdf-mcp.io/files/search?q=INV-2024-00123&mode=grep" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get Fresh Download URL

Retrieve a document with a custom expiry time for sharing:

# Get 24-hour download link
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=86400" \
  -H "Authorization: Bearer YOUR_API_KEY"

Download PDF Directly

Download the PDF file when presigned URLs are not accessible:

curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000/download" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output invoice.pdf

Clean Up Old Documents

Delete documents that are no longer needed:

curl -X DELETE "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer YOUR_API_KEY"

Tips and Best Practices

Efficient Pagination

Use reasonable page sizes (20-50 documents) for better performance
Cache the total count to calculate total pages without re-fetching
Consider filtering by storage_mode if you have documents in both modes

Search Optimization

Use fulltext mode for natural language queries (e.g., “invoice payment due”)
Use grep mode for exact identifiers (e.g., “INV-2024-001”, email addresses)
Keep search queries focused - broader queries return more results but may be less relevant

URL Management

Presigned URLs expire - always check signed_url_expires_at before using
Request longer expires_in values when sharing URLs externally
Maximum URL validity is 7 days (604800 seconds)

Storage Management

Documents with expires_at set will be automatically deleted
Manually delete documents you no longer need to free up storage
Use search to find old or unused documents for cleanup

HTML to PDF - Generate PDFs that can be stored
Text to PDF - Generate PDFs from text content
Image to PDF - Generate PDFs from images

Credit Usage Summary

Endpoint	Credits
GET /files	Free
GET /files/search	0.01
GET /files/{id}	Free
GET /files/{id}/download	Free
DELETE /files/{id}	Free

Overview

List Files

Endpoint

Authentication

Query Parameters

Example Request

Success Response

Response Fields

Document Object

Status Codes

Credit Usage

Search Files

Endpoint

Authentication

Query Parameters

Search Modes

Example Request

Success Response

Response Fields

Search Hit Object

Status Codes

Credit Usage

Notes

Get File

Endpoint

Authentication

Path Parameters

Query Parameters

Example Request

Success Response

Response Fields

Status Codes

Credit Usage

Download File

Endpoint

Authentication

Path Parameters

Example Request

Success Response

Status Codes

Credit Usage

Notes

Delete File

Endpoint

Authentication

Path Parameters

Example Request

Success Response

Response Fields

Status Codes

Credit Usage

Notes

Use Cases

List Recent Documents

Paginate Through All Documents

Find Documents by Content

Find Exact Document Reference

Get Fresh Download URL

Download PDF Directly

Clean Up Old Documents

Tips and Best Practices

Efficient Pagination

Search Optimization

URL Management

Storage Management

Related Endpoints

Credit Usage Summary