The Files API provides endpoints for managing PDF documents stored in your account. You can list all stored files, search their contents, retrieve individual documents with fresh download URLs, download PDF binaries directly, and delete files when no longer needed.
Overview
When PDFs are generated with storage: "default" or storage: "byob", they are stored and tracked in your account. The Files API lets you:
- List all stored documents with pagination and filtering
- Search document contents using full-text search or substring matching
- Retrieve individual documents with fresh presigned download URLs
- Download PDF binaries directly when presigned URLs are not accessible
- Delete documents from storage
List Files
Retrieve a paginated list of all stored documents.
Endpoint
GET /files
Authentication
Requires a valid API key or OAuth token in the Authorization header:
Authorization: Bearer YOUR_API_KEY
See Authentication for details.
Query Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
limit | integer | No | 50 | Maximum documents to return (max: 100) |
offset | integer | No | 0 | Number of documents to skip for pagination |
storage_mode | string | No | - | Filter by storage mode: default or byob |
Example Request
# List first 50 documents
curl -X GET "https://api.pdf-mcp.io/files" \
-H "Authorization: Bearer YOUR_API_KEY"
# List with pagination
curl -X GET "https://api.pdf-mcp.io/files?limit=20&offset=40" \
-H "Authorization: Bearer YOUR_API_KEY"
# Filter by storage mode
curl -X GET "https://api.pdf-mcp.io/files?storage_mode=default" \
-H "Authorization: Bearer YOUR_API_KEY"
Success Response
{
"documents": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"filename": "invoice-2024.pdf",
"storage_mode": "default",
"file_size_bytes": 45678,
"page_count": 3,
"source_endpoint": "/htmlToPdf",
"url": "https://s3.eu-central-1.amazonaws.com/...",
"signed_url_expires_at": "2024-01-15T12:00:00Z",
"expires_at": "2024-02-15T10:30:00Z",
"created_at": "2024-01-15T10:30:00Z"
}
],
"total": 150,
"limit": 50,
"offset": 0
}
Response Fields
| Field | Type | Description |
|---|---|---|
documents | array | Array of document objects |
total | integer | Total number of documents matching filter |
limit | integer | Limit used for this request |
offset | integer | Offset used for this request |
Document Object
| Field | Type | Description |
|---|---|---|
id | string | Document UUID |
filename | string | Document filename |
storage_mode | string | default or byob |
file_size_bytes | integer | Size of the PDF in bytes |
page_count | integer | Number of pages |
source_endpoint | string | API endpoint that generated the PDF |
url | string | Presigned URL for downloading |
signed_url_expires_at | string | Expiration time of the signed URL (ISO 8601) |
expires_at | string | Auto-deletion timestamp if set (ISO 8601) |
created_at | string | Document creation timestamp (ISO 8601) |
Status Codes
| Code | Description |
|---|---|
| 200 | Success - Documents list returned |
| 401 | Unauthorized - Missing or invalid Authorization header |
| 500 | Internal Server Error - Failed to list documents |
Credit Usage
Free (0 credits)
Search Files
Search document contents using full-text search or substring matching.
Endpoint
GET /files/search
Authentication
Requires a valid API key or OAuth token in the Authorization header:
Authorization: Bearer YOUR_API_KEY
See Authentication for details.
Query Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
q | string | Yes | - | Search query string |
mode | string | No | fulltext | Search mode: fulltext or grep |
limit | integer | No | 20 | Maximum results to return (max: 100) |
Search Modes
fulltext (default)
- Uses PostgreSQL full-text search
- Returns results ranked by relevance
- Supports word stemming and language-aware matching
- Faster for large document collections
- Returns snippets with match context
grep
- Case-insensitive substring matching
- Exact phrase matching
- Slower than fulltext, especially for large collections
- No relevance ranking
- Returns snippets with match context
Example Request
# Full-text search
curl -X GET "https://api.pdf-mcp.io/files/search?q=invoice%20total" \
-H "Authorization: Bearer YOUR_API_KEY"
# Grep mode for exact substring
curl -X GET "https://api.pdf-mcp.io/files/search?q=INV-2024-001&mode=grep" \
-H "Authorization: Bearer YOUR_API_KEY"
# Limit results
curl -X GET "https://api.pdf-mcp.io/files/search?q=contract&limit=10" \
-H "Authorization: Bearer YOUR_API_KEY"
Success Response
{
"results": [
{
"document": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"filename": "invoice-2024.pdf",
"storage_mode": "default",
"file_size_bytes": 45678,
"page_count": 3,
"source_endpoint": "/htmlToPdf",
"url": "https://s3.eu-central-1.amazonaws.com/...",
"signed_url_expires_at": "2024-01-15T12:00:00Z",
"expires_at": "2024-02-15T10:30:00Z",
"created_at": "2024-01-15T10:30:00Z"
},
"rank": 0.85,
"snippet": "...total amount due: $1,234.56. Invoice date..."
}
],
"total": 5,
"query": "invoice total",
"mode": "fulltext"
}
Response Fields
| Field | Type | Description |
|---|---|---|
results | array | Array of search hit objects |
total | integer | Total number of matching documents |
query | string | The search query used |
mode | string | The search mode used |
Search Hit Object
| Field | Type | Description |
|---|---|---|
document | object | Document metadata with presigned URL |
rank | number | Relevance score (fulltext mode only, null for grep) |
snippet | string | Text snippet showing match context |
Status Codes
| Code | Description |
|---|---|
| 200 | Success - Search results returned |
| 400 | Bad Request - Missing query or invalid mode |
| 401 | Unauthorized - Missing or invalid Authorization header |
| 500 | Internal Server Error - Search failed |
Credit Usage
0.01 credits per search
Notes
- Search is performed on the markdown representation of PDF content extracted during generation
- Documents without text content (e.g., image-based PDFs) will not appear in search results
- Only documents owned by the authenticated user are searched
Get File
Retrieve a single document by ID with a fresh presigned download URL.
Endpoint
GET /files/{document_id}
Authentication
Requires a valid API key or OAuth token in the Authorization header:
Authorization: Bearer YOUR_API_KEY
See Authentication for details.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
document_id | string | Yes | UUID of the document to retrieve |
Query Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
expires_in | integer | No | 3600 | Signed URL expiry in seconds (min: 60, max: 604800) |
Example Request
# Get document with default 1-hour URL
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
-H "Authorization: Bearer YOUR_API_KEY"
# Get document with 24-hour URL
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=86400" \
-H "Authorization: Bearer YOUR_API_KEY"
# Get document with 7-day URL (maximum)
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=604800" \
-H "Authorization: Bearer YOUR_API_KEY"
Success Response
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"filename": "invoice-2024.pdf",
"storage_mode": "default",
"file_size_bytes": 45678,
"page_count": 3,
"source_endpoint": "/htmlToPdf",
"url": "https://s3.eu-central-1.amazonaws.com/...",
"signed_url_expires_at": "2024-01-15T12:00:00Z",
"expires_at": "2024-02-15T10:30:00Z",
"created_at": "2024-01-15T10:30:00Z"
}
Response Fields
| Field | Type | Description |
|---|---|---|
id | string | Document UUID |
filename | string | Document filename |
storage_mode | string | default or byob |
file_size_bytes | integer | Size of the PDF in bytes |
page_count | integer | Number of pages |
source_endpoint | string | API endpoint that generated the PDF |
url | string | Fresh presigned URL for downloading |
signed_url_expires_at | string | Expiration time of the signed URL (ISO 8601) |
expires_at | string | Auto-deletion timestamp if set (ISO 8601) |
created_at | string | Document creation timestamp (ISO 8601) |
Status Codes
| Code | Description |
|---|---|
| 200 | Success - Document returned with presigned URL |
| 401 | Unauthorized - Missing or invalid Authorization header |
| 403 | Forbidden - Access denied (document belongs to another user) |
| 404 | Not Found - Document not found or was deleted |
| 502 | Bad Gateway - Failed to generate download URL |
Credit Usage
Free (0 credits)
Download File
Download a document’s PDF binary directly by ID. This proxies the file from S3 and returns the raw PDF, rather than a presigned URL.
Endpoint
GET /files/{document_id}/download
Authentication
Requires a valid API key or OAuth token in the Authorization header:
Authorization: Bearer YOUR_API_KEY
See Authentication for details.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
document_id | string | Yes | UUID of the document to download |
Example Request
# Download PDF directly
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000/download" \
-H "Authorization: Bearer YOUR_API_KEY" \
--output document.pdf
Success Response
Returns the raw PDF binary with the following headers:
| Header | Value |
|---|---|
Content-Type | application/pdf |
Content-Disposition | attachment; filename="original-filename.pdf" |
Content-Length | File size in bytes |
Status Codes
| Code | Description |
|---|---|
| 200 | Success - PDF binary returned |
| 401 | Unauthorized - Missing or invalid Authorization header |
| 403 | Forbidden - Access denied (document belongs to another user) |
| 404 | Not Found - Document not found or was deleted |
| 502 | Bad Gateway - Failed to download file from storage |
Credit Usage
Free (0 credits)
Notes
- Returns the actual PDF bytes, not a JSON response with a presigned URL
- Useful when your environment cannot access S3 presigned URLs directly (e.g., firewalled VMs, restricted networks)
- The original filename is preserved in the
Content-Dispositionheader - Only the document owner can download their documents
Delete File
Delete a document by ID, removing it from storage.
Endpoint
DELETE /files/{document_id}
Authentication
Requires a valid API key or OAuth token in the Authorization header:
Authorization: Bearer YOUR_API_KEY
See Authentication for details.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
document_id | string | Yes | UUID of the document to delete |
Example Request
curl -X DELETE "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
-H "Authorization: Bearer YOUR_API_KEY"
Success Response
{
"deleted": true,
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"message": "Document 550e8400-e29b-41d4-a716-446655440000 has been deleted"
}
Response Fields
| Field | Type | Description |
|---|---|---|
deleted | boolean | True if deletion succeeded |
document_id | string | UUID of the deleted document |
message | string | Confirmation message |
Status Codes
| Code | Description |
|---|---|
| 200 | Success - Document deleted |
| 401 | Unauthorized - Missing or invalid Authorization header |
| 403 | Forbidden - Access denied (document belongs to another user) |
| 404 | Not Found - Document not found or was already deleted |
| 502 | Bad Gateway - Failed to delete file from storage |
Credit Usage
Free (0 credits)
Notes
- Deletion is a soft delete - the document metadata is preserved with a
deleted_attimestamp but will no longer appear in list queries - The actual file is removed from S3 storage
- This action cannot be undone
Use Cases
List Recent Documents
Retrieve the most recent documents for display in a dashboard:
curl -X GET "https://api.pdf-mcp.io/files?limit=10" \
-H "Authorization: Bearer YOUR_API_KEY"
Paginate Through All Documents
Retrieve all documents in batches:
# First page
curl -X GET "https://api.pdf-mcp.io/files?limit=50&offset=0" \
-H "Authorization: Bearer YOUR_API_KEY"
# Second page
curl -X GET "https://api.pdf-mcp.io/files?limit=50&offset=50" \
-H "Authorization: Bearer YOUR_API_KEY"
# Continue until total is reached...
Find Documents by Content
Search for invoices containing a specific customer name:
curl -X GET "https://api.pdf-mcp.io/files/search?q=Acme%20Corporation" \
-H "Authorization: Bearer YOUR_API_KEY"
Find Exact Document Reference
Search for a specific invoice number using grep mode:
curl -X GET "https://api.pdf-mcp.io/files/search?q=INV-2024-00123&mode=grep" \
-H "Authorization: Bearer YOUR_API_KEY"
Get Fresh Download URL
Retrieve a document with a custom expiry time for sharing:
# Get 24-hour download link
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=86400" \
-H "Authorization: Bearer YOUR_API_KEY"
Download PDF Directly
Download the PDF file when presigned URLs are not accessible:
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000/download" \
-H "Authorization: Bearer YOUR_API_KEY" \
--output invoice.pdf
Clean Up Old Documents
Delete documents that are no longer needed:
curl -X DELETE "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
-H "Authorization: Bearer YOUR_API_KEY"
Tips and Best Practices
Efficient Pagination
- Use reasonable page sizes (20-50 documents) for better performance
- Cache the
totalcount to calculate total pages without re-fetching - Consider filtering by
storage_modeif you have documents in both modes
Search Optimization
- Use fulltext mode for natural language queries (e.g., “invoice payment due”)
- Use grep mode for exact identifiers (e.g., “INV-2024-001”, email addresses)
- Keep search queries focused - broader queries return more results but may be less relevant
URL Management
- Presigned URLs expire - always check
signed_url_expires_atbefore using - Request longer
expires_invalues when sharing URLs externally - Maximum URL validity is 7 days (604800 seconds)
Storage Management
- Documents with
expires_atset will be automatically deleted - Manually delete documents you no longer need to free up storage
- Use search to find old or unused documents for cleanup
Related Endpoints
- HTML to PDF - Generate PDFs that can be stored
- Text to PDF - Generate PDFs from text content
- Image to PDF - Generate PDFs from images
Credit Usage Summary
| Endpoint | Credits |
|---|---|
| GET /files | Free |
| GET /files/search | 0.01 |
| GET /files/{id} | Free |
| GET /files/{id}/download | Free |
| DELETE /files/{id} | Free |