The Files API provides endpoints for managing PDF documents stored in your account. You can list all stored files, search their contents, retrieve individual documents with fresh download URLs, download PDF binaries directly, and delete files when no longer needed.

Overview

When PDFs are generated with storage: "default" or storage: "byob", they are stored and tracked in your account. The Files API lets you:

  • List all stored documents with pagination and filtering
  • Search document contents using full-text search or substring matching
  • Retrieve individual documents with fresh presigned download URLs
  • Download PDF binaries directly when presigned URLs are not accessible
  • Delete documents from storage

List Files

Retrieve a paginated list of all stored documents.

Endpoint

GET /files

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Query Parameters

ParameterTypeRequiredDefaultDescription
limitintegerNo50Maximum documents to return (max: 100)
offsetintegerNo0Number of documents to skip for pagination
storage_modestringNo-Filter by storage mode: default or byob

Example Request

# List first 50 documents
curl -X GET "https://api.pdf-mcp.io/files" \
  -H "Authorization: Bearer YOUR_API_KEY"

# List with pagination
curl -X GET "https://api.pdf-mcp.io/files?limit=20&offset=40" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Filter by storage mode
curl -X GET "https://api.pdf-mcp.io/files?storage_mode=default" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "documents": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "filename": "invoice-2024.pdf",
      "storage_mode": "default",
      "file_size_bytes": 45678,
      "page_count": 3,
      "source_endpoint": "/htmlToPdf",
      "url": "https://s3.eu-central-1.amazonaws.com/...",
      "signed_url_expires_at": "2024-01-15T12:00:00Z",
      "expires_at": "2024-02-15T10:30:00Z",
      "created_at": "2024-01-15T10:30:00Z"
    }
  ],
  "total": 150,
  "limit": 50,
  "offset": 0
}

Response Fields

FieldTypeDescription
documentsarrayArray of document objects
totalintegerTotal number of documents matching filter
limitintegerLimit used for this request
offsetintegerOffset used for this request

Document Object

FieldTypeDescription
idstringDocument UUID
filenamestringDocument filename
storage_modestringdefault or byob
file_size_bytesintegerSize of the PDF in bytes
page_countintegerNumber of pages
source_endpointstringAPI endpoint that generated the PDF
urlstringPresigned URL for downloading
signed_url_expires_atstringExpiration time of the signed URL (ISO 8601)
expires_atstringAuto-deletion timestamp if set (ISO 8601)
created_atstringDocument creation timestamp (ISO 8601)

Status Codes

CodeDescription
200Success - Documents list returned
401Unauthorized - Missing or invalid Authorization header
500Internal Server Error - Failed to list documents

Credit Usage

Free (0 credits)


Search Files

Search document contents using full-text search or substring matching.

Endpoint

GET /files/search

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Query Parameters

ParameterTypeRequiredDefaultDescription
qstringYes-Search query string
modestringNofulltextSearch mode: fulltext or grep
limitintegerNo20Maximum results to return (max: 100)

Search Modes

fulltext (default)

  • Uses PostgreSQL full-text search
  • Returns results ranked by relevance
  • Supports word stemming and language-aware matching
  • Faster for large document collections
  • Returns snippets with match context

grep

  • Case-insensitive substring matching
  • Exact phrase matching
  • Slower than fulltext, especially for large collections
  • No relevance ranking
  • Returns snippets with match context

Example Request

# Full-text search
curl -X GET "https://api.pdf-mcp.io/files/search?q=invoice%20total" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Grep mode for exact substring
curl -X GET "https://api.pdf-mcp.io/files/search?q=INV-2024-001&mode=grep" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Limit results
curl -X GET "https://api.pdf-mcp.io/files/search?q=contract&limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "results": [
    {
      "document": {
        "id": "550e8400-e29b-41d4-a716-446655440000",
        "filename": "invoice-2024.pdf",
        "storage_mode": "default",
        "file_size_bytes": 45678,
        "page_count": 3,
        "source_endpoint": "/htmlToPdf",
        "url": "https://s3.eu-central-1.amazonaws.com/...",
        "signed_url_expires_at": "2024-01-15T12:00:00Z",
        "expires_at": "2024-02-15T10:30:00Z",
        "created_at": "2024-01-15T10:30:00Z"
      },
      "rank": 0.85,
      "snippet": "...total amount due: $1,234.56. Invoice date..."
    }
  ],
  "total": 5,
  "query": "invoice total",
  "mode": "fulltext"
}

Response Fields

FieldTypeDescription
resultsarrayArray of search hit objects
totalintegerTotal number of matching documents
querystringThe search query used
modestringThe search mode used

Search Hit Object

FieldTypeDescription
documentobjectDocument metadata with presigned URL
ranknumberRelevance score (fulltext mode only, null for grep)
snippetstringText snippet showing match context

Status Codes

CodeDescription
200Success - Search results returned
400Bad Request - Missing query or invalid mode
401Unauthorized - Missing or invalid Authorization header
500Internal Server Error - Search failed

Credit Usage

0.01 credits per search

Notes

  • Search is performed on the markdown representation of PDF content extracted during generation
  • Documents without text content (e.g., image-based PDFs) will not appear in search results
  • Only documents owned by the authenticated user are searched

Get File

Retrieve a single document by ID with a fresh presigned download URL.

Endpoint

GET /files/{document_id}

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Path Parameters

ParameterTypeRequiredDescription
document_idstringYesUUID of the document to retrieve

Query Parameters

ParameterTypeRequiredDefaultDescription
expires_inintegerNo3600Signed URL expiry in seconds (min: 60, max: 604800)

Example Request

# Get document with default 1-hour URL
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Get document with 24-hour URL
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=86400" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Get document with 7-day URL (maximum)
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=604800" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "filename": "invoice-2024.pdf",
  "storage_mode": "default",
  "file_size_bytes": 45678,
  "page_count": 3,
  "source_endpoint": "/htmlToPdf",
  "url": "https://s3.eu-central-1.amazonaws.com/...",
  "signed_url_expires_at": "2024-01-15T12:00:00Z",
  "expires_at": "2024-02-15T10:30:00Z",
  "created_at": "2024-01-15T10:30:00Z"
}

Response Fields

FieldTypeDescription
idstringDocument UUID
filenamestringDocument filename
storage_modestringdefault or byob
file_size_bytesintegerSize of the PDF in bytes
page_countintegerNumber of pages
source_endpointstringAPI endpoint that generated the PDF
urlstringFresh presigned URL for downloading
signed_url_expires_atstringExpiration time of the signed URL (ISO 8601)
expires_atstringAuto-deletion timestamp if set (ISO 8601)
created_atstringDocument creation timestamp (ISO 8601)

Status Codes

CodeDescription
200Success - Document returned with presigned URL
401Unauthorized - Missing or invalid Authorization header
403Forbidden - Access denied (document belongs to another user)
404Not Found - Document not found or was deleted
502Bad Gateway - Failed to generate download URL

Credit Usage

Free (0 credits)


Download File

Download a document’s PDF binary directly by ID. This proxies the file from S3 and returns the raw PDF, rather than a presigned URL.

Endpoint

GET /files/{document_id}/download

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Path Parameters

ParameterTypeRequiredDescription
document_idstringYesUUID of the document to download

Example Request

# Download PDF directly
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000/download" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output document.pdf

Success Response

Returns the raw PDF binary with the following headers:

HeaderValue
Content-Typeapplication/pdf
Content-Dispositionattachment; filename="original-filename.pdf"
Content-LengthFile size in bytes

Status Codes

CodeDescription
200Success - PDF binary returned
401Unauthorized - Missing or invalid Authorization header
403Forbidden - Access denied (document belongs to another user)
404Not Found - Document not found or was deleted
502Bad Gateway - Failed to download file from storage

Credit Usage

Free (0 credits)

Notes

  • Returns the actual PDF bytes, not a JSON response with a presigned URL
  • Useful when your environment cannot access S3 presigned URLs directly (e.g., firewalled VMs, restricted networks)
  • The original filename is preserved in the Content-Disposition header
  • Only the document owner can download their documents

Delete File

Delete a document by ID, removing it from storage.

Endpoint

DELETE /files/{document_id}

Authentication

Requires a valid API key or OAuth token in the Authorization header:

Authorization: Bearer YOUR_API_KEY

See Authentication for details.

Path Parameters

ParameterTypeRequiredDescription
document_idstringYesUUID of the document to delete

Example Request

curl -X DELETE "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer YOUR_API_KEY"

Success Response

{
  "deleted": true,
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "message": "Document 550e8400-e29b-41d4-a716-446655440000 has been deleted"
}

Response Fields

FieldTypeDescription
deletedbooleanTrue if deletion succeeded
document_idstringUUID of the deleted document
messagestringConfirmation message

Status Codes

CodeDescription
200Success - Document deleted
401Unauthorized - Missing or invalid Authorization header
403Forbidden - Access denied (document belongs to another user)
404Not Found - Document not found or was already deleted
502Bad Gateway - Failed to delete file from storage

Credit Usage

Free (0 credits)

Notes

  • Deletion is a soft delete - the document metadata is preserved with a deleted_at timestamp but will no longer appear in list queries
  • The actual file is removed from S3 storage
  • This action cannot be undone

Use Cases

List Recent Documents

Retrieve the most recent documents for display in a dashboard:

curl -X GET "https://api.pdf-mcp.io/files?limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY"

Paginate Through All Documents

Retrieve all documents in batches:

# First page
curl -X GET "https://api.pdf-mcp.io/files?limit=50&offset=0" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Second page
curl -X GET "https://api.pdf-mcp.io/files?limit=50&offset=50" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Continue until total is reached...

Find Documents by Content

Search for invoices containing a specific customer name:

curl -X GET "https://api.pdf-mcp.io/files/search?q=Acme%20Corporation" \
  -H "Authorization: Bearer YOUR_API_KEY"

Find Exact Document Reference

Search for a specific invoice number using grep mode:

curl -X GET "https://api.pdf-mcp.io/files/search?q=INV-2024-00123&mode=grep" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get Fresh Download URL

Retrieve a document with a custom expiry time for sharing:

# Get 24-hour download link
curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000?expires_in=86400" \
  -H "Authorization: Bearer YOUR_API_KEY"

Download PDF Directly

Download the PDF file when presigned URLs are not accessible:

curl -X GET "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000/download" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output invoice.pdf

Clean Up Old Documents

Delete documents that are no longer needed:

curl -X DELETE "https://api.pdf-mcp.io/files/550e8400-e29b-41d4-a716-446655440000" \
  -H "Authorization: Bearer YOUR_API_KEY"

Tips and Best Practices

Efficient Pagination

  • Use reasonable page sizes (20-50 documents) for better performance
  • Cache the total count to calculate total pages without re-fetching
  • Consider filtering by storage_mode if you have documents in both modes

Search Optimization

  • Use fulltext mode for natural language queries (e.g., “invoice payment due”)
  • Use grep mode for exact identifiers (e.g., “INV-2024-001”, email addresses)
  • Keep search queries focused - broader queries return more results but may be less relevant

URL Management

  • Presigned URLs expire - always check signed_url_expires_at before using
  • Request longer expires_in values when sharing URLs externally
  • Maximum URL validity is 7 days (604800 seconds)

Storage Management

  • Documents with expires_at set will be automatically deleted
  • Manually delete documents you no longer need to free up storage
  • Use search to find old or unused documents for cleanup

Credit Usage Summary

EndpointCredits
GET /filesFree
GET /files/search0.01
GET /files/{id}Free
GET /files/{id}/downloadFree
DELETE /files/{id}Free