pdf-mcp: PDF Generation Service

Generate high-quality PDF documents from HTML content via API or MCP Server.

Authentication

All requests require: Authorization: Bearer YOUR_API_KEY

Input Modes

All endpoints accept JSON (application/json) and multipart form (file upload). For PDF inputs, you can provide either pdf_base64 (base64-encoded) or pdf_url (a URL to fetch the PDF from).

API Endpoints

Base URL: https://api.pdf-mcp.io

POST /htmlToPdf

Convert HTML to PDF.

{"html": "<html>...</html>", "css": "body{font:Arial}", "filename": "doc.pdf", "base_url": "https://example.com"}

html (required): HTML content
css: Additional CSS styles
filename: Output filename (default: document.pdf)
base_url: Base URL for relative URLs
return_binary: Return raw PDF bytes instead of JSON (default: false)
storage: Storage configuration (see Storage)

POST /extractText

Extract text from PDF. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pages": [1,2,3]}

Returns: {"text": "...", "pages": [{"page": 1, "text": "..."}], "total_pages": 10}

POST /pdfToImage

Convert PDF pages to images. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pages": [1,2], "format": "png", "dpi": 200}

POST /textToPdf

Convert plain text to PDF.

{"text": "Plain text content", "font_size": 12, "filename": "doc.pdf"}

POST /imageToPdf

Convert images to multi-page PDF.

{"images_base64": ["img1...", "img2..."], "filename": "images.pdf"}

POST /pageCount

Get PDF page count. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "..."}

Returns: {"page_count": 10}

POST /extractPages

Extract/rearrange pages. Pages can repeat or reorder. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pages": [1,3,5,2], "filename": "extracted.pdf"}

POST /mergePdfs

Merge multiple PDFs.

{"pdfs_base64": ["pdf1...", "pdf2..."], "filename": "merged.pdf"}

POST /grepPdf

Search text within a PDF. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pattern": "search term", "regex": false, "ignore_case": true, "pages": [1,2], "context": 100}

pattern (required): Search pattern (plain text or regex)
regex: Treat pattern as regex (default: false)
ignore_case: Case-insensitive search (default: true)
pages: Limit search to specific pages
context: Characters of context around matches (default: 100)
count_only: Only return match counts (default: false)

Returns: {"matches": [{"page": 1, "match_count": 3, "matches": [...]}], "total_matches": 3, "pages_with_matches": 1, "total_pages": 10}

Storage

All PDF-generating endpoints support a storage object:

{"storage": {"mode": "memory", "filename": "custom.pdf", "expires_in": 3600, "retention_days": 7}}

mode: "memory" (return bytes, no persistence), "default" (store in cloud), "byob" (your S3 bucket)
filename: Override output filename
expires_in: Signed URL expiry in seconds (default: 3600)
retention_days: Auto-delete after N days

Document Management

Manage stored documents (created with storage.mode = "default" or "byob").

GET /files

List stored documents.

limit: Max results (default: 50, max: 100)
offset: Pagination offset (default: 0)
storage_mode: Filter by "default" or "byob"

GET /files/search

Search documents by content.

q (required): Search query
mode: "fulltext" or "grep" (default: fulltext)
limit: Max results (default: 20, max: 100)

GET /files/{document_id}

Get document metadata and a fresh signed URL.

expires_in: Signed URL expiry in seconds (default: 3600, max: 604800)

GET /files/{document_id}/download

Download the PDF binary directly.

DELETE /files/{document_id}

Delete a stored document.

CSS Page Setup

@page {
  size: A4;                /* A4, Letter, Legal, or 210mm 297mm */
  margin: 2cm;
}

@page :first { margin-top: 5cm; }  /* First page different */
@page :left { margin-left: 3cm; }  /* Binding margins */
@page :right { margin-right: 3cm; }

Sizes: A4 (210x297mm), Letter (8.5x11in), Legal (8.5x14in), A3, A5

Page Break Control

Use both modern and legacy properties:

/* Prevent breaks inside */
.no-break {
  break-inside: avoid;
  page-break-inside: avoid;
}

/* Force break after */
.page-break {
  break-after: page;
  page-break-after: always;
}

/* Force break before */
.chapter {
  break-before: page;
  page-break-before: always;
}

/* Keep headings with content */
h1, h2, h3 {
  break-after: avoid;
  page-break-after: avoid;
}

/* CRITICAL: Reset floats - page breaks only work on block elements */
@media print { * { float: none !important; } }

Tables: Multi-Page Handling

table { width: 100%; border-collapse: collapse; }
thead { display: table-header-group; }  /* Repeat headers */
tfoot { display: table-footer-group; }
tr { break-inside: avoid; page-break-inside: avoid; }

Images

Embed as base64 or use absolute URLs:

<img src="data:image/png;base64,..." alt="...">
<img src="https://example.com/image.png" alt="...">

img { max-width: 100%; break-inside: avoid; page-break-inside: avoid; }

Charts and Data Visualizations

No JavaScript is executed. Chart libraries (Chart.js, D3, Plotly.js, Highcharts) will not render in the browser sense. Charts must be prerendered as SVG or images before embedding in HTML.

Best options:

Inline SVG — vector, scalable, CSS-styleable. Embed <svg> directly in HTML.
Base64 images — <img src="data:image/png;base64,..."> for pre-exported charts.
Hosted image URLs — <img src="https://..."> (use base_url for relative paths).

If you can run Python, use Plotly to generate charts and export as static SVG/PNG:

import plotly.graph_objects as go
import plotly.io as pio
import base64

fig = go.Figure(data=[go.Bar(x=["Q1","Q2","Q3","Q4"], y=[120,180,240,200], marker_color="#3498db")])
fig.update_layout(title="Revenue", width=600, height=350, margin=dict(l=40,r=20,t=50,b=40))

# SVG string — embed directly in HTML
svg_str = pio.to_image(fig, format="svg").decode("utf-8")

# Or base64 PNG
png_b64 = base64.b64encode(pio.to_image(fig, format="png", scale=2)).decode("utf-8")

Embed in HTML:

<div class="chart no-break">{svg_str}</div>
<!-- or -->
<div class="chart no-break">
  <img src="data:image/png;base64,{png_b64}" alt="Revenue" style="width:100%;max-width:600px;" />
</div>

Requires pip install plotly kaleido

.chart { break-inside: avoid; page-break-inside: avoid; margin: 1cm 0; text-align: center; }
.chart svg, .chart img { max-width: 100%; height: auto; }

For the full charts guide (SVG examples, prerendering strategies, pitfalls), see CSS Print Styling Guide.

For advanced CSS print styling (margin boxes, headers/footers, page counters, named pages, widows/orphans), see the full guide: CSS Print Styling Guide

Example 1: Invoice

<!DOCTYPE html>
<html>
<head>
<style>
@page { size: A4; margin: 2cm; }
body { font-family: Arial, sans-serif; font-size: 10pt; }
table { width: 100%; border-collapse: collapse; margin: 1cm 0; }
th, td { border: 1px solid #ddd; padding: 10px; }
th { background: #f5f5f5; }
.amount { text-align: right; }
thead { display: table-header-group; }
tr { break-inside: avoid; page-break-inside: avoid; }
</style>
</head>
<body>
  <h1>INVOICE #INV-2024-001</h1>
  <p><strong>Date:</strong> January 15, 2024</p>

  <h3>Bill To:</h3>
  <p>Client Name<br>456 Client Ave<br>City, State 67890</p>

  <table>
    <thead>
      <tr><th>Description</th><th>Qty</th><th class="amount">Price</th><th class="amount">Amount</th></tr>
    </thead>
    <tbody>
      <tr><td>Professional Services</td><td>10</td><td class="amount">$150.00</td><td class="amount">$1,500.00</td></tr>
      <tr><td>Software License</td><td>1</td><td class="amount">$500.00</td><td class="amount">$500.00</td></tr>
      <tr style="font-weight:bold;background:#f0f0f0"><td colspan="3">Total</td><td class="amount">$2,000.00</td></tr>
    </tbody>
  </table>
</body>
</html>

Example 2: Report with Chapters

<!DOCTYPE html>
<html>
<head>
<style>
@page { size: Letter; margin: 1in; }
@page :first { margin-top: 2in; }
body { font-family: Georgia, serif; font-size: 12pt; line-height: 1.6; }
h1 { text-align: center; margin-bottom: 2cm; }
h2 { break-after: avoid; page-break-after: avoid; }
.chapter { break-before: page; page-break-before: always; }
.no-break { break-inside: avoid; page-break-inside: avoid; }
p { orphans: 3; widows: 3; }
</style>
</head>
<body>
  <h1>Annual Report 2024</h1>

  <section class="chapter">
    <h2>Executive Summary</h2>
    <p>This report provides comprehensive analysis...</p>
    <div class="no-break">
      <h3>Key Findings</h3>
      <ul>
        <li>Revenue increased 25%</li>
        <li>Customer satisfaction at 92%</li>
      </ul>
    </div>
  </section>

  <section class="chapter">
    <h2>Financial Overview</h2>
    <p>Detailed financial analysis...</p>
  </section>
</body>
</html>

Common Pitfalls

No viewport units in @page - Never use vh, vw, vmin, vmax
Reset floats - Page breaks don't work on floated elements
Apply break-inside to containers - Not just content
Use absolute URLs or base_url - Relative URLs need base_url parameter
Set max-width on images - Prevent overflow
Include legacy properties - Always use both break-* and page-break-*

MCP Server

Connect AI agents via MCP at /mcp:

https://api.pdf-mcp.io/mcp

Tools: html_to_pdf, extract_text, pdf_to_image, text_to_pdf, image_to_pdf, get_page_count, extract_pages, merge_pdfs, grep_pdf, list_documents, get_document, download_document, delete_document, search_documents

Resources (read via read_resource()):

docs://skill-guide — This document (API endpoints, examples, CSS tips)
docs://print-guide — Full CSS print styling reference

Support

Docs: https://pdf-mcp.io/docs
CSS Print Guide (raw): https://pdf-mcp.io/raw/PRINT.md
API Skill Guide (raw): https://pdf-mcp.io/raw/SKILL.md
Support: support@pdf-mcp.io