← pdf-mcp.io

pdf-mcp: PDF Generation Service

Generate high-quality PDF documents from HTML content via API or MCP Server.

Authentication

All requests require: Authorization: Bearer YOUR_API_KEY

Input Modes

All endpoints accept JSON (application/json) and multipart form (file upload). For PDF inputs, you can provide either pdf_base64 (base64-encoded) or pdf_url (a URL to fetch the PDF from).

API Endpoints

Base URL: https://api.pdf-mcp.io

POST /htmlToPdf

Convert HTML to PDF.

{"html": "<html>...</html>", "css": "body{font:Arial}", "filename": "doc.pdf", "base_url": "https://example.com"}

POST /extractText

Extract text from PDF. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pages": [1,2,3]}

Returns: {"text": "...", "pages": [{"page": 1, "text": "..."}], "total_pages": 10}

POST /pdfToImage

Convert PDF pages to images. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pages": [1,2], "format": "png", "dpi": 200}

POST /textToPdf

Convert plain text to PDF.

{"text": "Plain text content", "font_size": 12, "filename": "doc.pdf"}

POST /imageToPdf

Convert images to multi-page PDF.

{"images_base64": ["img1...", "img2..."], "filename": "images.pdf"}

POST /pageCount

Get PDF page count. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "..."}

Returns: {"page_count": 10}

POST /extractPages

Extract/rearrange pages. Pages can repeat or reorder. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pages": [1,3,5,2], "filename": "extracted.pdf"}

POST /mergePdfs

Merge multiple PDFs.

{"pdfs_base64": ["pdf1...", "pdf2..."], "filename": "merged.pdf"}

POST /grepPdf

Search text within a PDF. Accepts pdf_base64 or pdf_url.

{"pdf_base64": "...", "pattern": "search term", "regex": false, "ignore_case": true, "pages": [1,2], "context": 100}

Returns: {"matches": [{"page": 1, "match_count": 3, "matches": [...]}], "total_matches": 3, "pages_with_matches": 1, "total_pages": 10}

Storage

All PDF-generating endpoints support a storage object:

{"storage": {"mode": "memory", "filename": "custom.pdf", "expires_in": 3600, "retention_days": 7}}

Document Management

Manage stored documents (created with storage.mode = "default" or "byob").

GET /files

List stored documents.

GET /files/search

Search documents by content.

GET /files/{document_id}

Get document metadata and a fresh signed URL.

GET /files/{document_id}/download

Download the PDF binary directly.

DELETE /files/{document_id}

Delete a stored document.

CSS Page Setup

@page {
  size: A4;                /* A4, Letter, Legal, or 210mm 297mm */
  margin: 2cm;
}

@page :first { margin-top: 5cm; }  /* First page different */
@page :left { margin-left: 3cm; }  /* Binding margins */
@page :right { margin-right: 3cm; }

Sizes: A4 (210x297mm), Letter (8.5x11in), Legal (8.5x14in), A3, A5

Page Break Control

Use both modern and legacy properties:

/* Prevent breaks inside */
.no-break {
  break-inside: avoid;
  page-break-inside: avoid;
}

/* Force break after */
.page-break {
  break-after: page;
  page-break-after: always;
}

/* Force break before */
.chapter {
  break-before: page;
  page-break-before: always;
}

/* Keep headings with content */
h1, h2, h3 {
  break-after: avoid;
  page-break-after: avoid;
}

/* CRITICAL: Reset floats - page breaks only work on block elements */
@media print { * { float: none !important; } }

Tables: Multi-Page Handling

table { width: 100%; border-collapse: collapse; }
thead { display: table-header-group; }  /* Repeat headers */
tfoot { display: table-footer-group; }
tr { break-inside: avoid; page-break-inside: avoid; }

Images

Embed as base64 or use absolute URLs:

<img src="data:image/png;base64,..." alt="...">
<img src="https://example.com/image.png" alt="...">
img { max-width: 100%; break-inside: avoid; page-break-inside: avoid; }

Charts and Data Visualizations

No JavaScript is executed. Chart libraries (Chart.js, D3, Plotly.js, Highcharts) will not render in the browser sense. Charts must be prerendered as SVG or images before embedding in HTML.

Best options:

If you can run Python, use Plotly to generate charts and export as static SVG/PNG:

import plotly.graph_objects as go
import plotly.io as pio
import base64

fig = go.Figure(data=[go.Bar(x=["Q1","Q2","Q3","Q4"], y=[120,180,240,200], marker_color="#3498db")])
fig.update_layout(title="Revenue", width=600, height=350, margin=dict(l=40,r=20,t=50,b=40))

# SVG string — embed directly in HTML
svg_str = pio.to_image(fig, format="svg").decode("utf-8")

# Or base64 PNG
png_b64 = base64.b64encode(pio.to_image(fig, format="png", scale=2)).decode("utf-8")

Embed in HTML:

<div class="chart no-break">{svg_str}</div>
<!-- or -->
<div class="chart no-break">
  <img src="data:image/png;base64,{png_b64}" alt="Revenue" style="width:100%;max-width:600px;" />
</div>

Requires pip install plotly kaleido

.chart { break-inside: avoid; page-break-inside: avoid; margin: 1cm 0; text-align: center; }
.chart svg, .chart img { max-width: 100%; height: auto; }

For the full charts guide (SVG examples, prerendering strategies, pitfalls), see CSS Print Styling Guide.

For advanced CSS print styling (margin boxes, headers/footers, page counters, named pages, widows/orphans), see the full guide: CSS Print Styling Guide

Example 1: Invoice

<!DOCTYPE html>
<html>
<head>
<style>
@page { size: A4; margin: 2cm; }
body { font-family: Arial, sans-serif; font-size: 10pt; }
table { width: 100%; border-collapse: collapse; margin: 1cm 0; }
th, td { border: 1px solid #ddd; padding: 10px; }
th { background: #f5f5f5; }
.amount { text-align: right; }
thead { display: table-header-group; }
tr { break-inside: avoid; page-break-inside: avoid; }
</style>
</head>
<body>
  <h1>INVOICE #INV-2024-001</h1>
  <p><strong>Date:</strong> January 15, 2024</p>

  <h3>Bill To:</h3>
  <p>Client Name<br>456 Client Ave<br>City, State 67890</p>

  <table>
    <thead>
      <tr><th>Description</th><th>Qty</th><th class="amount">Price</th><th class="amount">Amount</th></tr>
    </thead>
    <tbody>
      <tr><td>Professional Services</td><td>10</td><td class="amount">$150.00</td><td class="amount">$1,500.00</td></tr>
      <tr><td>Software License</td><td>1</td><td class="amount">$500.00</td><td class="amount">$500.00</td></tr>
      <tr style="font-weight:bold;background:#f0f0f0"><td colspan="3">Total</td><td class="amount">$2,000.00</td></tr>
    </tbody>
  </table>
</body>
</html>

Example 2: Report with Chapters

<!DOCTYPE html>
<html>
<head>
<style>
@page { size: Letter; margin: 1in; }
@page :first { margin-top: 2in; }
body { font-family: Georgia, serif; font-size: 12pt; line-height: 1.6; }
h1 { text-align: center; margin-bottom: 2cm; }
h2 { break-after: avoid; page-break-after: avoid; }
.chapter { break-before: page; page-break-before: always; }
.no-break { break-inside: avoid; page-break-inside: avoid; }
p { orphans: 3; widows: 3; }
</style>
</head>
<body>
  <h1>Annual Report 2024</h1>

  <section class="chapter">
    <h2>Executive Summary</h2>
    <p>This report provides comprehensive analysis...</p>
    <div class="no-break">
      <h3>Key Findings</h3>
      <ul>
        <li>Revenue increased 25%</li>
        <li>Customer satisfaction at 92%</li>
      </ul>
    </div>
  </section>

  <section class="chapter">
    <h2>Financial Overview</h2>
    <p>Detailed financial analysis...</p>
  </section>
</body>
</html>

Common Pitfalls

  1. No viewport units in @page - Never use vh, vw, vmin, vmax
  2. Reset floats - Page breaks don't work on floated elements
  3. Apply break-inside to containers - Not just content
  4. Use absolute URLs or base_url - Relative URLs need base_url parameter
  5. Set max-width on images - Prevent overflow
  6. Include legacy properties - Always use both break-* and page-break-*

MCP Server

Connect AI agents via MCP at /mcp:

https://api.pdf-mcp.io/mcp

Tools: html_to_pdf, extract_text, pdf_to_image, text_to_pdf, image_to_pdf, get_page_count, extract_pages, merge_pdfs, grep_pdf, list_documents, get_document, download_document, delete_document, search_documents

Resources (read via read_resource()):

Support