TutorialApril 4, 2026

How to Generate PDFs from HTML in 2026

A practical guide to generating pixel-perfect PDFs from HTML pages — covering server-side approaches, browser-based rendering, API services, and when to use each.

The HTML-to-PDF Problem

Generating PDFs from HTML sounds straightforward until you encounter the real-world issues: web fonts not loading, CSS grid breaking, page breaks in the wrong places, images timing out, and dynamic JavaScript content missing entirely. Every approach has trade-offs, and the right tool depends on your use case, rendering quality requirements, and operational constraints.

This guide covers the main approaches in 2026 — from lightweight command-line tools to full browser-based rendering via an API — with code examples in JavaScript, Python, and curl.

Approach 1: wkhtmltopdf (Fast, Limited CSS Support)

wkhtmltopdf uses the older WebKit rendering engine, which means it handles basic HTML and CSS well but struggles with modern flexbox/grid layouts, web fonts loaded via @font-face, and any JavaScript-rendered content.

# Install
apt-get install wkhtmltopdf

# Basic usage
wkhtmltopdf https://example.com output.pdf

# With options
wkhtmltopdf --page-size A4 --margin-top 1cm --margin-bottom 1cm \
  --print-media-type https://example.com output.pdf

Best for: simple documents, internal reports, pages you control and know render well in old WebKit. Not suitable for modern SPA pages or anything with complex CSS.

Approach 2: Puppeteer / Playwright (Full Browser, Self-Hosted)

Chromium-based tools like Puppeteer and Playwright give you the most accurate rendering — the same engine as Chrome. They handle JavaScript, modern CSS, web fonts, and even SPAs. The trade-off is operational complexity: you need to manage Chromium binary installation, memory leaks, crash recovery, and scaling.

// Node.js + Puppeteer
import puppeteer from "puppeteer";

const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto("https://example.com", { waitUntil: "networkidle0" });

const pdf = await page.pdf({
  format: "A4",
  margin: { top: "1cm", bottom: "1cm", left: "1cm", right: "1cm" },
  printBackground: true, // include background colors/images
});

await browser.close();
require("fs").writeFileSync("output.pdf", pdf);

# Python + Playwright
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com", wait_until="networkidle")
    page.pdf(
        path="output.pdf",
        format="A4",
        margin={"top": "1cm", "bottom": "1cm"},
        print_background=True,
    )
    browser.close()

This approach gives excellent quality but is painful to deploy. Chromium in a Docker container adds ~300MB to your image. Running it in AWS Lambda requires a custom layer. Memory usage per browser instance is significant (300–500MB), and crashed browser processes need monitoring and restart logic.

Approach 3: Headless Chrome API Services (Recommended for Most Teams)

API-based PDF services like SnapAPI handle the Chromium fleet for you. You POST a URL or raw HTML, and get back a link to the generated PDF. This is the cleanest architectural choice for most applications — you trade a small per-call cost for zero browser infrastructure overhead.

curl -X POST https://api.snapapi.pics/v1/pdf \
  -H "X-Api-Key: sk_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/invoice/12345",
    "format": "A4",
    "margin": "1cm",
    "wait_for": ".invoice-loaded",
    "print_background": true
  }'

// JavaScript
const res = await fetch("https://api.snapapi.pics/v1/pdf", {
  method: "POST",
  headers: {
    "X-Api-Key": process.env.SNAPAPI_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    url: "https://yourapp.com/reports/monthly",
    format: "A4",
    wait_for: "#report-ready",
    print_background: true,
  }),
});

const { url } = await res.json();
// url = "https://cdn.snapapi.pics/pdfs/abc123.pdf"
console.log("PDF ready:", url);

Approach 4: ReportLab / WeasyPrint (Python, Programmatic)

If you need to generate PDFs programmatically from data (not from rendered HTML), Python libraries like ReportLab and WeasyPrint are excellent. WeasyPrint in particular understands CSS and renders HTML to PDF using a proper layout engine — though it does not execute JavaScript.

from weasyprint import HTML

HTML("https://example.com").write_pdf("output.pdf")

# From HTML string
HTML(string="Hello PDF
Content here
").write_pdf("output.pdf")

Choosing the Right Approach in 2026

For static HTML documents you control: WeasyPrint or wkhtmltopdf. For JavaScript-rendered pages or SPAs: Puppeteer/Playwright locally, or an API service like SnapAPI in production. For high-volume PDF generation without DevOps overhead: SnapAPI's PDF endpoint at $19–$299/month depending on volume. For programmatic PDF generation from data (invoices, reports): ReportLab in Python or PDFKit in Node.

The API approach wins whenever your team's time is more valuable than the per-PDF cost, which for most SaaS applications it is. SnapAPI's PDF endpoint handles authentication cookies, custom wait conditions, JavaScript execution, web fonts, and print-specific CSS — the same things that are painful to get right in self-hosted Chromium.

Generate PDFs with SnapAPI Free →

CSS for Print: Getting Page Breaks Right

One of the most frustrating aspects of HTML-to-PDF generation is controlling where page breaks fall. A heading at the bottom of a page, a table split mid-row, an image broken across two pages — all of these are common pitfalls that CSS print media queries can solve.

@media print {
  /* Keep headings with the content that follows */
  h1, h2, h3 {
    page-break-after: avoid;
  }

  /* Don't split tables across pages if possible */
  table {
    page-break-inside: avoid;
  }

  /* Force a new page before each major section */
  .report-section {
    page-break-before: always;
  }

  /* Keep figure + caption together */
  figure {
    page-break-inside: avoid;
  }
}

When using SnapAPI's PDF endpoint, pass "print_media": true in your request to activate these print CSS rules. The API renders the page with Chromium's print media emulation enabled, applying your @media print styles exactly as a real browser would when printing to PDF.

Handling Web Fonts in Generated PDFs

Web fonts are a common cause of PDF generation failures. When generating PDFs server-side, the font loading request must complete before the PDF is rendered. With wkhtmltopdf, many web fonts simply fail to load. With Chromium-based tools like Puppeteer or SnapAPI, fonts load correctly as long as you wait for the page to be fully rendered before capturing.

SnapAPI uses waitUntil: "networkidle0" semantics by default — it waits until all network requests (including font loads) have settled before generating the PDF. You can extend this with a custom wait_for selector if your page has additional async loading after fonts are ready.

Generating PDFs from Dynamic Data (Invoices, Reports)

A common architecture for invoice and report generation: render an HTML template server-side with your data, serve it at a temporary authenticated URL, then call SnapAPI's PDF endpoint to capture it. The result is a pixel-perfect PDF that matches your web design exactly — the same fonts, colors, and layout.

// Node.js / Express: generate invoice PDF
app.get("/invoices/:id/pdf", authenticateUser, async (req, res) => {
  const invoice = await db.invoices.findById(req.params.id);

  // Generate a signed URL to the HTML version of the invoice
  const htmlUrl = generateSignedUrl(`/invoices/${invoice.id}/html`, {
    expiresIn: 60, // 1 minute
    token: process.env.INTERNAL_SECRET,
  });

  // Ask SnapAPI to capture it as PDF
  const response = await fetch("https://api.snapapi.pics/v1/pdf", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.SNAPAPI_KEY,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      url: htmlUrl,
      format: "A4",
      margin: "15mm",
      print_background: true,
      wait_for: ".invoice-ready",
    }),
  });

  const { url: pdfUrl } = await response.json();

  // Redirect to the generated PDF or proxy it
  res.redirect(pdfUrl);
});

Cost Comparison: Self-Hosted vs API

Running your own Chromium fleet for PDF generation typically costs $50–$200/month in infrastructure even at moderate volumes, plus developer time for maintenance, monitoring, and crash recovery. SnapAPI's PDF endpoint starts at $19/month for 5,000 PDFs — for many teams that is dramatically cheaper when you account for the full cost of maintaining self-hosted browser infrastructure. The break-even point where self-hosting makes economic sense is typically above 100,000 PDFs per month.

Conclusion: Pick the Right Tool for Your Scale

For simple documents: wkhtmltopdf or WeasyPrint. For pixel-perfect rendering of modern HTML at low volumes: Puppeteer locally. For production systems where reliability and zero maintenance matter more than per-PDF cost: SnapAPI's PDF API. The code examples above work across all major languages, and the free tier gives you 200 PDFs per month to prototype without a credit card.