Guide

PDF Generation Guide: HTML to PDF in Node.js, Python, Go, and PHP

A practical comparison of every HTML-to-PDF approach: wkhtmltopdf, Puppeteer, WeasyPrint, and managed APIs. Code examples for every major backend language.

2026-04-04 · 11 min read

Why PDF Generation is Still Hard

Despite being a solved problem in theory, PDF generation remains one of the most frustrating tasks in backend development. The challenge is CSS rendering fidelity: modern web pages use flexbox, grid, CSS custom properties, web fonts, SVG, and JavaScript-rendered content that most PDF libraries simply cannot handle correctly. The result is PDFs that look nothing like the HTML template you designed, with broken layouts, missing fonts, and clipped content.

This guide covers every major approach and when to use each, with working code examples in Node.js, Python, Go, and PHP.

Option 1: wkhtmltopdf

wkhtmltopdf uses a patched WebKit engine to render HTML and convert it to PDF. It was the standard solution for a decade but has serious limitations: the WebKit version is old, flexbox support is partial, CSS Grid is not supported, and it requires system libraries to be installed on every server. On macOS and modern Linux distros, installation involves multiple workarounds. Maintenance of the project has effectively stopped. Not recommended for new projects.

Option 2: Puppeteer or Playwright

Puppeteer and Playwright use a real Chromium browser, giving you modern CSS support and JavaScript execution. They are the most accurate option for HTML-to-PDF conversion and are suitable when you control the server infrastructure. The downsides: the Chromium binary is 150-400 MB, process management is complex, and memory usage per concurrent job is significant.

// Node.js with Playwright
const { chromium } = require("playwright");

async function htmlToPdf(url) {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: "networkidle" });
  const pdf = await page.pdf({ format: "A4", printBackground: true });
  await browser.close();
  return pdf;
}

This works well for low-to-medium concurrency. At high concurrency, you need a browser pool to limit open browser instances and prevent memory exhaustion.

Option 3: WeasyPrint (Python)

WeasyPrint is a Python library that converts HTML and CSS to PDF without a browser. It has good CSS support for print-specific features like @page rules and page-break-* properties. It does not execute JavaScript, so it works best for server-rendered HTML templates without dynamic content. Installation requires system dependencies (Cairo, Pango) that can be tricky on minimal Docker images.

Option 4: Managed PDF API

A managed API like SnapAPI handles Chromium PDF rendering remotely. Your backend makes one HTTP call, receives PDF bytes, and delivers them. No binary installation, no process management, no memory overhead, no Chromium version pinning. Works on shared hosting, serverless, edge functions, and any environment that can make outbound HTTPS requests.

# Python — identical pattern works in Node.js, Go, PHP, Ruby
import requests

resp = requests.get(
    "https://api.snapapi.pics/v1/pdf",
    params={"url": "https://your-app.com/invoice/123", "format": "A4", "print_background": "true"},
    headers={"X-Api-Key": "YOUR_API_KEY"},
    timeout=60
)
with open("invoice.pdf", "wb") as f:
    f.write(resp.content)

This is the right choice when: you are on serverless or edge infrastructure, you do not want to manage Chromium, you need to support multiple languages, or you want to start quickly without infrastructure setup.

Choosing the Right Approach

Use Puppeteer or Playwright if: you already run Node.js with a Chromium binary for other purposes, you need maximum control, and you have ops capacity to manage browser processes. Use WeasyPrint if: you are Python-only, your templates are fully server-rendered, and you need print-specific CSS features. Use a managed API like SnapAPI if: you want zero infrastructure overhead, you need cross-language support, or you are on serverless.

Sign up at snapapi.pics for 200 free PDF generations per month. No credit card required. Your first PDF renders in under five minutes from any language.

PDF Generation Code Examples by Language

Node.js

const params = new URLSearchParams({ url: "https://your-app.com/invoice/123", format: "A4", print_background: "true" });
const res = await fetch(`https://api.snapapi.pics/v1/pdf?${params}`,
  { headers: { "X-Api-Key": process.env.SNAPAPI_KEY } });
const pdf = Buffer.from(await res.arrayBuffer());
await fs.writeFile("invoice.pdf", pdf);

Go

params := url.Values{"url": {pageURL}, "format": {"A4"}, "print_background": {"true"}}
req, _ := http.NewRequest("GET", "https://api.snapapi.pics/v1/pdf?"+params.Encode(), nil)
req.Header.Set("X-Api-Key", os.Getenv("SNAPAPI_KEY"))
resp, _ := (&http.Client{Timeout: 60 * time.Second}).Do(req)
defer resp.Body.Close()
io.Copy(outputFile, resp.Body)

PHP

$ch = curl_init("https://api.snapapi.pics/v1/pdf?" . http_build_query([
    "url" => "https://your-app.com/report", "format" => "A4", "print_background" => "true"
]));
curl_setopt($ch, CURLOPT_HTTPHEADER, ["X-Api-Key: " . getenv("SNAPAPI_KEY")]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
file_put_contents("report.pdf", curl_exec($ch));

The same API call pattern — GET request with URL and format parameters, X-Api-Key header, binary response body — works identically across Node.js, Python, Go, PHP, Ruby, and any other language that supports HTTP. No language-specific SDK required.

Advanced PDF Features: Headers, Footers, Authentication

Production PDF generation workflows often require features beyond basic URL rendering. Here is how SnapAPI handles the most common advanced requirements.

Multi-Page Documents with Headers and Footers

Use CSS @page rules in your HTML template to define content that repeats on every page. For running page numbers, use CSS counters — Chromium supports them fully. Alternatively, set display_header_footer=true in your SnapAPI request and provide HTML header and footer templates via the corresponding parameters. The header and footer templates have access to special variables like pageNumber, totalPages, title, and date.

PDFs from Authenticated Pages

For invoice or report pages that require authentication, pass session cookies via the cookies parameter. Generate a short-lived session token in your application, pass it as a cookie in the SnapAPI request, and SnapAPI renders the page as if that session is logged in. Invalidate the token immediately after the PDF is generated for security.

CSS Injection for Print Optimization

Pass print-specific CSS via the inject_css parameter to clean up the PDF without modifying your web page: hide navigation bars, cookie banners, and chat widgets; force single-column layout; adjust font sizes; add page break hints before major sections. This pattern avoids maintaining a separate print stylesheet in your codebase.

Performance: PDF Generation Latency

PDF generation is slower than screenshots because the browser must complete rendering before printing begins. Simple HTML templates render in 1-3 seconds. Complex dashboards with charts and multiple API calls can take 5-15 seconds. Set your HTTP client timeout to at least 60 seconds and consider moving PDF generation to a background job queue rather than the request cycle for better user experience.

Use the wait_for parameter to specify a CSS selector that must be present before printing begins. This ensures dynamic content — charts, loaded data, lazy images — is fully rendered in the PDF rather than capturing a loading state.

Storing Generated PDFs

For production PDF generation, avoid writing to local disk in web server processes. Pipe the PDF bytes directly to S3 using a streaming upload. In Node.js with the AWS SDK v3:

import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";

const pdfRes = await fetch(`https://api.snapapi.pics/v1/pdf?url=${url}&format=A4`,
  { headers: { "X-Api-Key": process.env.SNAPAPI_KEY } });
const pdfBuffer = Buffer.from(await pdfRes.arrayBuffer());

const s3Key = `pdfs/${invoiceId}.pdf`;
await s3.send(new PutObjectCommand({
  Bucket: process.env.S3_BUCKET,
  Key: s3Key,
  Body: pdfBuffer,
  ContentType: "application/pdf",
}));

const downloadUrl = await getSignedUrl(s3, new GetObjectCommand({
  Bucket: process.env.S3_BUCKET, Key: s3Key
}), { expiresIn: 900 }); // 15-minute expiry
// Return downloadUrl to user or send via email

The pre-signed URL gives the user a time-limited direct download link to the PDF in S3, avoiding your server being a proxy for potentially large file downloads.

Sign up at snapapi.pics for 200 free PDF generations per month. No Chromium installation, no library configuration, no binary dependencies. One HTTP call from any language — your first PDF renders in under five minutes.

PDF Generation Decision Guide

Simple HTML, server-rendered, no JS: WeasyPrint (Python) or wkhtmltopdf for legacy setups. Complex CSS, web fonts, JS-rendered content, full control of server: Puppeteer or Playwright. Any stack, serverless, or minimal infrastructure: SnapAPI PDF endpoint via HTTP. Need authenticated page rendering, CSS injection, or wait-for-selector: SnapAPI. Need to generate PDFs at high concurrency without browser process management: SnapAPI. Need to save the result to S3 and return a pre-signed download URL: SnapAPI plus AWS SDK. Sign up at snapapi.pics for 200 free PDF generations to test the API approach before committing.

pdf generation guide html to pdf node python go php wkhtmltopdf alternative puppeteer pdf chromium pdf api pdf generation tutorial
html to pdf api guide pdf generation node python go comparison wkhtmltopdf puppeteer managed api
pdf generation api guide best practices
pdf generation from html tutorial nodejs
pdf api nodejs python