Building a screenshot API sounds simple — spin up a browser, navigate to a URL, take a screenshot. In practice, you need to handle concurrent requests, browser crashes, memory leaks, timeouts, caching, storage, and dozens of edge cases. This tutorial walks through building a production-grade screenshot API from scratch with Node.js, Playwright, Express, BullMQ, and Redis — then shows why you might want to use SnapAPI instead.
Architecture Overview
A production screenshot API needs several components working together: an HTTP server to accept requests, a job queue for async processing, a browser pool for rendering, object storage for results, and caching to avoid redundant captures. Here's the full stack:
- Express — HTTP server with validation and rate limiting
- BullMQ + Redis — Job queue for async screenshot processing
- Playwright — Headless Chromium for page rendering
- S3-compatible storage — Store screenshot images
- Redis — Caching layer for duplicate URL requests
Step 1: Basic Express Server
Start with a simple Express server that accepts screenshot requests and validates input:
import express from 'express';
import { z } from 'zod';
const app = express();
app.use(express.json());
// Request validation schema
const screenshotSchema = z.object({
url: z.string().url(),
width: z.number().min(320).max(3840).default(1280),
height: z.number().min(240).max(2160).default(800),
fullPage: z.boolean().default(false),
format: z.enum(['png', 'jpeg', 'webp']).default('png'),
quality: z.number().min(1).max(100).optional(),
delay: z.number().min(0).max(10000).default(0),
});
app.post('/v1/screenshot', async (req, res) => {
try {
const params = screenshotSchema.parse(req.body);
// TODO: add authentication, rate limiting, queuing
const screenshot = await captureScreenshot(params);
res.json({
success: true,
url: screenshot.url,
width: params.width,
height: params.height,
});
} catch (error) {
if (error instanceof z.ZodError) {
return res.status(400).json({ error: error.errors });
}
console.error('Screenshot failed:', error);
res.status(500).json({ error: 'Screenshot capture failed' });
}
});
app.listen(3000, () => console.log('Screenshot API on port 3000'));
Step 2: Browser Pool
A single browser instance can't handle concurrent requests reliably. Build a browser pool that reuses contexts and handles crashes:
import { chromium } from 'playwright';
class BrowserPool {
constructor(options = {}) {
this.maxBrowsers = options.maxBrowsers || 3;
this.maxPagesPerBrowser = options.maxPagesPerBrowser || 5;
this.browsers = [];
this.pageCount = new Map();
}
async init() {
for (let i = 0; i < this.maxBrowsers; i++) {
await this.launchBrowser();
}
console.log(`Browser pool ready: ${this.browsers.length} browsers`);
}
async launchBrowser() {
const browser = await chromium.launch({
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-gpu',
],
});
browser.on('disconnected', () => {
this.browsers = this.browsers.filter(b => b !== browser);
this.pageCount.delete(browser);
console.log('Browser crashed, relaunching...');
this.launchBrowser();
});
this.browsers.push(browser);
this.pageCount.set(browser, 0);
return browser;
}
async getPage() {
// Find browser with fewest pages
let bestBrowser = null;
let minPages = Infinity;
for (const browser of this.browsers) {
const count = this.pageCount.get(browser) || 0;
if (count < this.maxPagesPerBrowser && count < minPages) {
bestBrowser = browser;
minPages = count;
}
}
if (!bestBrowser) {
throw new Error('All browsers at capacity');
}
const context = await bestBrowser.newContext();
const page = await context.newPage();
this.pageCount.set(bestBrowser, minPages + 1);
return { page, context, browser: bestBrowser };
}
async releasePage({ page, context, browser }) {
try {
await context.close();
} catch (e) { /* browser may have crashed */ }
const count = this.pageCount.get(browser) || 1;
this.pageCount.set(browser, count - 1);
}
async close() {
await Promise.all(this.browsers.map(b => b.close()));
}
}
const pool = new BrowserPool({ maxBrowsers: 3, maxPagesPerBrowser: 5 });
Step 3: Screenshot Capture Logic
The capture function navigates to the URL, waits for the page to load, and takes the screenshot with the requested options:
async function captureScreenshot(params) {
const { page, context, browser } = await pool.getPage();
try {
// Set viewport
await page.setViewportSize({
width: params.width,
height: params.height,
});
// Block unnecessary resources for speed
await page.route('**/*', (route) => {
const type = route.request().resourceType();
if (['media', 'font'].includes(type)) {
return route.abort();
}
return route.continue();
});
// Navigate with timeout
await page.goto(params.url, {
waitUntil: 'networkidle',
timeout: 30000,
});
// Optional delay for animations
if (params.delay > 0) {
await page.waitForTimeout(params.delay);
}
// Take screenshot
const buffer = await page.screenshot({
fullPage: params.fullPage,
type: params.format,
quality: params.format !== 'png' ? params.quality : undefined,
});
// Upload to S3
const key = `screenshots/${Date.now()}-${crypto.randomUUID()}.${params.format}`;
const url = await uploadToS3(key, buffer, `image/${params.format}`);
return { url, size: buffer.length };
} finally {
await pool.releasePage({ page, context, browser });
}
}
Step 4: Job Queue with BullMQ
For production traffic, process screenshots asynchronously with BullMQ. This prevents browser overload and provides retry logic:
import { Queue, Worker } from 'bullmq';
import Redis from 'ioredis';
const redis = new Redis({ maxRetriesPerRequest: null });
const screenshotQueue = new Queue('screenshots', {
connection: redis,
defaultJobOptions: {
attempts: 3,
backoff: { type: 'exponential', delay: 2000 },
removeOnComplete: { count: 1000 },
removeOnFail: { count: 5000 },
},
});
// Worker processes screenshot jobs
const worker = new Worker('screenshots', async (job) => {
const { params, requestId } = job.data;
console.log(`Processing ${requestId}: ${params.url}`);
const result = await captureScreenshot(params);
// Store result in Redis for pickup
await redis.set(
`result:${requestId}`,
JSON.stringify(result),
'EX', 3600 // 1 hour TTL
);
return result;
}, {
connection: redis,
concurrency: 5, // Process 5 jobs at once
limiter: { max: 10, duration: 1000 }, // Rate limit
});
// Updated API endpoint — async with polling
app.post('/v1/screenshot', async (req, res) => {
const params = screenshotSchema.parse(req.body);
const requestId = crypto.randomUUID();
await screenshotQueue.add('capture', { params, requestId });
res.status(202).json({
requestId,
status: 'processing',
pollUrl: `/v1/screenshot/${requestId}`,
});
});
app.get('/v1/screenshot/:id', async (req, res) => {
const result = await redis.get(`result:${req.params.id}`);
if (result) {
return res.json({ status: 'complete', ...JSON.parse(result) });
}
res.json({ status: 'processing' });
});
Step 5: Caching Layer
Avoid recapturing the same URL by caching results based on URL + parameters:
import crypto from 'crypto';
function getCacheKey(params) {
const normalized = JSON.stringify({
url: params.url,
width: params.width,
height: params.height,
fullPage: params.fullPage,
format: params.format,
});
return `cache:screenshot:${crypto.createHash('md5').update(normalized).digest('hex')}`;
}
async function captureWithCache(params) {
const cacheKey = getCacheKey(params);
// Check cache
const cached = await redis.get(cacheKey);
if (cached) {
console.log(`Cache hit for ${params.url}`);
return JSON.parse(cached);
}
// Capture and cache
const result = await captureScreenshot(params);
await redis.set(cacheKey, JSON.stringify(result), 'EX', 1800); // 30 min
return result;
}
Step 6: S3 Storage
Upload screenshots to S3-compatible storage for reliable delivery:
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
const s3 = new S3Client({
region: process.env.S3_REGION,
endpoint: process.env.S3_ENDPOINT,
credentials: {
accessKeyId: process.env.S3_ACCESS_KEY,
secretAccessKey: process.env.S3_SECRET_KEY,
},
});
async function uploadToS3(key, buffer, contentType) {
await s3.send(new PutObjectCommand({
Bucket: process.env.S3_BUCKET,
Key: key,
Body: buffer,
ContentType: contentType,
CacheControl: 'public, max-age=86400',
}));
return `${process.env.CDN_URL}/${key}`;
}
The Hard Parts You'll Discover
Building the basic screenshot API is the easy part. Here's what makes it a real engineering challenge in production:
- Memory leaks. Chromium leaks memory over time. You need to restart browsers periodically, track RSS usage, and kill zombie processes. Expect to write a watchdog daemon.
- Browser crashes. Certain pages crash Chromium — infinite loops, massive DOMs, WebGL contexts. Your pool needs auto-recovery, and jobs need retry logic with different browser instances.
- Anti-bot detection. Many sites detect headless Chrome via navigator properties, WebGL fingerprints, and CDP detection. You need stealth plugins, proxy rotation, and realistic browser profiles.
- Resource blocking. Ads, trackers, and cookie banners slow captures and add visual noise. You need filter lists (EasyList, EasyPrivacy) and custom dismiss logic for consent popups.
- Font rendering. System fonts differ between Linux servers and user machines. Screenshots look different without proper font installation and configuration.
- Scale. Each browser instance consumes 200-500MB RAM. At 100 concurrent users, you need a cluster of servers, load balancing, and health checks.
Or Just Use SnapAPI
All of the above — browser pools, crash recovery, caching, storage, stealth mode, ad blocking, device emulation — is exactly what SnapAPI provides as a managed service. One API call replaces hundreds of lines of infrastructure code:
import SnapAPI from 'snapapi-js';
const snap = new SnapAPI('sk_live_your_key');
// This replaces everything above
const result = await snap.screenshot({
url: 'https://example.com',
full_page: true,
format: 'png',
width: 1280,
height: 800,
block_ads: true,
block_cookie_banners: true,
device: 'desktop',
});
console.log(result.url); // CDN-delivered screenshot URL
// Plus features you'd spend weeks building:
// - 30+ device presets (iPhone, Pixel, iPad, etc.)
// - Stealth mode for anti-bot sites
// - Custom CSS/JS injection
// - Wait for selectors or network idle
// - Webhook delivery on completion
// - Video recording of page load
Build vs. Buy Comparison
| Aspect | Build It Yourself | SnapAPI |
|---|---|---|
| Setup time | 2-4 weeks | 5 minutes |
| Infrastructure cost | $50-500/mo (servers) | Free tier: 200/mo |
| Browser management | You handle crashes, memory, updates | Managed |
| Anti-bot bypass | Build stealth plugins | Built-in stealth mode |
| Ad/cookie blocking | Maintain filter lists | Built-in |
| Device emulation | Manual viewport config | 30+ presets |
| Scaling | Cluster management | Auto-scales |
| Additional features | Build each one | Scraping, extraction, PDF, video, AI analysis |
| Maintenance | Ongoing | Zero |
Skip the Infrastructure — Use SnapAPI
Screenshots, scraping, content extraction, PDF generation, video recording, and AI analysis. One API, zero browser management. Free tier includes 200 captures/month.
Start Free — No Credit Card Required