Use Case Guide · Updated February 2026

Web Archiving API: Capture & Archive Web Pages

Q: What's the maximum page size for full-page screenshots?

SnapAPI can capture pages up to 16,384 pixels tall. For longer pages, use PDF format.

Web content disappears. Pages get updated, redesigned, or taken down entirely. For compliance teams, legal departments, and researchers, having a reliable web archiving solution is critical. You need timestamped proof of what a page looked like at a specific moment — not just the HTML, but the actual visual rendering.

SnapAPI provides the building blocks for automated web archiving: screenshots for visual records, PDFs for document-quality archives, and data extraction for structured content preservation. Combine all three for comprehensive archival.

📦 Archive Any Web Page

Screenshots, PDFs, and extracted content. Timestamped, reliable, automated.

Get Free API Key →

Multi-Format Web Archiving

A robust web archive captures pages in multiple formats. Here's how to create comprehensive archives with SnapAPI:

curl — Complete Page Archive

# Visual screenshot
curl "https://api.snapapi.pics/v1/screenshot?url=https://example.com/page&fullPage=true&format=png" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -o archive-2026-02-12-screenshot.png

# PDF document
curl "https://api.snapapi.pics/v1/pdf?url=https://example.com/page&printBackground=true" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -o archive-2026-02-12.pdf

# Extracted content
curl "https://api.snapapi.pics/v1/extract?url=https://example.com/page" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -o archive-2026-02-12-content.json

Node.js — Automated Archiving Pipeline

async function archivePage(url) {
  const timestamp = new Date().toISOString();
  const slug = new URL(url).hostname.replace(/\./g, '-');
  const headers = { 'Authorization': 'Bearer YOUR_API_KEY' };

  const [screenshot, pdf, content] = await Promise.all([
    fetch(`https://api.snapapi.pics/v1/screenshot?url=${encodeURIComponent(url)}&fullPage=true&format=png`, { headers }),
    fetch(`https://api.snapapi.pics/v1/pdf?url=${encodeURIComponent(url)}&printBackground=true`, { headers }),
    fetch(`https://api.snapapi.pics/v1/extract?url=${encodeURIComponent(url)}`, { headers }).then(r => r.json())
  ]);

  const archive = {
    url,
    timestamp,
    content,
    screenshotPath: `archives/${slug}-${timestamp}.png`,
    pdfPath: `archives/${slug}-${timestamp}.pdf`
  };

  await fs.promises.writeFile(archive.screenshotPath, Buffer.from(await screenshot.arrayBuffer()));
  await fs.promises.writeFile(archive.pdfPath, Buffer.from(await pdf.arrayBuffer()));
  await fs.promises.writeFile(`archives/${slug}-${timestamp}.json`, JSON.stringify(archive, null, 2));

  return archive;
}

Python — Scheduled Archiving

import requests
from datetime import datetime

def archive_page(url):
    headers = {'Authorization': 'Bearer YOUR_API_KEY'}
    timestamp = datetime.now().strftime('%Y-%m-%d_%H-%M')
    slug = url.replace('https://', '').replace('/', '_')

    # Capture screenshot
    screenshot = requests.get(
        'https://api.snapapi.pics/v1/screenshot',
        params={'url': url, 'fullPage': True, 'format': 'png'},
        headers=headers
    )
    with open(f'archives/{slug}-{timestamp}.png', 'wb') as f:
        f.write(screenshot.content)

    # Capture PDF
    pdf = requests.get(
        'https://api.snapapi.pics/v1/pdf',
        params={'url': url, 'printBackground': True},
        headers=headers
    )
    with open(f'archives/{slug}-{timestamp}.pdf', 'wb') as f:
        f.write(pdf.content)

    # Extract content
    content = requests.get(
        'https://api.snapapi.pics/v1/extract',
        params={'url': url},
        headers=headers
    ).json()

    return {'url': url, 'timestamp': timestamp, 'content': content}

# Archive important pages daily
urls_to_archive = [
    'https://competitor.com/pricing',
    'https://regulator.gov/guidelines',
    'https://partner.com/terms',
]
for url in urls_to_archive:
    archive_page(url)

Web Archiving: API vs DIY

Aspect	DIY (wget / HTTrack / Puppeteer)	SnapAPI
Visual fidelity	wget/HTTrack miss JS-rendered content	Real browser rendering
JavaScript pages	wget can't render JS; Puppeteer can	Full JS execution
Multiple formats	Need separate tools	Screenshot + PDF + Extract in one API
Cookie popups	Manual dismissal	Auto-blocked
Server maintenance	Your responsibility	Fully managed
Full-page capture	Complex scroll-stitch logic	Built-in fullPage

Archiving Use Cases

⚖️ Legal Evidence

Capture timestamped screenshots for intellectual property disputes, defamation cases, and contract evidence.

📋 Regulatory Compliance

Archive financial disclosures, terms of service, and regulatory filings as required by law.

🏛️ Historical Records

Preserve web content for research, journalism, and institutional knowledge management.

🔄 Change Tracking

Monitor terms of service, pricing pages, and policy documents for changes over time.

Archiving Best Practices

Capture in multiple formats — screenshot (visual proof), PDF (printable document), and extracted text (searchable content)
Use full-page screenshots — fullPage=true captures everything, not just the viewport
Timestamp everything — include ISO timestamps in filenames and metadata
Store securely — use immutable storage (S3 Object Lock, WORM) for legal evidence
Hash for integrity — compute SHA-256 hashes of archived files to prove they haven't been tampered with
Block cookie banners — use blockCookieBanners=true for clean archives without consent overlays

Start Archiving Web Pages

Screenshots, PDFs, and content extraction. Compliance-ready archiving.

Get Free API Key →

FAQ

Can I archive JavaScript-heavy pages?

Yes. SnapAPI uses a real Chromium browser that fully executes JavaScript. Single-page apps, dynamic content, and interactive elements all render correctly.

How do I prove an archive is authentic?

Compute SHA-256 hashes of your archived files immediately after capture. Store hashes separately (e.g., in a database or blockchain). This proves the file hasn't been modified since capture.

Can I archive pages behind login?

SnapAPI supports cookie injection for authenticated sessions. Pass cookies as parameters to capture pages that require login.

What's the maximum page size for full-page screenshots?

SnapAPI can capture pages up to 16,384 pixels tall. For extremely long pages, consider using PDF format instead, which handles pagination automatically.

Related: SEO Monitoring · PDF Generation · Free Screenshot API