Data Extraction API
Web-to-JSONAny URL into clean structured JSON in one API call.
POST a URL and get normalized JSON back. No brittle CSS selectors. Built for SPA rendering, dynamic pages, login gates, and paywall-heavy sites where simple HTML scraping fails.
Simple pricing: $0.01/page pay-as-you-go or $15/month unlimited.
Example request:
curl -X POST https://api.webtojson.dev/api/extract \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_KEY" \
-d '{"url":"https://example.com/post"}'Response shape:
{
"title": "Page title",
"summary": "Concise factual summary",
"entities": [...],
"fields": {...},
"keyPoints": [...]
}Why Most Scrapers Miss the Data
Solo builders need output they can trust in production, not selector whack-a-mole. Web-to-JSON focuses on rendered content quality first.
Selectors Break Every Week
Most pages ship JS-heavy UI changes constantly. Hardcoded selectors drift and your scraper silently returns junk.
SPAs Hide Real Content
Basic HTTP fetch misses the rendered content because data lands after hydration, lazy loading, or route transitions.
Auth Walls & Paywalls
The useful data often sits behind soft walls or anti-bot messaging where naive extraction gives fragments only.
What You Get In One Call
The API is intentionally narrow: send URL in, receive practical JSON out, with enough structure to plug directly into lead enrichment, content indexing, or monitoring workflows.
Rendered Browser Capture
Puppeteer loads the page as a real browser session so SPAs, delayed scripts, and dynamic layouts are actually captured.
AI Structured Extraction
Claude or OpenAI converts noisy page text into stable JSON with key points, entities, and typed fields.
Paywall-Aware Signals
Returns paywall/auth-wall indicators so your pipeline knows when extraction quality is likely constrained.
Cookie-Based Paid Access
Purchased users unlock `/tool` and `/api/extract` access via secure cookie, keeping the monetized feature behind the wall.
Simple Pricing for Indie Builders
Start with low variable cost when you are validating. Switch to unlimited when extraction becomes core.
Pay-as-you-go
$0.01/page
Ideal for data experiments, side projects, and low-volume jobs where every call should stay cheap.
- - Full rendered extraction
- - Structured JSON output
- - API + dashboard tool access
API Quickstart
The extractor endpoint accepts a URL and returns normalized JSON from rendered page content.
POST /api/extract
Request body
{
"url": "https://example.com"
}Response body (example)
{
"ok": true,
"data": {
"title": "Example Domain",
"summary": "...",
"keyPoints": ["..."],
"entities": [],
"fields": {
"author": null,
"publishedDate": null,
"price": null
}
}
}If neither `OPENAI_API_KEY` nor `ANTHROPIC_API_KEY` is configured, the endpoint still returns useful heuristic JSON so your workflow remains operational.
Frequently Asked Questions
How is this different from web-access skills?
web-access style skills are built for conversational agents. Web-to-JSON is an API-first service designed for production pipelines and automation jobs.
Do I need to maintain selectors?
No. You send only the URL. The extractor renders the page, captures the meaningful content, and returns normalized JSON without CSS selector maintenance.
Can it handle SPAs and delayed content?
Yes. Puppeteer executes client-side JavaScript, waits for network idle, and scrolls to trigger lazy-loaded sections before extraction.
How does paid access work with Stripe Payment Link?
After checkout, Stripe sends a webhook event with the checkout session ID. The success page exchanges that session ID for an httpOnly cookie that unlocks the protected tool and API endpoint.
What should I configure in Stripe?
Set your payment link's success URL to `/success?session_id={CHECKOUT_SESSION_ID}` so buyers can be unlocked automatically after payment.