Latest: v1.3.0 — March 2026

API Changelog

Every update, fix, and new feature across all Papalily API releases. We ship fast and document everything.

v1.3.0 Latest March 10, 2026

Major new capability: interactive browser automation. Papalily can now execute JavaScript, fill forms, click buttons, paginate, and maintain browser sessions across multiple API calls. Includes a natural language task planner that converts plain-English goals into executable steps automatically.

New Endpoints
  • POST /interact — Execute a sequence of interactive steps on a real browser page. Accepts steps (explicit) or task (natural language, AI-planned).
  • POST /session/start — Open a persistent browser session. Context stays alive between API calls. Available on Pro and above.
  • POST /session/:id/step — Execute one step or a natural-language task on a live session.
  • GET /session/:id/state — Get current URL, title, and screenshot of a live session.
  • DELETE /session/:id — Close a session and free browser resources.
New Features
  • Natural language task planner — Pass "task": "..." to /interact or /session/:id/step. The AI snapshots the live page, generates a step plan, and executes it automatically. Plans are cached per domain+task for 1 hour.
  • CSS schema extraction — New css_schema step type extracts structured data via CSS selectors with zero AI cost. Faster and cheaper than extract when page structure is known.
  • Per-request browser contexts — Each request now gets an isolated Playwright context, preventing cookie/state leaks between users.
  • Crash guardsunhandledRejection and uncaughtException handlers prevent full process crashes on async errors.
  • PM2 memory restart — Server auto-restarts at 1GB memory to prevent OOM crashes.
Fixes
  • Fixed rate limiter ERR_ERL_UNEXPECTED_X_FORWARDED_FOR false-alarm errors flooding the log.
  • Fixed mobile hamburger menu not working on Blog, Compare, and Resources pages.
  • Fixed mobile nav font size inconsistency across pages.
  • Standardised navbar links across all pages (Home | Docs | Pricing | Blog | Compare | Changelog | Resources).
v1.1.1 March 9, 2026

Removed the /batch endpoint to protect server stability. Use POST /scrape for each URL individually — either sequentially or in parallel from your own client code.

Breaking Changes
  • POST /batch removed — endpoint now returns 410 Gone. The batch endpoint spawned multiple concurrent Playwright browser instances, causing high memory pressure. Replace with individual calls to POST /scrape.
Migration
  • For sequential scraping: call POST /scrape in a loop
  • For parallel scraping: call POST /scrape concurrently from your client (e.g. Promise.all in JS, asyncio.gather in Python)
  • Cached results return instantly and are quota-free — repeated URLs benefit automatically
v1.1.0 Major Update March 9, 2026

Major rendering reliability update. The API now uses an adaptive content-stability algorithm to detect when React/Vue/Next.js pages have finished hydrating — replacing fixed wait timers. Full-page screenshots, proxy support, and automatic non-English translation complete the release.

New Features
  • proxy_url parameter — route the browser through any HTTP/HTTPS/SOCKS5 proxy for geo-specific content (e.g. get USD pricing from a US IP)
  • Adaptive content-stability wait — polls innerText length every 600ms and exits only when the page stops changing, instead of a fixed delay
  • Lazy-load trigger — automatically scrolls the full page before capture to trigger intersection-observer lazy-loaded components
  • Auto-translation — results containing Korean, Japanese, Chinese, or Arabic are automatically translated to English via a second AI pass
  • API endpoint request logging — every call to every route is logged with status code and response time
  • Analytics dashboard endpoint stats — new table shows total calls, success rate, avg response time, and error count per endpoint
Performance
  • Page load: networkidleload event — saves 1–3s per request on most sites
  • Resource blocking: images, fonts, media, and tracking scripts aborted during navigation
  • Browser context reuse — shared context kept alive between requests instead of recreating
  • Screenshot quality optimised: full page up to 5000px tall at quality 70 for richer AI analysis
  • HTML preprocessed before AI analysis — <script>, <style>, SVG, and comments stripped
  • 15 extra Chrome --disable-* flags for unused browser services
  • Gemini model instance reused at module level (no re-initialisation per request)
Bug Fixes
  • Fixed geo-targeted sites (Shopify, etc.) serving localised content to non-US server IPs — added locale: en-US, Accept-Language, and CF-IPCountry headers
  • Fixed Korean/CJK text leaking into extraction results — AI extraction prompt now enforces English; translation safety net as backup
  • Cookies cleared between requests — prevented stale geo-targeting cookies from affecting subsequent scrapes
  • Removed responseMimeType: application/json which was suppressing Gemini's language instruction-following
  • Fixed duplicate route handlers causing request conflicts
  • Analytics dashboard: bar charts now use %-based widths (was fixed 300px, broke on mobile)
  • Analytics mobile: Referrer/IP/Device ID columns hidden on mobile; all tables wrapped in overflow-x: auto
Infrastructure
  • Vision-first extraction — system prompt explicitly instructs AI to study screenshot for pricing cards, grids, and visual tables
  • Geo-redirect interception — path-based locale redirects (/ko/, /ja/, etc.) rewritten to /en/
  • Proxy requests use isolated one-time browser contexts — no shared state bleed
🔗 View on RapidAPI
v1.0.2 Update March 8, 2026

Security hardening, analytics v2, self-hosted tracking dashboard, and mobile optimisation.

New Features
  • Self-hosted analytics dashboard at /analytics — pageviews, click events, top pages, referrers
  • Analytics v2: real IP tracking, device ID (__ppid localStorage UUID), browser fingerprint hash
  • Scheduled blog publishing system — 8 SEO posts drip-fed 2x/week via cron job
  • GEO/AI optimisation — ai-page.html served to AI crawlers (GPTBot, ClaudeBot, PerplexityBot) for ChatGPT Search and Perplexity extraction
  • Comparison pages: /compare/ hub, vs ScraperAPI, vs Apify
  • Resources page with curated developer tools and backlinks
Performance & Security
  • Nginx rate limiting zones: per-endpoint limits (scrape 10r/m, batch 3r/m, general 30r/m)
  • HSTS, CSP, gzip, static asset caching on www
  • MAX_CONCURRENT_SCRAPES = 3 cap to prevent OOM on concurrent Playwright instances
  • trust proxy 1 set for correct client IP behind Nginx
  • Mobile: orbs disabled on screens <768px (removed heavy filter:blur GPU load)
  • Mobile navigation added to all pages (was completely missing)
Bug Fixes
  • Fixed css/ and js/ directories with 700 permissions — Nginx could not read static assets
  • Cache now never stores failed scrapes — subsequent requests always retry fresh
  • Batch: each URL in batch counts as 1 quota request; cache hits are free
  • RapidAPI plan limits correctly read from x-rapidapi-subscription header on every request
v1.0.1 Update March 7, 2026

RapidAPI integration, CORS, improved cache logic, and batch endpoint hardening.

New Features
  • RapidAPI proxy secret validation (x-rapidapi-proxy-secret header)
  • CORS headers added for browser-side API access
  • SEO/GEO meta — robots.txt, sitemap.xml, llms.txt, IndexNow key, JSON-LD schemas
  • Blog launched — first post on scraping React sites with AI
  • GitHub profile README as DA96 backlink
Improvements
  • Cache improved: max 500 entries, LRU eviction, never caches failure responses
  • Batch: pre-flight quota check before scraping; per-item quota counting
  • RapidAPI plan auto-sync: BASIC→50, PRO→1000, ULTRA→20000, MEGA→100000 requests
v1.0.0 Initial Release March 6, 2026

Papalily API launches publicly on RapidAPI. Chromium-based scraping with Gemini AI extraction.

Initial Features
  • POST /scrape — render any URL in Chromium + extract structured JSON via Gemini AI
  • POST /batch — scrape up to 5 URLs in parallel
  • GET /usage — check quota and plan limits
  • GET /health — API status and cache stats
  • 10-minute LRU result cache — repeated requests instant and quota-free
  • Playwright Chromium headless rendering — handles React, Vue, Next.js, Angular
  • Gemini 2.0 Flash AI extraction engine — screenshot + text for maximum accuracy
  • Let's Encrypt SSL on all three domains (www, bare, api)
  • PM2 process manager with systemd auto-restart

🚀 On the Roadmap

Coming Soon
Webhooks
Fire a webhook when a scrape completes — for async workflows.
Coming Soon
Scheduled Scraping
Set a cron schedule — monitor prices, jobs, or any data automatically.
Planned
Geo Tunnel Locations
Select a country for your scrape — get localised prices and content.
Planned
PDF & File Extraction
Extract structured data from PDFs and downloadable files.
Planned
Schema Templates
Pre-built extraction templates for e-commerce, jobs, news, and real estate.
Planned
Dashboard & API Keys
Native dashboard for usage stats, key management, and history.