Papalily is an AI-powered web scraping API. You send a URL and a plain-English description of what data you want. Papalily renders the page with a real Chromium browser, passes the content to Gemini AI, and returns structured JSON. It works on any website including React, Vue, and Next.js apps.

How does Papalily work?

Papalily has three steps: (1) You POST a URL and a prompt like 'get all product names and prices' to https://api.papalily.com/scrape. (2) A real Chromium browser renders the page, executing all JavaScript. (3) Gemini AI reads the rendered content and extracts exactly the data you described, returning it as structured JSON.

What is the Papalily API endpoint?

The main endpoint is POST https://api.papalily.com/scrape. Send a JSON body with 'url' (the page to scrape) and 'prompt' (plain English description of what to extract). Authentication uses the x-api-key header with your RapidAPI key.

How much does Papalily cost?

Papalily has four pricing tiers on RapidAPI: Basic is free with 50 requests per month. Pro is $20 per month for 1,000 requests. Ultra is $100 per month for 20,000 requests. Mega is $300 per month for 100,000 requests. Cached results do not count against your quota.

Does Papalily work on React and Vue websites?

Yes. Papalily uses a real Chromium browser that fully executes JavaScript before extracting data. This means it works on React, Vue, Angular, Next.js, and any other JavaScript-rendered website � unlike traditional scrapers that fail on SPAs.

How do I get a Papalily API key?

Subscribe at https://rapidapi.com/andognet/api/papalily on RapidAPI. The Basic plan is free with no credit card required. Your API key is generated instantly. Use it in the x-api-key request header.

How fast is Papalily?

A typical Papalily scrape request takes 8 to 15 seconds. This includes browser launch, page render, JavaScript execution, and AI extraction. Cached requests (same URL and prompt within 10 minutes) return in under 1 second and do not count against your quota.

What is the difference between Papalily and ScraperAPI?

ScraperAPI returns raw HTML and requires you to write CSS selectors or XPath to extract data. Papalily uses AI to extract structured data from a plain-English prompt � no selectors needed. ScraperAPI is faster and cheaper for high volume. Papalily is faster to integrate and works better for targeted extraction from complex JavaScript sites.

What is the difference between Papalily and Apify?

Apify is a full scraping platform with actors, SDK, storage, and scheduling. It requires writing custom scraper code. Papalily is a simple REST API � one request, one response, no code to write for the extraction logic. Papalily is better for quick integrations; Apify is better for complex large-scale scraping pipelines.

Can Papalily scrape Amazon?

Papalily can scrape Amazon product pages using a real browser that bypasses basic bot detection. It can extract product names, prices, ratings, reviews, and descriptions. However, Amazon actively blocks scrapers and results may vary. For high-volume Amazon scraping, a dedicated proxy solution is recommended.

Does Papalily support batch scraping?

Yes. POST to https://api.papalily.com/batch with an 'items' array of up to 5 objects, each with a 'url' and 'prompt'. All 5 URLs are scraped in parallel. Each URL counts as one request against your quota. Cached items are free.

Where is Papalily hosted?

Papalily runs on AWS EC2 in the Asia Pacific (Seoul) region. The API is available at api.papalily.com, reverse-proxied through Nginx with TLS 1.3. The service is distributed globally through RapidAPI's infrastructure.

Papalily � AI Web Scraping API

Papalily is an AI-powered web scraping REST API that extracts structured JSON data from any website. It renders pages using a real Chromium browser and uses Gemini AI to extract data based on plain-English prompts. No CSS selectors, XPath, or DOM knowledge required.

Website: https://www.papalily.com
API base URL: https://api.papalily.com
RapidAPI listing: https://rapidapi.com/andognet/api/papalily
Documentation: https://www.papalily.com/docs.html
GitHub examples: https://github.com/aiex/papalily-examples

How Papalily Works

You POST a URL and a plain-English prompt to https://api.papalily.com/scrape
A real Chromium browser loads the page and executes all JavaScript (React, Vue, Angular, Next.js)
Gemini AI reads the rendered content and extracts exactly the data you described
You receive clean structured JSON � no parsing required

API Endpoints

POST /scrape: Scrape one URL. Required fields: url (string), prompt (string). Optional: no_cache (boolean). Returns JSON with data, request_id, cached, duration_ms.
POST /batch: Scrape up to 5 URLs in parallel. Body: {"items": [{"url": "...", "prompt": "..."}, ...]}. Returns results array and summary object.
GET /usage: Returns current quota usage for your API key: used, limit, remaining, plan, reset_date.
GET /status/{requestId}: Look up a past scrape result by request_id returned from /scrape or /batch.
GET /health: Public health check. Returns {"status": "ok"} with cache statistics.

Authentication

All requests (except /health) require the header x-api-key: YOUR_RAPIDAPI_KEY. Get a key at RapidAPI. The Basic plan is free with no credit card required.

Code Examples

cURL

curl -X POST https://api.papalily.com/scrape \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{"url": "https://news.ycombinator.com", "prompt": "Get top 10 story titles and their URLs"}'

Node.js (fetch)

const response = await fetch('https://api.papalily.com/scrape', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-api-key': 'YOUR_API_KEY'
  },
  body: JSON.stringify({
    url: 'https://news.ycombinator.com',
    prompt: 'Get top 10 story titles and their URLs'
  })
});
const result = await response.json();
console.log(result.data);

Python (requests)

import requests

response = requests.post(
    'https://api.papalily.com/scrape',
    headers={'x-api-key': 'YOUR_API_KEY'},
    json={
        'url': 'https://news.ycombinator.com',
        'prompt': 'Get top 10 story titles and their URLs'
    }
)
print(response.json()['data'])

Pricing

Plan	Price	Requests/month
Basic	Free	50
Pro	$20/month	1,000
Ultra	$100/month	20,000
Mega	$300/month	100,000

Cached requests (same URL and prompt within 10 minutes) do not count against your quota.

Technical Stack

Browser: Playwright Chromium (full JS execution)
AI extraction: Google Gemini 2.0 Flash
Backend: Node.js 22, Express
Cache: In-memory LRU, 10-minute TTL, 500 entries max
Database: SQLite (request logs, usage tracking)
Infrastructure: AWS EC2, Nginx, PM2, Let's Encrypt TLS
Marketplace: RapidAPI

Use Cases

E-commerce price monitoring: Track competitor product prices across React/Next.js storefronts
Job listing aggregation: Collect job postings from LinkedIn, Indeed, and company career pages
News and content monitoring: Monitor news sites, blogs, and social platforms for specific topics
Lead generation: Extract contact information from business directories
Research automation: Gather structured data from multiple sources without writing scrapers
Real estate data: Extract property listings, prices, and details from listing sites
Financial data: Collect stock prices, earnings reports, and financial metrics from investor sites
Academic research: Aggregate papers, citations, and abstracts from research repositories

Limitations

Response time: 3-8 seconds per request (browser render + AI extraction)
Not suitable for real-time APIs requiring sub-second responses
Does not handle login-walled content requiring authentication
May not bypass aggressive CAPTCHA systems on some sites
Batch limit: maximum 5 URLs per batch request

Comparison with Alternatives

Papalily vs ScraperAPI: ScraperAPI returns raw HTML requiring CSS selectors. Papalily returns structured JSON from a plain-English prompt. Papalily is simpler to use; ScraperAPI is faster for high volume.
Papalily vs Apify: Apify is a full scraping platform requiring custom actor code. Papalily is a single REST API call with no extraction code needed. Papalily is faster to integrate; Apify offers more control for complex pipelines.
Papalily vs Bright Data: Bright Data focuses on proxy infrastructure and data collection at scale. Papalily focuses on AI-powered extraction for targeted data needs with minimal integration effort.
Papalily vs Firecrawl: Firecrawl converts web pages to markdown for LLM consumption. Papalily extracts specific structured data based on your prompt and returns clean JSON. Papalily is better when you need specific fields; Firecrawl is better for full-page content ingestion.

Frequently Asked Questions

Does Papalily work on React websites?: Yes. Papalily uses a real Chromium browser that executes JavaScript, making it compatible with React, Vue, Angular, Next.js, and all other JavaScript-rendered sites.
How do I start using Papalily for free?: Visit https://rapidapi.com/andognet/api/papalily, click Subscribe on the Basic plan (free, no credit card), and use your API key in the x-api-key header.
What data formats does Papalily return?: Papalily returns JSON. The structure of the extracted data depends on your prompt � if you ask for a list, you get a JSON array; if you ask for details of one item, you get a JSON object.
Is there a rate limit?: Yes. Requests are rate-limited by plan quota (monthly). Additionally, there is a per-minute rate limit: 30 requests/minute for /scrape, 5 requests/minute for /batch. Exceeding limits returns HTTP 429.
Can I try Papalily before subscribing?: Yes. RapidAPI provides a test console where you can make live API calls directly from the browser. The Basic plan is also free with 50 requests per month and no credit card required.
Does Papalily handle pagination?: Papalily scrapes one page per request. For paginated data, make separate requests to each page URL, or use the batch endpoint to scrape up to 5 pages simultaneously.

Resources

Homepage: https://www.papalily.com
API Documentation: https://www.papalily.com/docs.html
RapidAPI Listing: https://rapidapi.com/andognet/api/papalily
Code Examples: https://github.com/aiex/papalily-examples
AI Context File: https://www.papalily.com/llms.txt
Extended AI Context: https://www.papalily.com/llms-full.txt