Legal Ethics Web Scraping

Is Web Scraping Legal in 2026?
What Every Developer Should Know

📅 March 21, 2026 ⏱ 10 min read By Papalily Team
⚠ Disclaimer: This article is for informational purposes only. It does not constitute legal advice and should not be relied upon as such. Laws vary by jurisdiction. Consult a qualified attorney for advice specific to your situation.

Is web scraping legal? It's one of the most searched questions in developer circles, and the honest answer is: it depends. It depends on what you're scraping, why, how, and where you're located. In 2026, the legal landscape around web scraping has become somewhat clearer thanks to landmark court decisions — but significant gray areas remain. Here's what you need to know.

The Short Answer

Scraping publicly available data — information visible to any unauthenticated visitor — is generally not illegal under computer crime laws in the United States, based on current case law. However, it may still violate a website's Terms of Service, raise copyright concerns, or conflict with privacy regulations like GDPR depending on what data you collect.

"Legal" and "permitted" are different things. Something can be legal (not criminal) while still being prohibited by contract (a website's ToS) or subject to civil liability.

Key Legal Cases

hiQ Labs vs. LinkedIn (US, 2022)

hiQ Labs vs. LinkedIn Corp., 9th Circuit, 2022 hiQ scraped public LinkedIn profiles to build a HR analytics product. LinkedIn sent cease-and-desist letters and blocked hiQ's access. hiQ sued. The Ninth Circuit Court of Appeals ruled that scraping publicly accessible data does not violate the Computer Fraud and Abuse Act (CFAA) — the key US federal computer crime statute — because public websites are not "protected computers" requiring authorization to access. The case was ultimately settled in 2023 with LinkedIn agreeing not to pursue CFAA claims against hiQ.

This ruling is significant because it limits how aggressively platforms can invoke computer crime law against scrapers of public data. It does not mean scraping is universally legal — only that the CFAA may not apply to publicly accessible pages.

Ryanair vs. PR Aviation (EU, 2015)

Ryanair DAC vs. PR Aviation BV, European Court of Justice, 2015 Ryanair's Terms of Service prohibited scraping. PR Aviation scraped Ryanair's public flight prices for a price comparison site. The ECJ ruled that while EU database rights didn't apply in this case, contractual Terms of Service restrictions on scraping can be legally enforceable under contract law — if users (or bots acting on their behalf) are deemed to have accepted the terms.

This case illustrates that ToS violations can have legal consequences in Europe, even when the underlying data access is technical "public."

Van Buren vs. United States (US, 2021)

Van Buren vs. United States, US Supreme Court, 2021 This case narrowed the interpretation of the CFAA's "exceeds authorized access" provision. The Supreme Court held that the CFAA applies when someone accesses a computer system they aren't entitled to access at all — not simply when they violate terms of use while accessing something they otherwise have authorization to view. This further weakened CFAA's applicability to public web scraping.

Terms of Service vs. Law

This distinction matters enormously:

What's Generally Considered Low-Risk

What Carries Higher Risk

GDPR and Privacy Considerations

If you're in the EU or scraping data about EU residents, GDPR applies. Publicly available data isn't automatically GDPR-exempt. Under GDPR, processing personal data requires a valid legal basis — consent, legitimate interests, contractual necessity, etc.

For developers: if you're scraping product prices, stock levels, or business information — you're typically in a safer position. If you're scraping names, emails, phone numbers, or social profiles of individuals — get legal advice before proceeding.

Practical Best Practices

  1. Check robots.txt first. Respecting it won't give you legal immunity, but disregarding it (especially when you know about it) can be used against you.
  2. Read the Terms of Service. Know what you're agreeing to or potentially violating. Make a conscious, informed decision.
  3. Don't circumvent technical measures. Rotating IPs and using real browsers is normal. Defeating CAPTCHAs and bypassing explicit IP bans is riskier.
  4. Rate limit yourself. Don't hammer servers. Slow, respectful scraping is much harder to argue as malicious.
  5. Stick to public data. No login walls, no personal data of private individuals.
  6. Don't republish copyrighted text. Extracting facts is different from copying articles.
  7. Get legal advice for commercial use at scale. The stakes are higher when you're building a product on scraped data.

The Bottom Line

In 2026, scraping publicly accessible websites for non-personal data, at reasonable rates, without circumventing authentication, is not criminally illegal in most jurisdictions under current case law. Whether it's contractually permissible depends on the specific site's ToS and how a court would assess enforceability against automated access.

The most responsible approach: know your target, know the risks, respect rate limits, avoid personal data, and consult a lawyer if you're building something significant.

Whatever you decide to scrape — Papalily makes it easy.

Scrape Responsibly with Papalily

AI-powered extraction that respects rate limits, renders real browsers, and returns clean JSON. Built for developers who care about doing things right. Free tier — 100 requests/month.

Get Free API Key on RapidAPI →

This article is for informational purposes only. Not legal advice.