FindArticles FindArticles
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
FindArticlesFindArticles
Font ResizerAa
Search
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
Follow US
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.
FindArticles > News > Technology

Reddit Sues Perplexity Over Alleged Data Theft

Gregory Zuckerman
Last updated: October 25, 2025 1:28 pm
By Gregory Zuckerman
Technology
6 Min Read
SHARE

Reddit has filed a lawsuit accusing AI startup Perplexity of covertly harvesting Reddit content to train and power its “answer engine,” escalating a high-stakes fight over who controls the web’s most valuable conversations. The complaint, reviewed by multiple outlets, alleges Perplexity and several scraping firms siphoned Reddit posts without permission, even after warnings, to fuel a commercial AI product.

Perplexity disputes the claims, saying it supports open access to public knowledge and provides factual results responsibly. The case lands as platforms, publishers, and AI companies collide over data ownership, licensing, and the boundaries of “public” content in the age of generative AI.

Table of Contents
  • How Reddit Says It Caught Perplexity Scraping Content
  • The Legal Stakes Around Public Web Data Use
  • Why Reddit’s Corpus Is a Bull’s-Eye for AI Training
  • Perplexity’s Position And Industry Backdrop
  • What to Watch Next as the Lawsuit Moves Forward
A smartphone displaying the Reddit logo and name, held by a hand, with a blurred Reddit logo in the background. The image has been resized to a 16:9 aspect ratio and enhanced for a professional presentation.

How Reddit Says It Caught Perplexity Scraping Content

Central to Reddit’s case is a sting operation. The company says it planted a specially crafted Reddit post that could be discovered only via Google’s index and was otherwise unreachable on the open web. According to the complaint, Perplexity’s system surfaced the text from that hidden post within hours, suggesting the AI firm or its partners scraped Google’s results pages rather than Reddit directly.

The lawsuit names three scraping-related co-defendants—AWMProxy, Oxylabs, and SerpApi—alleging Perplexity relied on at least one of them to gather Reddit data at scale. Reddit also claims it sent a cease-and-desist notice, after which citations to Reddit content in Perplexity’s answers allegedly increased, not decreased.

The Legal Stakes Around Public Web Data Use

At the heart of the dispute is a thorny question: when content is publicly viewable, who can reuse it, and under what terms? U.S. courts have said publicly accessible data may be scraped in some contexts, as seen in the LinkedIn v. hiQ Labs saga. But platforms increasingly rely on terms of service, IP protections, and anti-bot measures to limit mass harvesting—especially for commercial AI training.

If Reddit can show that Perplexity bypassed technical restrictions, violated contractual terms, or misused intermediary services to evade blocks, claims could extend beyond simple contract breach to theories like trespass to chattels or violations of computer abuse statutes. Conversely, if Perplexity demonstrates it used only lawfully accessible sources and fair methods, the boundaries of acceptable AI data collection could expand.

Why Reddit’s Corpus Is a Bull’s-Eye for AI Training

Reddit’s data is unusually rich for training and benchmarking AI systems: sprawling topic coverage, human-to-human dialogue, and upvote signals that help approximate quality. In its public filings, Reddit has highlighted tens of millions of daily active users, hundreds of thousands of active communities, and a vast archive of posts and comments that map to real-world tasks—from coding help to consumer advice.

Reddit and Perplexity logos with gavel, highlighting alleged data theft lawsuit

That value has translated into deals. Reddit struck a data licensing agreement with Google to enhance search and AI research and later announced a partnership with OpenAI to bring Reddit content into AI products while offering new features to moderators and users. The Perplexity suit underscores Reddit’s strategy: monetize access through licenses and push back on unlicensed extraction.

Perplexity’s Position And Industry Backdrop

Perplexity, which markets a conversational search experience, argues it is championing fair access to public information while delivering accurate AI-generated answers. Its stance echoes broader industry arguments that the open web underpins AI progress and that over-restriction could hinder innovation.

But the legal climate is shifting. Major media organizations have sued AI developers over training on news archives. Rights holders are testing theories around copyright, contracts, and database protections. Regulators in the U.S. and Europe are scrutinizing how data is sourced, labeled, and reused in commercial AI. Against that backdrop, Reddit’s targeted “trap” example could carry weight in discovery, offering a tangible narrative about collection methods.

What to Watch Next as the Lawsuit Moves Forward

Three questions loom:

  • How exactly did Perplexity obtain the contested Reddit content, and did those methods violate any terms or laws?
  • Will the court accept Reddit’s evidence as proof of systematic scraping—and if so, what remedies might follow, from injunctions to damages to deletion of data?
  • Does this case accelerate a broader shift toward paid data partnerships for AI companies and stricter technical protections by platforms and search engines?

No matter the outcome, the lawsuit is a bellwether. If Reddit prevails, expect more publishers to harden defenses and pursue licenses. If Perplexity succeeds, AI firms may feel emboldened to lean on public indexing and aggregator services to build products. Either way, the business of training data is moving from backroom deals to the courtroom—and the rules of engagement for AI are being written in real time.

Gregory Zuckerman
ByGregory Zuckerman
Gregory Zuckerman is a veteran investigative journalist and financial writer with decades of experience covering global markets, investment strategies, and the business personalities shaping them. His writing blends deep reporting with narrative storytelling to uncover the hidden forces behind financial trends and innovations. Over the years, Gregory’s work has earned industry recognition for bringing clarity to complex financial topics, and he continues to focus on long-form journalism that explores hedge funds, private equity, and high-stakes investing.
Latest News
Fitbit Inspire 3 Price Cut Hits 19% Off at Walmart
US Charges Ex L3Harris Cyber Chief With Selling Secrets
Palantir Seals $200M Lumen Deal For Enterprise AI
ChatGPT Service Restored After Brief Outage
Amazon Cuts Anker Solix Portable Power Stations Up to 58%
Internet Reshapes Gen Z Sexuality New Research Finds
183 Million Email Accounts Exposed In Breach
DDR4 And DDR5 RAM Prices Soar As AI Demand Explodes
TP-Link Wi-Fi 6 Router Drops to $50 at Walmart
Apple MagSafe Charger Drops Under $30 at Walmart
Sora Unveils Pet AI Videos Social Tools And Android Plan
Rivian Cuts 600 Jobs In Third Layoff This Year
FindArticles
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
  • Corrections Policy
  • Diversity & Inclusion Statement
  • Diversity in Our Team
  • Editorial Guidelines
  • Feedback & Editorial Contact Policy
FindArticles © 2025. All Rights Reserved.