FindArticles FindArticles
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
FindArticlesFindArticles
Font ResizerAa
Search
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
Follow US
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.
FindArticles > News > Technology

Spotify library scrape ‘confirmed’ when it reached 300TB

Gregory Zuckerman
Last updated: December 22, 2025 2:02 pm
By Gregory Zuckerman
Technology
8 Min Read
SHARE

Spotify has confirmed it is investigating a large-scale scrape of its catalog after the pirate preservation group known as “Anna’s Archive” claimed to have stolen 200,000+ songs and up to 300TB of music from the streaming service.

The group said it scraped metadata for 256 million tracks and claimed to have audio files for 86 million of them, a haul it maintains constitutes fully “99.6%” of listening activity. A third party “illegally accessed” Spotify systems and used stolen or leaked login credentials to gain unauthorized access to some of the streaming service’s songs and recordings, according to a statement emailed by an outside PR firm representing the company. The use of “fraudulent” methods such as scraping open-source code from the network, and bypassing digital rights management software to get into some music files, was also reported in a similar message delivered via email from Spotify to Billboard.

Table of Contents
  • What data and audio files were taken in the scrape
  • How the scrape likely worked to bypass protections
  • Why a 300TB cache of popular music matters now
  • How artists and everyday listeners could be affected
  • What Spotify and rightsholders could do next
The Spotify logo, a bright green circle with three black curved lines representing sound waves, centered on a dark, subtly patterned background with a soft green glow.

What data and audio files were taken in the scrape

Anna’s Archive says the cache includes track names, artist and album information, and details of popularity for 256 million items, as well as tracks themselves for 86 million files. According to the group, it will be releasing torrents slowly in stages, starting by moving up according to how many times they were played (at least using Spotify’s publicly available popularity scores), and including album art but no additional metadata.

Bitrate is inconsistent throughout the collection: The group contends top songs are encoded at 160 kbps, while less popular tracks can drop as low as 75 kbps to reduce file sizes. We’ve done some quick back-of-the-envelope math for 300 TB: the average three-minute-long song at a bitrate of 160 kbps is around 3.6 MB, and when you’re talking tens of millions of files — many at lower bitrates — it’s easy to see the total move into several hundred terabytes.

How the scrape likely worked to bypass protections

Spotify says public metadata was scraped — a common occurrence on the open web — but the key claim is DRM circumvention to extract audio from protected streams. At scale, that usually involves artificially generated requests, IP-hopping infrastructure, session theft, or abusing holes in access controls. Spotify doesn’t elaborate on the vector, but the description suggests that the attackers did more than normal scraping to bypass content protections; it’s a line that would trigger anti-circumvention concerns under laws like DMCA Section 1201 in the US and analogous rules in Europe’s DSM Directive.

That the scraped library seems to reflect listener popularity also implies that the operators optimized collection for maximum cultural coverage, rather than “completeness” (i.e., how a comparable artist is missing from our data set only because it was unpopular, not due to degradation); an optimization decision that both slashes bitrate and increases the transferred value — a crude form of perceived “value” for larger coverage.

Why a 300TB cache of popular music matters now

Even for lossy bitrates, 300 TB of popular music is a significant act of preservation and piracy. For archivists and scholars, it provides a snapshot of the modern streaming canon that can be traced to real-world consumption. From the rightsholders’ perspective, it’s a huge unauthorized distribution pipeline that has the potential to wreck licensing economics if torrents spread widely.

The scale also raises questions of competitive intelligence: popularity scores, release cadences, and catalog gaps are useful signals for labels, indie distributors, and recommendation researchers. Although Spotify’s own recommendation system was not revealed, bulk access to popularity and play proxies can be used to back out behavioral patterns. Historically, industry bodies such as the RIAA and IFPI have moved quickly to shut down torrent indexes offering up pirate copies of major-label back catalogs, and similar action is likely here.

The Spotify logo, featuring a green circle with three curved lines resembling sound waves, and the word Spotify in green text, set against a professional light gray background with subtle, soft wave patterns.

How artists and everyday listeners could be affected

We do not have any reason to believe that user accounts, personal data, or anything sensitive were breached; the breach concerns content files and publicly available metadata. Artists and labels now have the near-term risk of substitution; some level of consumption will move to torrents, but the lower bitrates and lack of platform features (song lyrics, playlists, social sharing) blunt that impact for a lot of listeners.

Another concern is uncontrolled reuse. For instance, large scraped audio sets can feed unlicensed machine-learning models that can be used to perform tasks such as source separation, vocal cloning, and genre synthesis. That risk has been on the minds of labels this year, as a top-secret string of lawsuits against unlicensed training on music and lyrics gained attention. A widely copied torrent could be more difficult to police, and the provenance murkier.

What Spotify and rightsholders could do next

Expect a multi-faceted response: technical mitigations (closing the scraping vectors, tightening token lifetimes and key rotation, and ramping bot detection), litigation aimed at distribution points, and coordinated takedowns by rightsholder groups.

If Anna’s Archive goes through with staggered releases, we can expect a longer game of cat and mouse over the months ahead on torrent sites and mirrors.

The problem for Spotify at its core is how to reconcile an open discovery platform — public artist pages and charts, say — with protections strong enough to discourage industrial-scale extraction. That usually includes rate limiting, anomaly detection of popularity-weighted access patterns, and a variety of DRM hardening. For the music industry, however, the mishap highlights a larger trend: streaming has funneled all of the world’s music onto a few platforms, and as such they’ve become high-value targets not only for pirates but anyone else trying to capture sweeping cultural datasets.

Spotify’s investigation is ongoing. The group responsible for the scrape, meanwhile, is framing the release as a preservation project. Whatever the framing, though, the disclosure exposes a fact of life in the era of streaming: when the catalog becomes the internet’s version of a commons, being able to protect — and profit from — those pipes is just as crucial as licensing those songs.

Gregory Zuckerman
ByGregory Zuckerman
Gregory Zuckerman is a veteran investigative journalist and financial writer with decades of experience covering global markets, investment strategies, and the business personalities shaping them. His writing blends deep reporting with narrative storytelling to uncover the hidden forces behind financial trends and innovations. Over the years, Gregory’s work has earned industry recognition for bringing clarity to complex financial topics, and he continues to focus on long-form journalism that explores hedge funds, private equity, and high-stakes investing.
Latest News
40% Unaware of Android Auto hidden menu: survey
Dangbei DBOX02 Pro 4K Hits Record Low Price
Xiaomi Teases 17 Ultra Ahead Of Holiday Launch
Dolphin Android continuous scanning and updates overview
Magnetic Phone Controllers Dominate Mobile Gaming
OnePlus Is Reportedly Making a 9,000mAh People’s Phone
Google Postpones Gemini Replacing Assistant on Android
Galaxy S26 external modem may increase power use concerns
Retroid Terminates Pocket 6 Early Bird Pricing Due to RAM Hike
Samsung is working on a wider foldable to counter an iPhone Fold
Second One UI 8.5 beta begins rolling out to Galaxy S25
Galaxy S26 sales could start in March after February reveal
FindArticles
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
  • Corrections Policy
  • Diversity & Inclusion Statement
  • Diversity in Our Team
  • Editorial Guidelines
  • Feedback & Editorial Contact Policy
FindArticles © 2025. All Rights Reserved.