FindArticles FindArticles
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
FindArticlesFindArticles
Font ResizerAa
Search
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
Follow US
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.
FindArticles > News > Technology

AWS outage disrupts parts of the internet for hours

Gregory Zuckerman
Last updated: October 20, 2025 2:03 pm
By Gregory Zuckerman
Technology
6 Min Read
SHARE

An Amazon Web Services failure spread across the web, turning popular sites, apps and smart home systems off for a few hours. The disruption focused on the US-East-1 region in Northern Virginia, a gigantic hub that many companies consider their default. The accident was a clear reminder that one false move inside one major cloud can ripple across the internet.

What failed inside AWS and how the issue cascaded

AWS reported increased error rates and latency for core services such as EC2, Lambda and DynamoDB after identifying the root cause to be a DNS resolution issue with a DynamoDB API endpoint in US-East-1. When DNS breaks, the service is unable to translate hostnames into IP addresses that can be used, and each dependent system starts throttling or retrying.

Table of Contents
  • What failed inside AWS and how the issue cascaded
  • Why a single region failure can disrupt many apps
  • How the outage rippled across sites and services
  • What the data shows about the scope of the outage
  • How a DNS glitch can lead to a widespread outage
  • Lessons for engineering teams to improve resilience
  • The bigger picture and long-term resilience takeaways
AWS cloud outage on status dashboard showing widespread internet disruption

That interplay mattered. A lot of apps are now running session state, configuration and feature flags on top of DynamoDB. As SDKs tried to reissue requests, the load mushroomed into a “retry storm,” overwhelming network gateways and control planes. The old saw proved true: it’s always DNS, until it’s not.

Why a single region failure can disrupt many apps

US-East-1 is the busiest of AWS’s regions, and frequently where new services arrive first. It also serves as an important management hub, so incidents there can have outsized impact. Despite reads being distributed worldwide, many employ critical write paths still pinned to this area in order to meet latency and cost requirements.

Analysts at firms like Gartner have long cautioned about single-region and single-cloud concentration risk. If an app can’t fail over safely to another region — if things like DNS or identity or your data layer is tied to one location still — a relatively small fault can cascade.

How the outage rippled across sites and services

They produced a visible impact for both consumers and businesses. Users said they were having difficulty logging into services like Snapchat, Ring, Alexa, Roblox and Hulu as well as financial and AI-based ones like Coinbase, Robinhood and Perplexity. Even some of Amazon’s own retail and streaming properties experienced an initial disruption.

There were also reports of degraded services at major institutions outside the United States, emphasizing the global extent of US-East-1 dependencies. Smart home appliances became disconnected, back-office systems ground to a halt and customer support lines increased in size as apps waited to time out, instead of quickly admitting defeat.

What the data shows about the scope of the outage

The AWS Health Dashboard logged effects in 28 services during the event, an indication the blast radius was wider than a single product. Outage-tracking sites like Downdetector received more than 14,000 user reports for Amazon at the outage’s peak, and infrastructure monitoring companies observed spikes in DNS error rates and connection timeouts in North America and Europe.

Amazon Web Services logo with broken cloud and error icons signaling internet outage

AWS said that it took several avenues to recover in parallel. Upon fixing the DNS issue, the company recommended some customers flush their DNS caches in order to clear outdated records and enable fresh resolution. While most operations were restored promptly, a portion of services were still being throttled as capacity was stabilizing.

How a DNS glitch can lead to a widespread outage

DNS issues are uniquely pernicious. Endpoints with small TTLs can result in high lookup rates; if resolvers return errors, nodes back off and retry, leading to a surge of traffic. In microservice architectures, a single failed dependency—e.g., your metadata store or session store—can effectively lock login flows, checkouts and content delivery if compute and networking are otherwise healthy.

In cloud environments, private DNS resolvers, service discovery, and control-plane APIs all live on the same dependence tree. A bug in one layer can look like many different failures at the edge of an application, making diagnosis more difficult and recovery slower without protective patterns.

Lessons for engineering teams to improve resilience

Design for regional failure as a first-class citizen. Active-active across regions, DynamoDB global tables, or an equivalent multi-region data replication can help keep core user flows alive when a dependency stalls—and in read-only mode for noncritical features. Leverage circuit breakers, bulkheads, and request budgets to avoid retry storms.

Harden DNS. Use multiple resolvers, verify health checks, and pre-provision backup endpoints to switch over. Prudently cache configuration and identity tokens to tolerate transient control-plane failures. Chaos engineering exercises and game days that mimic DNS or data-store failures are important tools for reducing blast radius.

The bigger picture and long-term resilience takeaways

Cloud providers have strong aggregate uptime, but dependency chain brittleness persists. Uptime Institute research has consistently indicated that software glitches and change management anomalies are the two biggest culprits in major incidents, many of which have enormous financial implications.

This outage didn’t mean that “the internet” failed; it revealed a concentration risk. A single DNS ripple connected to a key data service in a dominant region was sufficient to darken substantial swaths of the online economy. The takeaway is clear: resilience should not be a feature you bolt on after the fact; rather, it’s an architectural decision you make up front.

Gregory Zuckerman
ByGregory Zuckerman
Gregory Zuckerman is a veteran investigative journalist and financial writer with decades of experience covering global markets, investment strategies, and the business personalities shaping them. His writing blends deep reporting with narrative storytelling to uncover the hidden forces behind financial trends and innovations. Over the years, Gregory’s work has earned industry recognition for bringing clarity to complex financial topics, and he continues to focus on long-form journalism that explores hedge funds, private equity, and high-stakes investing.
Latest News
Adobe AI Foundry Takes On Generative AI Legal Risk
OnePlus 15 Leaps Ahead of S25 Ultra in Battery Race
Pebble Smartwatch And App Back On iPhone And Android
5 Big Lenovo Laptop Deals to Grab Right Now
Best Computer Monitor Deals Available Now
Samsung Galaxy S25 FE Review For Fans And Newcomers
Amazon DNS Outage Interrupts Wide Swath of Internet
GameHub Lite Removes Trackers and Permissions From Emulator
Anbernic RG DS Plays DS And 3DS Under $100
Alexa Alarm Chaos Causes Smart Home Nightmare
X Plans Marketplace For Rare Inactive Usernames
Five months after leaving T-Mobile, I still have no regrets
FindArticles
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
  • Corrections Policy
  • Diversity & Inclusion Statement
  • Diversity in Our Team
  • Editorial Guidelines
  • Feedback & Editorial Contact Policy
FindArticles © 2025. All Rights Reserved.