When the Cloud Fails: What the AWS Outage Teaches Us About Cyber Resilience

But the impact was huge.
Apps like Snapchat, Venmo, and Reddit stopped working. Smart home devices went silent. Even Amazon’s own Alexa had issues. For several hours, millions of people and businesses around the world realised how much they depend on “the cloud” and how quickly things can fall apart when it fails.
This kind of incident might not involve a cyberattack, but it exposes the same weaknesses that hackers often exploit: a lack of preparedness, weak contingency plans, and too much reliance on one system or provider.
A Small Glitch, a Big Problem
AWS said the issue came down to something called a DNS resolution problem, which means the system that helps computers find where to send information stopped working properly.
Think of DNS like the internet’s phone book. It tells your app or device where to go when it needs to connect to another service. When that phone book goes missing, nothing can reach the right destination.
That small glitch caused a domino effect. Services that relied on AWS couldn’t connect to what they needed. Systems backed up. Websites crashed. Businesses lost access to customer data, orders, and communications, all because of a single failure in one region of the cloud.
Not a Hack, But Still a Cyber Incident
It’s easy to shrug this off as “just an outage,” but the results looked almost identical to what would happen during a major cyberattack.
Systems were unavailable, operations stopped, and confidence took a hit. The difference was that this wasn’t caused by hackers or criminals – it was a fault in the technology itself.
That’s why this kind of event fits perfectly into conversations about cyber resilience.
Cyber resilience isn’t just about stopping attacks. It’s about keeping essential systems running, or bouncing back quickly, no matter what causes the disruption.
The Line Between Cybersecurity and Business Continuity
Traditionally, cybersecurity and business continuity have been treated as two separate things.
Cybersecurity focused on protecting systems from attack. Business continuity focused on getting back up and running after something goes wrong.
But in today’s cloud-driven world, those two areas are becoming one and the same. If your cloud provider goes down, your business goes down too.
That’s why resilience needs to be built into every layer of a company’s digital strategy, from how systems are designed to how staff respond when things break.
Some key takeaways from the AWS outage include:
- Don’t rely on a single provider or region. Spread systems across multiple regions or even different cloud companies where possible.
- Plan for failure. Build in backup systems and failover options so services can keep running at a reduced level.
- Test your response plans. Don’t just plan for cyberattacks, run drills for outages too.
- Review your contracts. Make sure you understand what your cloud provider covers and what falls on you.
- Communicate clearly. When things fail, quick and honest updates matter as much as technical fixes.
A Shared Responsibility
Cloud computing runs on what’s called a shared responsibility model.
AWS, Microsoft, Google and others handle the security of the underlying infrastructure, such as the servers and networks that make everything run.
Customers, meanwhile, are responsible for their own configurations, data, and resilience planning.
The AWS outage showed that even when the cloud provider does everything right from a security standpoint, things can still go wrong. That means every organisation needs its own plan for what to do when a provider or a region goes offline.
This isn’t just a business issue; it’s becoming a national one.
Governments now classify major cloud platforms as critical infrastructure. Regulators in the UK, EU, and elsewhere are starting to require operational resilience planning for exactly this reason, because when the cloud stumbles, the economy feels it.
What Leaders Should Take Away
This incident isn’t just a tech story. It’s a leadership story. Here are a few lessons worth keeping in mind:
- Assume failure will happen. Build systems and teams that can recover quickly.
- Think beyond attacks. Outages, errors, and software bugs can cause just as much damage as cyber threats.
- Invest in resilience, not just defence. Firewalls and antivirus stop attacks, but backups and redundancy keep you running when they fail.
- Test, learn, and adapt. Every incident is a chance to strengthen your systems and improve your response.
- Make resilience everyone’s job. From IT to leadership, every team has a role to play when the cloud goes down.
From Outage to Opportunity
The AWS outage was a reminder that our digital world, for all its power, still has weak spots.
It also showed how closely cybersecurity, business continuity, and technology policy are now linked.
True cyber resilience isn’t only about preventing attacks. It’s about being ready for anything that can disrupt the systems we rely on every day.
At Solas, we help organisations prepare for exactly these kinds of challenges.
Our cyber resilience services bring together cybersecurity, business continuity planning, and crisis response, so when disruption happens – whether from a cyber threat or a cloud failure – you can recover quickly and keep your operations running.
Because in the end, resilience isn’t just about technology. It’s about confidence, continuity, and control – even when the cloud goes down.
Outages happen, but they don’t have to stop your business.
See how we help organisations build stronger, smarter resilience plans that keep you running when it matters most.
