Amazon investigating problem after S3 suffers 8-hour outage

By Tim Conneally | Published July 21, 2008, 5:40 PM

Amazon's Simple Storage Service (S3) was down for more than eight hours over the weekend, affecting many prominent sites, and the company is still investigating the cause of the problem.

Cloud-based services such as those offered by Amazon provide cost effective solutions in computing and storage. However, the oft-cited drawback of relying on such offerings is that customers are left with little or no control if something goes wrong. The only option is to wait -- and in cases like this, wait nearly half a day.

Amazon's S3 Simple Storage Service which was introduced in 2006 is a part of the Amazon Web Services (AWS) suite, also consisting of the Elastic Compute Cloud (EC2) and SimpleDB services.

On July 20, the S3 component of AWS was down for more than 8 hours, affecting sites like SmugMug, Twitter, Centernetworks, and many of Amazon's own sites. The Amazon Web Service Health Dashboard shows that the Simple Storage Service and Simple Queue service experienced a "service disruption."

In a communication with the company, GigaOM's Om Malik received a rather general explanation as to why the service was down: "As a distributed system, the different components of S3 need to be aware of the state of each other. For example, this awareness makes it possible for the system to decide which redundant physical storage server to route a request to."

"We experienced a problem with those internal system communications, leaving the components unable to interact properly, and customers unable to successfully process requests. After exploring several alternatives, the team determined it had to take the service offline to restore proper communication and then bring service online again."

"These are sophisticated systems and it generally takes a while to get to root cause in such a situation -- we will be providing our customers with more information when we've fully investigated the incident," the company added.

Many companies utilize AWS, so a loss of functionality has the potential to affect a huge number of services. Both Red Hat and Sun utilize EC2, which has also experienced various outages. Consumer-aimed services like HP's Upline have faced numerous outages as well.

Comments

View comments by with a score of at least

I noticed the outage as a jungledisk user, but I've got to be honest S3 is the fastest, cheapest, most awesome online backup solution out there right now. 8 hours out of the 2 months I've been using it wasn't a problem. Now for me it's a backup solution, I don't host data I need to access immediately up there (unless of course my drives fail and then I would :).

Score: 0

|

Microsoft launches Office 2010 technical beta a few days early

A big week for Microsoft starts off with an out-of-sync surprise: the early release of the Office Technical Beta ahead of the launch keynote.

PDC 2009 Day 0: Vista is through

If there was any doubt in your mind that Microsoft is putting Vista behind it, the first session at PDC would eliminate it for good.

Windows Marketplace for Mobile launches on WinMo 6.0 and 6.1

No longer isolated to Windows Mobile 6.5, the Windows Phone app store has opened up to older versions of Windows Mobile.

Samsung releases another Android: where will it fit in with Bada approaching?

Samsung today announced the Galaxy Spica, sequel to its first Android handset destined for Europe and Asia.

Twitter to abandon 'politically biased' suggested user list

Twitter's suggested list of users to follow will be going away, says co-founder Biz Stone.

The Internet can still be a positive force, World Wide Web Foundation says

Sir Tim Berners-Lee's World Wide Web Foundation has launched worldwide operations.

Blockbuster's way down, but poised for a comeback

Though it took a serious beating in 2009, Blockbuster CEO Jim Keyes says the company can turn it around.

iTunes Preview doesn't go far enough to create Web-based option for store

Apple has rolled out iTunes Preview, a Web interface for browsing iTunes.

PDC 2009 Preview: The move to Office 2010 and Visual Studio 2010

The major focus of Microsoft's conference next week will likely be explaining why two pillars of its software sales strategy deserve to remain where they are.

Dell's first smartphone aids the Android onslaught

Longtime PC leader Dell has finally announced its Android-based smarphone.

After the Intel + AMD armistice: Do we really want a level playing field?

Scott Fulton On Point: One by one, the reasons for us to continue suspending the course toward open and fair competition in IT, are dropping like flies.