Thursday, December 19, 2013

Sunny Skies Ahead? My Foray into Cloud Computing

I recently was asked to give a presentation on how we use Amazon Web Services at Church on the Move to manage our websites, and especially how we use them to handle our Christmas Train sales - which have continued to expand in demand each year significantly. That got me thinking, I spent a lot of time going around the internet looking for how to put the various pieces together, but don't recall a single location where the information is easily available. This blog is my attempt at providing information that hopefully will help others who are interested in Amazon AWS and would like to see how someone is actually using their services in a website.

I would  like to point out that I do not consider myself to be an expert at AWS, but I have been using their services for a little over three years as of this writing, and have been around the block a few times. Hopefully my insights and experiences will help you make better decisions in regard to using Amazon AWS.

In this first entry, I am going to talk a little bit about cloud computing in general, my personal experience in getting involved in cloud computing, and an overview of the services we use at Amazon Web Services. I will go into more detail in future posts about each service and how it applies to our usage of Amazon Web Services.

As I mentioned above, I work at Church on the Move in Tulsa, OK as the Senior Web Applications Developer. That basically means that I am in charge of building systems to maintain our websites as well as building internal applications that are web-based. When I first started in this department, our website was hosted on a dedicated server in a data center somewhere - one of those black box data centers where their advertising shows beautiful rows of server racks, one of which is supposed to be yours, but in all reality is probably rows of inexpensive desktop machines sitting on shelves. Either way, we finally decided we had outgrown that and wanted to have our own physical server that we built and could actually see and touch. We found a local data center, and built a monster server to put there. This server (and all the servers we had on site) had 1.5TB of hardware RAID-5 storage - drives were cheap and we wanted to make sure we would never run out of room.

Then one day our church produced an awesome Father's Day tribute video called Dad Life. This video went viral, which drove a ton of traffic to our website, and the monster server we built ... didn't seem so monstrous any more. I managed to limp through that season and keep our site alive, but just barely. That was the first time in my professional career that we had a server overloaded with traffic.

Fast forward about 6 months, and we are gearing up for online sales for The Christmas Train - an annual event that our church puts on at our kid's camp called Dry Gulch, U.S.A.  It typically operates for 15 or so days between Thanksgiving and Christmas, and we welcome an average of 50,000 guests each season. The year prior was the first year that we offered online tickets, and was also the first year where our numbers were very underwhelming. This year was going to be different - we were going to offer a 24-hour super sale with heavily discounted tickets. Our goal was to sell half of the tickets during this sale.

After the Dad Life issue, I knew that we had the potential to be unable to withstand the potential demand given our current infrastructure. I began looking at AWS, because I had briefly been reading about how they have virtually unlimited compute capacity with auto scaling. The huge drawback was that it was fairly expensive to build a virtual server with 1.5TB of storage - at AWS or at any cloud computing service. Because this was how I was used to thinking, I had a difficult time deciding that AWS would be a long term solution for our web hosting. For Christmas Train, however, it was perfect. We ended up selling out that first year in about 35 hours. The second year we sold out in 11 hours. This past Christmas Train we sold out in 75 minutes - and would have probably sold out in less than 10 minutes if our infrastructure would have handled it.

Before I go on, I want to make a quick comment about that last sentence. A lot of people think that AWS and auto scaling is the magic bullet - just throw your system on an auto-scaling server and you'll never run out of capacity. I even had someone who had difficulty buying tickets this year say (in a not very nice way) that we could use something like AWS and we would never have any problems. But auto scaling is not a magic potion you apply to your website to prevent you from ever having any problems. You still have to anticipate accurately to some degree. Last year we sold out in 11 hours. This year we expected to sell out in 5-6 hours. We postured to sell out in 15 minutes. We would have sold out in less than 10 - probably less than 5 - if we could have. Doing that would have meant a fundamental change in our system design, because at that rate the issue was no longer server load, it was design decisions that were made with the expectation of a much slower sale rate. Accommodating the traffic we actually experienced would have meant completely changing the way the ticket sale process functioned - not just in the code design, but the whole end user experience as well.

We now have about a dozen websites served using on average 15-20 EC2 instances at Amazon AWS. My fears about the limitations of cloud computing proved to be unfounded, but I had to have a paradigm shift. You can't treat cloud computing in the same way that you treat a dedicated server. Sure, most of the ideas are the same, but the differences are not completely transparent.


No comments:

Post a Comment