The birth, death, (and hopefully rebirth) of Ubersoft.net on Drupal 5.1 -- a site performance autopsy
When I unveiled the Drupal-powered version of my webcomic to the world, I was proud of what I'd done -- legitimately so, I think. A ten-year-old webcomic, with ten years worth of comic archives, completely converted over to a database-driven site is nothing to sneeze at.
Unfortunately, this pride was short lived: after three days up, my site host provider suspended my account because I was getting too much traffic (thanks, in part, to a link from Drupal's front page) and it was killing the server it was on. They moved me to a test server for about a week, where they monitored how much site resources my site was using... after that week they reported that ubersoft.net had been consuming a full 4% of the site resources, and consquently I'd need to move up, at the very least, to a virtual host provider package.
I had neither the time nor the finances to do this, so I sadly moved back to my static pages. The Drupal version of the site is currently sitting in my test area, while I try to figure out how to redesign it so it can minimize the consumption of site resources.
Since reverting back to the static pages I've been trying to figure out what specifically caused the site to go down. At this point, I think I can point to the following areas:
1. Inaccurate reporting of site hosting capabilities on the part of the provider, and a lack of tools to accurately allow customers to measure site resource usage.
2. Database activity from Drupal on top of standard site resource consumption.
A more complete analysis follows.
Site Hosting: What is Advertised vs. What is Given
First, I don't want this to come off sounding like an attack on my host provider. A2hosting.com was very helpful during this time, their tech support was very responsive, and they allowed me to essentially sit on a dedicated server at shared hosting prices for an entire week while we monitored the site traffic. The problem, in my opinion, is that the industry standard used for bandwidth metering predates the common adoption of database-driven sites, and they are no longer reliable benchmarks.
I purchased a shared hosting plan from them as a jumping off-point. It was their highest level shared plan, with a bandwidth cap of 200gb/month. The bandwidth cap is, unfortunately, the only unit of measurement most people have when trying to determine what kind of account to lease from a hosting provider, and 200gb/month was well above my monthly site traffic. My static site ranges anywhere from 30gb-55gb/month, so I figured an upper limit of 200 would give me some room to grow before moving to a semi-dedicated or dedicated hosting plan.
Unfortunately, the bandwidth limit does not accurately measure every aspect of site resource consumption -- specifically, it doesn't measure how much database activity is going on behind the scenes, and that is far more important to a database-driven site.
I had access to my webalyzer stats from both my Drupal site and my older static site. I was able to find a period of spiked traffic on my static site (a few days when I had been linked by Reddit) and compare it with the time on my Drupal-powered site when Drupal had linked to it from the front page. This comparison revealed the following information:
- while bandwidth consumption on the Drupal-powered site was higher on average (about 15-20K per page) it was never in danger of reaching the bandwidth cap -- not even close
- hits per page on the Drupal site was actually lower than hits per page on the static site
From this I determined that my site traffic and resource usage was well within the advertised limits of my account. However, the limits were based on a pre-database model for measuring site traffic and resource consumption, and a shared hosting account can have hundreds of separate hosting accounts on a single server. There was another piece of the picture -- database activity -- that wasn't factored in at all.
Database Activity: the 40-ton feather that broke the camel's back
Most host providers give you "unlimited mysql accounts" but don't actually give you any tools that you can use to measure database activity. My host provider was no different in this case, so I was very curious how a site that seemed well within the limits of all advertised resource caps -- disk and bandwidth usage -- could be using 4% of the system resources of a shared hosting account. Obviously if you're on a shared hosting plan, with hundreds of other accounts on the same server, 4% is a ridiculously large slice of the server to be keeping all to yourself. But the only unit of measurement I had on hand was webalizer statistics and monthly bandwidth usage.
My webcomic averages about 8.5K unique visits on Monday through Friday (when I update) and roughly 20K-30K page views (for the readers who stroll through my comic archives). When I was linked by Reddit (on the static site), this spiked up to about 16K unique visits on a single day with 42K page views, and when I was linked by Drupal (on the drupal site), the unique visits and page views spiked in a similar manner.
The bandwidth consumption, as mentioned earlier, was a bit higher, but not high enough to exceed my 200gig/month cap, and the hits per page was actually lower so in theory Apache was doing less work to serve each page...
... but there was a database to take into account, and this is what killed my site.
My host provider had no tools I could use to monitor my database activity, but some Drupal users suggested I install the Devel module, which allows admins to view statistics about what the database is doing and how long it takes for those things to be done. Specifically, Devel allows you to see how many database queries are performed on each page, what those queries are, and how long each query takes.
On my test bed (which is an exact replica of the live site, using the same version of php, apache and mysql) I installed the module and started browsing pages. The Devel statistics were interesting:
- the front page required anywhere from 90-150 database queries
- individual comic pages in my comic archives averaged 70-80 database queries
- my archive table of contents for each comic (lists of my comics that could be browsed by year) used anywhere from 100-300 queries, depending on the filters used
I have no idea what the average is for database queries on a Drupal site. I do know that my site uses views quite heavily -- publishing a comic on Drupal is a bit more complicated than your standard blogroll, and publishing three comics plus general site news posts required a fair amount of tinkering on my part. I can surmise, however, that there is a significant difference in site resource consumption between a single person viewing a static page consisting of twelve hits in total, and a single person viewing a static page consisting of five hits in total plus 70 to 300 individual database queries. Multiply that by 16,000 unique visitors (or, more acurately, 47,000 individual page views over the course of one day) and you a LOT more activity going on in the background with a Drupal site than you do with a static site. And this was with both mysql caching and Drupal site caching enabled (though it was not set to aggressive caching).
That, I believe, is what killed my site.
Moving on: Making Drupal Work
So the question now becomes "how can I make Drupal work?" I was very pleased with the functionality I had put into the Drupal-powered website -- it had navigation and content searching features that static sites simply can't provide, and the taxonomy features of Drupal are head and shoulders above any other CMS I've played with... and perfectly suited for making images easier to search and navigate in a text medium. The problem, apparently, is that this functionality requires more power on the back end, and I need to figure out:
- whether I need to move to a more robust (i.e., more dedicated) hosting solution
- whether I need to optimize the drupal site to minimize database usage
- whether I need to do some combination of the two above options.
I suspect the third option is correct, but at this point I'm not sure which is more important. My hosting provider said that based on the 4% resource consumption I'd probably do well on a semi-dedicated hosting plan... but I'm not sure if that 4% is a natural result of the site traffic or a result of bloat I introduced in my site when I made all those views. The thought of 47,000 page views generating anywhere from 4,700,000 to 14,100,000 database queries in a single is alarming, but is it alarming because it's ridiculously high or is it alarming because I'm not familiar with database-driven sites? Is it normal for a single page to make 100 database queries? I don't think that's normal. On the other hand, I don't have enough experience to know if my instinct is on target.
Would I be well served to work on paring back all the features in my site in order to reduce the database load? Would doing that reduce the functionality of the site to the extent that it's just not particularly useful to my readers? The last question is particularly important to me -- I'm sure I could create an alternative Drupal design that reduced the number of database queries per page, but I'm not sure that the design would make my content more accessible and easier to navigate for my readers -- which was one of the main draws for moving over to a database-driven site.
The End
That's about all I have to say in terms of analysis. I hope it was useful to the rest of you... I spent a lot of time trying to figure out what to say. An earlier version of this post was considerably longer, a lot more confusing, and generally useless. I'm half-afraid that this version has gone in the other direction and doesn't give enough background information.
At any rate, I'm very intrested in what those of you who run high-traffic sites think of this post and my conclusions in it. All comments welcome and encouraged...
Drupal version: Drupal 5.x