Case study: Crooksandliars.com converts from Wordpress to Drupal
Crooks and Liars is an American liberal blog, which was founded in August of 2004, during the 2004 Presidential election, by John Amato. It was the first video style blog around, starting in a pre-YouTube era. Crooks and Liars has a team of about a dozen volunteers, including administrators, contributors and moderators. Crooks and Liars has grown immensely since its birth, now averaging over 230,000 unique visitors per day and over 330,000 page impressions.
Evolution
Crooks and Liars originally started out on Radio Userland , which served as its home for two years. After that we started exploring other blogging platforms. At that time we were averaging around 100,000 hits per day. We decided to move to Wordpress , which could handle our smaller team of only 4 at the time.
As the site continued to grow and we were approaching the 200,000 hits per day mark, we started experiencing a lot of down time from server overloads. We were utilizing the famous wp-cache plugin for Wordpress, as well as hosting the database on a single master and two slaves, using the HyperDB class for Wordpress to handle the replication.
While investigating the problems we realized one of our problems was the high level of comments the site receives per day – around 2,000. Being a political blog we also attract a lot of disruptions in our comments, which meant a larger work load for our small moderation team of volunteers.
CMS Selection
With our increased down time, we had to start looking for another CMS solution. We also had plans on expanding the site beyond just being a political blog, so that meant we need something that could handle our future plans.
I had been running my personal blog on Drupal since the 4.7 days, having moved from Wordpress myself, so I was already aware of a noticeable performance increase plus a greater framework in which to expand upon. This instantly put Drupal ahead of the rest. I then decided to confirm through a series of tests. I setup default installations of Wordpress 2.3 and Drupal 5. I only enabled the core caching mechanisms in both setups and populated them with the exact same data and display options. Both systems also used the default themes and features. After running a series of tests through JMeter, I quickly confirmed my beliefs and even exceeded them as I saw Drupal was able to handle about eight times the requests per second as Wordpress, both on the front page and the same single post view with 157 comments.
Hardware
We are running Crooks and Liars off of four servers. One server is strictly to handle our videos, then the rest are divided up as such:
- WEB – 2.66 GHz quad code Xeon with 8gb of ram running NetBSD 4.
- MySQL - 2.4 Ghz Core 2 quad with 4gb of ram, running NetBSD 4.
We also have an additional server operating as a slave for MySQL right now, although we aren’t actually using replication through Drupal. It is acting more or less as a hotbackup. That same server also runs our Memcached server. We also have an Apple XServe G5 which handles all our static content.
Converting Data
Converting data over for my tests was rather simple. I used the Wordpress to Drupal converter offered by Prime357 , which made the work very simple. However moving over our live data had some extra obstacles. We were using a few special plugins in Wordpress to store our video data in the posts, plus to handle embeds easily. Before doing our final conversion I had to write a couple of short scripts to handle the converting of our media data so a new custom Drupal module I was working on could deal with it. I also decided to remove our embed plugin we had used in Wordpress and instead inject the embed code directly into the node->content field for Drupal. Once that was handled, I ran the Prime357 converter and we had the data fully ready for Drupal. The conversion program took approximately an hour to run on our 600+ megabyte database containing over 30,000 posts and over 800,000 comments. The time was really amazing considering that large amount of data.
Design
Our Wordpress template was pretty much the same as the Radio Userland template we had used prior to switching. With our big move we also decided we wanted a sparkling new design. We enlisted the help of Carolyn King , a highly talented web designed I had worked with before. We wanted to come up with a template that would fit with our genre of being a video site and be able to handle a variety of advertising formats. We also decided it was time to move to a fixed width format, which we decided to size for monitors at 1024x768. For the main site, we had come up with a couple of different designs for the blocks. I added in a custom module so that we could designate available block styles in the template.info file and select our block style in the block configuration page.
All of our templates are sub-templates of a main skeleton. This makes adding in new sites extremely simple, mostly just changing around a few lines in the CSS file.
Contributed Modules
Most of the modules that make Crooks and Liars tick are custom, but we are relying upon a few contributor modules.
CacheRouter – We are using this configured for our memcached server. Currently we are using only a single server, but plan to move it to a couple of servers based upon data type in the near future.
BUEditor – We decided to drop the WYSIWYG feature of Wordpress that was offered through TinyMCE and instead go to a code editor only. The posts on Crooks and Liars do not contain much formatting and we constantly suffered problems of people pasting in code from other webpages into TinyMCE and it breaking our front page. BUEditor’s API also offered the features we were looking for to expand it so we could incorporate our new media management system seamlessly.
We are running a few other contributed modules, such as persistent login, IMCE and IMCE Crop . Most of our expansions are actually done through custom modules.
Custom Modules
- CL Media – This is the workhorse of Crooks and Liars. The media system stores data for all our videos (currently over 6,000). A custom slug is inserted into the. When clicking the “Insert Media” button on BUEditor, a new window is opened, designed much the same as the IMCE browser window. From here contributors can add or edit videos and then choose to insert the media into the post with a screen cap or without. Media is inserted via a custom slug in the $node->content, which gets rendered out during the node alter phase.
- AJAXComments – Our readers were used to the luxury of Ajax Comments that I had instituted on our Wordpress site, so I decided to come up with a system for Drupal. This is something I hope to put into a contributed module in the future. Users can post comments, preview comments and load new comments without refreshing the screen. We also needed to use Drupal’s comment paging with our large number of comments, so this added a little obstacle into the design. Finally I decided on reloading the entire comment division when new comments are loaded, along with the pager output. We also decided to use the threaded comments feature, but realized not all users like that. To overcome and complaints I added the option for users to change their comment display preferences right on the node page. There is a small button where users can select display order and type. When they do this it is saved in their user data so the preferences stick throughout the site.
- PostManager – We rely heavily upon the ability to schedule posts to publish at different times. I originally had us using the Scheduler module, but ended up writing my own custom solution. For users with the “publish {node_type} capability”, they have a box that they can either enter the time/date or use a popup calendar to schedule the post’s publish time. The PostManager also adds new permissions for each content type. These include publish and promote. This gives us the ability to set different roles for all our subsites. We can have an admin handle scheduling and publishing of posts on the Video Café site, but that person doesn’t have the ability to change the schedule of our front page.
MultiSite – All our subsites are their own content type. To add a new subsite, I simply have to create a small module defining it, and the data gets read through a custom hook from MultiSite. On Drupal_init, the request URL is parsed to determine if we are loading a multisite site or not. On single node views, the same check is done so that a multisite item isn’t displayed on a different sites template. This module is very lightweight and has been working out grat. Moderator – One of the problems we had with Wordpress was not having the ability to add a moderator only role. The only way we could have actual comment moderators was to make them full site admins. I had done custom hacking to Wordpress to give us this permission, but it was turning into a nightmare to keep up with on every update. Drupal gives us the ability to have comment moderators out of the box. The only thing I added was a new view that shows all comments posted on the site in chronological order, including the content. I also added in a simple banning system, so our moderators can ban a user, which prevents them from posting comments.
Core Modifications
We made a few minor core modifications to help handle our high traffic spikes. The first thing we did was add in a mechanism for our static content. All Javascript, CSS and image files are automatically sent to our static server. We run this via an rsync. We also have a mod_rewrite rule running on the static server to redirect traffic back to the main server if a file doesn’t exist. This eliminates any race conditions from occurring waiting on our rsync to run. Since we are using custom paths on all nodes, we decided to go through and also cache drupal_lookup_path. This will only cache in memcached. If we have to disable memcached for some reason then no caching will occur. This greatly reduced our number of queries, which now stands at approximately 80 per page.
Training the staff
The contributing and editorial staff of Crooks and Liars come from more political and journalistic backgrounds, so ease of use was a big factor. One concern was the removal of a WYSIWYG editor. Since our posts are rather simple in formatting, the switch to a text only editor caused no problems. To facilitate training of other parts of Drupal and our custom modules, I made a series of short training videos. I used Camtasia Studio for these videos and simply demonstrated different functions, such as moderating comments, scheduling posts, assigning users new roles, etc. These proved to be very beneficial and reduced the problems the staff had with such a major platform switch.
Final Thoughts
We have been running on Drupal for over a month now with no major incidents. The transition went smoother than anticipated. We had no problems on the conversion, and only a couple of very minor problems following the migration, which were mostly caused by coding errors in our custom modules. Even at our highest peak times our database and web servers sit almost idle. We recently moved to the registered comments only and have seen times with over 200 registered users online and over 3,000 anonymous visitors. At that time our server loads were minimal.
There is still a lot of work to be done. We haven’t started rolling out the new features we were planning on yet, but those will be coming in the near future. Our goal was to do an incremental roll out so we could easily hammer out any problems as they popped up. This has worked out extremely well.
With the minor changes between point versions of Drupal, we have also been able to easily patch our core with our modifications. I can easily update the site to a newer point version within about ten minutes. This was a big selling point and something we couldn’t do on Wordpress with the core modifications we had to make there.
John Amato, the founder of the site, is extremely happy with the outcome. We are looking forward to a long term relationship with Drupal and hope to contribute back to the community.