Case Study of OpenTheMagazine.com
The Open Magazine is a new current affairs/entertainment weekly magazine available on magazine stands in India. The magazine was launched in the first quarter of 2009.
The publisher of Open Magazine, Open Media Pvt. Ltd. wanted a web presence for its magazine. It selected Srijan Technologies, a content management specialist company based out of New Delhi, India to construct the website. Srijan's responsibility was to build the website from scratch and host it. Site design was done by Itu Chaudhuri Design.
Srijan Technologies used Drupal 6. Development took place over a period of approximately 4 months and the Open website was released to the public in July 2009.
Currently the website shows all the content that appears in the print version of the magazine online for free. The website also contains some exclusive web content that is not present in the print version. The website is substantially updated once a week though web exclusive content is updated more frequently to keep the site "fresh."
The purpose of this case study is to introduce all significant (mostly technical) aspects of the website from Srijan's perspective. Feedback and comments are requested.
Key features of the website
- A visually appealing website with a lot of custom theming
- An eMag feature which exactly reproduces the magazine on the web (third party Flash integration)
- Advanced Search features using the Apache Solr search engine
- Integration with ICICI bank's payment gateway for online subscription to the magazine
- Incorporates rich multimedia content (slideshows, videos and images)
We now look at some of the key features in greater detail.
Apache Solr Search
Our client was very focused on a good search experience from the very beginning. After evaluating a couple of options we decided to use Apache Solr for search because:
- Faceted Search is really cool! Once you get the hang of it, it allows you to explore content on the website easily
- Apache Solr is a capable search engine with a lot of "enterprise" class features lacking in default Drupal search (or even the faceted search module*)
- Apache Solr is scalable. As the number of nodes on your website increase, search experience does not degrade (PHP based search is an incredible resource hog and may not scale well). Search can be offloaded to another machine further taking load of the webservers. Apache Solr has a Java backend.
- Apache Solr is easily customizable. Here is a sample search result page. We have themed the search result intensively to give a nice user experience.
* Faceted Search is a separate Drupal Module -- not to be confused with the faceted search feature provided by Apache Solr
Some unique features on our implementation:
- We show the teaser of the article in addition to the textual matches returned by Apache Solr
- Apart from selecting facets you can also search within facets. For example you can search for "Jack" in the author "facet." This will return articles with both "Jack" and "Jack Straw" as author.
Overlays
One of the most distinctive features of our website, we believe, is the overlay drop down menu. Our magazine is very visual and there is no reason why the dropdown menus should not be exciting!
This uses the jQuery plugin available here as a base.
E-Magazine Feature
The Open Magazine is a paper magazine. The customer wanted a way for the site visitor to see a virtual paper version. We have done this using the Issuu.com flash plugin.
Please check our latest emag.
Theming
The site is custom themed using Zen as the base theme. The site has a lot of "bling" and most of the site development time was spent on:
- Customizing the Zen provided node, page etc. template (.tpl.php) files and attaching CSS to them
- Using the template files produced by Views (often minimally customizing them) and attaching CSS to them.
We did not use Panels anywhere on the site. The site was designed to be compatible with Google Chrome, Safari, Internet Explorer (7 and above) and Firefox. Internet Explorer caused us to tear our hair out on many occasions during the project.
Fundamental Modules
The power of Views 2 with the Content Creation Kit (CCK)
The heroes of the project were the Views 2 & CCK duo (BTW Apache Solr was also a hero!). If it weren't for these extremely powerful modules we'd easily be looking at spending 2x-3x the development time.
Just look at the complex functionality we were able to achieve just using Views (with CCK)!
A Note on Content Types
There is an inherent tension in making content types: Too few content types and you have bloat within a content type. The symptom of that is that you will have many fields in the content type (many of those fields are only going to be used rarely). On the other hand, with too many content types code reuse goes for a toss. The trick is to find a balance and we believe that we did on OPEN Magazine. After major back and forth we finally settled on two main content types: "articles" and "shorts" (there are other minor ones). The article is the long-form content type (example) and the shorts are the short articles, often grouped together (example).
Other Major Modules
Here we list some of the other major modules used on our site.
CCK/Views Family
- Imagefield & Imagecache: For all our image needs!
- Nodereference: A powerful module included with the CCK that allows content administrators to link / associate content together. On our site we use Nodereference when we want to manually suggest articles to the reader, create a Table of Contents for a particular issue, create a Package article which is nothing but a container for other articles etc.
- Nodequeue: Allows you to create a static "list" of articles and show it in a view. Provides a user friendly backend where content administrators can easily add, remove or change the order of items in a nodequeue. Our front page consists of many nodequeues. For example the 5 essentials on the frontpage is implemented as a view on a nodequeue.
- Filefield Paths: Very useful module that gives a high degree of control on the server path and file name of a filefield (Note: imagefields are also filefields). We use it to prefix the file name of all uploaded images with the node id of the associated article. So “baloons.jpg” becomes “2344.baloons.jpg”. Useful for tracking images to their nodes by just looking at their file name.
- Filefield Insert: Allows easy insertion of inline images into your rich text editor
- Vertical Tabs: Reduces the clutter on your node edit form
- Views Bulk Operations: A fantastic module!. We use it to make our custom content management interface. This content management interface allows users to filter nodes on the basis of content type, magazine issue date, magazine section etc.
- Slideshow Pro: Slideshow Pro is a image gallery flash plugin. This module provides Drupal Integration. See an example gallery.
Commenting
- Comment Notify: Notifies commenters that of fresh comments on articles they commented upon
- Ajax comments: Allow comments to be posted without a page refresh
- Captcha: Simple spam prevention.
Taxonomy/Tagging
- Cumulus: Cumulus presents the tags in a visually engaging manner
- Tagedelic Views (+ Tagedlic): Sometimes you want to see tags for content only from a particular section... e.g. Tags from all articles in the Business section. This allows you to get tags from a group of nodes rather than all nodes of a particular content type as is the usual case
- Hierarchical Select: Very useful module that creates a taxonomy selection widget in the node edit form. We used it mainly to force users to select the leaf nodes of a particular taxonomy hierarchy.
- Taxonomy Manager: Good for merging taxonomy terms, deleting taxonomy terms and so on
Other
- FCK Editor: Our Rich Text Editor. However we would recommend any new site uses WSIWYG API as the umbrella module for Rich Text Editors.
- Rules: An extremely powerful and useful module. It allows us to do all kinds of useful things like redirect articles on save. A great example of its use is the smallworld page (example). This page is a view consisting of many instance of a CCK content type called “Shorts”. After a content administrator make a “Shorts” content and saves it, we use Rules to redirect us to the view that would contain it. We also use rules to send emails to article authors when someone comments on the website.
- Fivestar: Used for ratings on movie reviews
- Apache Solr search: Discussed above
- SimpleCDN: The simpleCDN modules allows you to change all imagefield paths so that they point to a Content Delivery Network (CDN). SimpleCDN module supports any CDN that uses HTTP mirroring. We use it in our case, for the SimpleCDN CDN.
- Webform: Site visitors can subscribe to our paper magazine by filling up a form (implemented using webform). The Webform module tracks past submissions, validates the form and redirects the submission to the the credit card gateway.
Search Engine Optimization / Advertising / Tracking
- Global Redirect: Redirects all URLs like /node/2344 to their aliased path
- Integrated Metatags: Meta tags for search engine optimization
- Pathauto: Assigns human friendly URLs to nodes
- OpenX (formerly OpenAds): We are in the process of rolling out advertising on our site. We chose OpenX as our ad manager because Google Ads requires the website to be up for at least 6 months if your site is from India/China.
- Google Analytics: A must have for your website if you want to track your visitors
Development Related
- Features Module: An Amazing module! For CCK and Views exports (See below)
- Devel: Provides a suite of tools for the site developer.
- Drush: Command Drupal from the command line!
- Drupal for Firebug: Very powerful module! Integrates with Firebug. Allows you to look at the form API representation of a page. Execute PHP code in the Drupal context and more.
- Admin Menu: Allows easy site administration using an unobtrusive menu dropdown at the top of the page.
- Backup and Migrate: We take regular backups of the drupal database to prevent against loss of data. The module allows you to dump the contents and/or structure of only the tables you need. This allows us, for instance, to dump all the tables in our Drupal database (except for the cache tables where we only take a structure dump).
Comments on Development (And a plug for the features module!)
We used Drush extensively during the development process. Its a nifty command line tool used to accomplish all kinds of administrative tasks. All Drupal module code was checked out from drupal.org (using CVS as the default package handler in Drush). We used SVN for our internal version control.
A prime problem during Development is that a big chunk of your configuration is stored in the database. That's not good because you can't track those changes and roll back easily. The solution is to bring that configuration as much as possible to source code where it can be checked into the version control system. This allows you to diff and roll back easily.
The Features module helps you accomplish the above tasks easily.
- Easily Manage View exports
- Imagecache exports
- CCK content type definition exports
- You can bundle related views, content types and imagecache presets as "features" that can easily be enabled or disabled
- Automatically discover module dependencies in a "feature"
- "Features" live as modules in your filesystem and can therefore be version controlled
- Find what has changed between whats actually in the database and a particular "feature." Useful for discovering accidental configuration changes on your live server that may be causing problems
A lot of the above can be accomplished by using the export features of the Views and CCK modules but its much tougher and does not have that integrated and easy feel to it that features provides.
We used the feature module on our website for all Views, CCK and Imagecache exports and we suggest you use it for your site too!
Customer Data Entry and "Staging"
We now talk about how we solved a common client requirement and then we talk about a client responsibility.
“Trial” Feature of the homepage: A "staging" type customer requirement
Before making content changes to the site, the client would usually like to see how the site would look after new content was uploaded. This is tricky. We all know that before you publish nodes you can preview them. But what if you want to preview complex pages like the home page? We have provided, what we believe, is a simple way for the client to preview the homepage (this is simpler than a pure staging solution requested by many clients):
- Our homepage is basically a set of nodequeues containing featured content. We created two copies of each nodequeue on the homepage. One set is the "live" homepage nodequeues and another set is “trial" homepage nodequeues. The trial homepage nodequeue views do NOT check if the node is published. This is so that unpublished content can also be previewed. The "trial" homepage is available on separate, login only URL.
- Once the client is satisfied with the look and feel of the trial homepage, he can choose to transfer all the nodes in the trial nodequeues to the “live” nodequeues by simply clicking a button (We do programmatic population of the “live” nodequeues.)
- We have some more small tricks here for other situations that we can talk about if people are interested
Test Data Entry by Client: A Client Responsibility
We got the client involved quite early on to do test data entry. There is no way to finalize your content types unless the client does not start doing data entry. Because that is when all the edge test cases come out and the content types get tested out. Please note that content types are the most important building block on your site so treat them with respect!
We forced the client to concentrate on test data entry (clients always seem to care about look and feel more!)
If there is one suggestion we can make for this whole case study: Please do NOT delay test data entry!
Site Hosting and Performance
We will give a small overview of how we hosted and optimized www.openthemagazine.com. This information is not common in case studies but we hope that you will find it useful.
Server Architecture
Openthemagazine.com is hosted in the “cloud.” We have the following virtual machines (simplified description):
- Load balancer (Haproxy)
- 2 Apache Webservers (containing the Drupal PHP scripts)
- NFS Webserver (for the shared boost cache and the files folder). This server also runs Memcache.
- MYSQL database server
- Apache Solr (for our search engine)
The Open Magazine is image intensive. To offload some of the server load, improve site responsiveness (and decrease our bandwidth bill!) we use the Simplecdn content delivery network.
Boost Caching
We use Boost caching to reduce server load. Boost is a very powerful module that converts dynamic pages to static HTML. This way the PHP interpreter does not need to run on each request. Apache simply serves out the pre-calculated static pages. Server response times are significantly decreased because PHP scripts can typically take an order of magnitude more time to process than static pages.
Boost is similar to Squid or Varnish but because it is tightly integrated with Drupal it is able to expire the content intelligently (e.g. it will expire a page as soon as someone posts a comment on it. This would be difficult to achieve without writing custom hooks in non-integrated caching solutions).
Please note that you should only use Boost if most of your visitors are anonymous (which is the case for us).
Memcache
We use Memcache to serve all our cache tables. We also use the Memcache module to do all our session handling. Using Memcache in this fashion gives us a big performance boost. We used the Memcache module in Drupal.
Sprite Creation
We created sprites for a lot of our background images for our (design intensive) website. Sprites allow multiple background images to be downloaded within an "omnibus" image. This reduces the number of separate HTTP connections that need to be initiated for each visit (and thus makes our site finish downloading faster).
Other Drupal/PHP Optimizations
The most important optimization is to look at the HTTP expiration headers that your site generates. Its critical to have far in future expiry dates for all your static/rarely changing content (e.g. CSS, Javascript, images) and short expiry times for HTML. This way repeat visitors will only download the HTML and their browsers will use their local cache for CSS and Javascript.
We also enable CSS and Javascript aggregation. This reduces the number of files to be downloaded (this helps in the same way as sprite creation). Its is also important to enable this if you are going to set far-in-future expiry for the static content (Please ask me why if you're interested. Short version: Every time you change JS/CSS during the course of development the name of the aggregated file changes. This way even though these files are set to expire far into the future, web browsers use the new CSS/JS file as instructed by HTML, instead of the old ones in the local cache.)
We also use APC opcode caching.
Miscellaneous
We used the Firebug plugin Yahoo Yslow extensively for website performance measuring and optimization. Highly recommended!
Conclusion
We hope your enjoyed reading this case study. Visit our website at www.srijan.in to learn more about us or write to us at business[at]srijan.in for your business inquiries. We'll be glad to answer any queries about this site!
Drupal version: Drupal 6.x