Pregnancy.org relaunched on Drupal - a case study
DrupalCon DC 2009 note: Ben Jeavons and Matthew Saunders of pingVision and Mollee Bauer, founder/owner of Pregnancy.org, will be making a case study presentation of the design and development of Pregnancy.org at 6:00 pm, Wednesday, March 4th at DrupalCon. [See the DrupalCon DC 2009 schedule.] Any videos or supplementary materials from that presentation will be linked here sometime after the presentation.
Since 2001, Pregnancy.org has been an informational and community resource for women about fertility, pregnancy, labor and child care. In November 2008, Pregnancy.org was relaunched in Drupal 6, with a new design and development by pingVision.
The site ran previously in a highly modified version of PHP-Nuke which had gotten so unmanageable that the managers were posting static pages and building navigation just to avoid having to deal with Nuke's increasingly cumbersome quirks. The site was also fitted with several custom "tools," providing special features for site members who were pregnant, wanting to get pregnant, or have babies. Forums on the site ran in vBulletin.
Even with these drawbacks, the site was enjoying high activity by site members, and very good search engine rankings. And yet, with the software the way it was, and the design growing a bit stale, Pregnancy.org founder and CEO Mollee Bauer retained pingVision to implement an entire revamp of the site across the board, from a new logo to the back-end software.
Goals
The goals for the redesign and development included:
- better management and control over the site content
- allowing members to blog
- inclusion of common contemporary features like RSS feeds
- maintain a consistent navigation and user experience, including tighter integration with their community forums, running on vBulletin
- a fresh new look
- and maintaining or even improving the site's SEO.
Finally, several custom tools needed to be redeveloped to work within Drupal.
Information Architecture
The site's fundamental content and community structure is broken out into four "Paths" – Getting Pregnant, Pregnancy, Labor and Delivery, and Baby and Beyond – each of which is targeted toward users who might be in that stage. Working with the client, we decided on highlighting the Paths with blocks on the front page and focusing the navigation around those Paths and sub-terms within each Path. Sub-terms can cross Paths.
Within this overarching structure, as an informational site, there needed to be specific and consistent organization to the variety of content available, while at the same time, as a community site, there needed to be structures for users to be able to focus on just the areas of greatest interest to them, including easy access to the relevant site tools they may want to use.
There was also consideration that many of the site's content and features might cross two or more Paths. In the old site, content was largely relegated to one Path, and had to be manually linked to other Paths. The new architecture needed to help facilitate tagging and navigation across Paths. After all, for example, just because you're pregnant doesn't mean you're not interested in Labor and Delivery content, or connecting with parents of newborns.
Menu Navigation
The four Paths, Getting Pregnant, Pregnancy, Labor and Delivery, and Baby and Beyond, serve as the primary menu navigation terms. Path landing pages display content related to that Path and its sub-terms.
Administrators can feature content which will appear at the top of landing pages. A listing of content, grouped by content type, follows in reverse chronological order. The secondary links focus on content types and other site features like community, chat, and tools.
Taxonomy
The four Paths also serve as top terms in the primary vocabulary of the taxonomic structure of the site. These Path terms, and related subterms, serve as the tagging mechanism for the categorization of content. Secondary vocabularies with terms that may cross the various Paths – e.g., "nutrition" – provide a means for the end-user to move laterally across the Paths.
Technical Architecture
Content Types
There are three main content types: articles, questions, and blogs.
- Articles
- Articles are informational features written by subject-matter experts. Some of this content is topical, but most of it is evergreen content.
- Questions
- Questions are asked to Experts (much like a columnist of a newspaper), who answer the node using a CCK field to which they have access.
- Blogs
- Blogs are blogs, a way for users to write about their own thoughts and experiences.
Articles and questions dominate the site's content. Other content types include product reviews.
We also created node types specifically for use by the custom tools, such as baby names for the baby naming tool, the days of the pregnancy calendar, and glossary terms.
Drupal Modules
Many frequently used community contributed modules were employed in constructing the site, including mainstays such as CCK, Views, and Pathauto. Other notable modules include:
- Clickpath – Stores pages viewed by the user. It was used as a "Recently Viewed" block for authenticated users.
- Custom Error – Custom 404 and 403 pages. The module allows custom PHP to run on the 404 page, which we used to redirect requests for the old site's articles to their new URL.
- Drupal vB – Integrates vBulletin forums with Drupal. Our client wished to maintain existing forums to avoid throwing a lot of UI changes at her site members, integrating the vB forums with the new site. Drupal vB handles synchronizing accounts between Drupal and vBulletin, single sign-on, and provides blocks of recent forum activity.
- Flag – Provides a customizable user-to-node relationships which we used to allow administrators to feature content to be seen on the front page and in designated blocks. Authenticated users can also flag nodes to inform administrators if that node is spam or contains objectionable material.
- Profile Privacy – Privacy settings for profile fields.
- Nodecarousel – The custom module pingVision had initially developed for PopSci [case study], which leverages jQuery's jCarousel in this case to provide scrolling through nodes on Path landing pages.
- Chart API – Provides an API to Google Charts, replacing custom Flash implementations to generate line charts of temperatures in the custom Basal Body Temperature (BBT) Charting Tool.
Custom Modules
pingVision also wrote several custom modules to handle certain features:
- PDS, soon to be released, is an API for retrieving data related to a page. This was developed to help facilitate the third-party advertisements display. The site's primary advertisements are served from DoubleClick. Proper integration to allow the client to sell advertising targeted to different subject areas required mapping pages to a rigid set of parameters (referred to as tags) to provide relevant arguments for the Javascript calling up the ads from the ad servers. PDS was used to solve the problem of having identical arguments (or "tags") for all ads (blocks) on a page, but tags differing across pages.
- The Custom Tools – Tools are features unique to the site, for use by visitors. Several of the tools are Views of specific content, with the exception of the Calendar and BBT Charting tool.
- Baby name Database – A glossary View of baby name nodes.
- Calendar – A calendar of development and pregnancy information tailored to the user's pregnancy conception and due dates. Content for a day is a node with a CCK integer field which stores the number of days since conception.
- BBT Charting – A resource for tracking and charting fertility cycles. The user records various data each day in a node and the module measures for temperature changes, among other indicators. The user can browse through their data in a calendar display or see a temperature line-chart using Google Charts API.
Graphic Design and Theme
The new design, including the logo, was created entirely in-house. pingVision worked up several design approaches, and worked with the client towards the final look. The new design contains more gender-neutral pastel colors, maintaining a soft, comfortable look while establishing a sort of clean integrity. "This is a site you can trust," was the main underlying feeling in the design.
The theme itself makes heavy use of pre-process and theme functions to alter the markup before reaching the template files. The theme is fully functional in all modern browsers including IE6.
Content and User Import
Because the client had resorted to working around PHP-Nuke in order to keep the site functional, the content import was largely based on consuming a series of PHP files and images that resided on the server. Scanning the directories and determining the file type was the first step for pulling the data into Drupal.
We wrote an API for importing (currently located in Drupal.org's contributions repository at contributions/modules/import/) for managing the process, with imports triggered in batches by cron. The majority of the content followed similar HTML layouts so we used regular expressions to extract title, author and node body portions of the previous documents. Working with a content mapping from the client, we matched content with taxonomy terms for the new site.
Exceptions arose where the regular expressions didn’t always work, which were discovered through the manual process of running the import and viewing the results of failures. Eventually, the exceptions were solved and a viable complete import was able to proceed.
Over 270,000 users were imported from PHP-Nuke's user tables using the import API as well, mapping Nuke's user data into profile fields where possible. Users needed to be matched not only with PHP-Nuke's database, but also with the vBulletin instance, which had additional users we had to add to the Drupal user table.
Pushing the Drop a little bit forward
Because the design and development process was scheduled to take place over several months, building the new site in Drupal 6 was the first challenge. The site began as a Drupal 6.2 install early in 2008. At the time there were few of the needed community modules available for Drupal 6. We helped port, test, and review several contributed modules as part of the development, knowing that stable modules in the community repository were in our and the client's best interest. While a tough choice to develop in Drupal 6 in the Spring of 2008, looking back it was the right choice.
Performance
Pregnancy.org has reasonably high traffic. In January 2009:
- 582,109 Unique Visitors from 202 Countries/territories
- 8,354,186 Page views
- 60.11% New Visits
- 23.18% Direct Traffic
- 7.70% Referring Sites
- 69.11% Search Engines
With over 8 million pageviews a month, we did some performance tuning to ensure site responsiveness. Drupal's built in caching schemes are in play and we installed memcached on the server and the Memcache module to ease database hits. Pregnancy.org is running on two web servers with a load balancer in front. The files directory is mounted from another machine that also houses the MySQL server. Images on the old site sat in a number of different locations. Not all of these images were brought over or kept the same path so we found a lot of 404s showing up in the logs. Bootstrapping Drupal just to serve 404s for images outside the files directory is expensive so we incorporated additional mod_rewrite rules to avoid doing so and found an increase in the amount of requests per second served.
Drupal version: Drupal 6.x