Now on Drupal - HamptonRoads.com and PilotOnline.com
Hello all,
I wanted to introduce myself, and our now-powered-by-Drupal sites, PilotOnline.com and HamptonRoads.com. I'm Jeff Anderson, and I work as the product development manager for the interactive division of The Virginian-Pilot newspaper based in Norfolk, Virginia. The Pilot is a top 50/60 U.S. daily newspaper by circulation size, and PilotOnline.com was one of the first newspaper websites to go online in the early 1990s. Along with its sister site HamptonRoads.com, a local portal focused on entertainment, community and local guides, we recently ranked No. 6 in the U.S. in terms of the percentage of users in a local market who use a particular locally-focused site. On December 10, we re-launched both sites on a customized Drupal 5.x platform.
I wanted to share a brief outline of what we did, but first, let me just say that we don't consider the site in its current state to be a finished product... not that web sites are ever really finished, but the point is that we feel this is just the starting point for the things we're hoping to accomplish.
We have had a core team of five people working on the project over the past few months - three programmers, who each took ownership of writing/customizing different major modules and data import processes, one designer/themer, and myself as product/project manager. We also had lots of support and input from others on our operations, design and content teams. Soon after making the decision to try rebuilding our sites with Drupal, we arranged for Barry Jaspan, who had worked on the New York Observer site earlier this year, to come visit our office for a week and help our team get started.
The biggest challenge, I think, was in having our team - under the time pressure of this being a major company priority and forcing every non-critical maintenance task and other developments project to the back burner - learn to work in a new language, adopt new methods and use new processes.
Our existing content management system was entirely home-grown, developed incrementally over the course of several years, and a mix of mostly Cold Fusion 5/SQL, with some newer pieces in CF 7/MySQL. Nobody on the team was unfamiliar with or concerned about working in PHP, but at the same time, the switch in application languages was one portion of the challenge. Working collaboratively for the first time using a version-control development environment was another. Learning how Drupal core and various modules work, and understanding how to customize them to fit our needs, was probably the most challenging aspect of the project.
We avoided the temptation, wherever possible, but couldn't avoid modifying a few areas of core as our best solution - primarily in the comment and user modules.
Highlights of our Drupal configuration:
- Three main custom modules and templates - a "channel builder", a local events/calendar product, and a custom multi-blogger setup.
- 18 different content types - several that are devoted to node-building or node-aggregation components like channels, bloggers, writers, and event venues.
- 70+ CCK fields - the vast majority of them split between stories and events calendar.
- Five different taxonomy vocabularies - site sections, events calendar categories, community photo gallery categories, local cities, and a free tag category.
- Several different access roles, including various levels for public users and various levels for internal administration.
Data import process
The data import was fairly daunting, due to intricacies too detailed to share here. But by the numbers, here's what we pulled in:
- More than 50K nodes, between stories, blog posts from 70+ bloggers, calendar events and photos.
- More than 50K story comments
- More than 250K users
The process for doing this was generally to write scripts in Cold Fusion to export the data from SQL or MySQL, then clean up, parse or merge data as necessary, and import into Drupal's corresponding tables in a different MySQL database. Additional steps included MySQL tasks and custom PHP scripts to update the data once inside Drupal's tables as needed to make it work inside of Drupal. I realize this is a very non-technical glossing over of what some of you may be most interested in, but I'll defer to our developers to add on or answer questions about the details I'm not qualified to speak to.
Key problem-solving
There were a few areas where we had some requirements that we understood would be, to some degree, "going against the grain" of Drupal - and almost the first thing we did was set out to find solutions, at least at a proof-of-concept level, in the following areas:
Enforcing standards, without stifling participation: We absolutely wanted to make a change to our previous approach to user-generated content and participation. We've had good success for years in terms of the quantity and quality of UGC, but as that has grown, under a model of having our staff moderate ALL of it, we realized that we needed to shift to a different model in order to both keep growing the participation, and hopefully free up content staff time for other duties. But at the same time, nobody was comfortable with a shift to a totally open, "Wild West, like the rest of internet" level of discourse. So we established a layered system to handle this, primarily applying to the comment module. We plan to keep the exact details of this private so as not to encourage attempts to 'game' the system. But I can share that our approach includes a profanity filter which logs every submission that trips it, the creation of a "trusted user" role which gives the large majority who abide by the rules the opportunity to post in real-time, and continued staff moderation of content submitted by new users and those who have trouble following our guidelines. In addition, we have a ‘flag this' function for users to help us spot anything that still slips through.
Back to the future: We have a primary need of publishing many stories in real-time, and being able to control things like home page and channel-front placement of those stories at the point of publishing the story - rather than requiring separate manual steps to update or build those pages. But, we also have a need to accommodate and match the workflow of a daily newspaper - for example, for some portions of the site we need to set up the "Sunday edition" on a Friday, and be able to preview that content presentation without the public seeing it until Sunday.
One part of our solution is the combination usage of an "effective date" CCK field and a cron task which keys on that field for automated future publishing. The other part is a "Futureviewer" module we wrote for previewing future-date states of the two home pages, so editors can carefully tailor the length of headlines, teasers, etc. for the precise desired page layout. Yes, yes, I know ... "how very newspaper" of us!
Letting our IT manager sleep easier: Performance under a heavy load is perhaps our biggest concern about Drupal - the footprint for a site with as many modules, nodes and users as ours certainly isn't small. We are using the Boost module to protect ourselves from sudden traffic spikes, which for us are almost always caused by getting "Drudged", "Farked", "Dugg", or in our most recent major spike, "CNN-homepaged" when we had some exclusive coverage of a national story in our area - the Michael Vick dogfighting story. Since those events bring us a crush of out-of-market, unlogged users, the Boost module lets us serve those users flat pages - which we expire on a frequent interval - and save the database and application servers to provide dynamic content to authenticated users.
Development in Drupal
A few notes on the most significant custom work we did:
Channel builder: Exceedingly useful for our content producers - We can deploy new channels and modify existing ones quickly and simply. We developed a tool and a modular template that gives the content staff a great deal of flexibility for placement of lead stories with teasers & photos atop the page, and then several "tiles" of different content - with each tile capable of displaying dynamic headlines either by taxonomy term or by RSS feed (we use Aggregator to pull in all of our Associated Press-hosted content), with a "days back" parameter and a number-of-headlines number, as well as full-HTML fields which output at the top and bottom of each of these tiles. In addition, there are set positions for optional features such as embedded video players, bloggers, photo-sharing categories, polls and event calendar listings by category. A few examples:http://hamptonroads.com/entertainmenthttp://hamptonroads.com/pilot/sportshttp://hamptonroads.com/life/food-and-cooking
Custom homepages: We have two distinct brands/URLs with an integrated navigation and design. PilotOnline.com is the newspaper-affiliated brand with a news & utility focus, and HamptonRoads.com is the regional portal with an entertainment & community focus.http://hamptonroads.comhttp://hamptonroads.com/pilotonline
Events Calendar module: We essentially re-created the same product that we had developed in Cold Fusion a year earlier. Authenticated users can submit calendar events, which our staff reviews and publishes, marking some as "staff pick" events to be highlighted on the home page and certain channel fronts. Since most of our events come from repeat submitters such as venue personnel, bands, etc., we try to make their process easier by allowing the cloning of past event nodes submitted by the users - so a band only need update the date and venue fields, and a nightclub need only update the date, band name & description.http://hamptonroads.com/events
Blogs module: We had fairly simple needs, but specific enough that we didn't see a good fit among the 3rd-party modules. We have dozens of different bloggers, some of them newspaper staffers, some of them members of the community but vetted by our staff. We needed simple tools for posting blog entries, uploading photos to be used in blog entries, and comment moderation.http://hamptonroads.com/blogs
I'm looking forward to joining the discussion within the Drupal community, and seeing what the future holds. I'll try to answer any questions I can, and hope to see some others on our team join the discussion as well. One question for this group that I'm curious about - I'm not positive if this is the case, and even if it is, I don't expect it'll be so for long... but is anyone aware of a larger U.S. daily newspaper now using Drupal for its core site publishing?
-Jeff Andersonj.anderson@pilotonline.com
Drupal version: Drupal 5.x