Autism therapy site with autotagging and GeoIP-enabled directory
Healing Thresholds Autism Therapy provides comprehensive therapy fact sheets, daily updates of research and news, and a global directory of autism therapists. The site has become a standard and definitive source for autism treatment information. In particular, the site provides layperson-accessible summaries of essentially all child-focused research from the last 3 years, in addition to earlier seminal papers.
The two biggest challenges of the site have been to expose all of the deep content we've authored and to monetize it. We have over 2000 carefully-written research and news summaries and user-generated content. We also have over 1000 taxonomy terms, and we've authored detailed definitions for nearly half.
Our redesign last month upgraded the three-year-old Drupal 4.7-based site to Drupal 6. More importantly, we used the flexibility of Drupal to completely re-architect how users find and interact with our content. Where previously we had separate sections of research and news, our new tabbed interface displays -- in a search engine friendly fashion -- all of the relevant content for each of over 1000 different therapy topics.
Prioritizing taxonomy-based pages over nodes
On the old site, we had a lot of traffic as search engines brought users to a specific article, but a high bounce rate because it was not easy to find all of the related content we have available. Also, clicking on an underlined term brought you to our glossary, which was a dead end, instead of showing you all of the great content we have on that topic.
Our solution was to re-architect the site around the taxonomy. On each topic page (or landing page, in search engine optimization parlance), we show our carefully written fact sheets (when available), and then the first several research summaries, news articles, and user-generated content. Thus, our topic pages have a large quantity of relevant content, which makes them more effective "nets" for capturing organic search traffic. This also means that the same article can be generating search engine traffic (and answering users questions) on multiple pages simultaneously.
We constructed these landing pages using Panels 3 and Views 2, which are spectacularly versatile tools. By overriding the taxonomy pages with Panels, we can display 7 different views, pulling all the relevant information on to the page. We also made use of quicktabs to nicely format the information, but in a way that was completely transparent for SEO. Finally, we used some jQuery trickery to add in the number of articles of each type into the tabs. For an example, see our Applied Behavior Analysis (ABA) page. Our new theme is based on Minnelli, with some adjustments to provide space for 5 different ad blocks.
We originally chose Drupal over Ruby on Rails because of the head start that the rich ecosystem of Drupal modules provides. Our ability to completely re-architect the site during this upgrade, all with existing Drupal modules, confirmed the wisdom of our selection. The new site includes all of the information from the old one, but displayed in a far more accessible and SEO-friendly manner. The use of the modules described here would easily have cost us more than $100,000 if we had needed to reimplement the functionality from scratch.
The other key to our taxonomy-based architecture is our use of the glossary module. Not only do the hover definitions make the site much clearer to non-expert users, but each of those dotted underlined words is also a link to that taxonomy page. So, the entire site becomes richly interlinked automatically.
We also decided to use Google site search with google_cse. This is helpful because it correctly sees our site as a 1000+ therapy topics, and indexes the full contents of those taxonomy pages, rather than just indexing the individual articles (nodes).
GeoIP-enabled directory
The other major change to the site is adding a 50,000 listing directory of service providers for the autism community. Here, the gmap and location modules were invaluable. Gmap provides an incredible level of functionality out of the box, in particular, the gmap taxonomy markers you can see in this zoom-in of New York. Location was able to geocode all of our service providers and integrates seamlessly with views. We also made use of lm_paypal to support self-serve upgrading from free to enhanced listings. Although some development was needed to match our model (since when you cancel, your enhanced listing just turns back into a free one, rather than disappearing like an ad does), the PayPal 2-way notification works great. We wanted to use PayPal so we didn't have to handle the credit cards ourselves.
Finally, we set ourselves the challenge of using GeoIP technology to automatically show relevant directory entries on every page. When you enter our directory, you are redirected to the city page closest to you, based on where GeoIP says you are. We used the ad_geoip module, although some significant new development was necessary to support displaying relevant listings at all times. One of the features we're particularly pleased with is "tag matching", where we've associated a hundred or so of our most popular topic pages (taxonomy tags) with a relevant autism service provider (therapist category). So, if you go the drugs page, the therapists shown are pediatric psychiatrists near you.
Cleaning up taxonomies and autotagging
Another big challenge of the redesign was that we had been supporting a free-tagging vocabulary for the last 3 years, but decided to standardize it so as to make each landing page as useful as possible. That meant laboriously going through every listing and merging similar ones together, using the taxonomy synonym feature. Taxonomy_manager was a huge help through this process. We decided that we wanted all of our content autotagged based on the glossary terms used. That way, if a news article or user comment mentions several autism therapy topics, the comment will show up in each of them. After looking at 5 options, we settled on autotag (but running in tag-on-save automatic mode, rather than using the jQuery drag-and-drop mode). The magic came in combing autotag with views_bulk_operations, which is an incredibly powerful module, particularly with its use of the Batch API to avoid timeouts. VBO enables us to select specific content and resave it, which gets it autotagged in the process. And, all new content is autotagged automatically.
Ads
Another change was to add capability to our advertising system, particularly in our use of affiliate advertising targeting the specific topic of the page. We settled on Google Admanager to serve and track our ads, since it's capable and free. The google_admanager module allowed us to create 100 different advertising blocks for our 50 key pages. By setting each block to only appear on one or a few pages, we're able to do targeted ads across the most heavily trafficked pages on our site. We also liked using views_block to keep track of the ad blocks and their visibility rules.
Community
The original version of our site three years ago had a number of community features, including user blogs, forums, a wiki, and private messaging. They were not heavily used. We have, however, built up an email list nearing 20,000 members, as well as a large number of Facebook and Twitter followers. For the redesign, we decided to focus on user-generated content with the ubiquitous comments sections, rather than community. Instead, we prominently feature Add This buttons at the top and bottom of every page to enable viewers to use off-site community and social networking applications to communicate about Healing Thresholds.
Performance
The unoptimized performance of our site is not fantastic. Running seven views on our main topic pages can slow things down. Our solution is to use boost to serve out the vast majority of our pages statically. That way, the site is only a little slow for those who log in, and for those who view a specific page for the first time. For everyone else, they receive a static copy of the page, which loads almost instantly. The main downside is that new comments are not immediately visible unless you're logged in. There's a lot of further performance tweaking to be done. For example, we recently found that adding indexes on our location tables decreased the page generation time an order of magnitude, from 30 seconds to 3. Of course, with boost cache, the second page load becomes nearly instantaneous. devel query log has been essential for tracking down our issues.
Development
The two developers (Dan Kohn and Kent Parker) are based in New York and New Zealand. We mainly use Google Spreadsheets to track current tasks. We've had good success setting up three separate instances of the site, one for production and one for each of us to develop on. These are three full installations of Drupal (rather than using multisite) so that we can test new versions of modules and core before deploying to the production site. We use backup_and_migrate to automatically back up the database of the main site, and then regularly restore it to the development servers using this trick of shared directories. Two other essential tools for dealing with staging environments are keys, which automatically sets the correct Google Maps API key based on the domain, and environment_indicator, which shows a stripe on non-production sites to remind you where you're working. Of course, admin_menu is great for quickly accessing everything you need via the browser, and drush is a ludicrously helpful tool for command line updates.
Other important modules:
biblio: We use this module to hold our layperson summaries, as it provides great indexing functionality.
Editors: WYISWYG and TinyMCE for the fancy rich text editor, IMCE for the integrated image uploading, and imagefield_crop for the amazing picture cropping functionality, which we use for our directory listings.
SEO: seo_checklist, pathauto (rewrites the URLs to enable our taxonomy pages to be a first class citizen on the site), global_redirect, nodewords, page_title, search404, XML sitemap (which thankfully already included the capability to rank taxonomy pages higher than node pages), and google_analytics
General: login_taboggan, mollom, nice_menus, prepopulate, rotor (for our lovely rotating banners and pictures), scheduler, term_node_count
Conclusion
We'd like to give our enormous thanks to Kent Parker of Passing Phase Web Development, who has been a fantastic partner in the redesign.
If you have comments, questions, or thoughts, please contact us.
Drupal version: Drupal 6.x