Ebizon builds world's fastest growing Drupal site - TweenTribune
Introduction
TweenTribune,TeenTribuneand TTEspañol deliver the teen and tween audience with compelling stories kids won’t find anywhere else. Stories chosen for TweenTribune are selected by tweens working closely with professional journalists. Tweens can submit links to stories they'd like to share, submit their own stories and photos, and comment on the stories they read.
More than 53,000 teachers across the U.S use Tween Tribune in their classrooms.
Generates more than 5 million page views per month.
10,000 nodes are added every day
Brief History - From WordPress to Drupal
TweenTribune and its sister site, TeenTribune, work through schoolteachers across the U.S. Registered students log onto the site and post comments on selected stories of the day, and teachers review the responses for approval before making them “live” for other students to see.
During Christmas in 2008, Founder of Tweentribune, Mr. Alan Jacobson, decided to move its website from Wordpress to a more capable and flexible Content Management System Drupal. He contacted us in December 24th 2008 and worked with us to develop the application that would allow Tweens of ages 8 to 14 to read a variety of interesting content as well as comment on news for other Kids to see. Teachers can easily use Tween Tribune as a teaching tool. First, the site uses high-interest reading material to engage students with the news.
Teachers can register their classes on the site, which allows them access to special features like custom generated pages that show students comments or stories the class has commented on. Teachers can print out reports by student; these reports allow them to see which articles students have read and to access to individual student’s comments. In this way, teachers can easily grade or comment on students’ writing. There’s even a Faculty Lounge where teachers can interact with each other, sharing ideas and lesson plans.
Using Drupal 6 and a variety of excellent contributed modules, the site Tweentribune.com was launched in March, 2009. Modules used include Views, CCK (both core and imagefield), and Imagecache.
Codes were written for all the custom features of TweenTribune. This custom code was integrated into a Drupal Content Management System in the form of Drupal Modules.
Tweentribune is now a success story that has been featured in LAtimes, YPulse.com, KillerStartups, WeMedia, GoodHouseKeeping and getting
- more than 5 million page views a month.
- more than 16 million add impressions per month.
- more than 3000 comments and 6000 quizzes
SCALING WITH CONFIDENCE
Tweentribune.com had couple of unique challenges. The traffic used to pick during US school hours with most users logged in and hence, creating making maximum connections to the database. The webserver and database were separated on 2 different machines in the same network (LAN).
Further Following measures were taken to improve drupal performance:
- Optimize database queries and modules
- Use Memcache for all database cache.
- Sessions which are typically stored in database in Drupal were also stored in memcache.
- Boost module to serve html content for anonymous users
- Using Lighttpd to serve static files like css, js, images.
- APC as the PHP accelerator was used.
- Using Linux shell, Munin and Nagios for monitoring.
Memcache - way better than cash
Memcache, Squid, APC, etc were used to make Drupal scale. Memcache, APC and Squid were installed and configured on the server. Memcache was monitored and configuration of Memcache was changed with time as traffic improved and RAM of the server was changed.
Lighttpd
Lighttpd is a web server that was used to serve static files (images, javascripts, css) to reduce burden on Apache webserver as lighttpd is faster at static contents.
Apache Solr vs DSS
Drupal Search Sucks as it doesn't deal with large amount of content, it doesn't scale and gets bogged down.Drupal Search is integrated - it runs and searches on the same database thus, slowing down the system. Apache Solr's advantage for Drupal is that it indexes nodes, not pages. This means it can have access to attributes of the node that are not readily parsable from the rendered page. These attributes can be used to filter the results. Apache Solr provides faster search experience than default Drupal search.
Varnish or Squid
But either is better than getting shellacked, and both are better than Boost.
InnoDB, instead MyISAM. - Who wants to get locked under a table?
- InnoDB implements row-level lock for inserting and updating while MyISAM implements table-level lock.
- InnoDB inherently takes care of data integrity by the help of relationship constraints and transactions.
- InnoDB is faster in write-intensive (inserts, updates) tables as it utilizes row-level locking and only hold up changes to the same row that’s being inserted or updated
InnoDB buffer pool. How big is too big? We know. .
The larger the buffer pool, the more InnoDB acts like an in-memory database, reading data from disk once and then accessing the data from memory during subsequent reads. The buffer pool even caches data changed by insert and update operations, so that disk writes can be grouped together for better performance.
KeepAlive on or off?Contact us and we'll tell you.
THE TEAM
- Ebizon NetInfo: Ebizon builds World's fastest growing Drupal site and is the backbone of the project with the expertise in performance and scalability tuning that is essential for Drupal sites with millions of nodes and users. Ebizon supports Tweentribune's rapid growth of almost 10,000 nodes addition everyday through multiple layers of content caching in multi-server environment. Ebizon extends Drupal to meet the unique needs of the site to handle traffic of more than 1 million authenticated users during school peak hours.
- BrassTacksDesign: The BrassTacksDesign Team were responsible for project conceptualization and use cases. All day-to-day operations are managed and administered by them.
- Rackspace: The website is hosted on Rackspace.
HARDWARE
The underlying hardware included 2 machines on the same Gigabit network:
One with apache webserver and memcache with following configuration:
- Quad Socket Quad Core Intel Xeon E7440 2.4GHz
- 64GB Memory
- Operating System: Red Hat Enterprise Linux 5 - 64 bit
Database server has following configuration:
- RAID 5
- 12 GB DELL RAM
- Single Socket Quad Core Intel Xeon L5520 2.26GHz
HOW THE CHALLENGES WERE MET?
- Challenge: Drupal is both resource intensive and database intensive. Its strength is ease of development, extensibility through modules and faster development time. Its downside is that it requires more CPU and RAM than other CMSs.
Solution: With our experience we found that couple of Drupal contributed modules are resource intensive and their optimization is necessary in order to scale the system. We monitored SQL queries using devel module and identified the queries that consumed most resources. Then we optimized those queries and monitored their performance and load on the system for couple of days. The results and improvements were captured in a performance report that was published for client’s review.
- Challenge: Busted Page issue which was causing page to break. The busted page was a much trickier issue solely due to its intermittent nature.
Solution: The Busted Page Issue was THE MOST important issue since the site had scaled to 2 million page views a month and we couldn’t risk this problem to survive any longer. Initial attempt was to disable BOOST module but to our surprise disabling Boost did not solve the problem. After 24 hours of rigours effort and monitoring it looked like menu paths were restructuring during CRON that was running every hour. The best of teams in the world were thinking on it but no one could get to the root. Finally, one of our best technical leads made the cron to run instead of every hour only at night at 12 am. This resolved the Busted page problem and was a GREAT success for us and Alan.
- Challenge: Location based advertisement and headers implementation in Drupal 6.
Solution: Drupal ad geoip module were customized to implement the feature whereby advertisements and headers can be displayed based on users location.
- Challenge: Only teachers of a classroom should be able to moderate the comments and comment should be published only after they have been approved.
Solution: Drupal moderate module was customized and an interface was designed where teaches could see all the comments in a classroom and can approve or disapprove them.
- Challenge: Blocking inappropriate words that student puts in their comments.
Solution: Initially Watchlist module was recommended which automatically flags a node or comment if it contains any questionable content (these can be set in the Watchlist settings by adding regular expressions of words that are considered bad). But it flags the word and notifies admin AFTER the comment is posted, which is TOO LATE. Therefore Spam module was utilized to resolve this problem.
- Challenge: Alan needed a way for the teacher to send every student’s comments to the printer with one click, instead of sending them one at a time with one click per student.
Solution: It was not feasible to put restriction on users to have an email to sign up on Tweentribune.com therefore team found a way for not letting users create their email and instead having system create their email automatically from their Full name. The contrib module that was modified for this purpose was “Localemail” and was made to create email ids automatically for each user and let them register directly on Tweentribune.
- Challenge: A new workflow for teachers registration was required where teachers could register themselves without requiring Alan to personally verify each registration as in the previous workflow.
Solution: Team worked on a new workflow where:
- Teacher can submit information on webform, which is almost identical to existing webform with very minor change. This new form replaced the existing form.
- Drupal generates 9 classrooms for teacher, but does NOT use classroom taxonomy. Instead, user profile contains username and classrooms only. Classroom names use teacher's school email address + taxonomy ID. Example: mary.jones@collierschools.com-151365
- Drupal generates new usename = teacher's school email address. Role = teacher_private. This role is a clone of existing role = teacher.
- Drupal sends 2 welcome emails with username and password generated by Drupal to 2 email addresses: home email address and school email address. Email includes link to "dashboard" page where teacher can register students. See screenshot, attached. The dashboard is 600px wide, so it fits in the main content area of the current pages.
- Teacher logs in and is redirected to /teacher_landing_page or uses link provided in welcome email.
- Teacher can do the following on the dashboard:
- register students
- see usernames and passwords of students previously registered
- delete students
- print out student usernames and passwords
- change classroom name
TWEEN TRIBUNE APPLICATION AND DATABASE ARCHITECTURE
Tweentribune.com is a news site for Tweens and following are the cores around which it was built:
- CCK
- Views
- Webform
- Taxonomy
- Imagecache
- Custom AJAX-based drop down select developed as a replacement of hierarchical select module (http://drupal.org/project/hierarchical_select) when selecting classroom during registration or posting of stories.
- Custom module was used to allow non-email based registration on the site, since; Tweens usually do not have email addresses.
- Also, custom functionalities like allowing administrator to register teacher’s requests easily from an interface that are received from webforms were also developed. Comment moderation by teachers was also integrated into the site using Modr8 module.
Content Types
- Stories: This is the main content type around which whole Tweentribune.com stories are built.
- Profile: This content type carries the student and teacher profile information like classroom.
- Your-stories: Using this content type, teachers can post their own news into their classrooms.
- Quiz: With this content type, teachers can post quiz on the website for their classroom.
- Your Entry: This content type allows student to submit short stories and essays.
Taxonomy
- Topics for tween: This vocabulary is used to define category of the story posted on Tweentribune.com.
- Classroom: This vocabulary allows users to be assigned to the classroom. Classroom is based on parent-child hierarchy with country, state, city, school and then classroom following parent child relationship. Certain stories can also be optionally put in some classroom/school.
- Spanish: This vocabulary is used to post stories in spanish
- Your town: This vocabulary is used to post stories from affiliate partners
Drupal version: Drupal 6.x