Better, Stronger, Faster! A New Open Atrium Installer.
Installing Drupal from scratch is a relatively painless for most people. However, installing a large Drupal distribution such as Open Atrium 2 has the potential to be painful. Core Drupal only contains a handful of modules that need to be installed, but Open Atrium 2 is a very feature-rich distribution with nearly 200 modules to be installed. An installation of Open Atrium typically takes 15-20 minutes and consumes significant server resources during that period. In this article, I will show how we reduced that installation time down to only TWO minutes!
Overview of Drupal Installation
Have you ever looked at the Drupal core installer code, or traced through it? It’s actually a fairly robust and flexible installer framework. The installation consists of several different “steps”. Each step can be a form asking for information, such as the database information, site information, theme selection, etc. Or a step can simply be a function that performs processing, such as the step to check if the system requirements of Drupal are met. Each step can optionally be processed via the Batch API, such as the step that installs each required module.
The Drupal installer can be run interactively, or non-interactively, such as when using the “drush site-install” command. During interactive installs, separate HTTP requests are used for each step of the process, and for each “batch” of processing done by the batch steps. Batch processing helps avoid timeout errors, but a single step that takes too long can still cause a timeout. The installer tries to break batch steps into one-second requests, but that only allows fast steps to be combined into single requests. A batch step that takes more than one second only ensures that a new request is used for the next batch, and can still cause a timeout.
Caching during the installation process is also very complex. In general the installer minimizes the amount of cache clearing to speed up the install process. However, during module installation, some modules might depend on the existence of other modules and without some cache clearing, modules installed during the same batch request might not see each other. This becomes even more difficult if a module is actually a Features export. Features has it’s own caches and sometimes needs to rebuild a feature to put configuration into the database that might be needed by other modules.
Installing Open Atrium 2
What appears to be a straight-forward installation process for Drupal core becomes exponentially more complicated when installing a distribution that uses hundreds of modules and Features, such as Open Atrium. An interactive installation of Atrium begins fine and quickly enables the first 30 modules. Soon it becomes slower and slower. Enabling the last module takes nearly ten times the first module. Each Feature module that is installed causes the info files of all previously enabled modules to be reloaded into cache. So the more modules installed, the slower Features becomes. There are many issues, such as this in the Features queue to try and address some of these performance issues, but it is a very complex process.
Because modules take longer and longer to enable, some Open Atrium installs run into PHP timeout problems and need to increase their limit from the 30-second default up to 60-seconds. This is often frustrating because these timeouts often near the end of the installer after 15 minutes have already passed. In general, a successful Open Atrium 2 installation takes around 15 to 20 minutes, which is far too long for most people trying to make a quick evaluation of Open Atrium functionality.
There Must be a Better Way!
During the recent Drupal NYC Camp, we held an Open Atrium 2 training session. During this training, we had 20 people all installing Open Atrium at the same time. We all collectively waited 20 minutes just to get 20 identical copies of Open Atrium 2 installed on our computers. Wouldn’t it have been nice to simply install Open Atrium once and then clone that onto the other computers? That’s when I realized there was a better way to install Open Atrium…by cloning a previous installation!
The new 2.18 version of Open Atrium contains this new installer option. After entering your database information, you are prompted for the Installation Method. The Standard Drupal installation method is still available (if you really want to wait 20 minutes!). The new Quick installation option is the default. Selecting Quick installation will direct the installer to import from an existing database dump that is saved within the Open Atrium code repository.
Instead of a batch process step to install and enable each module, the Quick install step uses a batch process to import database tables from the sql dump. This will only work if you are using MySQL. If you are using a different database engine you won’t get prompted for the new Quick install…it will just use the Drupal default installer.
After only TWO minutes, the installer should finish importing the database tables. You will then be prompted to fill in your Site information as normal.
Technical Details
Importing an existing database from the Drupal installer turned out to be trickier than originally anticipated. The Drupal installer isn’t happy when certain database tables or variables stored in the database are ripped out from under it. Another complexity was the fact that the new database might have a different database “prefix” for table names. The new installer code actually parses the MySQL database dump line by line to split the SQL statements into batches for each table and to replace table names with the new database prefix.
To visualize the difference between the standard Drupal installer and the new Quick install, I ran both and sent the statistics into Graphite:
The first 20 minutes of this graph shows the normal Drupal installation of Open Atrium 2. The CPU is almost fully utilized during the entire install process. The resident memory used by Apache climbs as the installer adds more and more modules to the site. At around 08:27 the first Feature export module is enabled, causing a sharp increase in the amount of memory being used. The decrease in CPU at 08:30.5 (when the green CPU line goes to zero) is caused by the installer asking for Site Information. After that step, the Drupal installer clears all caches, runs cron, and does other cleanup. This increases the resident Apache memory even further.
The second set of data starting at 08:36 is for the new Quick install option. You’ll notice the actual install time is around 2-3 minutes. Once again, the decrease in CPU at 08:38.5 is when the installer prompted for Site information. Thus, the data after that represents the same cache clearing and install cleanup as in the full installer. Ultimately the same amount of memory is needed to fully clear the cache and clean up the site as the same modules have been installed and enabled at that point.
In fact, the Quick installation process itself consists of importing over 300 database tables, followed by it’s own cache clear. The memory increase at around 08:37.5 is caused by cache clear that the Quick installer executes after all the database tables have been imported. During the actual database import, memory usage in Apache is flat.
Conclusion
Any other Drupal distribution that requires a large number of modules should be able to leverage the code used in Open Atrium. The code is all within the “install_from_db” subdirectory of the profile, which can be found in the Open Atrium project on drupal.org. The huge reduction in installation time should help retain new clients that are evaluating Open Atrium and Drupal for their organizations and generally improve first impressions of Drupal.