High performance in Drupal Part 2: Lightning fast code
Welcome to the second part of our series on High Performance Drupal. Here we will cover techniques for speeding up your custom modules.
Query builders
Drupal 7 adds an easy to use database abstraction layer letting you quickly produce complex, database independent queries.
Unfortunately there can be a noticeable performance penalty when using these query builders instead of straight SQL. Let's look at a few guidelines for deciding which option is best.
You should use a query builder if:
- You are creating a dynamic query and want to avoid lots of messy string concatenation
- You want to make use of query extenders such as table column sorting or a pager
- You want to allow other modules to alter your query, such as the node access system
- You need to support multiple databases with the same codebase
- Code readability is more important than performance of the query
- The query is long-running or not used often so a small amount of overhead is negligible
<span style="color: #000000"><span style="color: #0000BB"><?php<br></span><span style="color: #FF8000">// Create a select query for the `node` table.<br></span><span style="color: #0000BB">$query </span><span style="color: #007700">= </span><span style="color: #0000BB">db_select</span><span style="color: #007700">(</span><span style="color: #DD0000">'node'</span><span style="color: #007700">, </span><span style="color: #DD0000">'n'</span><span style="color: #007700">)<br> </span><span style="color: #FF8000">// Add a pager.<br> </span><span style="color: #007700">-></span><span style="color: #0000BB">extend</span><span style="color: #007700">(</span><span style="color: #DD0000">'PagerDefault'</span><span style="color: #007700">)<br> </span><span style="color: #FF8000">// Add table click sorting.<br> </span><span style="color: #007700">-></span><span style="color: #0000BB">extend</span><span style="color: #007700">(</span><span style="color: #DD0000">'TableSort'</span><span style="color: #007700">);<br><br></span><span style="color: #FF8000">// Join our example table.<br></span><span style="color: #0000BB">$query</span><span style="color: #007700">-></span><span style="color: #0000BB">join</span><span style="color: #007700">(</span><span style="color: #DD0000">'example_table'</span><span style="color: #007700">, </span><span style="color: #DD0000">'e'</span><span style="color: #007700">, </span><span style="color: #DD0000">'n.nid = e.nid'</span><span style="color: #007700">);<br><br></span><span style="color: #0000BB">$query<br> </span><span style="color: #FF8000">// Specify the fields we require.<br> </span><span style="color: #007700">-></span><span style="color: #0000BB">fields</span><span style="color: #007700">(</span><span style="color: #DD0000">'e'</span><span style="color: #007700">, array(</span><span style="color: #DD0000">'nid'</span><span style="color: #007700">, </span><span style="color: #DD0000">'example_data'</span><span style="color: #007700">))<br> </span><span style="color: #FF8000">// We only want to load nodes which the user has access to.<br> </span><span style="color: #007700">-></span><span style="color: #0000BB">addTag</span><span style="color: #007700">(</span><span style="color: #DD0000">'node_access'</span><span style="color: #007700">)<br> </span><span style="color: #FF8000">// We only want 10 results at a time. This works alongside the pager.<br> </span><span style="color: #007700">-></span><span style="color: #0000BB">limit</span><span style="color: #007700">(</span><span style="color: #0000BB">10</span><span style="color: #007700">);<br><br>if (!empty(</span><span style="color: #0000BB">$nids</span><span style="color: #007700">)) {<br> </span><span style="color: #FF8000">// Add an optional condition. $nids can be an int or an array.<br> </span><span style="color: #0000BB">$query</span><span style="color: #007700">-></span><span style="color: #0000BB">condition</span><span style="color: #007700">(</span><span style="color: #DD0000">'nid'</span><span style="color: #007700">, </span><span style="color: #0000BB">$nids</span><span style="color: #007700">);<br>}<br><br></span><span style="color: #FF8000">// Run our query.<br></span><span style="color: #0000BB">$result </span><span style="color: #007700">= </span><span style="color: #0000BB">$query</span><span style="color: #007700">-></span><span style="color: #0000BB">execute</span><span style="color: #007700">();<br></span><span style="color: #0000BB">?></span></span>
You should use db_query() if:
- You have a simple or static query
- Your query will be run multiple times per page
- You will never need to switch databases or support more than one database engine
- You don't need any of the extra functionality provided by query builders
<span style="color: #000000"><span style="color: #0000BB"><?php<br></span><span style="color: #FF8000">// Run our simple query.<br></span><span style="color: #0000BB">$result </span><span style="color: #007700">= </span><span style="color: #0000BB">db_query</span><span style="color: #007700">(</span><span style="color: #DD0000">"SELECT nid, example_data FROM {example_table} WHERE nid IN :nids"</span><span style="color: #007700">, array(</span><span style="color: #DD0000">':nids' </span><span style="color: #007700">=> </span><span style="color: #0000BB">$nids</span><span style="color: #007700">));<br></span><span style="color: #0000BB">?></span></span>
Scaling your data
One oft-overlooked area of Drupal's cache and entity systems are their ability to optimise loading data by doing so in bulk.
When loading entities (comments, nodes, terms users, etc.) you can call entity_load()
with an array of ID's to load several at a time. This will group together database queries wherever possible to reduce the number of queries per page and reduce page load time.
The corresponding comment_load_multiple(), node_load_multiple(), taxonomy_term_load_multiple() and user_load_multiple() provide entity specific wrappers around entity_load() for ease of use.
Don't do this:
<span style="color: #000000"><span style="color: #0000BB"><?php<br></span><span style="color: #007700">foreach (</span><span style="color: #0000BB">$nids </span><span style="color: #007700">as </span><span style="color: #0000BB">$nid</span><span style="color: #007700">) {<br> </span><span style="color: #FF8000">// Load each node one at a time.<br> </span><span style="color: #0000BB">$node </span><span style="color: #007700">= </span><span style="color: #0000BB">node_load</span><span style="color: #007700">(</span><span style="color: #0000BB">$nid</span><span style="color: #007700">);<br> </span><span style="color: #FF8000">// Do something with the node.<br> </span><span style="color: #0000BB">example_function</span><span style="color: #007700">(</span><span style="color: #0000BB">$node</span><span style="color: #007700">);<br>}<br></span><span style="color: #0000BB">?></span></span>
This will be much faster:
<span style="color: #000000"><span style="color: #0000BB"><?php<br>$nodes </span><span style="color: #007700">= </span><span style="color: #0000BB">node_load_multiple</span><span style="color: #007700">(</span><span style="color: #0000BB">$nids</span><span style="color: #007700">);<br>foreach (</span><span style="color: #0000BB">$nodes </span><span style="color: #007700">as </span><span style="color: #0000BB">$node</span><span style="color: #007700">) {<br> </span><span style="color: #FF8000">// Do something with the node.<br> </span><span style="color: #0000BB">example_function</span><span style="color: #007700">(</span><span style="color: #0000BB">$node</span><span style="color: #007700">);<br>}<br></span><span style="color: #0000BB">?></span></span>
The Drupal cache system offers similar functionality using cache_get_multiple() for loading several cache entries at once.
Most but not all cache backends support this functionality, so it's best to double check first. The Drupal database cache, memcache and APC all support this optimisation.
Static caching
Often you need to make use of the same information more than once during a page load. Best practices denote that you should package this code up into a separate function for ease of maintenance.
To avoid expensive calculations or database calls you can use a static cache to keep hold of the data once it has been loaded once as the following example shows.
<span style="color: #000000"><span style="color: #0000BB"><?php<br></span><span style="color: #007700">function </span><span style="color: #0000BB">example_function</span><span style="color: #007700">() {<br> </span><span style="color: #FF8000">// Get our static data.<br> </span><span style="color: #0000BB">$example_data </span><span style="color: #007700">= &</span><span style="color: #0000BB">drupal_static</span><span style="color: #007700">(</span><span style="color: #0000BB">__FUNCTION__</span><span style="color: #007700">);<br><br> if (!isset(</span><span style="color: #0000BB">$example_data</span><span style="color: #007700">)) {<br> </span><span style="color: #FF8000">// We haven't calculated this yet so load it now.<br> </span><span style="color: #0000BB">$example_data </span><span style="color: #007700">= </span><span style="color: #0000BB">example_data_load</span><span style="color: #007700">();<br> }<br><br> return </span><span style="color: #0000BB">$example_data</span><span style="color: #007700">;<br>}<br></span><span style="color: #0000BB">?></span></span>
Using drupal_static() allows you to access and clear the static data elsewhere in your page load, in case you need to load a fresh copy later on.
If your function is used heavily you should use the fast Drupal static method, as drupal_static() does have a small overhead associated with it:
<span style="color: #000000"><span style="color: #0000BB"><?php<br></span><span style="color: #007700">function </span><span style="color: #0000BB">example_function</span><span style="color: #007700">() {<br> </span><span style="color: #FF8000">// Use the advanced drupal_static() pattern, since this is called very often.<br> </span><span style="color: #007700">static </span><span style="color: #0000BB">$drupal_static_fast</span><span style="color: #007700">;<br> if (!isset(</span><span style="color: #0000BB">$drupal_static_fast</span><span style="color: #007700">)) {<br> </span><span style="color: #0000BB">$drupal_static_fast </span><span style="color: #007700">= &</span><span style="color: #0000BB">drupal_static</span><span style="color: #007700">(</span><span style="color: #0000BB">__FUNCTION__</span><span style="color: #007700">);<br> }<br><br> </span><span style="color: #FF8000">// Get our static data.<br> </span><span style="color: #0000BB">$example_data </span><span style="color: #007700">= &</span><span style="color: #0000BB">$drupal_static_fast</span><span style="color: #007700">;<br><br> if (!isset(</span><span style="color: #0000BB">$example_data</span><span style="color: #007700">)) {<br> </span><span style="color: #FF8000">// We haven't calculated this yet so load it now.<br> </span><span style="color: #0000BB">$example_data </span><span style="color: #007700">= </span><span style="color: #0000BB">example_data_load</span><span style="color: #007700">();<br> }<br><br> return </span><span style="color: #0000BB">$example_data</span><span style="color: #007700">;<br>}<br></span><span style="color: #0000BB">?></span></span>
So you've got your data until the end of the page load, but why stop there? cache_get() and cache_set() let you keep hold of that data for as long as you need it:
<span style="color: #000000"><span style="color: #0000BB"><?php<br></span><span style="color: #007700">function </span><span style="color: #0000BB">example_function</span><span style="color: #007700">() {<br> </span><span style="color: #FF8000">// Use the advanced drupal_static() pattern, since this is called very often.<br> </span><span style="color: #007700">static </span><span style="color: #0000BB">$drupal_static_fast</span><span style="color: #007700">;<br> if (!isset(</span><span style="color: #0000BB">$drupal_static_fast</span><span style="color: #007700">)) {<br> </span><span style="color: #0000BB">$drupal_static_fast </span><span style="color: #007700">= &</span><span style="color: #0000BB">drupal_static</span><span style="color: #007700">(</span><span style="color: #0000BB">__FUNCTION__</span><span style="color: #007700">);<br> }<br><br> </span><span style="color: #FF8000">// Get our static data.<br> </span><span style="color: #0000BB">$example_data </span><span style="color: #007700">= &</span><span style="color: #0000BB">$drupal_static_fast</span><span style="color: #007700">;<br><br> if (!isset(</span><span style="color: #0000BB">$example_data</span><span style="color: #007700">)) {<br> if (</span><span style="color: #0000BB">$cache </span><span style="color: #007700">= </span><span style="color: #0000BB">cache_get</span><span style="color: #007700">(</span><span style="color: #DD0000">'example_data'</span><span style="color: #007700">)) {<br> </span><span style="color: #FF8000">// Load the data straight from the cache.<br> </span><span style="color: #0000BB">$example_data </span><span style="color: #007700">= </span><span style="color: #0000BB">$cache</span><span style="color: #007700">-></span><span style="color: #0000BB">data</span><span style="color: #007700">;<br> }<br> else {<br> </span><span style="color: #FF8000">// We haven't calculated this yet so load it now.<br> </span><span style="color: #0000BB">$example_data </span><span style="color: #007700">= </span><span style="color: #0000BB">example_data_load</span><span style="color: #007700">();<br> </span><span style="color: #FF8000">// Store our calculated data in the cache.<br> </span><span style="color: #0000BB">cache_set</span><span style="color: #007700">(</span><span style="color: #DD0000">'example_data'</span><span style="color: #007700">, </span><span style="color: #0000BB">$example_data</span><span style="color: #007700">);<br> }<br> }<br><br> return </span><span style="color: #0000BB">$example_data</span><span style="color: #007700">;<br>}<br></span><span style="color: #0000BB">?></span></span>
Profiling your code
XHProf is a really helpful tool for finding bottlenecks in your code.
You can see the amount of time and memory used by each function on a page-by-page basis, so you can work out where there's room for improvement.
It's always worth looking through the most common and most time consuming functions to make sure they really should be running that frequently and for that long. Often seemingly simple, un-optimised functions end up being used much more often than you thought.
There are a couple of ways to use XHProf in your Drupal site:
- The XHProf module is the quickest to set up as it all comes bundled together
- If you're already using Devel on your site then you might find it easier to configure the built-in XHProf support
High speed hosting
Tuning your hosting can make or break your site when it comes to performance.
I will be writing a follow up post High speed Drupal hosting with some handy hints and tips.
In the mean time get in touch if you would like a hand tuning your Drupal hosting environment.