Slow Queries In Drupal Can Often Be Cured By Indexes
Recently, we were reviewing the performance of a large site that has a significant portion of its traffic from logged in users. The site was suffering from a high load average during peak times.
We enabled slow query logging on the site for a entire week, using the following in my.cnf:
log_slow_queries = 1<br>slow_query_log = 1<br>slow_query_log_file = /var/log/mysql/slow-query.log<br>log-queries-not-using-indexes = 1<br>long_query_time = 0.100
Note that the parameter long_query_time can be a fraction of a second only on more recent versions on MySQL.
You should not set this value too low, otherwise the server's disk could be tied up in logging the queries. Nor should it be too high so as to miss most slow queries.
We then analyzed the logged queries after a week.
We found that the slow queries, on aggregate, examined a total of 150,180 trillion rows, and returned 838,930 million rows.
Out of the total types of queries analyzed, the top two had a disproportionate share of the total.
So these two queries combined were 63.7% of the total slow queries! That is very high, and if we were able to improve these two queries, it would be a huge win for performance and server resources.
Voting API Slow Query
The first query had to do with Voting API and Userpoints.
It was:
SELECT votingapi_vote.*<br>FROM votingapi_vote<br>WHERE value_type = 'points' <br>AND tag = 'userpoints_karma'<br>AND uid = '75979'<br>AND value = '-1'<br>AND timestamp > '1464077478'
It hogged 45.3% of the total slow queries, and was called 367,531 times per week. It scanned over 213,000 rows every time it ran!
The query took an aggregate time for execution of 90,766, with an average of 247 milliseconds per execution.
The solution was simple: create an index on the uid column:
CREATE INDEX votingapi_vote_uid ON votingapi_vote (uid);
After that was done, the query used the index and scanned only one row, and returned instantly.
Private Messaging Slow Query
The second query had to do with Privatemsg. It is:
SELECT COUNT(pmi.recipient) AS sent_count<br>FROM pm_message pm<br>INNER JOIN pm_index pmi ON pm.mid = pmi.mid<br>WHERE pm.author = '394106' <br>AND pm.timestamp > '1463976037' <br>AND pm.author <> pmi.recipient
This query accounted for 18.4% of the total slow queries, and was called 32,318 times per week. It scanned over 1,350,000 rows on each execution!
The query took an aggregate time for execution of 36,842, with an average of 1.14 seconds (yes, seconds!) per execution.
Again, the solution was simple: create an index on the author column.
CREATE INDEX pm_message_author ON pm_message (author);
Just like the first query, after creating the index, the query used the index and scanned only 10 rows and over a million! It returned instantly.
Results After Tuning
As with any analysis, comparison of the before and after data is crucial.
After letting the tuned top two offending queries run for another week, the results were extremely pleasing:
Before
After
Total rows examined
150.18 T
34.93 T
Total rows returned
838.93 M
500.65 M
A marked improvement!
Conclusion
With performance, the 80/20 rule applies. There are often low hanging fruit that can easily be tuned.
Do not try to tune because of something you read somewhere, that may not apply to your site (including this and other articles on our site!)
Rather, you should do proper analysis, and reach a diagnosis based on facts and measurements, as to the cause(s) of the slowness. After that, tuning them will provide good results.