Jenkins + Drush + Dropbox = Easily share sanitized database projects
I recently wrote about setting up Jenkins. My next step was making it do something useful to help our team become more efficient. In most any team it's likely that you'll get some folks for whom "just use drush sql-sync" is not a reasonble solution.
My goal: get a database backup into dropbox on a regular basis and make sure no sensitive customer data is in that backup.
Make a Database backup of the live site
We're running jenkins on a non-production server (for a variety of reasons). So, we get a backup of the live database into a temporary scratch database using the drush aliases feature. That process sanitizes it a bit using the sql-sanitize feature of drush. Then we dump out that database.
- Start with an up to date checkout of your live site's Drupal code
- Use the multisite feature and create a sites/example.prod/settings.php where the $databases array has a set of read-only credentials to the production database
- A second "site" at sites/example.scratch/settings.php
- Setup a Drush alias that points to those two sites inside the Drupal - be sure to use the 'uri' element so that drush knows which set of credentials to use inside the sites/ folder:
<br>$aliases['example.prod'] = array(<br>'root' => '/var/lib/jenkins/example_scripts/example_com_checkout_for_drush/',<br>'uri' => 'example.backup',<br>);<br>$aliases['example.scratch'] = array(<br>'root' => '/var/lib/jenkins/example_scripts/example_com_checkout_for_drush/',<br>'uri' => 'example.scratch',<br>);<br>
- The example.scratch credentials should point to a "scratch" database that is used just for these purposes.
- Finally a line in the Jenkins job to copy the database from the live site to the backup.
drush sql-sync @sitename.prod @sitename.backup
- Run the "drush sql-sanitize" command on it to get drush's default sanitization. If you have the paranoia module installed it will do a few extra sanitization steps. For more on some ways to sanitize a Drupal database checkout this Drupal Scout article.
- Bonus points: Write your own sitename_drush_sql_sync_sanitize hook just like the paranoia module does - you can clear out data that is important to your site
- Once that's done, you export the sanitized database:
drush sql-dump > sitename.prod_sanitized_backup.sql
Send the result to Dropbox
There's Dropbox page for Linux that even has command line instructions. If you're not sure if you need 64 or 32 bit do a "uname -a" on your server. x86_64 is 64 bit.
I also downloaded their dropbox.py script to manage dropbox. I wanted the script to work regardless of whether the dropbox daemon has properly restarted (it feels like a likely point of failure). So, we have a Dropbox account that is connected just to the server and frequently our jobs will have a last step of moving their contents to ~/Dropbox/something and then they kick off the command ~/bin/dropbox.py start
To be honest, the Dropbox CLI mode has been a real pain. Some random TIF files don't synch to the server.
Bits and bobs
For us, this script runs every morning. After a few minutes Dropbox has synched it to all their machines and they can easily import it to a local environment using drush, mysql command line, PHPMyAdmin, or whatever their favorite tool is.
It does a few extra things:
- It bzips the output before putting it in Dropbox.
- It does an "rm -f *.sql; rm -f *.bz2" at the beginning and end to remove any leftover cruft.
- A downstream project uses the output of this step to rebuild a test site where we get the latest code from the testing branch, the latest prod database and people can test it out without having to have a local Drupal installation.
Related: Jenkins is CARD.com open source project spotlight article.
Thanks to c4rl for his help editing.