Configuring Drupal 7.x With PostgreSQL Replication

One of the new features in Drupal 7 is that it supports sending queries to a read-only slave database. Since version 9.0, PostgreSQL supports replication natively. In this post, I wanted to cover how to configure replication in PostgreSQL and have Drupal make use of a slave. I will use the master/slave terminology that is common in the MySQL world when referring to the master (primary) and slave (standby) servers in this post.

First, I installed PostgreSQL 9.1 on my master server along with Drupal 7.12 The steps taken to install and configure Drupal with PostgreSQL 9.1 on my master server are outlined in this gist. Then I installed PostgreSQL 9.1 on another server that will serve as a slave. My initial setup on the slave was quite simple and only involved a basic install and nothing else. The following commands were all I executed on the slave server to get a base PostgreSQL install:

sudo apt-get install python-software-propertiessudo add-apt-repository ppa:pitti/postgresqlsudo apt-get updatesudo apt-get install postgresql-9.1 libpq-dev postgresql-contrib-9.1

Once the basic Drupal install was up and running on the master and the slave server has a basic PostgreSQL install, I started on configuring replication. Replication in general is documented in depth in the online PostgreSQL documentation. In this post, I will be configuring streaming replication which allows a slave server to service read queries.

The steps that need to be performed to configure streaming replication are (I will cover how to perform each step):

create a replication user for slaves to connect with
enable continuous archiving on the master
configure the master to allow remote connections with the replication user
take a base backup to be used for setting up a slave
set up a file-based log-shipping slave

The first step is to create a user for replication on the master:

sudo su postgrespsqlcreate role repl replication login password 'repl';

Next, the master needs to be have continuous archiving enabled. This is achieved by editing the /etc/postgresql/9.1/main/postgresql.conf file on the master and ensuring the following parameters are set:

wal_level = hot_standbymax_wal_senders = 3 # limits number of concurrent connections from standbylisten_addresses = '0.0.0.0'archive_mode = onarchive_command = 'test ! -f /mnt/postgres/archivedir/%f && cp %p /mnt/postgres/archivedir/%f'

Now to allow remote connections for the replication user, the /etc/postgresql/9.1/main/pg_hba.conf file on the master server needs to have an entry like (this assumes the slave server I have configured has the IP address 10.39.111.10):

host replication repl 10.39.111.10/32 md5

Once the above modifications have been mode, we need to restart the PostgreSQL service:

sudo service postgresql restart

The master is now configured. Next we go to the slave server to take a base backup using pg_basebackup along with configuring the slave to use this base backup for its data directory:

sudo service postgresql stopsudo mv /var/lib/postgresql/9.1/main/ /var/lib/postgresql/9.1/orig_mainsudo su postgrespg_basebackup -D /var/lib/postgresql/9.1/main/ -P -h master_server -p 5432 -U repl -Wsudo ln -s /etc/ssl/certs/ssl-cert-snakeoil.pem /var/lib/postgresql/9.1/main/server.crtsudo ln -s /etc/ssl/private/ssl-cert-snakeoil.key /var/lib/postgresql/9.1/main/server.key

The pg_basebackup command should result in output similar to the following:

postgres@ip-10-39-111-9:/etc/postgresql/9.1/main$ pg_basebackup -D /var/lib/postgresql/9.1/main/ -P -h 10.76.241.129 -p 5432 -U repl -WPassword: WARNING: skipping special file "./server.key"WARNING: skipping special file "./server.crt"WARNING: skipping special file "./server.key"WARNING: skipping special file "./server.crt"1403786/1403786 kB (100%), 1/1 tablespaceNOTICE: pg_stop_backup complete, all required WAL segments have been archivedpostgres@ip-10-39-111-9:/etc/postgresql/9.1/main$

Next, we configure the slave to be a hot standby and to allow remote client connections (since Drupal will be connecting to the slave). This is done by editing the /etc/postgresql/9.1/main/postgresql.conf file on the slave to have the following entries:

hot_standby = onlisten_addresses = '0.0.0.0'

To allow the drupal user to connect from the master server (where apache is running), modify the /etc/postgresql/9.1/main/pg_hba.conf file on the slave (assuming 10.76.241.129 is IP address of master):

host drupal drupal 10.76.241.129/32 md5

Next, create a recovery.conf file in the PostgreSQL data directory on the slave server:

sudo touch /var/lib/postgresql/9.1/main/recovery.confsudo chown postgres:postgres /var/lib/postgresql/9.1/main/recovery.conf

The following should be placed in the recovery.conf file (assuming 10.76.241.129 is IP address of master):

standby_mode = 'on'primary_conninfo = 'host=10.76.241.129 port=5432 user=repl password=repl'

The PostgreSQL service on the slave server is now ready to be started again:

sudo service postgresql start

If everything worked correctly, log entries indicating replication is running should be present. For example, on my slave server my log file had entries like:

ubuntu@ip-10-39-111-9:/var/log/postgresql$ sudo tail -n 5 /var/log/postgresql/postgresql-9.1-main.log 2012-07-07 22:06:50 UTC LOG: streaming replication successfully connected to primary2012-07-07 22:06:50 UTC LOG: incomplete startup packet2012-07-07 22:06:50 UTC LOG: redo starts at 1/150000202012-07-07 22:06:50 UTC LOG: consistent recovery state reached at 1/160000002012-07-07 22:06:50 UTC LOG: database system is ready to accept read only connectionsubuntu@ip-10-39-111-9:/var/log/postgresql$

Now, Drupal running on the master server is ready to be configured to use a PostgreSQL slave for read-only queries! The settings.php file for the Drupal site needs to be updated to know about this slave database. My settings.php file looked like (10.39.111.10 is IP address of slave server):

$databases = array ( 'default' => array ( 'default' => array ( 'database' => 'drupal', 'username' => 'drupal', 'password' => 'drupal', 'host' => 'localhost', 'port' => '5432', 'driver' => 'pgsql', 'prefix' => '', ), 'slave' => array ( 'database' => 'drupal', 'username' => 'drupal', 'password' => 'drupal', 'host' => '10.39.111.10', 'port' => '5432', 'driver' => 'pgsql', 'prefix' => '', ), ),);

I would suggest enabling query logging on the slave server so you can see read queries being sent to the slave. Query logging can be enabled by modifying the /etc/postgresql/9.1/main/postgresql.conf file to have these entries:

logging_collector = onlog_directory = 'pg_log'log_statement = 'all'

Query log files will then be generated in the /var/lib/postgresql/9,1/main/pg_log directory.

By default, very few queries from Drupal core are sent to a slave database. The search module is probably the best module to test with to see queries being sent to the slave server. The search module can be access from your drupal site by going to http://your.ip.address/drupal/?q=search

Try searching content for a keyword. If everything is working correctly, queries should start appearing in the query log on the slave server when issuing content searches.

That’s about it for this post. Once replication is configured in PostgreSQL, having Drupal send queries to the slave is pretty straightforward.

Original Article:

Configuring Drupal 7.x With PostgreSQL Replication