Converting Drupal data to Mongo
Step by step of this painful process
Converting an existing Drupal site to MongoDB
All the cool kids lately are using MongoDB field storage to scale their Drupal sites. That’s all fine and dandy if you use it from the get go since that’s what it’s made for, but converting an existing Drupal site to it can be a bit of a pain.
There are basically three parts to the conversion process:
1. Initial setup
2. Convert the data
3. Convert the views
Initial setup
This is mostly covered by the Mongo module’s docs but I’ll cover it here just because.
Step 1: Install MongoDB somewhere on a server. Instructions for this can be found in a million places online. Once you have it up and running, you should be able to do this at a terminal:
Step 2: Download and enable the “mongodb” module and the “mongodb_field_storage” submodule.
Step 3: Add the MongoDB connection info to settings.php, like so:
Step 4: Clear all caches
Now Mongo is set up and ready to go, but it’ll only act on new nodes inside content types with new fields that are created after setting up Mongo, (if you’re nerdy/curious, this is mostly due to the default field storage for existing fields being set to “field_sql_storage” in the “field_config” table). That doesn’t do us any good because we’re converting an existing site. So on to the conversion.
Converting the data
So you now have Mongo up and running but you need to convert your billion existing awesome nodes and fields and users to Mongo for it to do any good. Luckily, there’s a separate submodule called “mongodb_migrate” made to do just that.
Step 1: Enable the “mongodb_migrate” module
Step 2: Take a backup of the MySQL database in case things go wrong.
Step 3: Run “drush mongodb-migrate-prepare” which will create the collections you need in Mongo and build a list of entity types to migrate.
Step 4: Run ‘drush mongodb-migrate --timeout=”5”’ which will perform 5 seconds of the migration as a test. You can monitor the output which will tell you what entity type and ID it’s working on as it goes.
Step 5: Pick one of the ID’s from step 4 (i.e., a node or user or whatever that was migrated) and make sure that 1) it and its fields exists in Mongo and 2) you can edit it and save it and view it without issues.
If you run into issues here, you’ve got some digging to do (issue queues, IRC, source code, etc.). Otherwise, move on.
Step 6: Run ‘drush mongodb-migrate --timeout=”0”’ to run the entire migration with no timeout. If you have a lot of data this can take a VERY long time, so keep that in mind and maybe run it over night or over the weekend. Note that you can also stop the migration and when you start it again it’ll pick up where it left off.
For the record, this drush command has two options:
Note that timeout defaults to 900 seconds (15 minutes) so that’s why the --timeout=”0” is needed to keep it running until it’s done.
Step 7: Once the migration is done, click around the site and edit random things and make sure that everything is working as expected (except for the Views, that comes next).
Converting the Views
You’re using Views, right? If so, the gotcha here is that Views is hardcoded to use field_sql_storage and thus doesn’t really work with Mongo. So to get around that, some smart folks built the EntityFieldQuery Views module which is basically an alternative Views backend that uses EFQ instead of querying fields directly. The obvious benefit here is that it makes Mongo integration with Views possible, but there are a couple drawbacks:
1. NO RELATIONSHIPS WHATSOEVER. EFQ Views do not support relationships because Mongo doesn’t support JOINs. If this is a killer feature for whatever you’re doing, then you have some work to do.
2. Some modules that offer Views integration don’t work with EFQ Views. I’m not sure exactly why this is but it’s true. If you have fields or sorts or filters that are coming from a contrib module instead of stock Views or custom-defined fields, then you might want to check with a test EFQ View first.
If you’re still game, then you have two options:
1. Rebuild all your existing Views manually as EFQ: Content views rather than plain old Content views, or:
2. Convert your views to EFQ Views using a drush command.
***NOTE***: You only have to worry about this for Views that use custom fields. If a View just uses stuff that’s in the “node” or “users” table (stuff like title, created date, author, etc.) then you should be able to leave it as a regular View because that stuff all still exists in MySQL. It’s the fields that are the problem.
Rebuilding your views
If you decide your Views are simple enough that you can rebuild them without too much effort (along with redoing your views blocks and changing your CSS to match up with the new classes etc.) then your job is easy.
For each View that uses custom-defined fields in any way, just create a new View, make sure to use “EntityFieldQuery: Content (or User or whatever)” as the View type when creating, and then just configure it like any other View.
Converting your views
If you have a ton of views spread out into a ton of blocks in a ton of different contexts then maybe rebuilding that all isn’t an option for you. In that case, you’re in luck--we had that problem too and we built a drush command to convert them for you. This is riskier than rebuilding but also much quicker, and you dig the bleeding edge, don’t you? We do too.
Grab the drush inc file from this comment and run it using “drush efq-views-convert --view=”VIEWNAME”” for each View you want to convert. If the View has a relationship it’ll tell you it can’t do it, since EFQ Views doesn’t support relationships, and otherwise it’ll do its best. After it runs, edit the view and look for anywhere that says “Broken/missing handler” and if you spot any, fix those manually. The script does its best but it can’t catch everything.
Once all your views are converted and provided you converted your data successfully then your site should be in pretty good shape. Hope you’ve found this helpful!