Migrate Translations from CSV, JSON or XML to Drupal 8
In my last post, I showed you how to migrate translated content from Drupal 6 to Drupal 8. But clients often don't start with their data in Drupal 6. Instead there's some other source of data that may include translations, like a CSV spreadsheet. In this article, I'll show you how to migrate multilingual content from such sources to Drupal 8.
This article would not have been possible without the help of my colleague Dave. Gracias Dave!
The problem
We have two CSV files containing some data about chemical elements in two languages. One file contains data in English and the other file, in Spanish. Our goal is to migrate these records into a Drupal 8 website, preserving the translations.
Before we start
- Since this is an advanced migration topic, it is assumed you already know the basics of migration.
- To execute the migrations in this example, you can download the migrate example i18n. The module should work without any trouble for a standard Drupal 8 install. See quick-start for more information.
Migrating JSON, XML and other formats
Though this example shows how to work with a CSV data source, one can easily work with other data sources. Here are some quick pointers:
- Find and install the relevant migrate source module. If you do not have a standard source module available, you can:
- try converting your data to a supported format first.
- write your own migration source plugin, if you're feeling adventurous.
- Modify the migration definitions to include custom parameters for the data source.
- Some useful source formats are supported by these projects:
- JSON / XML / SOAP are supported by the migrate_plus project.
- ODS / XLS / XLSX / CSV are supported by migrate_spreadsheet project.
The module
To write the migrations, we create a module—in our case, it is named migrate_example_i18n. There's nothing special about the module declaration except for the dependencies:
- migrate_plus and migrate_tools provide various features for defining and executing migrations.
- migrate_source_csv lets us read migration source data from CSV files.
How to migrate translations
Before we start writing migrations, it is important to mention how Drupal 8 translations work. In a nutshell:
- First, we create content in its base language, say in English. For example, we could create a brand new node for the element Hydrogen, which might have a unique node ID 4.
- Now that the base node is in place, we can translate the node, say to Spanish. Unlike some previous versions of Drupal, this won't become a new node with its own node ID. Instead, the translation is saved against the same node generated above, and so will have the same node ID—just a different language setting.
Hence, the migration definition for this example includes the following:
- We migrate the base data in English using in example_element_en migration.
- We migrate the Spanish translations using the example_element_es migration, and link each translation to the original English version.
- We group the two migrations in the example_element migration group to keep things clean and organized.
Thus, we can execute the migrations of this example with the command drush migrate-import --group=example_element
.
Warning
Note that this plan only works because every single node we are importing has at least an English translation! If some nodes only existed in Spanish, we would not be able to link them to the (non-existent) original English version. If you encounter data like this, you'll need to handle it in a different way.
Step 1: Element base migration (English)
To migrate the English translations, we define the example_element_en migration. Here is a quick look at some important parameters used in the migration definition.
Source
source: plugin: csv path: 'element.data.en.csv' header_row_count: 1 keys: - Symbol fields: Name: 'Name' Symbol: 'Symbol' 'Atomic Number': 'Atomic number' 'Discovered By': 'Name of people who discovered the element' constants: lang_en: en node_element: 'element'
- plugin: Since we want to import data from a CSV file, we need to use the csv plugin provided by the migrate_source_csv module.
- path: Path to the CSV data source so that the source plugin can read the file. Our source files for this example actually live within our module, so we modify this path at runtime using
hook_migration_plugins_alter()
in migrate_example_i18n.module. - header_row_count: Number of initial rows in the CSV file which do not contain actual data. This helps ignore column headings.
- keys: The column(s) in the CSV file which uniquely identify each record. In our example, the chemical symbol in the column Symbol is unique to each row, so we can use that as the key.
- fields: A description for every column present in the CSV file. This is used for displaying source details in the UI.
- constants: Some static values for use during the migration.
Destination
destination: plugin: 'entity:node'
- plugin: Nothing fancy here. We aim to create node entities, so we set the
plugin
asentity:node
. - translations: Since we are importing the content in base language, we do not specify the
translations
parameter. This will make Drupal create new nodes for every record.
Process
process: type: constants/node_element title: Name langcode: constants/lang_en field_element_symbol: Symbol field_element_discoverer: plugin: explode delimiter: ', ' source: Discovered By
This is where we map the columns of the CSV file to properties of our target nodes. Here are some mappings which require a special mention and explication:
- type: We hard-code the content type for the nodes we wish to create, to type
element
. - langcode: Since all source records are in English, we tell Drupal to save the destination nodes in English as well. We do this by explicitly specifying
langcode
asen
. - field_element_discoverer: This field is a bit tricky. Looking at the source data, we realize that every element has one or more discoverers. Multiple discoverer names are separated by commas. Thus, we use
plugin: explode
anddelimiter: ', '
to split multiple records into arrays. With the values split into arrays, Drupal understands and saves the data in this column as multiple values.
When we run this migration like drush migrate-import example_element_en
, we import all the nodes in the base language (English).
Step 2: Element translation migration (Spanish)
With the base nodes in place, we define a migration similar to the previous one with the ID example_element_es.
source: plugin: csv path: 'element.data.es.csv' header_row_count: 1 keys: - 'Simbolo' constants: lang_en: en # ...destination: plugin: 'entity:node' translations: trueprocess: nid: plugin: migration source: Simbolo migration: example_element_en langcode: constants/lang_es content_translation_source: constants/lang_en # ...migration_dependencies: required: - example_element_en
Let us look at some major differences between the example_element_es
migration and the example_element_en
migration:
- source:
- path: Since the Spanish node data is in another file, we change the path accordingly.
- keys: The Spanish word for Symbol is Símbolo, and it is the column containing the unique ID of each record. Hence, we define it as the source data key. Unfortunately, Drupal migrate support keys with non-ASCII characters such as
í
(with its accent). So, as a workaround, I had to remove all such accented characters from the column headings and write thekey
parameter asSimbolo
, without the specialí
. - fields: The field definitions had to be changed to match the Spanish column names used in the CSV.
- destination:
- translations: Since we want Drupal to create translations for English language nodes created during the
example_element_en
migration, we specifytranslations: true
.
- translations: Since we want Drupal to create translations for English language nodes created during the
- process:
- nid: We use the
plugin: migration
to make Drupal lookup nodes which were created during the English element migration and use their ID as thenid
. This results in the Spanish translations being attached to the original nodes created in English. - langcode: Since all records in element.data.es.csv are in Spanish, we hard-code the
langcode
toes
for each record of this migration. This tells Drupal that these are Spanish translations. - content_translation_source: Each translation of a Drupal node comes from a previous translation—for example, you might take the Spanish translation, and translate it into French. In this case, we'd say that Spanish was the source language of the French translation. By adding this process step, we tell Drupal that all our Spanish translations are coming from English.
- nid: We use the
- migration_dependencies: This ensures that the base data is migrated before the translations. So to run this migration, one must run the
example_element_en
migration first.
Voilà! Run the Spanish migration (drush migrate-import example_element_es)
and you have the Spanish translations for the elements! We can run both the English and Spanish migration at once using the migration group we created. Here's how the output should look in the command-line:
$ drush migrate-import --group=example_elementProcessed 111 items (111 created, 0 updated, 0 failed, 0 ignored) - done with 'example_element_en'Processed 105 items (105 created, 0 updated, 0 failed, 0 ignored) - done with 'example_element_es'
If we had another file containing French translations, we would create another migration like we did for Spanish, and import the French data in a similar way. I couldn't find a CSV file with element data in French, so I didn't include it in this example—but go try it out on your own, and leave a comment to tell me how it went!
Next steps
- Check out the code for the migrate_example_i18n module on GitHub.
- Read about migrating translated content from Drupal 6 to Drupal 8.