B'Tselem: migration from ASP to Drupal
In October 2008 I hitchhiked through the Middle East all the way to Jerusalem, it only took me 12 days out of which I spent 3 days in Istanbul. Syria was still a relatively safe place to travel, but of course I was covered by a long term travel insurance.
While in Israel I got in touch with B'Tselem, a human rights organization for the occupied territories.
project in a nutshell
est. pages
30.000
technology
LAMP + Drupal
languages
English, Hebrew, Arabic
partner: coding
DQX Techpartner: release
Linnovate
partner: design
WUWA
running time
early 2009 - June 2011
B'Tselem has championed human rights in the Occupied Territories for over two decades, earning a reputation as the leading source of information on human rights. Late 2008, during a stay in Israel, I became aware of their activities, which range from detailed testimonies of human rights violations to actual working with Palestinians in the field who have been equipped with video cameras by B'Tselem. I noticed the website was created in Microsoft ASP.
Yoav Gross, now the director of B'Tselem's video department, was my first contact in B'Tselem. At first the plan was to simply build a website in Drupal for the video part of the site. Quickly the project proposal became much bigger, encompassing a migration of the existing site in ASP to Drupal.
Marc Blumenfrucht.
-->
An overview of B'Tselem's website
The site was more complicated and extended than I thought at the beginning. Here are the different parts:
- Press releases
- News items
- Photos, an extended collection of photographs, taken by both Jewish and Arab people in the field
- Maps
- Topic pages, detailed background information
- Publications
- Testimonies
- Statistics
- Video, including the Shooting Back project
The conversion process
Several attempts at converting the ASP files
Most of the information was coming from hand edited ASP files - for a new press release: take a template ASP file, fill in the parts and upload the file to the production server.
Honestly, I don't really like PHP. And I was not looking forward to parsing 20000 ASP files like that.
Our first attempt was to use Ruby with hpricot, turn all data into .csv files and use node import. The lack of node import's scriptability made us abandon this idea pretty quickly.
The second attempt was Python, using Drupal services. As it appeared Drupal services is nice, but it's not the right tool if you want to have precise control over your data.
So in the end I did it in PHP. Drush made things easily scriptable - and like this the migration process was run every night. The process took about 1 hour on a fast server - optimization wasn't the main concern here. The hand edited files had many peculiarities that had to be resolved with even more exceptions.
Around Spring 2010 dqxtech started working on the project as well and he handled the statistics part and many other things.
Access files: mdbtools to the rescue
A large part of the site, such as the photos and statistics, was stored in Access files. Fortunately there's a free software project to access and convert these files into more open formats and software. I wrote some PHP classes to facilitate the process.
From design to theme
At first the idea was to stick with the existing design. It was still based on HTML tables and in general felt a bit more like the 90s - but pretty good for the nineties.
So a new design was made by Ben Benhorim from whileyouwereaway visual solutions. I'm not so good with .PSD files so the conversion process was outsourced to PSD2html. I thought the Drupal theming option was a good idea but later, but instead of a bare theme PSD2html delivered a complete Drupal setup - which caused extra work to merge back into the already growing alpha site. So I can recommend their work but only until the HTML, don't take the Drupal option.
i18n and ltr/rtl
I had already gained some experience doing i18n in Drupal but I hadn't yet worked with right-to-left languages. This seemed easy at first, then it became hairy but thanks to $body_classes and some tips and work of dqxtech it was pretty easy again at the end. The main trick is to have rtl or ltr in your body tag and use this to separate the ltr and rtl CSS rules, some fictional example:
#footer-wrapper { margin-bottom: 20px; }
.ltr #footer-wrapper { margin-left: 10px; }
.rtl #footer-wrapper { margin-right: 10px; }
We focused on getting Hebrew done right, so my Hebrew has improved considerably thanks to this project. At the end there were also some specific issues in Arabic but unfortunately I still have a hard time deciphering the characters.
SEO
The existing links were of the form English/Statistics/Index.asp, and also mixed capitals/lower case (typical for ASP sites). All of these had to remain working. I chose to first make everything lowercase, which I did through some Apache rules for forced lowercase 301s.
(I don't remember the exact details right now [June 2011] but...) For the conversion process itself it was important that all files were also accessible in various different ways, during the conversion process and also afterwards, for this I wrote a Python script to recursively create lowercase symlinks.
The main modules
- Display Suite has been extremely useful for getting the proper positioning of the various content types
Drush is not really a module but it has been so amazingly useful that it deserves mention.
To a release
By the end of 2010 priorities were changed and we decided to get Israel's biggest Drupal shop, Linnovate to finish the job and launch the site.
Conclusion
Meanwhile I have never seriously considered moving to Israel but I'm happy I worked on this website. I feel B'Tselem's work is important for the future of the Middle East.
tag: israeldrupalhebrewbtselemaspconversionarabichuman rightsdrupal planet