drush grep, search raw content in drupal with regular expressions
I was looking for a way to search raw content (before input filters are applied) in my Drupal blog using regular expressions, à la grep; I googled to see what other people had come up with to solve the same problem and I found an article about Searching the Drupal Database by Regular Expression, which pointed also to the scanner module, however those solutions have both some limitations: ad-hoc Drupal scripting, only MySQL supported, and I didn't want to have a module installed for that anyways; so I tried the Drush way and I found it the most convenient one.
Drush is a good tool to know if you are somewhat into Drupal, with Drush you can do almost anything you do with the Drupal admin UI, only faster and in a scriptable way from a CLI.
The power of Drush is that it can be extended very easily to meet our needs, by writing new commands, a first resource about that is the example command sandwich.drush.inc. Oddly enough a “How to write a new Drush command?” question was not even in the FAQ, but that's fixed now.
Anyhow, I wrote a “grep” Drush command for my issue, you can find it in the dgrep git repository. Here is an example run on my blog to check out where I used the syntaxhighlighter filter:
$ cd .../my_drupal_installation_dir
$ drush grep '/syntaxhighlighter[^}]*/'
Node: 11 Title: Web scraping with PHP and XSL
URL: blog/2009/07/26/web-scraping-php-and-xsl
Match: syntaxhighlighter brush:php
Node: 12 Title: Renaming a DOM element with XSL
URL: blog/2009/08/06/renaming-dom-element-xsl
Match: syntaxhighlighter brush:php
Node: 13 Title: Translating XML documents with XLIFF
URL: blog/2009/09/09/translating-xml-documents-xliff
Match: syntaxhighlighter brush:xml
Node: 16 Title: git-commit with date in the past
URL: blog/2009/10/30/git-commit-date-past
Match: syntaxhighlighter brush:bash
Node: 19 Title: Vim buffers: status(line) symbol
URL: blog/2009/11/12/vim-buffers-statusline-symbol
Match: syntaxhighlighter brush:plain
Node: 26 Title: Branding patches with git and vim
URL: blog/2010/01/05/branding-patches-git-and-vim
Match: syntaxhighlighter brush:bash
Node: 34 Title: On piping in shell scripts and var scoping
URL: blog/2010/03/26/piping-shell-scripts-and-var-scoping
Match: syntaxhighlighter brush:shell
Node: 37 Title: Neat compile/run cycle with git and OpenEmbedded
URL: blog/2010/05/27/neat-compilerun-cycle-git-and-openembedded
Match: syntaxhighlighter brush:bash
Node: 43 Title: AO2 runs into autorun.inf
URL: blog/2010/09/19/ao2-runs-autoruninf
Match: syntaxhighlighter class="brush: bash"
Node: 46 Title: List header files first in a patch with git
URL: blog/2010/10/13/list-header-files-first-patch-git
Match: syntaxhighlighter class="brush: cpp;" title="dinner.h"
TODO: grepping blocks is not supported yet
How cool is that?
The current dgrep code is just a prototype but I'd like it to become useful for the whole Drupal community. So if you wanna help: go try it, comment about it, fork it, and report back with any feedback or code change you might have, either here or in the relative issue on drupal.org. Thanks!