Solr Streaming Expressions
Solr 5.1 introduced a revolutionary Streaming API. With Solr 5.2, you get Streaming Expressions on top of it. Ever wondered on how to run nested queries in SOLR or running parallel computing capabilities, this could be the answer.
Streaming Expressions provide a simple query language for SolrCloud that merges search with parallel computing. Under the covers Streaming Expressions are backed by a java Streaming API that provides a fast map/reduce implementation for SolrCloud. Streaming Expressions are composed of functions. All functions behave like Streams, which means that they don't hold all the data in memory at once. Read more about the basics here https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions
Setup:
Assuming a debian based system, say Ubuntu 12.04 or 14.04. If you have not installed Solr 5.2, go grap latest codebase (For eg http://apache.mirror1.spango.com/lucene/solr/5.2.1/), extract it.
Setup Solr in cloud mode.
Cloud mode lets you create collection and nodes. See https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud for more details. bin/solr -e cloud Enter the port and other details.
To start a single node, use,
bin/solr start -cloud -s example/cloud/node1/solr -p 8983
Streaming API:
Now comes the interesting part. We have the following streaming API functions,