Solr 5: First look
Installation
Setting up Solr5
Solr 5 is now a stand-alone service and it is no longer necessary to run it in a container like Tomcat. The advantage of this is that Tomcat does not have to be installed any longer, which simplifies maintaining and securing the server Solr is running on.
Installation is also much simpler:
- Download Solr from the download page
- Extract the archive (e.g. to solr5)
- CD into the directory (e.g. cd solr5)
- In this directory there is a script: install_solr_service.sh which must be run as root and must have as first argument the path to the downloaded archive. Run this script with:sudo ./bin/install_solr_service.sh <path-to-solr-5.0.0.tgz\>
Installation defaults / options:
- The installation directory defaults to /opt/solr, and is configurable with the '-i'-option
- The data directory defaults to /var/solr, and is configurable with the '-d'-option.
- The port defaults to 8983 (configuration option '-p')
Notes:
- The script wil install Solr5 as a service, so that you can easily use sudo service solr start, sudo service solr stop etc.
- The script also creates a 'solr'-user and -group which are used to run the server.
Tweaking the installation
In the data-directory (default /var/solr) the file solr.in.sh lives. This script will be run as part of the Solr startup procedure and is the place in which the most important variables for Solr are set.
For example, debugging and development handy options for autocommit can be set here ('Why do I have to wait 2 minutes befor something is in the index...'):
SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=3000"
SOLR_OPTS="$SOLR_OPTS -Dsolr.autoCommit.maxTime=60000"
The directory in which the new create core command acts can also be set here by editing the SOLR_HOME variable, eg:
SOLR_HOME=/var/solr/cores
To use this, make sure the solr.xml is present in the directory, also make sure the solr user owns this directory, and the service is restarted:
sudo chown -R solr:solr /var/solr
sudo cp <solr install dir>/server/solr/solr.xml /var/solr/cores/
sudo service solr restart
Creating cores
The solr executable in the bin directory has the ability to create cores, so copying existing cores and adding them to solr.xml is not necessary anymore. Furthermore, Solr 5 uses the new core-discovery abilities to detect the cores and so these kind of settings:
<cores adminPath="/admin/cores" defaultCoreName="solrdev">
<core name="xxx" instanceDir="cores/xxx" />
<core name="yyy" instanceDir="cores/yyy" />
</cores>
in solr.xml are no longer present.
To create a new core (from the Solr 5 installation dir) use:
sudo bin/solr create -c devel1
This command will fail with an error:
Failed to create core 'devel1' due to: Error CREATEing SolrCore 'devel1': Unable to create core [devel1] Caused by: /var/solr/data/devel1/data
due to permmission errors, fix them with:
sudo chown -R solr:solr /var/solr/
and rerun the commando:
sudo bin/solr create -c devel1
If you now point your browser to
http://localhost:8983/solr/#/devel1
you will see the new core.
Customizing your schema
Drupal schema
When using the 4.x schema from the Drupal ApacheSolr module only a few points prevent the Solr core from running:
Line 99:
<fieldType name="pfloat" class="solr.FloatField" omitNorms="true"/>
Line 122:
<fieldType name="date" class="solr.TrieDateField" sortMissingLast="true" omitNorms="true"/>
Both 'FloatField' and 'DateField' are deprecated:
The following legacy numeric and date field types, deprecated in Solr 4.8, are no longer supported: BCDIntField, BCDLongField, BCDStrField, IntField, LongField, FloatField, DoubleField, SortableIntField, SortableLongField, SortableFloatField, SortableDoubleField, and DateField. Convert these types in your schema to the corresponding Trie-based field type and then re-index. See SOLR-5936 for more information.
See also SOLR-5936
When both fields are changed to their Trie-based variants, the core will be starting, which is not to say that it is running optimal!
Furthermore in solconfig.xml solr.admin.AdminHandlers is deprecated, remove the line 1044:
<requestHandler name="/admin/" class="solr.admin.AdminHandlers" />
Also the extraction and clustering libs are not on the same location (and probably not necessary), remove line 71 and 72:
<lib dir="${solr.contrib.dir:../../../contrib}/extraction/lib" />
<lib dir="${solr.contrib.dir:../../../contrib}/clustering/lib/" />
Future todo's
Of course this is only a short first look and we still have to look into things like security (obviously securing the Solr server on IP via Tomcat is no longer an option) and perfomance. For now we are using Tika as extracting service, but the location extracting libs is also something which should be fixed in the Drupal solconfig.xml.