Filtering and sorting search results by date range with OpenSearch
How to make use of OpenSearch (and Elasticsearch) built-in data-types specifically designed for filtering and sorting date ranges.
by
lee.rowlands
/ 5 June 2024
It's fairly common to have content with a date range. E.g., events or content that is only applicable for a given date.
Sorting and filtering results by date can be complicated, especially if you have to support features such as multiple date ranges and optional end dates.
But OpenSearch (and Elasticsearch) have data types specifically designed for this use case.
The key here is to index your data using a Range field type.
Let's consider an example we recently built for NSW.gov.au. The requirements for the field were as follows:
- Index a grant content-type
- The grant content-type has a date-range field
- The field has optional end dates via the Optional End Date module
- The field is multi value
- There is also a boolean 'Grant is ongoing' field that flags the grant as 'Evergreen'
The filtering requirements include a 'Status' field, which has options for:
- Open - the current date must fall inside one of the date-range values
- Closed - the current date is after date-range values
- Opening soon - the current date falls inside a future date range
Additionally, the search results should be sorted in the following order when no keywords exist:
- Open
- Opening soon
- Closed
Defining a date range field in Drupal/Search API
The first step is to inform Search API about the date-range data type. How you do this will depend on whether you're using Elasticsearch connector or Search API OpenSearch.
If you're using Search API OpenSearch, the good news is that it is already supported in the 2.x branch.
If you're using Elasticsearch connector, you should implement hook_elasticsearch_connector_supported_data_types_alter
and add date range as a data-type - something like this
/** * Implements hook_elasticsearch_connector_supported_data_types_alter(). */function YOURMODULE_elasticsearch_connector_supported_data_types_alter(array &$data_types) { if (!in_array('date_range', $data_types)) { $data_types[] = 'date_range'; }}
If you're using Elasticsearch connector, you also need a processor plugin to put the field values into the index. You can take inspiration from how Search API OpenSearch achieves this.
Indexing Drupal data
Once you have those pieces in place, you simply need to configure your date field to use the Date Range data type in the Search API index.
Constructing the queries
With the data in your index, the only thing left is to construct your queries to make use of the date range data-type.
In these examples, we'll assume your date-range field is called 'application_dates', that the field is multi-value and that the end date is optional.
We will use 1710201396000 as the current timestamp (in epoch microseconds).
Filtering to 'open'
This one is the simplest. You need a new term
query that uses the current date
{ "from": 0, "size": 10, "query": { "bool": { "must": [ { "term": { "application_dates": 1710201396000 } } ] } }}
For a date-range field, the use of a term query will match any document where the given value is between the start and end of the range.
Filtering to 'closed'
This one is a bit more involved. We need to combine some conditions
{ "from": 0, "size": 10, "query": { "bool": { "minimum_should_match": 1, "should": { "bool": { "minimum_should_match": 1, "must_not": [ { "range": { "application_dates": { "gte": 1710201396000 } } }, { "term": { "grant_is_ongoing": true } } ], "should": [ { "range": { "application_dates": { "relation": "WITHIN", "lte": 1710201396000 } } } ] } } } }}
First, we have a should, which is an OR query in OpenSearch parlance. We're saying the grant must not have any future open dates (gte: 1710201396000) AND must not be flagged as ongoing. Then we're saying all of the application dates should be in the past (lte: 1710201396000).
Filtering to 'opening soon'
Here again, we need multiple conditions.
{ "from": 0, "size": 10, "query": { "bool": { "must": { "range": { "application_dates": { "relation": "WITHIN", "gte": 1710201396000 } } }, "must_not": [ { "term": { "application_dates": 1710201396000 } }, { "term": { "grant_is_ongoing": true } } ] } }}
First, we are saying there must be at least one date where the date range is in the future (gte: 1710201396000) and that the grant must not be ongoing OR have any date ranges that match the current date (term: 1710201396000) (i.e. are open).
Sorting
Sorting is a little more complex. We can make use of filters in our sort expressions.
To do this, we also need to index our date-range as an object so we can use nesting.
We duplicate the application_dates
field to an application_dates_sort
field and make this use the nested
data-type. The mapping will look like this
{ "application_dates_sort": { "type": "nested", "properties": { "gte": { "type": "long" }, "lte": { "type": "long" } } }}
Then, we can make use of this in our sort expressions. E.g. to have open items appear first, we can use
{ "from": 0, "sort": [ { "application_dates_sort.lte": { "order": "asc", "mode": "min", "nested": { "path": "application_dates_sort", "filter": { "bool": { "must": [ { "range": { "application_dates_sort.lte": { "gte": 1710201396000 } } }, { "range": { "application_dates_sort.gte": { "lte": 1710201396000 } } } ] } } }, "missing": "_last" } } ], "size": 10}
What we're doing here is sorting by the opening date, ascending, but using a filter to only match on records where the current date falls between the start/end date. We're telling OpenSearch to put any objects that don't match (missing
) at the end.
We can stack sort expressions like this — subsequent expressions will apply to those that were 'missing' from the first sort. Combining these lets us achieve the desired sort order of opening -> opening soon -> closed.
Wrapping up
OpenSearch is pretty powerful. But to make the most of its features, you need to ensure your data is mapped into the right data-type. Using date-range fields for storing date-ranges takes a little bit of extra planning, but it leads to much more efficient queries and a better experience for your users.
Tagged