Tuesday, January 14, 2014

Sitecore 7 and GroupBy method for Solr.


Story

If you have tried implementing Solr with Sitecore 7, you probably have ran into the lack of GroupBy method. In our project we needed to retrieve top N content items grouped by type (template). Next to each type header we needed to display a count indicating how many items of this type we have in total. The only way to implemented using out-of-the-box Sitecore 7 SolrProvider is to make one call to get all items and split them into groups calculating the total for each group and taking top N. For systems that have thousands of content items this is not really a feasible production solution. We are building one of these systems. Number of content items we run searches against is roughly 10,000. Primary reason for using Solr in our solution was the speed. I was not willing to give it up easily, so here is the result of my battle to implement GroupBy method for Sitecore 7 Solr provider.

Solution

  • Sending a request to Solr with "group by" parameters;
  • Receiving results from Solr and parsing them out into an object that would contain all necessary values.

Sending the request

In github solution code you'll find a class called SearchManager (/Managers/SearchManager.cs). This is where the whole thing starts. GroupedSearch<TItem> method generates a query and makes the request. In the body of this method you'll see the following line:

var group = query.GroupResults(context, g => g.TemplateId, numberofItemsPerGroupForRelatedWidgets);

GroupResults is the new method that is makes a request to Solr with groupby parameters and returns results as ExtendedSearchResults<TSource> object.  ExtendedSearchResults is very similar to SearchResults class, but since SearchResults is sealed, I couldn't inherit it and had to create a copy with a few new properties.

The next class to be inspected is SolrQueryExtensions. This is the place where I'm adding groupby parameters to Solr query.

Once query is updated to what we need, request can be executed:

linqToSolr.Execute<ExtendedSearchResults<TSource>>(extendedQuery);

The class responsible for generating request url is called CustomLinqToSolrIndex. Once again, I couldn't extend existing Sitecore LinqToSolrIndex class due to the fact that method that performs all response parsing logic is internal. And once again, I created a copy of existing Sitecore class with a few changes and called it CustomLinqToSolrIndex.

Processing Response

Logic that processes response from Solr and generates ExtendedSearchResults object can be found in the same CustomLinkToSolrIndex class.  GetExtendedResults<TResult, TDocument> method is where all the magic happens.

Conclusion

This solution might not be perfect. I would much prefer truly extending Sitecore Solr Provider to implement this method, but for now it seems to be working pretty well.