Stemming for better matching

Pair this blog with:

Hot Toddy
Recipe

Stemming is a way to automatically attempt to get the root of words for better search results. Without it, you would run into situations where documents contain “frogs” and some with “frog” but if you search for either you won’t get the full list of documents containing both terms. In comes stemming filters.

There are a few options here depending on what you want. None of them are perfect because if it was then we would all use that option, right? So what to consider is what languages you need to support, and just how aggressive you want your stemming algorithm.

All you have to do is add the proper filter to your schema.xml fieldType you are searching against.

For instance:

    
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.LowerCaseFilterFactory" />
		<filter class="solr.KStemFilterFactory"/> <!-- Stemming -->
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" />
        <filter class="solr.LowerCaseFilterFactory" />
		<filter class="solr.KStemFilterFactory"/> <!-- Stemming -->
      </analyzer>
    </fieldType>

You will want to reindex your documents so the stemming is stored. You will also want to add it to your query so you don’t end up searching for “frogs” and not get ANY matches when “frog” is now what is stored.

Check out the official wiki to help you pick out the stemming that is most appropriate.

Advertisements
Posted in Solr | Leave a comment

Custom Similarity to disable coord in Solr Scoring

Pair this blog with:

Kentucky Mimosa
Recipe

Just a short blog on something I noticed while doing recency boosting. A little background first, we have a field on each document that is essentially a list of GUIDs corresponding to our content types. We wanted to be able to give stronger weighting to the different content types to help promote or demote results. The more we added to our query, the less of an impact each had, and we could see in the explain string that it was because the “coord” formula takes into account how many clauses are included vs how many each document matches to.

Custom Simililarity

For us, it would be better if we could just remove this from the equation. So I did a little poking around and did not find any way to do that without creating your own Similarity.
Here’s the source code of this custom SImilarity for Solr 4.5. I believe the DefaultSimilarity has become deprecated at some point so in later versions of Solr you might need to extend another class.

package com.velir.lucene.similarities;

import org.apache.lucene.search.similarities.DefaultSimilarity;

public class NoCoordsDefaultSimilarity extends DefaultSimilarity {
@Override
public float coord(int overlap, int maxOverlap) {
return 1.0f;
}
}

After you compile this Java code into a .jar, place it in your /dist folder. In my case it is called score-functions-1.0-SNAPSHOT.jar and you will need to reference this in your solrconfig.xml, for example in my case:

  <lib dir="../../../dist/" regex="score-functions-1.0-SNAPSHOT.jar" />

Then in your schema.xml, you will want to add the following to the end of your schema, right above the end tag.

  <similarity class="com.velir.lucene.similarities.NoCoordsDefaultSimilarity"></similarity>

You will notice now that “coord” is no longer in your explain text. There is a lot more you can do with this but here is the simplest of examples.

Posted in Solr | Leave a comment

Sitecore, Solr, and using FunctionQuery

Pair this blog with:

Whiskey Sour

Last week I went over getting the Explain data back from Solr but I also hinted at adding “_val_” into Solr’s schema.xml while we were at it. This blog will show some of what we can do with that.

Recap

Just a recap of what was covered in the last blog, we want to add in the _val_ field name to the schema.xml like so:

<schema name="example" version="1.5"><schema name="example" version="1.5">  <fields> <field name="_val_" type="string" /> <field name="explain" type="string" /> ...  </fields>  ...</schema>

The reasoning for this is because we want to provide the field so Sitecore does not try to do its dynamic field creation and name it something like _val_t (which will not work for us)

What’s Next?

Now that we have the field, what can we do with it?  This is a special field name in Solr that will allow you to use FunctionQuery methods to do some pretty interesting things, like recency boosting.  For example:

Let’s say we have a field content_types_sm which has an array of GUIDs representing content types of a document.  Let’s say for the GUID 772a9f3669634367b5a06e3e26228dcd we want to apply a recency boost but for any other we don’t want to impact it.  You can use something like this in Solr:

(((content_types_sm:(772a9f3669634367b5a06e3e26228dcd) AND (_val_:(recip\(ms\(NOW\/HOUR,globaldate_tdt\),3.16e\-11,1,1\)))^10) 
OR (content_types_sm:(19c9456595ed41d39a2e9cd1943e9344) AND (_val_:(recip\(ms\(NOW\/HOUR,globaldate_tdt\),3.16e\-11,1,1\)))^10)
OR (_val_:(1))))

What can I use in this _val_ field?

_val_ has to evaluate to a number, and the recip method will return just that. This is a pretty well documented method, but the idea is that it will return a number between 0 and 1 based on the value of globaldate_tdt compared to the current datetime. On that same note, you can give further arbitrary boosting based on a value by providing whatever number you want like so:

((content_types_sm:(772a9f3669634367b5a06e3e26228dcd) AND (_val_:(5))^10) OR (_val_:(1)))

The reason for the final OR is to catch anything that does NOT match the content types check I am making, otherwise it will filter out items that do not contain that content type. But this was just my situation, not necessarily going to affect anyone else. So now that you see how to do it in Solr, how do you do it in Sitecore?

A small sample:

var boost = PredicateBuilder.True();
boost = boost.And(x => x.ContentType.Contains(contentItem.Id));
boost = boost.And(x => x._val_.Equals("recip(ms(NOW/HOUR,globaldate_tdt),3.16e-11,1,1)").Boost(10f));
var boost = PredicateBuilder.True();
boost = boost.And(x => x.ContentType.Contains(contentItem.Id));
boost = boost.And(x => x._val_.Equals(4.3f).Boost(10f));

Where _val_ in my SearchPageResult is

[IndexField("_val_")]
public string _val_ { get; set; }

Wish I could provide more here but hopefully you get the idea.

Resources

Official FunctionQuery documentation

Posted in Solr | Tagged , | Leave a comment

Sitecore, Solr, and Explain

Pair this blog with:

Jameson and Ginger

Wouldn’t it be pretty sweet if you could figure out why Solr search results appear the way they do when you don’t specify the sorting mechanism?  There is a way BUT it will take some work to get it up and running.  In the end, I am able to get an output similar to this next to each search result:

score tree

This is a visual interpretation of the “explain” debug information you get from Solr.  This also brings up a number of questions I am sure.  “Where are you getting these values?” “How do I do this?” “Why is this useful?”  “What do all the numbers and text mean?”  Grab that recommended drink and let’s get started from the top.

Where are you getting these values?

Ever get back Solr results and wonder why the sorting ended up the way it did?  Without specifying the sorting method it will generate a score based on recency.  You can find explanations of the details on their site.  There is also the concept of “debugQuery” on a Solr query.  If you execute a query with that option on you will get a “debug” element in your result which contains an “explain node”.  That node will contain an array of explanations.  One such could look like this on a search of “test”:

"sitecore://web/{bb4fe04b-c1d1-4436-b57b-cd5d4c9073d6}?lang=en&ver=1": "\n1.9663129 = (MATCH) weight(text:test in 360) [DefaultSimilarity], result of:\n 1.9663129 = fieldWeight in 360, product of:\n 2.4494898 = tf(freq=6.0), with freq of:\n 6.0 = termFreq=6.0\n 4.2813005 = idf(docFreq=810, maxDocs=21581)\n 0.1875 = fieldNorm(doc=360)\n"

How do I do this?

But this is not going to be very useful in the end.  With the default Sitecore implementation, we cannot get to the “debug” element.  If we could get the individual “explain” text in each doc response that will certainly help.  There is good news, and bad news.  The good news is you CAN.  The bad news is there is some legwork involved.  I know.

In Solr you can supply the “[explain]” field in the “fl” parameter to get the desired result.  Your “fl” parameter will look something like “*, score, [explain]”.  Or a slightly cleaner version would be “*, score, explain:[explain]” which will yield a field on the response called “explain” rather than “[explain]”.

{
  "responseHeader": {
    "status": 0,
    "QTime": 1,
    "params": {
      "q": "text:test",
      "indent": "true",
      "fl": "*,score,explain:[explain]",
      "wt": "json",
      "_": "1506107781370"
    }
  },
  "response": {
    "numFound": 808,
    "start": 0,
    "maxScore": 1.9663129,
    "docs": [
      {
        ...
        "score": 1.9663129,
        "explain": "1.9663129 = (MATCH) weight(text:test in 360) [DefaultSimilarity], result of:\n  1.9663129 = fieldWeight in 360, product of:\n    2.4494898 = tf(freq=6.0), with freq of:\n      6.0 = termFreq=6.0\n    4.2813005 = idf(docFreq=810, maxDocs=21581)\n    0.1875 = fieldNorm(doc=360)\n"
      },
      ...
    ]
  }
}

You are probably thinking to yourself, “This is exactly what I want. Time to go try it myself”.  You are right.  You can certainly play around with this in your Solr dashboard and marvel at the new field that gives you all sorts of scoring data, BUT what if you want this inside of Sitecore?  That is going to depend on your version of Sitecore and here is why.  The field list is hardcoded deep in the Sitecore.ContentSearch.SolrProvider.dll.  That’s ok, we should be able to override this list pretty easily right?  Nope 😦 but I have done a lot of the legwork so hopefully it helps out someone else.  I’ll get back to that.

Why is this useful?

In order to understand what is going on with the order your results are coming back in detail, this is exactly what you are looking for.  It tells you what was taken into account, it tells you what had the most impact, and if there was any boosting involved.  Boosting will give a higher weighting on that part of the formula which will impact the overall score more.

What do all the numbers and text mean?

I will get to this on a follow up blog if anyone is curious.

What we are all here for (the code)

There is also an added bonus here of _val_ which I can cover in a later blog.  This will allow you to do some really cool FunctionQuery methods like recency boosting on a date.  You can add it while you are putting in the explain string so you can be ready for the next blog.
Disclaimer: This was from Sitecore 7.1 and you will need to reference the DLLs for your version of Sitecore to do this. I used dotPeek to copy out the decompiled classes and this is the result.

schema.xml (solr)
      ...    ...
SolrSearchResults.cs
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.Linq;
using Sitecore.ContentSearch.Linq.Common;
using Sitecore.ContentSearch.Linq.Methods;
using Sitecore.ContentSearch.Pipelines.IndexingFilters;
using Sitecore.ContentSearch.Security;
using Sitecore.ContentSearch.SolrProvider;
using SolrNet;
using System;
using System.Collections.Generic;
using System.Linq;

namespace [Project].Library.CustomSitecore.ContentSearch.SearchProvider
{
	internal struct SolrSearchResults
	{
		private readonly SearchContext context;
		private readonly SolrQueryResults<Dictionary> searchResults;
		private readonly SolrIndexConfiguration solrIndexConfiguration;
		private readonly IIndexDocumentPropertyMapper<Dictionary> mapper;
		private readonly SelectMethod selectMethod;
		private readonly IEnumerable virtualFieldProcessors;
		private readonly int numberFound;

		public int NumberFound
		{
			get
			{
				return this.numberFound;
			}
		}

		public SolrSearchResults(SearchContext context, SolrQueryResults<Dictionary> searchResults, SelectMethod selectMethod, IEnumerable virtualFieldProcessors)
		{
			this.context = context;
			this.solrIndexConfiguration = (SolrIndexConfiguration)this.context.Index.Configuration;
			this.mapper = this.solrIndexConfiguration.IndexDocumentPropertyMapper;
			this.selectMethod = selectMethod;
			this.virtualFieldProcessors = virtualFieldProcessors;
			this.numberFound = searchResults.NumFound;
			this.searchResults = ApplySecurity(searchResults, context.SecurityOptions, ref this.numberFound);
		}

		private static SolrQueryResults<Dictionary> ApplySecurity(SolrQueryResults<Dictionary> solrQueryResults, SearchSecurityOptions options, ref int numberFound)
		{
			if (!options.HasFlag(SearchSecurityOptions.DisableSecurityCheck))
			{
				HashSet<Dictionary> dictionarySet = new HashSet<Dictionary>();
				foreach (Dictionary dictionary in solrQueryResults.Where(searchResult => searchResult != null))
				{
					object obj1;
					if (dictionary.TryGetValue("_uniqueid", out obj1))
					{
						object obj2;
						dictionary.TryGetValue("_datasource", out obj2);
						if (OutboundIndexFilterPipeline.CheckItemSecurity(new OutboundIndexFilterArgs((string)obj1, (string)obj2)))
						{
							dictionarySet.Add(dictionary);
							numberFound = numberFound - 1;
						}
					}
				}
				foreach (Dictionary dictionary in dictionarySet)
					solrQueryResults.Remove(dictionary);
			}
			return solrQueryResults;
		}

		public TElement ElementAt(int index)
		{
			if (index  this.searchResults.Count)
				throw new IndexOutOfRangeException();
			return this.mapper.MapToType(this.searchResults[index], this.selectMethod, this.virtualFieldProcessors, this.context.SecurityOptions);
		}

		public TElement ElementAtOrDefault(int index)
		{
			if (index  this.searchResults.Count)
				return default(TElement);
			return this.mapper.MapToType(this.searchResults[index], this.selectMethod, this.virtualFieldProcessors, this.context.SecurityOptions);
		}

		public bool Any()
		{
			return this.numberFound > 0;
		}

		public long Count()
		{
			return (long)this.numberFound;
		}

		public TElement First()
		{
			if (this.searchResults.Count < 1)
				throw new InvalidOperationException("Sequence contains no elements");
			return this.ElementAt(0);
		}

		public TElement FirstOrDefault()
		{
			if (this.searchResults.Count < 1)
				return default(TElement);
			return this.ElementAt(0);
		}

		public TElement Last()
		{
			if (this.searchResults.Count < 1)
				throw new InvalidOperationException("Sequence contains no elements");
			return this.ElementAt(this.searchResults.Count - 1);
		}

		public TElement LastOrDefault()
		{
			if (this.searchResults.Count < 1)
				return default(TElement);
			return this.ElementAt(this.searchResults.Count - 1);
		}

		public TElement Single()
		{
			if (this.Count()  1L)
				throw new InvalidOperationException("Sequence contains more than one element");
			return this.mapper.MapToType(this.searchResults[0], this.selectMethod, this.virtualFieldProcessors, this.context.SecurityOptions);
		}

		public TElement SingleOrDefault()
		{
			if (this.Count() == 0L)
				return default(TElement);
			if (this.Count() == 1L)
				return this.mapper.MapToType(this.searchResults[0], this.selectMethod, this.virtualFieldProcessors, this.context.SecurityOptions);
			throw new InvalidOperationException("Sequence contains more than one element");
		}

		public IEnumerable GetSearchHits()
		{
			foreach (Dictionary searchResult in this.searchResults)
			{
				float score = -1f;
				object scoreObj;
				if (searchResult.TryGetValue("score", out scoreObj) && scoreObj is float)
					score = (float)scoreObj;
				yield return new SearchHit(score, this.mapper.MapToType(searchResult, this.selectMethod, this.virtualFieldProcessors, this.context.SecurityOptions));
			}
		}

		public IEnumerable GetSearchResults()
		{
			foreach (Dictionary searchResult in this.searchResults)
				yield return this.mapper.MapToType(searchResult, this.selectMethod, this.virtualFieldProcessors, this.context.SecurityOptions);
		}

		public Dictionary<string, ICollection<KeyValuePair>> GetFacets()
		{
			IDictionary<string, ICollection<KeyValuePair>> facetFields = this.searchResults.FacetFields;
			IDictionary facetPivots = this.searchResults.FacetPivots;
			Dictionary<string, ICollection<KeyValuePair>> dictionary = facetFields.ToDictionary(x => x.Key, x => x.Value);
			if (facetPivots.Count > 0)
			{
				foreach (KeyValuePair keyValuePair in facetPivots)
					dictionary[keyValuePair.Key] = this.Flatten(keyValuePair.Value, string.Empty);
			}
			return dictionary;
		}

		private ICollection<KeyValuePair> Flatten(IEnumerable pivots, string parentName)
		{
			HashSet<KeyValuePair> keyValuePairSet = new HashSet<KeyValuePair>();
			foreach (Pivot pivot in pivots)
			{
				if (parentName != string.Empty)
					keyValuePairSet.Add(new KeyValuePair(parentName + "/" + pivot.Value, pivot.Count));
				if (pivot.HasChildPivots)
					keyValuePairSet.UnionWith(this.Flatten(pivot.ChildPivots, pivot.Value));
			}
			return keyValuePairSet;
		}

	}
}
SearchContext.cs
using Microsoft.Practices.ServiceLocation;
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.Diagnostics;
using Sitecore.ContentSearch.Linq.Common;
using Sitecore.ContentSearch.Security;
using Sitecore.ContentSearch.SolrProvider;
using Sitecore.Diagnostics;
using SolrNet;
using SolrNet.Commands.Parameters;
using SolrNet.Exceptions;
using SolrNet.Impl;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml;

namespace [Project].Library.CustomSitecore.ContentSearch.SearchProvider
{
	public class SearchContext : IProviderSearchContext
	{
		private readonly SolrSearchIndex _index;
		private readonly SearchSecurityOptions _securityOptions;

		public ISearchIndex Index
		{
			get
			{
				return (ISearchIndex)this._index;
			}
		}

		public SearchSecurityOptions SecurityOptions
		{
			get
			{
				return this._securityOptions;
			}
		}

		public SearchContext(SolrSearchIndex index, SearchSecurityOptions options = SearchSecurityOptions.EnableSecurityCheck)
		{
			Assert.ArgumentNotNull((object)index, "index");
			Assert.ArgumentNotNull((object)options, "options");
			this._index = index;
			_securityOptions = options;
		}

		public IQueryable GetQueryable() where TItem : new()
		{
			return this.GetQueryable(null);
		}

		public IQueryable GetQueryable(IExecutionContext executionContext) where TItem : new()
		{
			CustomLinqToSolrIndex linqToSolrIndex = new CustomLinqToSolrIndex(this, executionContext);

			return linqToSolrIndex.GetQueryable();
		}

		public IEnumerable GetTermsByFieldName(string fieldName, string filter)
		{
			SolrSearchFieldConfiguration fieldConfiguration = this._index.Configuration.FieldMap.GetFieldConfiguration(fieldName.ToLowerInvariant()) as SolrSearchFieldConfiguration;
			if (fieldConfiguration != null)
				fieldName = fieldConfiguration.FormatFieldName(fieldName, this.Index.Schema, (string)null);
			TermsParameters termsParameters = new TermsParameters(fieldName) { Sort = TermsSort.Count };
			if (!string.IsNullOrEmpty(filter))
				termsParameters.Prefix = filter;
			HashSet searchIndexTermSet = new HashSet();
			try
			{
				var solrOperations = ServiceLocator.Current.GetInstance<ISolrOperations<Dictionary>>(_index.Core); ;
				AbstractSolrQuery all = SolrQuery.All;
				QueryOptions queryOptions = new QueryOptions();
				queryOptions.Terms = termsParameters;
				queryOptions.Rows = new int?(0);
				QueryOptions options = queryOptions;
				foreach (TermsResult termsResult in solrOperations.Query(all, options).Terms)
					foreach (KeyValuePair keyValuePair in termsResult.Terms)
					{
						KeyValuePair term = keyValuePair;
						searchIndexTermSet.Add(new SearchIndexTerm(term.Key, () => term.Value));
					}
				return searchIndexTermSet;
			}
			catch (Exception ex)
			{
				if (!(ex is SolrConnectionException) && !(ex is SolrNetException))
				{
					throw;
				}
				string message = ex.Message;
				if (ex.Message.StartsWith("<!--?xml")) 
				{
					XmlDocument xmlDocument = new XmlDocument();
					xmlDocument.LoadXml(ex.Message);
					XmlNode xmlNode1 = xmlDocument.SelectSingleNode("/response/lst[@name='error'][1]/str[@name='msg'][1]");
					XmlNode xmlNode2 = xmlDocument.SelectSingleNode("/response/lst[@name='responseHeader'][1]/lst[@name='params'][1]/str[@name='q'][1]");
					if (xmlNode1 != null && xmlNode2 != null)
					{
						SearchLog.Log.Error(string.Format("Solr Error : [\"{0}\"] - Term Query attempted: [{1}]", (object)xmlNode1.InnerText, (object)xmlNode2.InnerText));
						return searchIndexTermSet;
					}
				}
				Log.Error(message, (object)this);
				return searchIndexTermSet;
			}
		}

		public void Dispose()
		{
		}
	}
}
DebugExplainSearchIndex.cs
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.Maintenance;
using Sitecore.ContentSearch.Security;
using Sitecore.ContentSearch.SolrProvider;

namespace [Project].Library.CustomSitecore.ContentSearch.SearchProvider
{
	public class DebugExplainSearchIndex : SolrSearchIndex
	{
		public DebugExplainSearchIndex(string name, string core, IIndexPropertyStore propertyStore) : base(name, core, propertyStore)
		{
		}

		public override IProviderSearchContext CreateSearchContext(SearchSecurityOptions options = SearchSecurityOptions.EnableSecurityCheck)
		{
			return new SearchContext(this, options);
		}
	}
}
CustomLinqToSolrIndex.cs
using Microsoft.Practices.ServiceLocation;
using Sitecore.Configuration;
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.Diagnostics;
using Sitecore.ContentSearch.Linq;
using Sitecore.ContentSearch.Linq.Common;
using Sitecore.ContentSearch.Linq.Methods;
using Sitecore.ContentSearch.Linq.Nodes;
using Sitecore.ContentSearch.Linq.Solr;
using Sitecore.ContentSearch.Pipelines.GetFacets;
using Sitecore.ContentSearch.Pipelines.ProcessFacets;
using Sitecore.ContentSearch.Security;
using Sitecore.ContentSearch.SolrProvider;
using Sitecore.ContentSearch.SolrProvider.Logging;
using Sitecore.ContentSearch.Utilities;
using Sitecore.Diagnostics;
using SolrNet;
using SolrNet.Commands.Parameters;
using SolrNet.Exceptions;
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Reflection;
using System.Xml;

namespace [Project].Library.CustomSitecore.ContentSearch.SearchProvider
{
	public class CustomLinqToSolrIndex : SolrIndex
	{
		private readonly SearchContext context;
		private readonly string cultureCode;

		public CustomLinqToSolrIndex(SearchContext context, IExecutionContext executionContext)
			: base(new SolrIndexParameters(context.Index.Configuration.IndexFieldStorageValueFormatter, context.Index.Configuration.VirtualFieldProcessors, context.Index.FieldNameTranslator, executionContext))
		{
			Assert.ArgumentNotNull(context, "context");
			this.context = context;
			CultureExecutionContext executionContext1 = Parameters.ExecutionContext as CultureExecutionContext;
			CultureInfo culture = executionContext1 == null ? CultureInfo.GetCultureInfo(Settings.DefaultLanguage) : executionContext1.Culture;
			cultureCode = culture.TwoLetterISOLanguageName;
			((SolrFieldNameTranslator)Parameters.FieldNameTranslator).AddCultureContext(culture);
		}

		public override TResult Execute(SolrCompositeQuery compositeQuery)
		{
			if (typeof(TResult).IsGenericType && typeof(TResult).GetGenericTypeDefinition() == typeof(SearchResults))
			{
				Type genericArgument = typeof(TResult).GetGenericArguments()[0];
				SolrQueryResults<Dictionary> solrQueryResults = Execute(compositeQuery, genericArgument);
				Type type = typeof(SolrSearchResults).MakeGenericType(genericArgument);
				MethodInfo methodInfo = GetType().GetMethod("ApplyScalarMethods", BindingFlags.Instance | BindingFlags.NonPublic).MakeGenericMethod(typeof(TResult), genericArgument);
				SelectMethod selectMethod = GetSelectMethod(compositeQuery);
				object instance = Activator.CreateInstance(type, context as object, solrQueryResults as object, selectMethod as object, compositeQuery.VirtualFieldProcessors as object);
				return (TResult)methodInfo.Invoke(this, new[]
				{
					compositeQuery,
					instance,
					solrQueryResults
				});
			}
			SolrQueryResults<Dictionary> solrQueryResults1 = Execute(compositeQuery, typeof(TResult));
			SelectMethod selectMethod1 = GetSelectMethod(compositeQuery);
			SolrSearchResults processedResults = new SolrSearchResults(context, solrQueryResults1, selectMethod1, compositeQuery.VirtualFieldProcessors);
			return ApplyScalarMethods(compositeQuery, processedResults, solrQueryResults1);
		}

		public override IEnumerable FindElements(SolrCompositeQuery compositeQuery)
		{
			SolrQueryResults<Dictionary> searchResults = this.Execute(compositeQuery, typeof(TElement));
			List list = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.Select).Select(m => (SelectMethod)m).ToList();
			SelectMethod selectMethod = list.Count == 1 ? list[0] : null;
			return new SolrSearchResults(this.context, searchResults, selectMethod, compositeQuery.VirtualFieldProcessors).GetSearchResults();
		}

		internal SolrQueryResults<Dictionary> Execute(SolrCompositeQuery compositeQuery, Type resultType)
		{
			QueryOptions options = new QueryOptions();
			if (compositeQuery.Methods != null)
			{
				List list1 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.Select).Select(m => (SelectMethod)m).ToList();
				if (list1.Any())
				{
					foreach (string str in list1.SelectMany(selectMethod => selectMethod.FieldNames as IEnumerable))
						options.Fields.Add(str.ToLowerInvariant());
					if (!this.context.SecurityOptions.HasFlag((Enum)SearchSecurityOptions.DisableSecurityCheck))
					{
						options.Fields.Add("_uniqueid");
						options.Fields.Add("_datasource");
					}
				}
				List list2 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.GetResults).Select(m => (GetResultsMethod)m).ToList();
				if (list2.Any())
				{
					if (options.Fields.Count > 0)
					{
						options.Fields.Add("score");
						options.Fields.Add("explain:[explain]");
					}
					else
					{
						options.Fields.Add("*");
						options.Fields.Add("score");
						options.Fields.Add("explain:[explain]");
					}
				}
				List list3 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.OrderBy).Select(m => (OrderByMethod)m).ToList();
				if (list3.Any())
				{
					foreach (OrderByMethod orderByMethod in list3)
					{
						string field = orderByMethod.Field;
						options.AddOrder(new SortOrder(field, orderByMethod.SortDirection == SortDirection.Ascending ? Order.ASC : Order.DESC));
					}
				}
				List list4 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.Skip).Select(m => (SkipMethod)m).ToList();
				if (list4.Any())
				{
					int num = list4.Sum(skipMethod => skipMethod.Count);
					options.Start = new int?(num);
				}
				List list5 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.Take).Select(m => (TakeMethod)m).ToList();
				if (list5.Any())
				{
					int num = list5.Sum(takeMethod => takeMethod.Count);
					options.Rows = new int?(num);
				}
				List list6 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.Count).Select(m => (CountMethod)m).ToList();
				if (compositeQuery.Methods.Count == 1 && list6.Any())
					options.Rows = new int?(0);
				List list7 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.Any).Select(m => (AnyMethod)m).ToList();
				if (compositeQuery.Methods.Count == 1 && list7.Any())
					options.Rows = new int?(0);
				List list8 = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.GetFacets).Select(m => (GetFacetsMethod)m).ToList();
				if (compositeQuery.FacetQueries.Count > 0 && (list8.Any() || list2.Any()))
				{
					foreach (FacetQuery hash in GetFacetsPipeline.Run(new GetFacetsArgs(null, compositeQuery.FacetQueries, this.context.Index.Configuration.VirtualFieldProcessors, this.context.Index.FieldNameTranslator)).FacetQueries.ToHashSet())
					{
						if (hash.FieldNames.Any())
						{
							int? minimumResultCount = hash.MinimumResultCount;
							if (hash.FieldNames.Count() == 1)
							{
								SolrFieldNameTranslator fieldNameTranslator = this.FieldNameTranslator as SolrFieldNameTranslator;
								string str = hash.FieldNames.First();
								if (fieldNameTranslator != null && str == fieldNameTranslator.StripKnownExtensions(str) && this.context.Index.Configuration.FieldMap.GetFieldConfiguration(str) == null)
									str = fieldNameTranslator.GetIndexFieldName(str.Replace("__", "!").Replace("_", " ").Replace("!", "__"), true);
								options.AddFacets(new SolrFacetFieldQuery(str)
								{
									MinCount = minimumResultCount
								} as ISolrFacetQuery);
							}
							if (hash.FieldNames.Count() > 1)
								options.AddFacets(new SolrFacetPivotQuery()
								{
									Fields = new string[1]
									{
										string.Join(",", hash.FieldNames)
									} as ICollection,
									MinCount = minimumResultCount
								} as ISolrFacetQuery);
						}
					}
					if (!list2.Any())
						options.Rows = 0;
				}
			}
			if (compositeQuery.Filter != null)
				options.AddFilterQueries(compositeQuery.Filter as ISolrQuery);
			options.AddFilterQueries(new SolrQueryByField("_indexname", context.Index.Name) as ISolrQuery);
			if (!Settings.DefaultLanguage.StartsWith(this.cultureCode))
				options.AddFilterQueries(new SolrQueryByField("_language", cultureCode + "*")
				{
					Quoted = false
				} as ISolrQuery);
			SolrLoggingSerializer loggingSerializer = new SolrLoggingSerializer();
			string q = loggingSerializer.SerializeQuery(compositeQuery.Query);
			SolrSearchIndex index = this.context.Index as SolrSearchIndex;
			try
			{
				if (!options.Rows.HasValue)
					options.Rows = ContentSearchConfigurationSettings.SearchMaxResults;
				SearchLog.Log.Info("Query - " + q);
				SearchLog.Log.Info("Serialized Query - ?q=" + q + "&" + string.Join("&", loggingSerializer.GetAllParameters(options).Select(p => string.Format("{0}={1}", p.Key, p.Value)).ToArray()));

				var solrOperations = ServiceLocator.Current.GetInstance<ISolrOperations<Dictionary>>(index.Core); ;
				return solrOperations.Query(q, options);
			}
			catch (Exception ex)
			{
				if (!(ex is SolrConnectionException) && !(ex is SolrNetException))
				{
					throw;
				}
				else
				{
					string message = ex.Message;
					if (ex.Message.StartsWith("<?xml"))
					{
						XmlDocument xmlDocument = new XmlDocument();
						xmlDocument.LoadXml(ex.Message);
						XmlNode xmlNode1 = xmlDocument.SelectSingleNode("/response/lst[@name='error'][1]/str[@name='msg'][1]");
						XmlNode xmlNode2 = xmlDocument.SelectSingleNode("/response/lst[@name='responseHeader'][1]/lst[@name='params'][1]/str[@name='q'][1]");
						if (xmlNode1 != null && xmlNode2 != null)
						{
							SearchLog.Log.Error(string.Format("Solr Error : [\"{0}\"] - Query attempted: [{1}]", xmlNode1.InnerText, xmlNode2.InnerText));
							return new SolrQueryResults<Dictionary>();
						}
					}
					Log.Error(message, (object)this);
					return new SolrQueryResults<Dictionary>();
				}
			}
		}

		private TResult ApplyScalarMethods(SolrCompositeQuery compositeQuery, SolrSearchResults processedResults, SolrQueryResults<Dictionary> results)
		{
			QueryMethod queryMethod = compositeQuery.Methods.First();
			object obj;
			switch (queryMethod.MethodType)
			{
				case QueryMethodType.All:
					obj = true;
					break;
				case QueryMethodType.Any:
					obj = processedResults.Any();
					break;
				case QueryMethodType.Count:
					obj = processedResults.Count();
					break;
				case QueryMethodType.ElementAt:
					obj = !((ElementAtMethod)queryMethod).AllowDefaultValue ? processedResults.ElementAt(((ElementAtMethod)queryMethod).Index) : processedResults.ElementAtOrDefault(((ElementAtMethod)queryMethod).Index) as object;
					break;
				case QueryMethodType.First:
					obj = !((FirstMethod)queryMethod).AllowDefaultValue ? processedResults.First() : processedResults.FirstOrDefault() as object;
					break;
				case QueryMethodType.Last:
					obj = !((LastMethod)queryMethod).AllowDefaultValue ? processedResults.Last() : processedResults.LastOrDefault() as object;
					break;
				case QueryMethodType.Single:
					obj = !((SingleMethod)queryMethod).AllowDefaultValue ? (object)processedResults.Single() : processedResults.SingleOrDefault();
					break;
				case QueryMethodType.GetResults:
					IEnumerable searchHits = processedResults.GetSearchHits();
					FacetResults facetResults = FormatFacetResults(processedResults.GetFacets(), compositeQuery.FacetQueries);
					obj = Activator.CreateInstance(typeof(TResult), searchHits as object, processedResults.NumberFound as object, facetResults as object);
					break;
				case QueryMethodType.GetFacets:
					obj = FormatFacetResults(processedResults.GetFacets(), compositeQuery.FacetQueries);
					break;
				default:
					throw new InvalidOperationException("Invalid query method");
			}
			return (TResult)System.Convert.ChangeType(obj, typeof(TResult));
		}

		private FacetResults FormatFacetResults(Dictionary<string, ICollection<KeyValuePair>> facetResults, List facetQueries)
		{
			SolrFieldNameTranslator fieldNameTranslator = context.Index.FieldNameTranslator as SolrFieldNameTranslator;
			IDictionary<string, ICollection<KeyValuePair>> dictionary = ProcessFacetsPipeline.Run(new ProcessFacetsArgs(facetResults, facetQueries, facetQueries, this.context.Index.Configuration.VirtualFieldProcessors, fieldNameTranslator));
			foreach (FacetQuery facetQuery in facetQueries)
			{
				FacetQuery originalQuery = facetQuery;
				if (originalQuery.FilterValues != null && originalQuery.FilterValues.Any() && dictionary.ContainsKey(originalQuery.CategoryName))
				{
					ICollection<KeyValuePair> source = dictionary[originalQuery.CategoryName];
					dictionary[originalQuery.CategoryName] = source.Where(cv => originalQuery.FilterValues.Contains(cv.Key)).ToList();
				}
			}
			FacetResults facetResults1 = new FacetResults();
			foreach (KeyValuePair<string, ICollection<KeyValuePair>> keyValuePair in dictionary)
			{
				if (fieldNameTranslator != null)
				{
					var key = keyValuePair.Key;
					string name;
					if (key.Contains(","))
						name = fieldNameTranslator.StripKnownExtensions(key.Split(new char[1]
						{
							','
						}, StringSplitOptions.RemoveEmptyEntries));
					else
						name = fieldNameTranslator.StripKnownExtensions(key);
					var values = keyValuePair.Value.Select(v => new FacetValue(v.Key, v.Value));
					facetResults1.Categories.Add(new FacetCategory(name, values));
				}
			}
			return facetResults1;
		}

		private static SelectMethod GetSelectMethod(SolrCompositeQuery compositeQuery)
		{
			var list = compositeQuery.Methods.Where(m => m.MethodType == QueryMethodType.Select).Select(m => (SelectMethod)m).ToList();
			return list.Count != 1 ? null : list.First();
		}
	}
}

Sitecore.ContentSearch.Solr.Indexes.config

Change the SolrProvider from SolrSearchIndex to your new provider

Added bonus: A parser I wrote and am using on the Explain result
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace [Project].Library.Utilities
{
	public static class TreeViewFromExplain
	{
		private const string MainRegex =
			"(?'score'[\\d\\.]+) = (\\(MATCH\\))?(?'text'.*)";

		private const string SplitRegex = "\\n[^\\)]";

		public static string Convert(string explain)
		{
			if (string.IsNullOrEmpty(explain)) return string.Empty;

			var lines = Regex.Split(explain, SplitRegex);

			TreeViewNode treeNodes = null;
			TreeViewNode lastNode = null;

			foreach (var line in lines)
			{
				if (treeNodes == null)
				{
					treeNodes = new TreeViewNode(line);
					lastNode = treeNodes;
					continue;
				}

				var childNode = new TreeViewNode(line);

				// Same level child
				if (lastNode.Offset == childNode.Offset)
				{
					lastNode.Parent.AddChild(childNode);
				}

				// A child
				if (lastNode.Offset < childNode.Offset)
				{
					lastNode.AddChild(childNode);
				}

				// A new higher item
				if (childNode.Offset < lastNode.Offset)
				{
					var parent = lastNode.Parent;

					while (parent.Offset != childNode.Offset)
					{
						parent = parent.Parent;
					}

					parent.Parent.AddChild(childNode);
				}

				lastNode = childNode;
			}

			return ConvertTreeViewNodeListToMarkup(treeNodes);
		}

		private static string ConvertTreeViewNodeListToMarkup(TreeViewNode treeNode)
		{
			var markup = new StringBuilder();
			markup.AppendLine("
    "); markup.AppendLine(treeNode.Render().ToString()); markup.AppendLine("
"); return markup.ToString(); } public class TreeViewNode { public string Text { get; set; } public string Operation { get; set; } public int Offset { get; set; } public string Value { get; set; } public TreeViewNode Parent { get; set; } public List Children { get; set; } public TreeViewNode(string text) { var trimText = text.TrimStart(); var results = Regex.Match(trimText, MainRegex); Value = results.Groups["score"].Value; Offset = text.Length - trimText.Length; string modifiedText; Operation = ExplainOperation.Convert(results.Groups["text"].Value, out modifiedText); Text = modifiedText; Children = new List(); } public void AddChild(TreeViewNode child) { child.Parent = this; Children.Add(child); } public StringBuilder Render(StringBuilder sb = null, bool isLast = false) { if (sb == null) sb = new StringBuilder(); sb.AppendLine(String.Format("
  • {0} {1}", this.Value, this.Text, this.Operation, (isLast) ? " lastChild" : "")); if (Children.Any()) { sb.AppendLine("
      "); foreach (var child in Children) { child.Render(sb, (child == Children.Last())); } sb.AppendLine("
    "); } sb.AppendLine("
  • "); return sb; } } public static class ExplainOperation { public static string Sum = "sum"; public static string Product = "product"; public static string Result = "result"; public static string WithFreq = "withfreq"; public static string None = ""; public static string Convert(string explainLine, out string modifiedText) { if (explainLine.Contains(" sum of:")) { modifiedText = explainLine.Replace(" sum of:", ""); return Sum; } if (explainLine.Contains(" product of:")) { modifiedText = explainLine.Replace(" product of:", ""); return Product; } if (explainLine.Contains(" result of:")) { modifiedText = explainLine.Replace(" result of:", ""); return Result; } if (explainLine.Contains(" with freq of:")) { modifiedText = explainLine.Replace(" with freq of:", ""); return WithFreq; } modifiedText = explainLine; return None; } } } }
    Posted in Solr | Tagged , , | Leave a comment

    DEF First Impressions

    Pair this blog with:

    A nice craft beer

    DEF (Data Exchange Framework) is a module for Sitecore 8.1+ and is meant for retrieving and transforming data from one endpoint to another.  Those endpoints can be anything, in fact.  I started playing around with DEF and this is what I have found so far.

    • Robust system for moving around data from one location to another
    • Complicated at first, to set up
    • In detail tutorial that is easy to follow

    Structure in Sitecore

    • [Tenant Name]
      • Found at sitecore/System/Data Exchange/[Tenant Name]
      • Data Access
        • Value Accessor Sets
          • For defining the fields on the source and destination
          • Will be referenced by a Pipeline field by field but mapped through the Value Mapping Set
      • Endpoints
        • Providers
          • Used to define the source/destination of the data
          • Uses an endpoint converter
      • Pipeline Batches
        • A reference to the pipeline(s) to run
        • When selecting a pipeline batch, you can execute “run pipeline batch” from the ribbon
      • Pipelines
        • Actions to perform on the data, from reading data in to transforming to updating
        • Order matters when it comes to the sub items (steps) under the pipelines
        • Pipelines can reference others
      • Queues
      • Tenant Settings
      • Value Mapping Sets
        • Mapping a value accessor set property to another
        • This can be referenced in a pipeline step

    DXF.png

    Sitecore Templates

    After installing DEF, you will have a number of templates installed.  They can be found at sitecore/Templates/Data Exchange .  From what I have encountered, any custom templates will be put in their own folder structure under sitecore/Templates/Data Exchange/Providers/[Name Here].  In the tutorial it is named “File System”.  There are a number of prebuilt templates that can be extended to fit your needs.

    Note:

    In following the tutorial, there are a number of places where a TemplateID needs to be referenced in another template or standard values.

    References

    Data Exchange Framework tutorial

    What is next?

    I am going to be putting together a POC over the course of the next few blog posts on a business need I encountered.

    Posted in DXF | Tagged , , , , | Leave a comment

    About Me – Blog Beginnings

    Greetings folks,

    I have been working with Sitecore for the last few years and have decided it is time to share information I have gathered up over my time here.  But, not yet.  Today it is about me.  ME.

    Kidding.

    Posted in Uncategorized | Leave a comment