Wednesday, December 23, 2015

Sitecore Lucene index and DateTime fields

[Sitecore 8.1]

DateTime field in Lucene index

I was trying to create an index search for an event calendar that would give me items (from a template etc..)  that have a datefield:
  • from today onwards (today included) 
  • up until today
The field is Sitecore is a date field (so no time indication), but our query seemed to have issues with the time indications. The code to create the predicate looks like this:

private Expression<Func<EventItem, bool>> GetDatePredicate(OverviewMode mode)
{
  var predicate = PredicateBuilder.True<EventItem>();
  switch (mode)
  {
 case OverviewMode.Future:
 {
  var minDate = DateTime.Today.ToUniversalTime();
  predicate = predicate.And(n => n.StartDate > minDate);
  break;
 }
 case OverviewMode.Past:
 {
  var maxDate = DateTime.Today.ToUniversalTime();
  var minDate = DateTime.MinValue.ToUniversalTime();
  predicate = predicate.And(n => n.StartDate < maxDate).And(n => n.StartDate > minDate);
  break;
 }
 default:
 {
  return null;
 }
  }
  return predicate;
}


This did not work correctly with events "today". We had to add "AddDays(-1)" after the Today before we set it to UTC. So why?

The first reason is that Sitecore stores its DateTimes in UTC which was an hour difference with our local time. So, our dates shifted a day back: "12/12/2015" becomes "12/11/2015 23:00". This is known and should be no issue as we also shift to UTC in our predicate.

But still.. we did not get the correct results.

The logs

So we look at the logs. Sitecore logs all requests in the Search log file. We saw that our predicate was translated into something like this:
"+(+date_from:[* TO 20151111t230000000z} +date_from:{00010101t000000000z TO *])"

Looks fine, but note that the "t" in the dates is lowercase. In my index however they are all uppercase. If I try the query with Luke it does give me the wrong results indeed.. When I alter the query in Luke to use uppercase T it works correctly..

Support, here we come!


Solution(s)

Support gave us 2 possible solutions, next to the one we already had (skipping a day).

1. Format

We could alter our index to use a format attribute:
<field fieldName="datefrom" storageType="YES" indexType="UNTOKENIZED" vectorType="NO" boost="1f" 
format="yyyyMMdd" type="System.DateTime" 
settingType="Sitecore.ContentSearch.LuceneProvider.LuceneSearchFieldConfiguration, Sitecore.ContentSearch.LuceneProvider"/>

After rebuilding our index, the "DateFrom" field values, stored in the index, will contain only dates (like "20151209"), so search by dates should return results as expected (since there are no "T" and "Z" symbols). 
This works if you really don't need the times..

2. Custom Converter

Another solution is to override the "Sitecore.ContentSearch.Converters.IndexFieldUtcDateTimeValueConverter" class to store dates in lower case to the index.

Add your converter to the index config:
<converters hint="raw:AddConverter">
  ...
  <converter handlesType="System.DateTime" 
         typeConverter="YourNamespace.LowerCaseIndexFieldUtcDateTimeValueConverter, YourAssembly" />
  ...
</converters>

As a result, all dates should be stored to the index in lower case. As the search query is in lower case, all expected results should be found.


Future solution

Since currently search queries are always generated in lower case and this behavior is currently not configurable (the "LowercaseExpandedTerms" property of the "Lucene.Net.QueryParsers.QueryParser" class is always set to true, which lowers parameters in a search query string), a feature request for the product was made so that it can be considered for future implementations. That should make these tweaks unnecessary..

Wednesday, December 16, 2015

Sitecore WFFM 8.1 and multilingual save actions

WFFM Save actions

Every Sitecore developer that had the 'pleasure' of working with WFFM knows about save actions, and probably also knows that save actions by default are shared. And so: not multilingual. In some cases this is no issue, but if you want to send an email to your visitor you might want to do that in his own language.

KB : Solution 2

A solution for this issue is given in https://kb.sitecore.net/articles/040124. Most people use "Solution 2":

Apply the following customization:
  • Navigate to /sitecore/templates/Web Forms for Marketers/Form
  • Uncheck "Shared" checkbox for the Save Action field.
After this change, you must add Save Actions to each language version of the form item. This means that each language of the form item will keep its own list of Save Actions.

The error

This works up until Sitecore 8.0, but when we tried this in Sitecore 8.1 with WFFM 8.1 we got this:

When we go to a form and switch to another language this nice error appears. Apparently the reason is simple: by making the save action field un-shared we caused empty values in some languages for that field. Sounds very logical indeed.

But Sitecore does not expect an empty value in that field, it expects some xml.

The fix (workaround)

  1. Go to an affected form (in a working language)
  2. Switch to "Raw values"
  3. Open a needed language version of a form item.
  4. Insert the following value into the Save Actions field:
<?xml version="1.0" encoding="utf-16"?>
 <li xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <g id="{E5EABB1F-40BC-45BB-8D87-3B6C239B521B}" displayName="Actions" 
     onclick="javascript:return scForm.postEvent(this,event,'forms:addaction')" />
 </li>
Switch back to normal view (and save). This xml will insert an empty save actions block and you are good to go.

Monday, December 7, 2015

Sitecore Lucene index with integers

The situation

We recently discovered an issue when using a facet on an integer field in a Sitecore (8.1) Lucene index. We had a number of articles (items) with a date field. We had to query these items, order them by date and determine the number of items in each year.

The code

We created a ComputedField "year" and filled it with the year part of the date:
var dateTime = ((DateField)publicationDateField).DateTime;
return dateTime.Year;
We added the field to a custom index, and created an entry in the fieldmap to mark it as System.Int32. We rebuild the index, check the contents with Luke and all is fine. So we create a class based on SearchResultItem to use for the query:

class NewsItem : SearchResultItem
{
    [IndexField("title")]
    public string Title { get; set; }

    [IndexField("publication date")]
    public DateTime Date { get; set; }

    [IndexField("category")]
    public Guid Category { get; set; }

    [IndexField("year")]
    public int PublicationYear { get; set; }
}

The query

When we use this class for querying, we get not results when filtering on the year.. apparently integer fields need to be tokenized to be used in searches (indexType="TOKENIZED"). Sounds weird as this is surely not true for text fields, but the NumericField constructor makes it clear:

Lucene.Net.Documents.NumericField.NumericField(string name, int precisionStep, Field.Store store, bool index) : base(name, store, index ? Field.Index.ANALYZED_NO_NORMS : Field.Index.NO, Field.TermVector.NO)

So, we changed the field in the fieldmap and set it tokenized. We add an analyzer to prevent the integer being cut in parts (Lucene.Net.Analysis.KeywordAnalyzer or Sitecore.ContentSearch.LuceneProvider.Analyzers.LowerCaseKeywordAnalyzer).

Success?

Yeah! We have results! We got the news items for 2015! And 2014..  But... there is always a but or this post would be too easy. We still needed a facet. And there it went wrong. The facet resulted in this:


Not what we expected actually...

So back to our query and index..  Sitecore Support found out that this happens because of the specific way the numeric fields are indexed by Lucene, they are indexed not just as simple tokens but as a tree structure (http://lucene.apache.org/core/2_9_4/api/all/org/apache/lucene/document/NumericField.html).

Unfortunately, Sitecore cannot do faceting on such fields at this moment - this is now logged as a bug.

The Solution

The solution was actually very simple. We threw out the field from the fieldmap and changed the int in our NewsItem to string. If we want to use them as an integer we need to cast them afterwards, but for now we don't even need that.
Luckily for us, even the sorting doesn't care as our int's are years. So we were set.. queries are working and facets are fine.