Solr will use Highlighter instead of FastVectorHighlighter warning
Asked Answered
R

1

0

Hi I'm developing rails app with Solr 4.1 search engine,

When I add highlighting to searchSolr start spaming the tomcat6 log with this warning:

Jan 29, 2015 12:13:38 PM org.apache.solr.highlight.DefaultSolrHighlighter useFastVectorHighlighter
WARNING: Solr will use Highlighter instead of FastVectorHighlighter because *Field_Name* field does not store TermPositions and TermOffsets.

Example of my field in schema.xml:

<field name="name" type="text" indexed="true" stored="true" multiValued="true"/>

What I found in documentation:

The Standard Highlighter is the swiss-army knife of the highlighters. It has the most sophisticated and fine-grained query representation of the three highlighters. For example, this highlighter is capable of providing precise matches even for advanced queryparsers such as the surround parser. It does not require any special datastructures such as termVectors, although it will use them if they are present. If they are not, this highlighter will re-analyze the document on-the-fly to highlight it. This highlighter is a good choice for a wide variety of search use-cases. FastVector Highlighter

The FastVector Highlighter requires term vector options (termVectors, termPositions, and termOffsets) on the field, and is optimized with that in mind. It tends to work better for more languages than the Standard Highlighter, because it supports Unicode breakiterators. On the other hand, its query-representation is less advanced than the Standard Highlighter: for example it will not work well with the surround parser. This highlighter is a good choice for large documents and highlighting text in a variety of languages.

And FastVector highlighting provide a faster search: http://solr.pl/en/2011/06/13/solr-3-1-fastvectorhighlighting/.

But what the difference in configuration of Highlighting and FastVectorHighlighting?

And does users see the difference in search results when I change Highlighting to FastVectorHighlighting?

All what I need to do to turn on FastVectorHighlighting is to add termVectors="on" termPositions="on" termOffsets="on"/> into each field in schema.xml ? Like:

<field name="name" type="text" indexed="true" stored="true" multiValued="true" termVectors="on" termPositions="on" termOffsets="on"/>

Also I found this problem in Solr documentation: https://issues.apache.org/jira/browse/SOLR-5544

But I still don't know how to can I fix a WARNING, cause size of my log file increasing on 500 MB each second! it is critical, cause seach server'll stop if there'll be no free space on volume.

Please, help.

Retrospective answered 28/1, 2015 at 16:54 Comment(10)
Do you want to fix that it uses the FastVectorHighlighter over the Highlighter or that your log gets spammed?Liberticide
@Liberticide The problem is only in logs.. It is critical, cause it is on production server. I want to fix spamming log fileRetrospective
@Liberticide as I understood from Solr 4.1 documentation configuration difference between Highlighter and FastVectorHighlighter is only in adding termVectors="on" termPositions="on" termOffsets="on" to each field in schema.xml ? And users don't see difference in search results, when Highlighter or FastVectorHighlighter a turned on (except search speed). I'm right?Retrospective
Party, the FastVectorHighlighter performs better and may deliver slightly different results. But you cannot just change the schema and restart the server. If you change this, you are required to re-create the index ...Liberticide
@chefe, so to turn on FastHighlighting all what I need is to add three attributes to each field and rebuild indexes? in Solr 4.1 I need to add termVectors="on" termPositions="on" termOffsets="on" or termVectors="true" termPositions="true" termOffsets="true" to each field?Retrospective
@Liberticide it is bad to recreate the index, cause search stop working for a few hours.. Ok, I keep default highlighter in this case, but how can I fix problem with mad Tomcat logs? It is very critical and have bigger priority for me...Retrospective
I am not familiar with rails, but his SO looks interesting #4976675Liberticide
@Liberticide I think problem is not in rails configs, cause Solr are located on external server, which is configured by Chef. This warning is writing to tomcat log file var/log/tomcat6/catalina.2015-01-29.log.Retrospective
@Liberticide but I will try it in any case. thanks!Retrospective
@Liberticide I add termVectors="on" termPositions="on" termOffsets="on" to fields, and rebuild indexes after that. Unfortunately it does not help me..((( Maybe I need to do some changes in solrconf.xml?Retrospective
R
3

I found fields in my schema.xml, which include termVectors="true" attribute without termPositions="true" termOffsets="true".

It was the reason of warnings.

So, what I made:

  • added termPositions="true" termOffsets="true" to fields in schema.xml wihch have only termVectors="true" attribute
  • added termVectors="true" termPositions="true" termOffsets="true" to each field wich I found in warnings: ("...field phone does not store positions and offsets..." e.g.)

After I ran reindexing, but it does not fix "spam "warnings in logs.

Reason of this problem - Sold does not see schema.xml updates, while tomcat is not restarted.

So, I restart tomcat:

  • sudo /etc/init.d/tomcat6 restart.

  • I kick off reindexing again, cause all highlighting was lost

Many thanks @chefe for help!

Retrospective answered 29/1, 2015 at 14:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.