Solr 4.4: StopFilterFactory and enablePositionIncrements

Asked 7/9, 2013 at 0:13 Answered 9/3, 2018 at 16:16

While attempting to upgrade from Solr 4.3.0 to Solr 4.4.0 I ran into this exception:

 java.lang.IllegalArgumentException: enablePositionIncrements=false is not supported anymore as of Lucene 4.4 as it can create broken token streams

which led me to this issue. I need to be able to match queries irrespective of intervening stopwords (which used to work with enablePositionIncrements="true"). For instance: "foo of the bar" would find documents matching "foo bar", "foo of bar", and "foo of the bar". With this option deprecated in 4.4.0 I'm not clear on how to maintain the same functionality.

The package javadoc adds:

If the selected analyzer filters the stop words "is" and "the", then for a document containing the string "blue is the sky", only the tokens "blue", "sky" are indexed, with position("sky") = 3 + position("blue"). Now, a phrase query "blue is the sky" would find that document, because the same analyzer filters the same stop words from that query. But the phrase query "blue sky" would not find that document because the position increment between "blue" and "sky" is only 1.

If this behavior does not fit the application needs, the query parser needs to be configured to not take position increments into account when generating phrase queries.

But there's no mention of how to actually configure the query parser to do this. Does anyone know how to deal with this issue as Solr moves toward 5.0?

Lair answered 7/9, 2013 at 0:13 Comment(6)

have you found the solution of this problem ? – Voltaire 8/5, 2014 at 6:35

@VishalParekh nope - haven't found a solution yet... – Lair 8/5, 2014 at 15:24

@Lair I have the same problem, I was thinking about re-implementing the StopFilterFactory and re-enable the option to set enablePositionIncrements to false – Marvamarve 5/3, 2015 at 15:21

@condit, I'm facing same issue. Any solutions? – Blondellblondelle 6/4, 2015 at 8:59

@MMTac - nope. Still stuck on this. – Lair 6/4, 2015 at 15:21

@Lair hmm, thanks for your prompt response. I'm still finding the way to overcome this issue. – Blondellblondelle 7/4, 2015 at 3:28

You can use proximity searching:

"foo bar"~2

Metalline answered 7/9, 2013 at 13:59 Comment(2)

This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. – Devout 7/9, 2013 at 14:45

removed the rhetorical question :) – Metalline 7/9, 2013 at 14:55

I don't know if this is recommended for usage, but there are still some legacy classes in Lucene 5, such as Lucene43StopFilter.

Unfortunately they seem to have disappeared in Lucene 6...

Ranged answered 18/5, 2016 at 0:42 Comment(0)

I found somewhere on the net implementation of RemoveTokenGapsFilterFactory

public final class RemoveTokenGapsFilter extends TokenFilter {

    private final PositionIncrementAttribute posIncrAttribute = addAttribute(PositionIncrementAttribute.class);

    public RemoveTokenGapsFilter(TokenStream input) {
        super(input);
    }

    @Override
    public boolean incrementToken() throws IOException {

        if (input.incrementToken()) {
            posIncrAttribute.setPositionIncrement(1);
            return true;
        }

        return false;
    }
}

Zhang answered 9/3, 2018 at 16:16 Comment(0)

Recommended topics

Hot tags