Solr can't find resource stopwords_en.txt
Asked Answered
M

5

10

I'm trying to setup Solr 3.6.0 with Django-haystack Beta 2.0.0.

After running ./manage.py build_solr_schema and moving schema.xml to the conf directory, upon visiting http://localhost:8983/solr/admin, I receive an error exactly like the one produced in this thread.

org.apache.solr.common.SolrException: No cores were created, please check the logs for errors

java.lang.RuntimeException: Can't find resource 'stopwords_en.txt' in classpath or 'solr/./conf/', cwd=/home/randall/startupsearch_live/apache-solr-3.6.0/example

At the bottom of the thread, a user mentions that schema.xml must be edited to match stopwords_en.txt to the /example/solr/conf/ directory, which I did both through a symbolic link and by editing all instances of stopwords.txt to /solr/conf/stopwords_en.txt in the generated schema.xml file. However, the same error persists, giving a slightly different output:

java.lang.RuntimeException: Can't find resource '/solr/conf/stopwords_en.txt' in classpath or 'solr/./conf/', cwd=/home/randall/startupsearch_live/apache-solr-3.6.0/example

What file must I edit to fix this problem?

Morganstein answered 8/7, 2012 at 21:37 Comment(0)
S
7

It can't find stopwords_en.txt file in the classpath. You should add stopwords_en.txt file into the solr/conf/ directory. You can find more information about stopwords here.

Selflove answered 9/7, 2012 at 11:7 Comment(0)
B
6

A better way is to find all occurrences of stopwords_en.txt in schema.xml and replace them with lang/stopwords_en.txt

Bennettbenni answered 14/9, 2013 at 11:35 Comment(0)
H
2

You have to put stopwords_en.txt in the path . Make a file name stopwords_en.txt and paste beside the schema.xml. I hope you know what stopword filter is used.....

Heilner answered 9/7, 2012 at 9:35 Comment(0)
F
1

To combine all three of the above answers, you need the stopwords_en.txt as it begins testing for English language text

From http://wiki.apache.org/solr/LanguageAnalysis#Stopwords

Stopwords affect Solr in three ways: relevance, performance, and resource utilization.

From a relevance perspective, these extremely high-frequency terms tend to throw off the scoring algorithm, and you won't get very good results if you leave them. At the same time, if you remove them, you can return bad results when the stopword is actually important.

From a performance perspective, if you keep stopwords, some queries (especially phrase queries) can be very slow.

From a resource utilization perspective, if you keep stopwords, the index is much larger than if you remove them.

One tradeoff you can make if you have the disk space: You can use CommonGramsFilter/CommonGramsQueryFilter instead of StopFilter. This solves the relevance and performance problems, at the expense of even more resource utilization, because it will form bigrams of stopwords to their adjacent words.

What you need to do is copy the original version located in the /conf/lang folder of your solr directory into just the /conf directory

cp PATH/TO/solr/conf/lang/stopwords_en.txt PATH/TO/solr/conf
Fletafletch answered 29/10, 2014 at 13:54 Comment(0)
O
1

In Solr 5 I got the same error. I had used the Solr zookeeper cli shell to upload my configuration. I had copied the contents of an existing solr config from the server/solr/configsets/basic_configs, but I somehow missed the lang directory.

The conf/lang directory contains stopwords_en.txt.

Optical answered 25/2, 2016 at 23:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.