I've tried to follow the nutch tutorial but having a bit of a problem with the schema.xml file.
I was told to the nutch provided schema to my project, essentially this...
cp ${NUTCH_RUNTIME_HOME}/conf/schema.xml ${APACHE_SOLR_HOME}/example/solr/conf/
I have deployed my solr file in Tomcat and the error I get when I go to the Solr dashboard is
collection1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Plugin init failure for [schema.xml] fieldType "text":
Plugin init failure for [schema.xml] analyzer/filter:
Error loading class 'solr.EnglishPorterFilterFactory'
Which relates to this element in my solrconfig.xml file (I can comment this out but not sure how important this is yet)
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
I have edited my solrconfig.xml to try and included a range of jar files that come with solr, specifically
<lib path="/etc/solr/collection1/libs/dist/solr-core-4.2.1.jar" />
<lib path="/etc/solr/collection1/libs/dist/solr-analysis-extras-4.2.1.jar" />
But I don't think they contain the missing class "solr.EnglishPorterFilterFactory"
Does anyone have idea why this might not be working or if I have missed something? I'm not a Java developer btw so no doubt it will be something simple :)
UPDATE After finding out that the schema had some old classes being referenced I had another look in the nutch/conf and tt looks like there is a ${NUTCH_RUNTIME_HOME}/conf/schema-solr4.xml file which seems to work.
Not 100% if this is correct but hey...