Error while indexing in solr data crawled by nutch
Asked Answered
M

1

8

I have starting working with nutch and solr and I have a problem with integrating Solr with Nutch. I followed this tutorial: http://wiki.apache.org/nutch/NutchTutorial and after using: bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5 nutch shows message:

java.io.IOException: Job failed!

and solr is showing:

SEVERE: org.apache.solr.common.SolrException: ERROR: [doc=http://nutch.apache.org/] unknown field 'host'

I thought that the reason might be a missing 'host' field in the $SOLR_HOME/example/solr/conf/schema.xml but it is there. I would be very grateful for your help.

Mckenziemckeon answered 17/11, 2012 at 9:56 Comment(4)
Did you copy the Nutch schema to SOLR? cp ${NUTCH_RUNTIME_HOME}/conf/schema.xml ${APACHE_SOLR_HOME}/example/solr/conf/Affenpinscher
Check if host is defined in schema. Stop Solr. Remove data directory. Start Solr. Try again.Affenpinscher
Ok, I had to define this field in ${APACHE_SOLR_HOME}/example/solr/collection1/conf and now is working. Thanks for help.Mckenziemckeon
Please add answers as Answers, and mark this question "Answered".Samos
Q
2

Changing configuration at Nutch side does not effect the schema of Solr. You have to define that field at schema.xml of Solr.

Quar answered 6/4, 2014 at 20:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.