Using Nutch crawler with Solr
Asked Answered
E

3

13

Am I able to integrate Apache Nutch crawler with the Solr Index server?

Edit:

One of our devs came up with a solution from these posts

  1. Running Nutch and Solr
  2. Update for Running Nutch and Solr

Answer

Yes

Edlin answered 17/10, 2008 at 8:32 Comment(0)
B
6

If you're willing to upgrade to nutch 1.0 you can use the solrindex as described in this article by Lucid Imagination: http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/.

Bushed answered 8/8, 2009 at 19:8 Comment(1)
ya thats the definitive article for nutch/solrEdlin
W
1

nutch 2.x is designed to use solr as default. You can follow the steps in http://wiki.apache.org/nutch/Nutch2Tutorial, or a better instruction in the book "Web Crawling and Data Mining with Apache Nutch".

Wino answered 17/10, 2008 at 8:32 Comment(0)
P
1

It's still an open issue. If you're feeling adventurous you could try applying those patches yourself, although it looks like it's not so simple

Piscine answered 19/12, 2008 at 1:13 Comment(1)
ya I'm preparing a usergroup talk on lucene so I'll test out this setup. I was hoping there was a quick Y/N answer out thereEdlin

© 2022 - 2024 — McMap. All rights reserved.