Has anyone had any luck writing custom indexers for nutch to index the crawl results with elasticsearch? Or do you know of any that already exist?
Haven't done it but this is definitely doable but would require to piggyback the SOLR code (src/java/org/apache/nutch/indexer/solr) and adapt it to ElasticSearch. Would be a nice contrib to Nutch BTW
I wrote an ElasticSearch plugin that mocks the Solr api. Using this plugin and the standard Nutch Solr indexer you can easily send crawled data into ElasticSearch. Plugin and an example of how to use it with Nutch can be found on GitHub:
I know that Nutch will be adding pluggable backends and glad to see it. I had a need to integrate elasticsearch with Nutch 1.3. Code is posted here. Piggybacked off the (src/java/org/apache/nutch/indexer/solr) code.
Haven't done it but this is definitely doable but would require to piggyback the SOLR code (src/java/org/apache/nutch/indexer/solr) and adapt it to ElasticSearch. Would be a nice contrib to Nutch BTW
Time goes by and now Nucth is already integrated well with ElasticSearch. Here is a nice tutorial.
© 2022 - 2024 — McMap. All rights reserved.