How to run apache nutch different jobs in parallel manner
Asked Answered
W

1

10

I am using nutch 2.3. All jobs run one after other i.e. first generator, fetch, parse, index etc. I want to run some jobs simultaneously. I know some jobs cannot run in parallel but other can e.g parse job, dbupdate, indexjob should be run with fetch.

Is it possible ? My basic objective is to run fetcher job all the time. I suppose that we can do it with different timestamp. Can anyone guide me the proper way ?

Walkin answered 5/5, 2015 at 6:35 Comment(1)
May be you shold use hadoop with nutch.Napolitano
I
5

If you check out the nutch web app server, you will find out that it can execute multiple crawl job in parallel.You should check out the source code of the Nutch 2.3 for webapp[NutchUiServer]. Hope this helps.

Involve answered 17/5, 2015 at 18:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.