Imagine you have millions of records containing text with average 2000 words (each), and also you have an other list with about 100000 items.
e.g: In the keywords list you a have item like "president Obama" and in one of the text records you have some thing like this: "..... president Obama ....", so i want to find this keyword in the text and replace it with some thing like this: "..... {president Obama} ...." to highlight the keyword in the text, the keywords list contains multi-noun word like the example.
What is the fastest way to this in such a huge list with millions of text records?
Notes:
Now I do this work in a greedy way, check word by word and match them, but it takes about 2 seconds for each text record, and I want some thing near zero time.
Also I know this is something like named-entity-recognition and I worked with many of the NER framework such as Gate and ..., but because I want this for a language which is not supported by the frameworks I want to to this manually.