I have N different keywords that i am tracking (for sake of simplicity, let N=3). So in GET statuses/filter, I will give 3 keywords in the "track" argument.
Now the tweets that i will be receiving can be from ANY of the 3 keywords that i mentioned. The problem is that i want to resolve as to which tweet corresponds to which keyword. i.e. mapping between tweets and the keyword(s) (that are mentioned in the "track" argument).
Apparently, there is no way to do this without doing any processing on the tweets received.
So i was wondering what is the best way to do this processing? Search for keywords in the text of the tweet? What about case-insensitive? What about when multiple words are there in same keyword, e.g: "Katrina Kaif" ?
I am currently trying to formulate some regular expression...
I was thinking the BEST way would to use the same logic (regular expressions etc.) as is used originally be statuses/filter API. How to know what logic is used by Twitter API statuses/filter itself to match tweets to the keywords ?
Advice? Help?
P.S.: I am using Python, Tweepy, Regex, MongoDb/Apache S4 (for distributed computing)