This question involves computing as well as knowledge of Chinese. I have chinese queries and I have a separate list of phrases in Chinese I need to be able to find which of these queries have any of these phrases.
In english, it is a very simple task. I don't understand Chinese at all, its semantics, grammar rules etc. and if somebody in this forum who also understands Chinese can help me with some basic understanding and how pattern matching is done for Chinese.
I have a basic perception that in Chinese one unit (without any space in between) can actually mean more than one word(Is this correct?). So are there any rules on how more than one word combine among themselves to stand out as a unit. It is confusing because there are spaces in Chinese writing yet even a unit without space has more than one word in it.
Any links which explain Chinese from computational point of view, pattern matching etc would be very useful..