How can you determine the semantic similarity between two texts in python using WordNet?
The obvious preproccessing would be removing stop words and stemming, but then what?
The only way I can think of would be to calculate the WordNet path distance between each word in the two texts. This is standard for unigrams. But these are large (400 word) texts, that are natural language documents, with words that are not in any particular order or structure (other than those imposed by English grammar). So, which words would you compare between texts? How would you do this in python?