I use sklearn.feature_extraction.text.CountVectorizer to compute n-grams. Example:
import sklearn.feature_extraction.text # FYI http://scikit-learn.org/stable/install.html
ngram_size = 4
string = ["I really like python, it's pretty awesome."]
vect = sklearn.feature_extraction.text.CountVectorizer(ngram_range=(ngram_size,ngram_size))
vect.fit(string)
print('{1}-grams: {0}'.format(vect.get_feature_names(), ngram_size))
outputs:
4-grams: [u'like python it pretty', u'python it pretty awesome', u'really like python it']
The punctuation is removed: how to include them as separate tokens?