I using sklearn to obtain tf-idf values as follows.
from sklearn.feature_extraction.text import TfidfVectorizer
myvocabulary = ['life', 'learning']
corpus = {1: "The game of life is a game of everlasting learning", 2: "The unexamined life is not worth living", 3: "Never stop learning"}
tfidf = TfidfVectorizer(vocabulary = myvocabulary, ngram_range = (1,3))
tfs = tfidf.fit_transform(corpus.values())
Now I want to view my calculated tf-idf scores in a matrix as follows.
I tried to do it as follows.
idf = tfidf.idf_
dic = dict(zip(tfidf.get_feature_names(), idf))
print(dic)
However, then I get the output as follows.
{'life': 1.2876820724517808, 'learning': 1.2876820724517808}
Please help me.
tfidf.fit_transform()
is in this form only. Only thing needed is the column names which you get fromtfidf.get_feature_names()
. Just wrap these two into a dataframe. – Berkeleianism