When using the linear_kernel or the cosine_similarity
for TfIdfVectorizer
, I get the error "Kernel died, restarting".
I am running the scikit learn functions for TfID method Vectorizer and fit_transform
on some text data like the example below, but when I want to calculate the distance matrix, I get the error "Kernel died, restarting".
Whether I use the the cosine_similarity
or the linear_kernel
function:
tf = TfidfVectorizer(analyzer='word' stop_words='english')
tfidf_matrix = tf.fit_transform(products['ProductDescription'])
--cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)
--cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)
Maybe the problem is the size of my data?
My tiidf matrix is (178350,143529) which should generate a (178350,178350) cosine_sim matrix.
tfidf_matrix * tfifdf_matrix.T
– Kizzie