Using Pearson correlation in sklearn FeatureAgglomeration - McMap

About

Using Pearson correlation in sklearn FeatureAgglomeration

Asked 14/8, 2018 at 18:38 Answered 14/8, 2018 at 18:38

python scipy scikit-learn affinity

S

0

7

I have a pandas dataframe with 100 rows and 10,000 features. I want to fit hierarchical clustering on my data by using pearson correlation as the affinity argument in sklearn.cluster.FeatureAgglomeration.

I've tried two ways to make it work so far: The first is:

feature_agglomator = FeatureAgglomeration(n_clusters=10, affinity=np.corrcoef, linkage='average')

The second one:

from scipy.spatial.distance import correlation 
feature_agglomator = FeatureAgglomeration(n_clusters=10,affinity='correlation', linkage='average')

After running:

feature_agglomator.fit_transform(X)

Both ended with the same exception:

ValueError: The condensed distance matrix must contain only finite values.

What can I do for it to work propery?

Southeastwards answered 14/8, 2018 at 18:38 Comment(4)

I think you should read these two github threads related to your issue: [link]github.com/scikit-learn/scikit-learn/issues/7689 [link]github.com/scikit-learn/scikit-learn/issues/10076 Both seem to point to point to scipy refusing to perform agglomerative clustering when using cosine distance with zero vectors. – Periodontal 14/8, 2018 at 19:13

I think that the correlation is giving you NaN. Check out your input values. – Psychographer 14/8, 2018 at 19:17

@Psychographer you were right, I had columns filled with 0's. Thanks! – Southeastwards 15/8, 2018 at 8:36

I’m voting to close this question because it was a fixed error by the author – Cowberry 28/3 at 6:29

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.