Extract dendrogram from seaborn clustermap
Asked Answered
H

1

5

Given the following example which is from: https://python-graph-gallery.com/404-dendrogram-with-heat-map/

It generates a dendrogram where I assume that it is based on scipy.

# Libraries
import seaborn as sns
import pandas as pd
from matplotlib import pyplot as plt

# Data set
url = 'https://python-graph-gallery.com/wp-content/uploads/mtcars.csv'
df = pd.read_csv(url)
df = df.set_index('model')
del df.index.name
df

# Default plot
sns.clustermap(df)

Question: How can one get the dendrogram in non-graphical form?

Background information: From the root of that dendrogram I want to cut it at the largest length. For example we have one edge from the root to a left cluster (L) and an edge to a right cluster (R) ...from those two I'd like to get their edge lengths and cut the whole dendrogram at the longest of these two edges.

Best regards

Hindgut answered 21/10, 2018 at 13:45 Comment(0)
C
12

clustermap returns a handle to the ClusterGrid object, which includes child objects for each dendrogram, h.dendrogram_col and h.dendrogram_row. Inside these are the dendrograms themselves, which provides the dendrogram geometry as per the scipy.hierarchical.dendrogram return data, from which you could compute the lengths of a specific branch.

h = sns.clustermap(df)
dgram = h.dendrogram_col.dendrogram
D = np.array(dgram['dcoord'])
I = np.array(dgram['icoord'])

# then the root node will be the last entry, and the length of the L/R branches will be
yy = D[-1] 
lenL = yy[1]-yy[0]
lenR = yy[2]-yy[3]

The linkage matrix, the input used to compute the dendrogram, might also help:

h.dendrogram_col.linkage
h.dendrogram_row.linkage
Calamite answered 21/10, 2018 at 18:27 Comment(1)
Thank you very much for the detailed elaboration, this helped me to solve the problem!Hindgut

© 2022 - 2024 — McMap. All rights reserved.