Plotting dendrogram in Scipy error for large dataset
Asked Answered
E

2

17

I am using Scipy for hierarchial clustering. I do manage to get flat clusters on a threshold using fcluster. But I need to visualize the dendrogram formed. When I use the dendrogram method, it works fine for 5-6k user vectors. But my dataser consists of 16k user vectors. When I run it for 16k users dendrogram function throws the following error:

File "/home/enthought/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2333, in _dendrogram_calculate_info
leaf_label_func, i, labels)
File "/home/enthought/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2205, in _append_singleton_leaf_node
ivl.append(str(int(i)))
RuntimeError: maximum recursion depth exceeded while getting the str of an object

Any ideas on visualizing dendrogram for larger dataser?

Eddie answered 18/4, 2012 at 6:42 Comment(2)
A simple idea is to extend your memory, otherwise you may need to dive into the implementation detail to make the routine memory friendly.Ewens
I had the same thing happen to me, but only when clustering was done with some methods (single, average, complete), but not ward. I wonder what triggers this - what are the properties of the same size linkage matrices that makes the recursion go so deep?Musk
P
31

This may be a bit late, but if you feel comfortable with increasing your recursion limit to subvert the recursion depth limit, you could do so. It's not recommended, and definitely not 'pythonic', but it will likely get you the results you want.

import sys
sys.setrecursionlimit(10000)
Papeterie answered 1/8, 2013 at 21:48 Comment(0)
T
1

Using sys.setrecursionlimit(1000000) I was able to process a large matrix and successfully return a seaborn.clustermap call. I imagine that this error could also be possibly resolved by upgrading scipy or supplying additional arguments and building a clustermap more thoughtfully using scipy.

Toenail answered 4/2, 2020 at 22:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.