cluster-analysis Questions

6

Solved

I am currently working on trying to write code to calculate the degree matrix, so that I may compute the Laplacian L = D - A, where D=degree matrix, A=adjacency matrix. This will be later used in...
Vulpine asked 3/9, 2015 at 16:46

2

Solved

I have a sparse matrix from scipy.sparse import * M = csr_matrix((data_np, (rows_np, columns_np))); then I'm doing clustering that way from sklearn.cluster import KMeans km = KMeans(n_clusters...
Pagas asked 22/4, 2015 at 13:26

2

Solved

We've been using Kmeans for clustering our logs. A typical dataset has 10 mill. samples with 100k+ features. To find the optimal k - we run multiple Kmeans in parallel and pick the one with the b...
Snowball asked 11/10, 2019 at 18:13

1

I have somewhere between 10-20k different time-series (24 dimensional data -- a column for each hour of the day) and I'm interested in clustering time series that exhibit roughly the same patterns ...
Thing asked 12/10, 2019 at 20:16

2

Solved

I'm exploring the possibility of clustering some categorial data with python. I have currently 8 features each with approximately 3-10 levels. As I understood both one-hot encoding with kmeans and...
Centrifugal asked 16/5, 2019 at 15:19

3

Solved

I'm working with BOW object detection and I'm working on the encoding stage. I have seen some implementations that use kd-Tree in the encoding stage, but most writings suggest that K-means clusteri...

2

Solved

I have a collection of photos and I'd like to distinguish clusters of the similar photos. Which features of an image and which algorithm should I use to solve my task?

2

I am trying to translate the R implementations of gap statistics and prediction strength http://edchedch.wordpress.com/2011/03/19/counting-clusters/ into python scripts for the estimation of number...
Lowbred asked 8/1, 2014 at 17:39

4

I have some dots in a 3 dimensional space and would like to cluster them. I know Pythons module "cluster", but it has only K-Means. Do you know a module which has FCM (Fuzzy C-Means)? (If you know...
Timothee asked 18/7, 2011 at 16:47

4

I'm using a Gaussian Mixture Model (GMM) from sklearn.mixture to perform clustering of my data set. I could use the function score() to compute the log probability under the model. However, I am ...
Bortman asked 2/12, 2015 at 16:14

2

Solved

I'm building Kmeans in pytorch using gradient descent on centroid locations, instead of expectation-maximisation. Loss is the sum of square distances of each point to its nearest centroid. To ident...

4

Solved

I want to use silhouette to determine optimal value for k when using KMeans clustering in Spark. Is there any optimal way parallelize this? i.e. make it scalable

3

Solved

I was learning about non-linear clustering algorithms and I came across this 2-D graph. I was wondering which clustering alogirthm and combination of hyper-parameters will cluster this data well. ...

5

Solved

I have already trained my clustering model using hclust: model=hclust(distances,method="ward”) And the result looks good: Now I get some new data records, I want to predict which cluster ev...
Pigtail asked 11/1, 2014 at 15:48

1

Solved

I have a python code which uses igraph library import igraph edge = [(0, 6), (0, 8), (0, 115), (0, 124), (0, 289), (0, 359), (0, 363), (6, 60), (6, 115), (6, 128), (6, 129), (6, 130), (6, 131), (6...
Trapeziform asked 19/5, 2019 at 7:46

1

I have a 3000x50 feature vector matrix. I obtained a similarity matrix for this using sklearn.metrics.pairwise_distances as 'Similarity_Matrix'. Now I used networkx to create a graph using the simi...
Deidredeific asked 15/5, 2014 at 17:14

1

Solved

The Issue Given the following network of nodes and edges, I would like to derive all possible groupings of nodes where all nodes within a group are connected to all other nodes within that group ...
Consulate asked 29/4, 2019 at 20:25

4

Solved

I could use some advice on methods in R to determine the optimal number of clusters and later on describe the clusters with different statistical criteria. I’m new to R with basic knowledge about t...
Grania asked 6/11, 2012 at 10:51

1

Solved

I noticed that there are two different functions for spectral clustering in sklearn.cluster library: SpectralClustering and spectral_clustering. Although they differ in some details, both do spectr...
Kylander asked 13/4, 2019 at 21:40

1

I am performing mean shift clustering on a dataset. estimate_bandwidth function estimates the appropriate bandwidth to perform mean-shift clustering. Syntax: sklearn.cluster.estimate_bandwidth(X,...
Sulfuric asked 5/2, 2015 at 2:15

7

Solved

I'm looking for a decent implementation of the OPTICS algorithm in Python. I will use it to form density-based clusters of points ((x,y) pairs). I'm looking for something that takes in (x,y) pairs...

1

I have a data set, which consists of more than one subsets of data. If I plot Y vs. X, I get few overlapping ellipses and I want to cluster them*. I have tried with the mixture from sklearn, the ...

2

Solved

I'm writting a piece of code to evaluate my Clustering Algorithm and I find that every kind of evaluation method needs the basic data from a m*n matrix like A = {aij} where aij is the number of dat...
Mainstream asked 30/9, 2011 at 15:56

2

I am trying to cluster ~30 million points (x and y co-ordinates) into clusters - the addition that makes it challenging is I am trying to minimise the spare capacity of each cluster while also ensu...

3

Solved

The k-medoids in the clara() function uses distance to form clusters so I get this pattern: a <- matrix(c(0,1,3,2,0,.32,1,.5,0,.35,1.2,.4,.5,.3,.2,.1,.5,.2,0,-.1), byrow=T, nrow=5) cl <- cla...
Crease asked 11/5, 2012 at 17:13

© 2022 - 2024 — McMap. All rights reserved.