cluster-analysis Questions
2
Solved
I have two N x N co-occurrence matrices (484x484 and 1060x1060) that I have to analyze. The matrices are symmetrical along the diagonal and contain lots of zero values. The non-zero values are inte...
Jakob asked 4/6, 2017 at 18:51
3
Solved
I am using R software (R commander) to cluster my data. I have a smaller subset of my data containing 200 rows and about 800 columns. I am getting the following error when trying kmeans cluster and...
Colorist asked 16/4, 2011 at 13:54
1
Solved
I'm running some cluster analysis and I'm trying to figure out two main things:
1) How to best interpret the results of the p-values in pvclust (what is the null that they are establishing?)
2) H...
Enterovirus asked 23/4, 2017 at 20:34
1
I've had encouraging results clustering a set of entity names using scikit-learn's affinity propagation implementation, with a modified Jaro-Winkler distance as the similarity metric, but my cluste...
Denumerable asked 24/4, 2017 at 14:8
2
I am trying to work on android mobile app where I have a functionality to find matches according to interest and location. Many dating apps are already doing some kinda functionality for example Ti...
Tetragon asked 23/4, 2017 at 16:23
8
Solved
Expectation Maximization (EM) is a kind of probabilistic method to classify data. Please correct me if I am wrong if it is not a classifier.
What is an intuitive explanation of this EM tech...
Irma asked 4/8, 2012 at 10:56
1
As per as the sklearn kmeans documentation, it says that k-means requires a matrix of shape=(n_samples, n_features). But I provided a distance matrix of shape=(n_samples,n_samples) where each index...
Decembrist asked 20/4, 2017 at 7:22
3
Solved
We're currently trying to detect the object regions in medical instruments images using the methods available in OpenCV, C++ version. An example image is shown below:
Here are the steps we're fol...
Ethridge asked 20/5, 2015 at 14:40
1
Solved
A dendrogram is a data structure used with hierarchical clustering algorithms that groups clusters at different "heights" of a tree - where the heights correspond to distance measures between clust...
Till asked 3/9, 2016 at 15:48
2
Solved
I want to cluster data with missing columns. Doing it manually I would calculate the distance in case of a missing column simply without this column.
With scikit-learn, missing data is not possibl...
Vagary asked 24/2, 2016 at 19:39
8
Solved
I am looking for Python implementation of k-means algorithm with examples to cluster and cache my database of coordinates.
Fetal asked 9/10, 2009 at 19:16
1
Solved
I'd like to find a threshold value for a bimodal distribution. For example, a bimodal distribution could look like the following:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(...
Gittern asked 10/2, 2017 at 1:22
3
I am a newbie in text mining, here is my situation.
Suppose i have a list of words ['car', 'dog', 'puppy', 'vehicle'], i would like to cluster words into k groups, I want the output to be [['car', ...
Poteen asked 31/1, 2017 at 11:25
3
Solved
I'm reading the following article for my master thesis:
http://graphics.cs.cmu.edu/projects/discriminativePatches/discriminativePatches.pdf
In section 2.1 it said: "we turn the classification step...
Swamp asked 5/1, 2014 at 18:42
0
I know how to group rows (genes) by annotation category using pheatmap, and I know how to perform Person's correlation clustering on the whole set of rows (genes), but what I would like to accompli...
Magdalenmagdalena asked 14/1, 2017 at 8:56
1
Solved
I have three matrices to compare. Each of them is 5x6. I originally wanted to use hierarchical clustering to cluster the matrices, such that the most similar matrices are grouped, given a threshold...
Microdont asked 1/1, 2017 at 15:19
1
I know how the algorithm works, but I'm not sure how it determines the clusters. Based on images I guess that it sees all the neurons that are connected by edges as one cluster. So that you might h...
Sturrock asked 13/12, 2016 at 12:24
3
Solved
I am doing some clustering using K-means in MATLAB. As you might know the usage is as below:
[IDX,C] = kmeans(X,k)
where IDX gives the cluster number for each data point in X, and C gives the ce...
Ashe asked 9/12, 2010 at 15:43
2
Solved
I'd like to understand the parameter max_iter from the class sklearn.cluster.KMeans.
According to the documentation:
max_iter : int, default: 300
Maximum number of iterations of the k-means algor...
Masera asked 1/12, 2016 at 10:10
1
Assume, I have hourly data corresponding to 5 categories for consective 10 days, created as:
library(xts)
set.seed(123)
timestamp <- seq(as.POSIXct("2016-10-01"),as.POSIXct("2016-10-10 23:59:59...
Lalonde asked 19/11, 2016 at 5:7
2
I have been trying to cluster multiple datasets of URLs (around 1 million each), to find the original and the typos of each URL. I decided to use the levenshtein distance as a similarity metric, al...
Rosalindarosalinde asked 2/8, 2016 at 12:20
2
Solved
I was trying to use DBSCAN algorithm from scikit-learn library with cosine metric but was stuck with the error.
The line of code is
db = DBSCAN(eps=1, min_samples=2, metric='cosine').fit(X)
wh...
Albertson asked 23/9, 2015 at 17:10
8
I have need to do some cluster analysis on a set of 2 dimensional data (I may add extra dimensions along the way).
The analysis itself will form part of the data being fed into a visualisation, r...
Inconformity asked 9/11, 2009 at 14:0
3
Solved
I am trying to run the package NbClust on my data (100 rows x 130 columns) to determine the number of clusters I should choose, but I keep getting this error if I try to apply it to the full data s...
Amalee asked 18/12, 2013 at 22:16
1
I would like to add to a clusplot plot the variables used for pca as arrows. I am not sure that a way has been implemented (I can't find anything in the documentation).
I have produced a clusplot ...
Broeker asked 21/4, 2015 at 11:11
© 2022 - 2024 — McMap. All rights reserved.