ELKI implementation of OPTICS clustering algorithm detects only one cluster
Asked Answered
T

2

7

I'm having issue with using OPTICS implementation in ELKI environment. I have used the same data for DBSCAN implementation and it worked like a charm. Probably I'm missing something with parameters but I can't figure it out, everything seems to be right.

Data is a simple 300х2 matrix, consists of 3 clusters with 100 points in each.

DBSCAN result:

Clustering result of DBSCAN

MinPts = 10, Eps = 1

OPTICS result:

Clustering result of OPTICS

MinPts = 10

Triglyceride answered 25/12, 2012 at 14:54 Comment(1)
I read the paper more accurately and discovered that the output of OPTICS algorithm itself is not a classification but a range. Therefore we get information about a density cluster structure (in a form of density plot) and work with it afterwards. I recommend using OpticsXI algorithm in ELKI to classify.Triglyceride
P
4

You apparently already found the solution yourself, but here is the long story:

The OPTICS class in ELKI only computes the cluster order / reachability diagram.

In order to extract clusters, you have different choices, one of which (the one from the original OPTICS publication) is available in ELKI.

So in order to extract clusters in ELKI, you need to use the OPTICSXi algorithm, which will in turn use either OPTICS or the index based DeLiClu to compute the cluster order.

The reason why this is split into two parts in ELKI probably is so that you can on one hand implement another logic for extracting the clusters, and on the other hand implement different methods like DeLiClu for computing the cluster order. That would align well with the modular architecture of ELKI.

IIRC there is at least one more method (apparently not yet in ELKI) that extracts clusters by looking for local maxima, then extending them horizontally until they hit the end of the valley. And there was a different one that used "inflexion points" of the plot.

Primal answered 25/12, 2012 at 22:45 Comment(0)
I
4

@AnonyMousse pretty much put it right. I just can't upvote or comment yet.

We hope to have some students contribute the other cluster extraction methods as small student projects over time. They are not essential for our research, but they are good tasks for students that want to learn about ELKI to get started.

ELKI is a fast moving project, and it lives from community contributions. We would be happy to see you contribute some code to it. We know that the codebase is not easy to get started with - it is fairly large, and the generality of the implementation and the support for index structures make it a bit hard to get started. We try to add Tutorials to help you to get started. And once you are used to it, you will actually benefit from the architecture: your algorithms get the benfits of indexing and arbitrary distance functions, while if you would implement from scratch, you would likely only support Euclidean distance, and no index acceleration.

Seeing that you struggled with OPTICS, I will try to write an OPTICS tutorial in the new year. In particular, OPTICS can benefit a lot from using an appropriate index structure.

Impertinence answered 31/12, 2012 at 14:19 Comment(6)
I would urge that the ELKI UI be made more intuitive from the standpoint of the UI. Currently, it is barely usable with the need for significant guessing in terms of the parameters and exceptions that are used to poorly communicate undesirable parameters.Itis
UI contributions would be welcome. I literally hate doing UI. But you need to understand that it is a tool that tries to offer all possible choices; not only the best pick. Any UI must be able to adapt to constantly new algorithms, we cannot affort to have to update the UI code every time a new algorithm or option is added. In the end ELKI is a research tool, not an end user application, so our focus is not on a nice UI (nevertheless, contributions are welcome).Impertinence
Have you considered a wrapper to fit into another package such as Weka which has a friendlier UI but less effective algorithms? This would certainly give more exposure than the current UI offers. I understand that ELKI is a research tool but having a problematic UI limits the availability of the algorithms for investigation purposes.Itis
I personally don't need that, so I'm not going to do this myself. In fact, I am very happy with the current UI, it serves my purposes perfectly. Yes, of course I would like a nicer UI, but who is going to do it?Impertinence
Understood. I was asking to identify interest in or opposition to being linked to Weka. I have ported algorithms to Weka in the past and could add some wrappers as I have done for several clustering algorithms. It would just be a matter of understanding the parameters which is part of the problem that I often face in using the MiniGUI.Itis
Weka is much less performant, and I see no benfits of adding an ELKI module to Weka that would only copy data from Weka to ELKI and back. Weka UI is not that pretty either. I'd like to see someone to a RELKI package for R, though; because R is popular and powerful for preprocessing data. There was someone working on a ELKI module for Rapidminer, but I have not seen this published yet.Impertinence

© 2022 - 2024 — McMap. All rights reserved.