My naive maximal clique finding algorithm runs faster than Bron-Kerbosch's. What's wrong?
Asked Answered
T

1

7

In short, my naive code (in Ruby) looks like:

# $seen is a hash to memoize previously seen sets
# $sparse is a hash of usernames to a list of neighboring usernames
# $set is the list of output clusters

$seen = {}
def subgraph(set, adj)
    hash = (set + adj).sort
    return if $seen[hash]
    $sets.push set.sort.join(", ") if adj.empty? and set.size > 2
    adj.each {|node| subgraph(set + [node], $sparse[node] & adj)}
    $seen[hash] = true
end

$sparse.keys.each do |vertex|
    subgraph([vertex], $sparse[vertex])
end

And my Bron Kerbosch implementation:

def bron_kerbosch(set, points, exclude)
    $sets.push set.sort.join(', ') if set.size > 2 and exclude.empty? and points.empty?
    points.each_with_index do |vertex, i|
        points[i] = nil
        bron_kerbosch(set + [vertex],
                      points & $sparse[vertex],
                      exclude & $sparse[vertex])
        exclude.push vertex
    end
end

bron_kerbosch [], $sparse.keys, []

I also implemented pivoting and degeneracy ordering, which cut down on bron_kerbosch execution time, but not enough to overtake my initial solution. It seems wrong that this is the case; what algorithmic insight am I missing? Here is a writeup with more detail if you need to see fully working code. I've tested this on pseudo-random sets up to a million or so edges in size.

Tights answered 1/3, 2011 at 1:13 Comment(6)
I tried your code on some other test cases and it was about twice as slow as B–K. What do your tests look like?Hygienic
Generated edges from a pseudo-random routine. Do you mind if you dumped your test cases and code somewhere for me to play with?Tights
mediafire.com/file/5x5p7tu2t9c7r1a/tests.zipHygienic
Thanks, I can definitely see some of these test cases being slower with my code. I guess it comes down to proper input selection. I'll try to puzzle out why.Tights
Based on your title, I wondered which Olympics Bron Kerbosch competed in.Frere
Maybe your pseudo-random routine generates something strange? Maybe N is too small to make algorithm complexity important?Morn
C
4

I don't know how you generate the random graphs for your tests but I suppose you use a function which generates a number according to a uniform distribution and thus you obtain a graph that is very homogeneous. That's a common problem when testing algorithms on graphs, it is very difficult to create good test cases (it's often as hard as solving the original problem).

The max-clique problem is a well-known NP hard problem and both algorithms (the naive one and the Bron Kerbosch one) have the same complexity so we can't expect a global improvement on all testcase but just an improvement on some particular cases. But because you used a uniform distribution to generate your graph, you don't have this particular case.

That's why performance of both algorithms is very similar on your data. And because Bron Kerbosch algorithm is a little more complex than the naive one, the naive one is faster.

Cushitic answered 24/5, 2012 at 14:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.