The score makes sense if you think about the quantity that it is meant to capture.
The problem being addressed here is figuring out what's the best way to place vertices of a graph into semi-clusters (simply a group of vertices where each vertex can be in more than one semi-cluster) with some upper bound on the total number of semi-cluster. So one method of finding the "best" way is to assign a score to any potential semi-cluster (in other words, to any arbitrary group of vertices). Then the problem becomes that of maximizing the total score.
So, semi-cluster are meant to capture cliques in a graph. For example in a social graph, a semi-cluster might be the members of a high school basketball team.
Thus more interior edges equates to a "better" semi-cluster. This explains the I_c
in the numerator. Similarly, you want to have very few boundary edges since if there are a lot of boundary edges, then that means there is likely to be a better semi-group containing the one you are examining. This gives the -f_b * B_c
in the numerator. f_b
is simply a scaling factor so that you can tune how much penalty you want to assign the boundary edges.
The denominator is also kind of a scaling factor. It is used to normalize the semi-cluster scores so that small clusters don't get completely dominated by larger ones. An extreme example of this is if you consider the semi-group of everyone in the world. Clearly there are no boundary edges and tons of interior edges, but it is undoubtably a less useful semi-group than the high school basketball team.