Effects of violating compareTo transitivity contract due to numerical precision errors
Asked Answered
B

3

7

I have some numbers I'm trying to compare. They represent lengths of paths through different spaces.

Unfortunately for me, some imprecision was causing false comparisons. For instance, after noting the wrong effects, I found that I was having comparisons like this:

a = 384.527100541296
b = 384.52710054129614 // Note the trailing 14 

For my purposes, a and b are supposed to be equal.

I noted the guava has a fuzzyCompare() method for doubles which seems to do what I want in ignoring some of this precision:

private static final double COMPARISON_PRECISION=1e-10;

private static final Comparator<Double> fuzzyCompare= new Comparator<Double>(){
    public int compare(Double o1, Double o2) {
        return DoubleMath.fuzzyCompare(o1, o2, COMPARISON_PRECISION);
    }   
};

public int compareTo(Object o) {
    if (o instanceof Spam) {
       Spam other = (Spam) (o);
       return ComparisonChain.start()
       .compare(this.getLength(),other.getLength(),fuzzyCompare)
       //...
       .result();
    } else {
       throw new ClassCastException();
    }
}

The warning on that fuzzy compare did not slip my notice:

This is not a total ordering and is not suitable for use in Comparable.compareTo(T) implementations. In particular, it is not transitive

My question is, is this lack of transitivity a real problem? If it is, how would it present itself? I would think that if the comparison were really truly violated it would throw an error similar to this question: Java error: Comparison method violates its general contract, and its not doing that even against a variety of values I've tested.

Or perhaps since an IllegalArgumentException is a runtime error, I just haven't run across the problems yet because only some devious values will trigger the problem?

Or perhaps it is doing something wrong right now, it just subtle enough that I haven't noticed it?

Branscum answered 30/10, 2017 at 15:18 Comment(1)
It is more subtle. Actually it depends on the initial sequence of your List<T> to be sorted, whether Collection.sort(List<T>, Comparator<T>) will throw the Java error: Comparison method violates its general contract or not.Commutable
L
5

TL;DR:

Your operator is not transitive. Consider a = 0, b = 0.6, c = 1.2 with a tolerance of 1. a==b, b==c but a!=c. The solution is to partition your values into classes (for example by rounding or truncating) and use Double.compare() to preserve transitivity.

Detailed explanation:

First lets discuss if your data is transitive while using fuzzyCompare(double, double, double):

While in most cases your data will be transitive, it is possible to generate samples which are not. Lets take the following values:

a = 384.52710054120
b = 384.52710054126
c = 384.52710054132

As you can see that using our new metric the following are true: a==b, b==c, but a!=c. As you can see, you have violated transitivity.

Is it a problem if your Comparator is non-transitive?

Methods assert certain conditions by using documentation and/or annotations. The compare method promises that the method is transitive. Breaking that promise might be ok for a lot of cases were transitivity is not important, but code which relies on that promise could be broken.

What is an example of code which might not work if the promise of transitivity is broken?

Lets create a scenario where we have 3 elements of type Foo which are not transitive according to some Comparator called fooComparator . We call them f1, f2 and f3.

Comparator<Foo> fooComparator = new Comparator<Foo>(){
    public int compare(Foo o1, Foo o2) {
        // some non-transitive return value
    }   
};

Since they are not transitive, lets assume f0 < f1, f1 < f2, f2 < f0 holds true. What would happen if you put them in a list and tried to sort() them?

List<Foo> foos = new LinkedList<>();
Collections.addAll(f1, f2, f3)
Collections.sort(foos, fooComparator);

How to fix the problem

You can create a transitive operator by mapping data to another data set and use a transitive operator defined on that set. Lets map the real numbers to the real numbers with a lower precision.

Consider the following values:

a = 0.01; b = 0.05; c = 0.13; d = 0.19; e = 0.21

If you truncate them to the second digit (Math.truncate(x * 10)/10) and use Double.compare(), transitivity is preserved.

You can see that we have put our values into three classes {a, b} < {c, d} < {e}. There surely is some important theorem which proves that this is the case, but I can't remember its name..

Layette answered 30/10, 2017 at 15:23 Comment(9)
I suppose that in order for f2<f0 and f0<f1 to be true, the difference must be small enough that the fuzzy compare is coming into effect. I would argue then that even though the actual numbers are as you described, for the intent of my program, I'm asserting that f0=f1=f2. I think the sort is still stable, so at least I haven't introduced non-determinism right?Branscum
well, if your data is transitive within the boundaries of your application: yes. you can ignore that warning. the warning is still very important, as that might not be the case for every applicationLayette
i have to admit i didn't get your question at the first shot. my answer is still: yes, its a problem. give me a minute to edit my answerLayette
i fixed my answer. look at the last secion i addedLayette
Well. That is a problem. I hadn't thought of that. I suppose I'll have to track down whats causing the imprecision in the first place.Branscum
why dont you just round your numbers to the tenth digit and use the common Double.compare()Layette
Wouldn't that be essentially the same problem? Choosing what rounding method would be important I think.Branscum
no, just truncate it at a certain precision. transitivity will be maintained. check it with the examples providedLayette
i added an example to the bottom of my answerLayette
B
1

is this lack of transitivity a real problem

Maybe, depends on the problem you're trying to solve. But you might run into subtle problems where code expecting implementations of Comparator to work in a transitive way. It is hard to say what the effects are, other than "undefined".

I would not be very happy if I saw this code in a review: you are overloading Java's well-defined concept of comparison with your own - valid, but different - notion of comparison.

If you call it something different - fuzzyCompare, FuzzyComparator etc - there is no confusion of the two notions.

Brahmin answered 30/10, 2017 at 15:24 Comment(2)
I guess I was trying to understand what the effects might be. You're saying that there is no general "bad effect" like a specific error, but that just the violation itself opens the code up to a whole world of "undefined" behavior?Branscum
@Branscum exactly. There is nothing to say that it won't work, just you shouldn't be surprised if it doesn't.Brahmin
L
1

Using a non-transitive compareTo is a terrible idea:

  • Sorting may throw as already pointed.
  • Even worse, sorting may return wrong results, possibly completely wrong. It relies on the contract which you're violating and there's absolutely no guarantee that it ends up well (i.e., with an exception). Analyzing TimSort won't help, as the algorithm may get replaced by a better one in a few years.
  • Any SortedMap may break terribly. It may happen that things you put it, won't be found (such things do really happen with a HashMap with a broken equals or hashCode). Again, the implementation may change in a few years and then anything is possible.

I'd strongly suggest to name your method differently. Or, create a Comparator documented with a corresponding warning (this can lead to the same kind of problems, but it's much more explicit).

Note that with a broken Comparable, even a HashMap may break as in case of many collisions, it uses compareTo when possible.

Lowlife answered 31/10, 2017 at 13:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.