Why is ThreadLocalRandom implemented so bizarrely?
Asked Answered
C

1

10

This question regards the implementation of ThreadLocalRandom in OpenJDK version 1.8.0.

ThreadLocalRandom provides a per-thread random number generator without the synchronization overhead imposed by Random. The most obvious implementation (IMO) would be something like this, which appears to preserve backward compatibility without much complexity:

public class ThreadLocalRandom extends Random {
    private static final ThreadLocal<ThreadLocalRandom> tl =
        ThreadLocal.withInitial(ThreadLocalRandom::new);
    public static ThreadLocalRandom current() {
        return tl.get();
    }
    // Random methods moved here without synchronization
    // stream methods here
}

public class Random {
    private ThreadLocalRandom delegate = new ThreadLocalRandom();
    // methods synchronize and delegate for backward compatibility
}

However, the actual implementation is totally different and quite bizarre:

  • ThreadLocalRandom duplicates some of the methods in Random verbatim and others with minor modifications; surely much of this code could have been reused.
  • Thread stores the seed and a probe variable used to initialize the `ThreadLocalRandom, violating encapsulation;
  • ThreadLocalRandom uses Unsafe to access the variables in Thread, which I suppose is because the two classes are in different packages yet the state variables must be private in Thread - Unsafe is only necessary because of the encapsulation violation;
  • ThreadLocalRandom stores its next nextGaussian in a static ThreadLocal instead of in an instance variable as Random does.

Overall my cursory inspection seems to reveal an ugly copy of Random with no advantages over the simple implementation above. But the authors of the standard library are smart so there must be some reason for this weird approach. Does anyone have any insight into why ThreadLocalRandom was implemented this way?

Corespondent answered 15/11, 2016 at 21:31 Comment(9)
I haven't seen the implementation for ThreadLocalRandom, but I do know that parent types should not know of/depend on their subtypes (in regards to your simpler alternative)Draff
@VinceEmigh Well, both could delegate to UnsynchronizedRandom. Either way the major issues with the current implementation are trivially avoided.Corespondent
Looks like the class was rewritten in Java 8. The JDK 1.7u79 version only has 226 LOC while the OpenJDK 8b132 version has 1101 LOC. The code comments right after the class declaration in the Java 8 version seem to explain some of your questions.Bonn
The internal docs of the Java 8 version mention SplittableRandom, from which some code is duplicated.Bonn
@MickMnemonic I see. Comments like this do not inspire confidence: "Implementations of non-core methods are mostly the same as in SplittableRandom, that were in part derived from a previous version of this class." Perhaps such copying is required to maintain backward compatibility, but if so the authors have allowed too many implementation details to leak into the documented API.Corespondent
"Comments like this do not inspire confidence" - you seem out to get the JDK authors; why are you so concerned that they've done something wrong? If you find a bug, suggest a fix. Otherwise, rest easy in the knowledge that many, many smart people have worked on this code for almost 20 years. It's battle tested.Sharpnosed
To your comment about leaky implementation details, that's a testament to the standards the authors hold themselves to, not a flaw. By working so hard to ensure backwards compatibility the vast majority of Java applications can upgrade with minimal issues. No small feat for a language as widely used as Java.Sharpnosed
@Sharpnosed The very reason I asked this question is that I thought there must be a good reason for this bad-looking code. If I didn't think the JDK authors were good I would have assumed it's just shitty code. Regarding implementation details, they can and do change from version to version. String.substring changed to copy the substring in Java 7, which affects Java applications. The particular algorithm used for random number generation never should have been documented in such detail and never should have been relied on exactly. I think that's one of the many legacy mistakes in the Java API.Corespondent
The String.substring change was (and still is) a contentious one. I'm not saying the JDK team never makes mistakes, but I'm suggesting starting from an assumption of good intent and competency, rather than using language like "bizarrely", "ugly", "no advantages", and the like. That you don't see the advantages doesn't mean they don't exist. I realize that's what you're trying to get at, but your framing of the question is needlessly presumptuous. I'm simply encouraging giving the JDK folks a little more benefit of the doubt.Sharpnosed
S
13

The key problem is a lot of the code is legacy and can't (easily) be changed - Random was designed to be "thread-safe" by synchronizing all its methods. This works, in that instances of Random can be used across multiple threads, but it's a severe bottleneck as no two threads can simultaneously retrieve random data. A simple solution would be to construct a ThreadLocal<Random> object thereby avoiding the lock contention, however this still isn't ideal. There's still some overhead to synchronized methods even when uncontested, and constructing n Random instances is wasteful when they're all essentially doing the same job.

So at a high-level ThreadLocalRandom exists as a performance optimization, hence it makes sense that its implementation would be "bizarre", as the JDK devs have put time into optimizing it.

There are many classes in the JDK that, at first glance, are "ugly". Remember however that the JDK authors are solving a different problem than you. The code they write will be used by thousands if not millions of developers in countless ways. They have to regularly trade-off best-practices for efficiency because the code they're writing is so mission critical.

Effective Java: Item 55 also addresses this issue - the key point being that optimization should be done as a last resort, by experts. The JDK devs are those experts.

To your specific questions:

ThreadLocalRandom duplicates some of the methods in Random verbatim and others with minor modifications; surely much of this code could have been reused.

Unfortunately no, as the methods on Random are synchronized. If they were invoked ThreadLocalRandom would pull in Random's lock-contention trouble. TLR needs to override every method in order to remove the synchronized keyword from the methods.

Thread stores the seed and a probe variable used to initialize the ThreadLocalRandom, violating encapsulation;

First off, it's really not "violating encapsulation" since the field is still package-private. It's encapsulated from users, which is the goal. I wouldn't get too hung up on this as the decisions were made here to improve performance. Sometimes performance comes at the cost of normal good design. In practice this "violation" is undetectable. The behavior is simply encapsulated inside two classes instead of a single one.

Putting the seed inside Thread allows ThreadLocalRandom to be totally stateless (aside from the initialized field, which is largely irrelevant), and therefore only a single instance ever needs to exist across the whole application.

ThreadLocalRandom uses Unsafe to access the variables in Thread, which I suppose is because the two classes are in different packages yet the state variables must be private in Thread - Unsafe is only necessary because of the encapsulation violation;

Many JDK classes use Unsafe. It's a tool, not a sin. Again, I just wouldn't get too stressed out about this fact. The class is called Unsafe to discourage lay-developers from misusing it. We trust/hope the JDK authors are smart enough to know when it's safe to use.

ThreadLocalRandom stores its next nextGaussian in a static ThreadLocal instead of in an instance variable as Random does.

Since there will only ever be one instance of ThreadLocalRandom there's no need for this to be an instance variable. I suppose you could alternatively make the case that there's no need for it to be a static either, but at that point you're just debating style. At a minimum making it static more clearly leaves the class essentially stateless. As mentioned in the file, this field is not really necessary, but ensures similar behavior to Random.

Sharpnosed answered 20/1, 2017 at 9:33 Comment(2)
Couldn't the synchronized code be extracted to a package-private common base class, as was done with StringBuffer?Interlinear
In practice that wouldn't be possible since Random and ThreadLocalRandom are in separate packages. Why that's the case is harder to say. Looking at the history the class was introduced as part of JSR 166 which generally placed concurrent-friendly code in java.util.concurrent, so perhaps that was simply done to be consistent.Sharpnosed

© 2022 - 2024 — McMap. All rights reserved.