Creating one non-thread-safe object per thread and using happens-before guarantee
Asked Answered
A

3

16

I want to use SOAPConnectionFactory and MessageFactory classes from SAAJ with multiple threads, but it turns out that I can't assume they are thread-safe. Some related posts:

Here is an interesting little proof that it can be thread safe: http://svn.apache.org/repos/asf/axis/axis2/java/core/tags/v1.5.6/modules/saaj/src/org/apache/axis2/saaj/SOAPConnectionImpl.java it is said

Although thread safety is not explicitly required by the SAAJ specs, it appears that the SOAPConnection in Sun's reference implementation is thread safe.

But still I don't think it's enough proof to treat SAAJ classes as thread-safe.

So my question: is the idiom below correct? I create exactly one SOAPConnection and MessageFactory object using the possibly non-thread safe factories inside the main thread and then safely publish those object to an executor task using the happens-before guarantee of the CompletionService interface. I also use this happens-before guarantee to extract the result HashMap object.

Basically I just want to verify the sanity of my reasoning.

public static void main(String args[]) throws Exception {
    ExecutorService executorService = Executors.newFixedThreadPool(10);
    CompletionService<Map<String, String>> completionService = new ExecutorCompletionService<>(executorService);

    //submitting 100 tasks
    for (int i = 0; i < 100; i++) {
        // there is no docs on if these classes are thread-safe or not, so creating them before submitting to the
        // external thread. This seems to be safe, because we are relying on the happens-before guarantees of the
        // CompletionService.
        SOAPConnectionFactory soapConnectionFactory = SOAPConnectionFactory.newInstance();
        SOAPConnection soapConnection = soapConnectionFactory.createConnection();
        MessageFactory messageFactory = MessageFactory.newInstance();
        int number = i;// we can't just use i, because it's not effectively final within the task below
        completionService.submit(() -> {
            // using messageFactory here!
            SOAPMessage request = createSOAPRequest(messageFactory, number);
            // using soapConnection here!
            SOAPMessage soapResponse = soapConnection.call(request, "example.com");
            soapConnection.close();
            ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
            soapResponse.writeTo(outputStream);
            // HashMap is not thread-safe on its own, but we'll use the happens-before guarantee. See f.get() below.
            Map<String, String> result = new HashMap<>();
            result.put("soapResponse", new String(outputStream.toByteArray()));
            return result;

        });
    }

    // printing the responses as they arrive
    for (int i = 0; i < 100; i++) {
        Future<Map<String, String>> f = completionService.take();
        Map<String, String> result = f.get();
        System.out.println(result.get("soapResponse"));
    }

    executorService.shutdown();
}

/**
 * Thread-safe static method
 */
private static SOAPMessage createSOAPRequest(MessageFactory messageFactory, int number) throws Exception {
    SOAPMessage soapMessage = messageFactory.createMessage();
    SOAPPart soapPart = soapMessage.getSOAPPart();

    String serverURI = "example.com";

    SOAPEnvelope envelope = soapPart.getEnvelope();
    envelope.addNamespaceDeclaration("example", serverURI);

    SOAPBody soapBody = envelope.getBody();
    SOAPElement soapBodyElem = soapBody.addChildElement("number", "example");
    soapBodyElem.addTextNode(String.valueOf(number));

    soapMessage.saveChanges();

    return soapMessage;
}
Armed answered 22/3, 2016 at 10:29 Comment(7)
Now you're creating not instance-per-thread, but instance-per-task (so 100 instances of each will be created). Why don't you use TreadLocal to reduce instantiation and reuse them in non-interfering tasks?Breathy
Just wondering: why do you use a static main for testing; and not unit tests?Tympanic
@SashaSalauyou Yes, instance-per-task actually. But I don't think it changes much. Theoretically I could create an object inside the "submit" and kinda cache it inside ThreadLocal for the case when another task happens to run within the same thread, but I'll still have to invoke at least SOAPConnectionFactory.newInstance() inside the task's code block which is assumed here to not be thread-safe. Let me know if I'm missing something. Also, I don't care about "reusing" them that much. The service call is about 1 minute for my situation, so the speed of creating the objects is not a bottleneck.Armed
@Jägermeister it's not "testing" code, it's kinda simplified production code.Armed
@Armed if you're aware of thread-unsafety of .newInstance(), you may explicitly synchronize by factory in ThreadLocal.withInitial(). For 100 instances you won't see the difference, but in production reusing will be more effective.Breathy
@SashaSalauyou explicit synchronization is what I'm trying to avoid in the first place. That's why I asked the question:) To verify the idiom described above. And believe me or not - I really don't care about reusing those objects:-) I have a very special scenario which is basically out of the scope of my post.Armed
@SashaSalauyou You are saying that my idiom is correct in one comment and then that it's not correct in the other comment:) As for your last comment: the idiom is trying to tackle exactly that. It's using the shared objects inside the "main" thread, not inside the tasks' threads.Armed
H
8

Yes, your reasoning about the CompletionService is correct -- .submit() ensures that the task lambda will see the complete objects, and .take() ensures that the main thread will only see fully formed responses.

Generally, however, you don't need to do this. static factory methods should always be thread-safe, because there is no way to ensure that they are not in use in some other thread without global knowledge of the whole JVM, and you can't really write code that relies on that in many environments. Sometimes you'll see an implementation that might have a problem if one thread tries to use it while another is configuring it, however, but even that is rare.

Imagine a servlet that uses SOAPConnectionFactory. It's impossible to know that there isn't some other Web application in the same JVM that isn't also using it at the same time, so it has to be thread-safe.

So, really, MessageFactory.newInstance() and SOAPConnectionFactory.newInstance() would just be wrong if they weren't thread-safe. I would use them in multiple threads without worry, and just check the source if you're really concerned. But really they're fine.

On the other hand, objects (even other factories) created by static factory methods are often not thread safe, and you shouldn't assume that they are without documentation that says so. Even checking the source isn't sufficient, because if the interfaces aren't documented to be thread safe, then someone can add unsafe state to an implementation later.

Hardnosed answered 30/3, 2016 at 1:56 Comment(4)
Great answer! I also like the discussion about the static factories. Though, could you please provide some proof links to prove that "A non-thread-safe static factory is a bug"? I think I mostly understand your explanation, but just want to make sure I'm not missing anything and it's really a general accepted way of thinking.Armed
Proof links? Ah, no, sorry. I've been doing this sort of thing for a long time, so I don't validate my reasoning against random folks on the internet. Feel free to remove your checkmark if that bugs you. If you want to be really sure, check the source code.Hardnosed
It feels like I'm missing some common notion or idiom. That's why I asked you to give some proof. Usually Java guys refer to "Effective Java" or "Java Concurrency in Practice". I tried to find at least "a random folk in the internet" explaining that "A non-thread-safe static factory is a bug", but to no success. So, just wanted to make sure it's really true. And checking the source code is what I tried to avoid in the first place as the least safe method here.Armed
A lot of times, when you get past the "basic information", it will be impossible to find direct attestations like that. Sometimes the best you can do is to find example code written by people who should know about it, and deduce the unstated properties of the interface from that. If you google for "SOAPConnectionFactory examples", for example, you will find lots of code that would be wrong if .newInstance() wasn't thread-safe. Some of it will be from Oracle or Sun. Over time you will learn how the authors of such things expect their classes to be used even when they don't spell it out.Hardnosed
B
5

I've spent an hour discovering sources of com.sun.xml.internal.messaging.saaj (used as default SAAJ implementation in Oracle JDK) and found out that none of the factories returned by WhateverFactory.newInstance() have any inner state. So they're definitely thread-safe and don't require to be instantiated multiple times.

These factories are:

For example, HttpSOAPConnectionFactory effectively has just 3 lines in body:

public class HttpSOAPConnectionFactory extends SOAPConnectionFactory {

    public SOAPConnection createConnection() throws SOAPException {
        return new HttpSOAPConnection();
    }
}

What about SOAPMessage and SOAPConnection -- they must be used in one thread, though operations made on them involve several calls. (In fact, SOAPConnection#call() is also thread-safe since HttpSOAPConnection doesn't hold any inner state except closed variable. It may, but should not be reused unless you guarantee that .close() is never invoked, otherwise subsequent .call() will throw.) After processing is done, SOAPConnection should be closed and forgotten as well as SOAPMessage instances used in particular request-response cycle.

To conclude: I believe you do everything correctly except of creating separate factories for each call. At least in mentioned implementation these factories are completely thread-safe, so you can save on loading classes.


All said refers to default SAAJ implementation which comes with Oracle JDK. If you use a commercial Java EE application server (Websphere, JBoss etc) where implementation may be vendor-specific, it's better to address your question to their support.

Breathy answered 25/3, 2016 at 22:52 Comment(5)
Anyway, I still cannot understand your intention to use CompletionService abstraction. Everything you do can be easily made by Future<...> f = executor.submit(() -> { ... }). All these Futures you can collect in the list, then take one by one and call f.get() on each to get the results of each task. Future#get() returns guaranteed after the executor processes the task and returns the result.Breathy
f.get() on one task would block and prevent me to invoke f.get() on the task that has already finished. That's what CompletionService is for. It gives me the results at they arrive.Armed
@Armed yes you're right. Now I understand. I supposed you need them ordered.Breathy
The answer is not answering my question. I wasn't asking if those factories are thread-safe or not. I have already put my investigation to the question and concluded that there is no enough proof to treat them as thread-safe. Only explicit docs could prove it in "a general way". See "Java Concurrency In Practice" for how to treat such cases of lack of docs.Armed
I would strongly recommend you to read the corresponding chapter in "Java Concurrency In Practice" which explains how to deal with such situations. First, code can change any time and the maintainers don't have to preserve thread-safety. Second, as you said, there are other implementations. But anyways, the book gives some more considerations.Armed
R
2

I tested your code, It seems like you are creating a soapConnection thru a soapConnectionFactory which is perfectly fine. The following method in SAAJ 1.3 returns a new instance of MessageFactory

public static MessageFactory newInstance(String protocol) throws SOAPException {
    return SAAJMetaFactory.getInstance().newMessageFactory(protocol);
}

There is no info in description about thread safety but by looking at the code, it seems that this method uses mainly stack variables, e.g. has an SOAPConnection object in stack and uses it. I can not see a problem if soapConnection.call(request, "example.com") is called by multiple threads despite that there is no synchronized blocks.

One would expect that the threads would send their result message through different connections

Reprography answered 25/3, 2016 at 17:2 Comment(1)
It's not answering my question. I wasn't asking if those factories are thread-safe or not. I have already put my investigation to the answer and concluded that there is no enough proof to treat them as thread-safe. Only explicit docs could prove it in "a general way". See "Java Concurrency In Practice" for how to treat such cases of lack of docs.Armed

© 2022 - 2024 — McMap. All rights reserved.