Confusion regarding the Blocking of "peer threads" when a user-level thread blocks
Asked Answered
C

3

7

I was reading about differences between threads and processes, and literally everywhere online, one difference is commonly written without much explanation:

If a process gets blocked, remaining processes can continue execution. If a user level thread gets blocked, all of its peer threads also get blocked.

It doesn't make any sense to me. What would be the sense of concurrency if a scheduler cannot switch between a blocked thread and a ready/runnable thread. The reason given is that since the OS doesn't differentiate between the various threads of a given parent process, it blocks all of them at once.

I find it very unconvincing, since all modern OS have thread control blocks with a thread ID, even if it is valid only within the memory space of the parent process. Like the example given in Galvin's Operating Systems book, I wouldn't want the thread which is handling my typing to be blocked if the spell checking thread cannot connect to some online dictionary, perhaps.

Either I am understanding this concept wrong, or all these websites have just copied some old thread differences over the years. Moreover, I cannot find this statement in books, like Galvin's or maybe in William Stalling's COA book where threads have been discussed.

These are resouces where I found the statements:

Copyread answered 30/8, 2021 at 9:6 Comment(8)
Please provide a source (f)or more context... Because maybe the source is wrong, or rather likely, the context of that paragraph changes the meaning. Perhaps this paragraph is all about controlling a group of threads?Sander
The main difference between threads and processes is that the formers share an address space while the latters don't. In fact, most OS schedules threads and the process entity is just an attribute of each thread.Hogan
It seems your quoted text is present here. Always take with a grain of salt what's on Medium. Their information quality is pretty law and, in this case, just plain wrong.Hogan
@JayC667, these are the sources where I read these statements: geeksforgeeks.org/difference-between-process-and-thread, tutorialspoint.com/difference-between-process-and-thread, guru99.com/difference-between-process-and-thread.html, javatpoint.com/process-vs-threadCopyread
@AMANKUMAR: those are all user-submitted tutorials, often repeating statements someone read somewhere but doesn't fully understanding. Or doesn't realize are outdated, or that they don't put into context for whether real-world modern systems actually do this or not.Catechumen
The thing is, writing a good tutorial is a lot of work. Many posts on those sites are (I think) written as a learning exercise by beginners who are themselves just learning about a topic. The same thing works for Stack Overflow because comments are more visible, review from experts is more active, and edits to fix mistakes are more expected. Also SO answers are shorter so any mistakes are often more central to the point of the answer (although we certainly do see answers get upvotes for their main point while they contain mis-statements about other things).Catechumen
So TL:DR: Margaret is right, geeksforgeeks, tutorialspoint, and sites like that are notorious for having mistakes in their articles. Sometimes code that's not tested, sometimes misleading conceptual explanations. Sometimes the tutorials there are better than nothing, but I prefer to write about stuff in Stack Overflow Q&As where the quality standards are higher thanks to up and down voting, and similarly to look for answers here. Or from blog posts of well-regarded folks, not random users on sites like tutorialspoint. (Or from an actual textbook, or reading about how real systems work)Catechumen
I think the person that wrote that meant that if the process gets blocked ie. suspended, swapped to disk, killed, all its threads gets blocked too - ie mistakenly said "thread" instead of "process".Mayorga
C
6

There is a difference between kernel-level and user-level threads. In simple words:

  • Kernel-level threads: Threads that are managed by the operating system, including scheduling. They are what is executed on the processor. That's what probably most of us think of threads.
  • User-level threads: Threads that are managed by the program itself. They are also called fibers or coroutines in some contexts. In contrast to kernel-level threads, they need to "yield the execution", i.e. switching from one user-level to another user-level thread is done explicitly by the program. User-level threads are mapped to kernel-level threads.

As user-level threads need to be mapped to kernel-level threads, you need to choose a suiteable mapping. You could map each user-level to a separate kernel-level thread. You could also map many user-level to one kernel-level thread. In the latter mapping, you let multiple concurrent execution paths be executed by a single thread "as we know it". If one of those paths blocks, recall that user-level threads need to yield the execution, then the executing (kernel-level) thread blocks, which causes all other assigned paths to also be effectively blocked. I think, this is what the statement refers to. FYI: In Java, user-level threads – the multithreading you do in your programs – are mapped to kernel-level threads by the JVM, i.e. the runtime system.


Related stuff:

Chinookan answered 30/8, 2021 at 11:12 Comment(6)
And, regarding that last sentence, if Project Loom is successful Java will be able to have many JVM-managed virtual threads mapped to each host OS managed “real” thread. This brings huge efficiencies, making practical even millions of simultaneous threads.Milden
Back in the early days of Java at least, user-level threads were called "green threads", to implement Java threading on OSes that didn't support native threading. There's still a Wiki article: en.wikipedia.org/wiki/Green_threads. IIRC on Solaris there was also some use of an N:M model where N user-space threads might be handled by fewer than N kernel threads, introducing the idea of "peer" threads (that share the same kernel thread) that is present in the OP's quote.Catechumen
This makes somewhat sense now. Just one more thing, don't most of the modern operating systems make use of one-to-one mapping for kernel and user threads? I could understand that argument would hold true in the case of old OS maybe where many-to-one implementation was used. So maybe all these websites just kept copying from some outdated text-source maybe...Copyread
@AMANKUMAR Yes, I think so. I searched for the sources you had likely in mind and noticed that most of them just drop the statement without any context. This is very misleading if you don't know what a "user level thread" is and how to understand "peer threads" in this context.Chinookan
@AMANKUMAR An operating system doesn't (really) do the mapping; it's the responsibility of the program or runtime system the program is executed by. Imagine you would implement a lightweight OS within your program that does things concurrently and imagine that your program is executed by a single actual thread. The threads of your implemented OS are the user-level threads. This example is weird but I hope it kinda helps.Chinookan
@AMANKUMAR To my non-expert knowledge, most programming languages (or runtime systems) do many-to-many mapping. The point is that this is something you don't really need to care about as a "normal" developer.Chinookan
C
3

Back in the early days of Java at least, user-level threads were called "green threads", to implement Java threading on OSes that didn't support native threading. There's still a Wiki article https://en.wikipedia.org/wiki/Green_threads which explains the origin and meaning.

(This was back when desktops/laptops were uniprocessor systems, with a single-core CPU in their 1 physical socket, and SMP machines mostly only existed as multi-socket.)

You're right, this was terrible, and once mainstream OSes grew up to support native threads, people mostly stopped ever doing this. For Java specifically at least, Green threads refers to the name of the original thread library for the programming language Java (that was released in version 1.1 and then Green threads were abandoned in version 1.3 to native threads).

So use Java version 1.3 or later if you don't want your spell-check thread to block your whole application. :P This is ancient history.

Although there is some scope for using non-blocking IO and context switching when a system call returns that it would block, but usually it's better to let the kernel handle threads blocking and unblocking, so that's what normal modern systems do.


IIRC on Solaris there was also some use of an N:M model where N user-space threads might be handled by fewer than N kernel threads. This could mean having some "peer" threads (that share the same kernel thread) like in your quote without being fully terrible purely userspace green threads.

(i.e. only some of your total threads are sharing the same kernel thread.)

pthreads on Linux uses a 1:1 model where every software thread is a separate task for the kernel to schedule.

Google found https://flylib.com/books/en/3.19.1.51/1/ which defines those thread models and talks about them some, including the N:M hybrid model, and the N:1 user-space aka green threads model that needs to use non-blocking I/O if it wants to avoid blocking other threads. (e.g. do a user-space context switch if a system call returns EAGAIN or after queueing an async read or write.)

Catechumen answered 30/8, 2021 at 17:57 Comment(0)
S
2

Okay, the other answers provide detailed information.

But to hit your main convern right in the middle:

  • the article is putting that a bit wrong, lacking the necessary context (see all the details in @akuzminykh 's explanation of user-level threads and kernel-level threads)
  • what this means for a Java programmer: don't bother with those explanations. If one of your Java threads blocks (due to I/O etc), that will have NO IMPACT on any other of your threads (unless, of course, you explicitly WANT them to, but then you'd have to explicitly use mechanisms for that)

How do Threads get blocked in Java?

  • If you call sleep() or wait() etc, the Thread that currently executes that code (NOT the objects you call them on) will be blocked. These will get released on certain events: sleep will finish once the timer runs out or the thread gets interrupted by another, wait will release once it gets notified by another thread.
  • if you run into a synchronized(lockObj) block or method: this will release once the other thread occupying that lockObj releases it
    • closely related to that, if you enter ThreadGates, mutexes etc, all those 1000s of specialized classes for extended thread control like rendezvous etc
  • If you call a blocking I/O method, like block reading from InputStream etc: int amountOfBytesRead = read(buffer, offset, length), or String line = myBufferedReader.readLine();
    • opposed to that, there are many non-blocking I/O operations, like most of the java.nio (non-blocking I/O) package, that return immediately, but may indicate invalid result values
  • If the Garbage Collector does a quick cleanup cycle (which are usually so short you will not even notice, and the Threads get released automatically again)
  • if you call .parallelStream() functions for certain long-lasting lambda functions on streams (like myList.parallelStream().forEach(myConsumerAction)) that - if too complex or with too many elements - get handled by automated multithreading mechanisms (which you will not notice, because after the whole stuff is done, your calling thread will resume normally, just as if a normal method was called). See more here: https://www.baeldung.com/java-when-to-use-parallel-stream
Sander answered 30/8, 2021 at 20:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.