Socket vs SocketChannel
Asked Answered
B

3

88

I am trying to understand SocketChannels, and NIO in general. I know how to work with regular sockets and how to make a simple thread-per-client server (using the regular blocking sockets).

So my questions:

  • What is a SocketChannel?
  • What is the extra I get when working with a SocketChannel instead of a Socket.
  • What is the relationship between a channel and a buffer?
  • What is a selector?
  • The first sentance in the documentation is A selectable channel for stream-oriented connecting sockets.. What does that mean?

I have read the also this documentation, but somehow I am not getting it...

Billybillycock answered 8/1, 2013 at 23:39 Comment(1)
I must apologize for the others who downvoted your question because of the background and not the content. As a fellow graduate student, I totally understand when you are forced to TA a course that is not exactly in your research area, especially when your funding depends on it. I think it's also good that you came here to seek clarification.Fondness
F
72

A Socket is a blocking input/output device. It makes the Thread that is using it to block on reads and potentially also block on writes if the underlying buffer is full. Therefore, you have to create a bunch of different threads if your server has a bunch of open Sockets.

A SocketChannel is a non-blocking way to read from sockets, so that you can have one thread communicate with a bunch of open connections at once. This works by adding a bunch of SocketChannels to a Selector, then looping on the selector's select() method, which can notify you if sockets have been accepted, received data, or closed. This allows you to communicate with multiple clients in one thread and not have the overhead of multiple threads and synchronization.

Buffers are another feature of NIO that allows you to access the underlying data from reads and writes to avoid the overhead of copying data into new arrays.

Fondness answered 9/1, 2013 at 0:31 Comment(12)
Thanks very much. But I have some missing point still. Is there any advantage using a channel without a selector? or do they come together?Billybillycock
You can't use a channel without a selector, because the selector tells you when a channel is ready to be read, for example. I can't think of a reason why you'd want to use channels by themselves, as you would basically be reimplementing the functionality of the selector.Fondness
Excuse me, but in which way is SocketChannel non-blocking? Selector.select() is a blocking operation. You said that select() notifies, but in reality it simply blocks. This can't be called a notification, because it does not use callbacks.Diocletian
There are no callbacks in Java - maybe you're thinking about Node? It's called non blocking because there aren't a pile of threads waiting on blocking I/O.Fondness
@andrewmao Using one thread instead of multiple ones is not enough to be non-blocking. It still blocks waiting on select() and wasting a whole thread. It definitely can't be called "non-blocking". What do you mean by "no callbacks in Java"? Have you seen AsynchronousChannel?Diocletian
@orionll Then "N" in NIO stands for nonblocking. The definition isn't going to change no matter how much you want it to :) Callbacks in Java are definitely not usable as in Javascript, because functions aren't first class objects.Fondness
@andrewmao There a lot of definitions in computer science that are misleading. I have worked much with non-blocking code, and, believe me, NIO is definitely not non-blocking. That fact that callbacks are not convenient does not mean that they are not usable. Your phrase "There are no callbacks in Java" is incorrect.Diocletian
@orionll Yes, you're absolutely right. Can't argue with a troll.Fondness
@andrewmao Just a small question. Why did they introduce asynchronous channels in Java 7, if NIO was already non-blocking?Diocletian
Terminology is confusing only when misused. The "non-blocking" in "NIO" pertains to the "IO", i.e. the I/O operations. The thread is not "wasted", because it has nothing to do except I/O. And non-blocking I/O is not the same as asynchronous I/O: that's a different subject altogether.Edda
@AndrewMao You can indeed use a channel without a selector, even in non-blocking mode, which is not the default, although it's not recommended. The "N" in "NIO" stands for "New".Roncesvalles
I know this question is very old, but I am still struggling to understand how async IO has any benefit at all to just using a thread per socket or a threadpool. If you have a single thread communicating through a socket channel and looping through a bunch of open sockets in another thread to see if something happened, those sockets will still block the thread they are in, and the looping will just halt when it reaches a socket with no data available. What am I missing? It seems to me that behind the scenes, you still need one thread per socket to read the data. Please, help me understand!Teetotaler
E
22

By now NIO is so old that few remember what Java was like before 1.4, which is what you need to know in order to understand the "why" of NIO.

In a nutshell, up to Java 1.3, all I/O was of the blocking type. And worse, there was no analog of the select() system call to multiplex I/O. As a result, a server implemented in Java had no choice but to employ a "one-thread-per-connection" service strategy.

The basic point of NIO, introduced in Java 1.4, was to make the functionality of traditional UNIX-style multiplexed non-blocking I/O available in Java. If you understand how to program with select() or poll() to detect I/O readiness on a set of file descriptors (sockets, usually), then you will find the services you need for that in NIO: you will use SocketChannels for non-blocking I/O endpoints, and Selectors for fdsets or pollfd arrays. Servers with threadpools, or with threads handling more than one connection each, now become possible. That's the "extra".

A Buffer is the kind of byte array you need for non-blocking socket I/O, especially on the output/write side. If only part of a buffer can be written immediately, with blocking I/O your thread will simply block until the entirety can be written. With non-blocking I/O, your thread gets a return value of how much was written, leaving it up to you to handle the left-over for the next round. A Buffer takes care of such mechanical details by explicitly implementing a producer/consumer pattern for filling and draining, it being understood that your threads and the JVM's kernel will not be in sync.

Edda answered 9/1, 2013 at 20:13 Comment(12)
SocketChannel is blocking. At least read() and write() methods are. The true non-blocking channel is AsynchronousSocketChannel, which was introduced in Java 7.Diocletian
Really? So what happened to the configureBlocking(bool) method that was there ever since NIO was introduced in Java 1.4?Edda
Ok, but what if you need to read bytes as soon as they are available? You still have to call some blocking operation in this case.Diocletian
By design, the only blocking call is select(). On return, you examine the selector for the channels that are ready for I/O and perform the appropriate operations. This has been working for ages. Can you explain why you think this has not been working in millions of existing systems?Edda
I don't say that it does not work. I just want to say, that some blocking is still needed. And the need of blocking was completely eliminated in Java 7.Diocletian
It sounds like you have confused non-blocking I/O with asynchronous I/O. Even in asynch I/O, blocking is not "completely eliminated" - e.g. get() on a Future<> is a blocking call. The design issue is not whether to block (your essential confusion) but when to block.Edda
I said that "the need of blocking was completely eliminated", not "blocking was completely eliminated". Blocking will never be eliminated because of backward compatibility.Diocletian
Backward compatibility?? Sigh. Whatever.Edda
@Diocletian A SocketChannel in non-blocking mode is non-blocking. Asynchronous I/O is yet another beast, nothing to do with non-blocking mode whatsoever.Roncesvalles
@EJP The channel is non-blocking, but you still need to block the thread on select() if you want your data to be ready as soon as possible. This is what I meant.Diocletian
@Diocletian But it's not what you said. Selector.select() blocks. Neither the SocketChannel nor its read() or write() methods block if the channel is in non-blocking mode, contrary to what you specifically asserted above, where you erroneously also conflated non-blocking I/O with asynchronous I/O.Roncesvalles
@EJP Yes, I probably was wrong about read() and write(). It was a long time ago. And I don't agree that non-blocking I/O has nothing to with asynchronous I/O. They are not the same but they are tightly connected.Diocletian
I
6

Even though you are using SocketChannels, It's necessary to employ thread pool to process channels.

Thinking about the scenairo you use only one thread which is responsible for both polling select() and processing the SocketChannels selected from Selectors, if one channel takes 1 seconds for processing, and there are 10 channels in queue, it means you have to wait 10 seconds before next polling which is untolerable. so there should be a thread pool for channels processing.

In this sense, i don't see tremendous difference to the thread-per-client blocking sockets pattern. the major difference is in NIO pattern, the task is smaller, it's more like thread-per-task, and tasks could be read, write, biz process etc. for more detail, you can take a look at Netty's implementation of NioServerSocketChannelFactory, which is using one Boss thread accepting connection, and dispatch tasks to a pool of Worker threads for processing

If you are really fancy at one thread, the bottom-line is at least you shold have pooled I/O threads, because I/O operations is often oders of magnitude slower than instruction-processing cycles, you would not want the precious one thread being blocked by I/O, and this is exactly NodeJS doing, using one thread accept connection, and all I/O are asynchornous and being parallelly processed by back-end I/O threads pool

is the old style thread-per-client dead? I don't think so, NIO programming is complex, and multi-threads is not naturally evil, Keep in mind that modern operating systems and CPU's become better and better at multitasking, so the overheads of multithreading becomes smaller over time.

Instrumentation answered 3/6, 2014 at 9:14 Comment(4)
if one channel takes 1 seconds for processing, you should not be using NIO.Sturdivant
@Sturdivant what would you be using then?Denaedenarius
@ElMac obviously more than a single thread. Reading / writing the socket channel may still be implemented as multiplexed single thread. But processing of the data should be done in some other thread pool.Triptolemus
I'd argue that "CPU's become better and better at multitasking" is wrong. The benefit of thread-per-core or process-per-core has been going up, but the relative cost of a context switch has also gone up - having any given core context switch is expensive. With the brand new loom virtual threads being a natural way to handle thisAlleluia

© 2022 - 2024 — McMap. All rights reserved.