Socket server with epoll and threads

Asked 28/11, 2011 at 13:24 Answered 17/12, 2011 at 23:23

Solved c multithreading sockets pthreads epoll

I am trying to create a socket server in C for a Collaborative real-time editor http://en.wikipedia.org/wiki/Collaborative_real-time_editor but I don't know what is the best server architecture for it.

At the first, I was trying to use select for the socket server but after that, I was reading about epoll and now I think that epoll is the best choice because the client will send every letter, that the user will write on textarea, to the server, so the server will have allot of data to process.

Also, I want to use threads with epoll but I don't know exactly how to use them. I want to use threads because I think is better to use 2 or all CPUs on the target machine.

My plan is

create 2 threads when the server start
first thread will analyze the new clients and prepare them for reading or sending
the second thread will have the job to read and send data from/to clients

The problem is that this 2 threads will use a while(1) with a epoll_wait.

My questions are, is this a good server architecture for using epoll with threads ? If not, what options I have ?

EDIT: I can't use libevent or libev or other libraries because this is a college project and I'm not allow to use external libraries.

Chuchuah answered 28/11, 2011 at 13:24 Comment(2)

I would advice against using epoll on the points of not being universally available and that you start with something simple while figuring out the architecture. Also, epoll is good if you have a large number of connections, but it doesn't really matter when it comes to response time when you only have a few connections. – Siderolite 28/11, 2011 at 13:57

unless you managed to saturate one core of the CPU there's no reason the go multithreaded. and for that you need 10k clients (if your app is well written). – Scauper 28/11, 2011 at 21:32

I think you're trying to over-engineer this problem. The epoll architecture in Linux was intended for situations where you have thousands of concurrent connections. In these kinds of cases, the overhead by the way the poll and select system calls are defined will be the main bottleneck in a server. The decision to use poll or select vs. epoll is based on the number of connections, not the amount of data.

For what you're doing, it seems as though the humans at your editing system would go insane after you hit a few dozen concurrent editors. Using epoll will probably make you go crazy; they play a few tricks with the API to squeeze out the extra performance, and you have to be very careful processing the information you get back from the calls.

This sort of application sounds like it would be network-I/O-bound instead of CPU-bound. I would try writing it as a single-threaded server with poll first. When you receive new text, buffer it for your clients if necessary, and then send it out when the socket accepts write calls. Use non-blocking I/O; the only call you want to block is the poll call.

If you are doing a significant amount of processing on the data after receiving it, but before sending it back out to clients, then you could benefit from multi-threading. Write the single-threaded version first, then if you are CPU-bound (check using top) and most of the CPU time is spent in the functions where you are doing data processing (check using gprof), add multithreading to do the data processing.

If you want, you can use pipes or Unix-domain sockets inside the program for communication between the different threads---in this way everything in the main thread can be event-driven and handled through poll. Alternatively, with this model, you could even use multiple processes with fork instead of multiple threads.

Spec answered 17/12, 2011 at 23:23 Comment(0)

You might want to consider using something like libev or libevent instead of writing your own event handling implementation. these give you a cross-platform event handler, which will use whatever's appropriate (be it select, poll, epoll, kqueue or anything else) and most likely at a lower overhead than having two threads handing off work to one another.

Seurat answered 28/11, 2011 at 14:5 Comment(5)

How can libeven or libev then be used to utilise all available processors? is there a standard way to do this? – Dunlin 28/11, 2011 at 15:4

@SlappyTheFish: For what I understand of the workload, it probably wouldn't be necessary. even a fairly naive select implementation would likely work as this is mostly I/O bound. also, assuming TCP, the issue of every letter being sent separately doesn't really exist (unless forced, eg. using TCP_NODELAY or similar). – Seurat 28/11, 2011 at 16:17

@Seurat I forgot to mention on my main post, I can't use libevent or libev because this is a college project and I am not allow to use any external libraries. – Chuchuah 28/11, 2011 at 18:6

@bugspy.net They teach us only plain C for example this was the year when they added epoll, in the last years there was no mention about epoll only select and poll. :) – Chuchuah 28/11, 2011 at 20:19

@cemycc: A simple single threaded server would probably work well enough for this case, unless you are required to do otherwise. you can use the same select or poll loop to both accept new connections and to send/receive data from connected sockets. I suggest you set your sockets to be non-blocking to prevent some surprises (eg. select indicates recv won't block, but a subsequent recv blocks, which may happen for UDP) – Seurat 29/11, 2011 at 11:39

Just start using libevent or libev and follow their examples. There are numerous examples - don't try to invent anything new here

Candleberry answered 28/11, 2011 at 14:7 Comment(1)

Sometimes trying to do it yourself can make one appreciate the intricacies of a system, and appreciate the ease of use of an API more, while also diminishing the possible surprises one gets when using a higher-level API. If this is a college project, I'd say go for raw poll/select/epoll! – Orthostichy 24/4, 2013 at 14:47

Recommended topics

Hot tags