How epoll detect clientside close in Python?
Asked Answered
C

10

5

Here is my server

"""Server using epoll method"""

import os
import select
import socket
import time

from oodict import OODict

addr = ('localhost', 8989)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(addr)
s.listen(8)
s.setblocking(0) # Non blocking socket server
epoll = select.epoll()
epoll.register(s.fileno(), select.EPOLLIN) # Level triggerred

cs = {}
data = ''
while True:
    time.sleep(1)
    events = epoll.poll(1) # Timeout 1 second
    print 'Polling %d events' % len(events)
    for fileno, event in events:
        if fileno == s.fileno():
            sk, addr = s.accept()
            sk.setblocking(0)
            print addr
            cs[sk.fileno()] = sk
            epoll.register(sk.fileno(), select.EPOLLIN)

        elif event & select.EPOLLIN:
            data = cs[fileno].recv(4)
            print 'recv ', data
            epoll.modify(fileno, select.EPOLLOUT)
        elif event & select.EPOLLOUT:
            print 'send ', data
            cs[fileno].send(data)
            data = ''
            epoll.modify(fileno, select.EPOLLIN)

        elif event & select.EPOLLERR:
            print 'err'
            epoll.unregister(fileno)

client side input

ideer@ideer:/home/chenz/source/ideerfs$ telnet localhost 8989
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
123456
123456
^]

telnet> q
Connection closed.

server side output

ideer@ideer:/chenz/source/ideerfs$ python epoll.py 
Polling 0 events
Polling 0 events
Polling 1 events
('127.0.0.1', 53975)
Polling 0 events
Polling 1 events
recv  1234
Polling 1 events
send  1234
Polling 1 events
recv  56

Polling 1 events
send  56

Polling 0 events
Polling 0 events
Polling 0 events
Polling 1 events
recv  
Polling 1 events
send  
Polling 1 events
recv  
Polling 1 events
send  
Polling 1 events
recv  
Polling 1 events
send  
Polling 1 events
recv  
^CTraceback (most recent call last):
  File "epoll.py", line 23, in <module>
    time.sleep(1)
KeyboardInterrupt

It's strange that after the client has closed the connection, epoll still can poll recv and send events! Why does EPOLLERR event never happen? it's the same if you use EPOLLHUP.

I notice that the EPOLLERR event only happens when you try to write a closed connection. Besides this, is there another way to tell that whether the connection has been closed or not?

Is it correct to treat the connection as closed if you get nothing in a EPOLLIN event?

Cockeyed answered 27/4, 2009 at 14:6 Comment(0)
T
5

EPOLLERR and EPOLLHUP never happens in the code pasted in the post is because they've always occurred in conjunction with an EPOLLIN or an EPOLLOUT (several of these can be set at once), so the if/then/else have always picked up an EPOLLIN or EPOLLOUT.

Experimenting I've found that EPOLLHUP only happens in conjunction with EPOLLERR, the reason for this may be the way python interfaces with epoll and lowlevel IO, normally recv would return a -1 and set errno to EAGAIN when nothing is available on a non-blocking recv, however python uses '' (nothing returned) to signal EOF.

Closing your telnet-session only closes that end of the tcp-connection, so it's still perfectly valid to call recv on your side, there may be pending data in the tcp receive buffers which your application hasn't read yet so that won't trigger an error-condition.

It seems that EPOLLIN and a recv that returns an empty string is indicative of the other end having closed the connection, however, using an older version of python (before epoll were introduced) and plain select on a pipe, I've experienced that a read that returned '' did not indicate EOF just a lack of available data.

Tomtit answered 5/5, 2009 at 17:56 Comment(0)
O
2

If the socket is still open but no read/write available epoll.poll will timeout.

If data if available from the peer, you get an EPOLLIN and data will be available.

If the socket is closed by the peer, you will get an EPOLLIN but when you read it it will return "".

you could then close the socket by shutting it down and picking up the resulting EPOLLHUP event to clean up your internal structures.

or perform cleanup and unregister the epoll.

elif event & select.EPOLLIN:
    data = cs[fileno].recv(4)

if not data:
    epoll.modify(fileno, 0)
    cs[fileno].shutdown(socket.SHUT_RDWR)
Orangeade answered 27/4, 2009 at 14:6 Comment(0)
W
1

My ad-hoc solution to bypass this problem

--- epoll_demo.py.orig  2009-04-28 18:11:32.000000000 +0800
+++ epoll_demo.py   2009-04-28 18:12:56.000000000 +0800
@@ -18,6 +18,7 @@
 epoll.register(s.fileno(), select.EPOLLIN) # Level triggerred

 cs = {}
+en = {}
 data = ''
 while True:
     time.sleep(1)
@@ -29,10 +30,18 @@
             sk.setblocking(0)
             print addr
             cs[sk.fileno()] = sk
+            en[sk.fileno()] = 0
             epoll.register(sk.fileno(), select.EPOLLIN)

         elif event & select.EPOLLIN:
             data = cs[fileno].recv(4)
+            if not data:
+                en[fileno] += 1
+                if en[fileno] >= 3:
+                    print 'closed'
+                    epoll.unregister(fileno)
+                continue
+            en[fileno] = 0
             print 'recv ', data
             epoll.modify(fileno, select.EPOLLOUT)
         elif event & select.EPOLLOUT:
Withy answered 28/4, 2009 at 10:17 Comment(1)
I find it easier to always treat POLLIN and POLLHUP the same, like this. You get much more specific errors from simply reading and you want to handle them anyway.Aikoail
R
1

The issue why you're not detecting EPOLLHUP/EPOLLERR in your code is because of the bitwise operations you are doing. See when a socket is ready to read epoll will throw a flag with bit 1 which is equal to select.EPOLLIN (select.EPOLLIN == 1). Now say the client hangs up (gracefully or not) epoll on the server will throw a flag with bit 25 which is equal to EPOLLIN+EPOLLERR+EPOLLHUP. So with the bit 25 (event variable in your code) you can see how EPOLLERR is not being detected because all of your elif statements (with the exception of EPOLLOUT line) don't return 0 so the first elif statement is executed, for example:

>>> from select import EPOLLIN,EPOLLOUT,EPOLLHUP,EPOLLERR
>>> event = 25
>>> event & EPOLLIN
1
>>> event & EPOLLERR
8
>>> event & EPOLLHUP
16
>>> event & EPOLLOUT
0

Notice how the first three don't return 0? That's why your code isn't detecting EPOLLERR/EPOLLHUP correctly. When a client hangs up you can still read from the socket as the server side is still up (of course it would return 0 data if you did) hence EPOLLIN but since the client hung up it's also EPOLLHUP and since it's EPOLLHUP it's also EPOLLERR as a hangup is somewhat of an error. I know I'm late on commenting on this but I hope I helped someone out there lol

Here is a way I would rewrite your code to express what I'm saying better:

import os
import select
import socket
import time

from oodict import OODict

addr = ('localhost', 8989)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind(addr)
s.listen(8)
s.setblocking(0) # Non blocking socket server
epoll = select.epoll()
read_only = select.EPOLLIN | select.EPOLLPRI | select.EPOLLHUP | select.EPOLLERR
read_write = read_only | select.EPOLLOUT
biterrs = [25,24,8,16,9,17,26,10,18] #Bitwise error numbers
epoll.register(s.fileno(),read_only)

cs = {}
data = ''
while True:
    time.sleep(1)
    events = epoll.poll(1) # Timeout 1 second
    print 'Polling %d events' % len(events)
    for fileno, event in events:
        if fileno == s.fileno():
            sk, addr = s.accept()
            sk.setblocking(0)
            print addr
            cs[sk.fileno()] = sk
            epoll.register(sk.fileno(),read_only)

        elif (event is select.EPOLLIN) or (event is select.EPOLLPRI):
            data = cs[fileno].recv(4)
            print 'recv ', data
            epoll.modify(fileno, read_write)
        elif event is select.EPOLLOUT:
            print 'send ', data
            cs[fileno].send(data)
            data = ''
            epoll.modify(fileno, read_only)

        elif event in biterrs:
            print 'err'
            epoll.unregister(fileno)
Rayner answered 8/8, 2014 at 22:29 Comment(0)
P
0

Don't you just need to combine the masks together to make use of EPOLLHUP and EPOLLIN at the same time:


epoll.register(sk.fileno(), select.EPOLLIN | select.EPOLLHUP)

Though to be honest I'm not really familiar with the epoll library, so it's just a suggestion really...

Poesy answered 27/4, 2009 at 15:53 Comment(1)
Hi John, thanks for your answer. I tried, but it did not work too. Seems EPOLLERR and EPOLLHUP are set automatically, no need to set explicitly.Cockeyed
W
0

After I move select.EPOLLHUP handling code to the line before select.EPOLLIN, hup event still cant be got in 'telnet'. But by coincidence I found that if I use my own client script, there are hup events! strange...

And according to man epoll_ctl

   EPOLLRDHUP (since Linux 2.6.17)
          Stream socket peer closed connection, or shut down writing half of connection.  (This flag is especially useful for writing simple code  to
          detect peer shutdown when using Edge Triggered monitoring.)

   EPOLLERR
          Error  condition  happened on the associated file descriptor.  epoll_wait(2) will always wait for this event; it is not necessary to set it
          in events.

   EPOLLHUP
          Hang up happened on the associated file descriptor.  epoll_wait(2) will always wait for this event; it  is  not  necessary  to  set  it  in
          events.

Seems there shall be a EPOLLRDHUP event when remote side closed connection, which is not implemented by python, don't know why

Withy answered 15/5, 2009 at 4:1 Comment(0)
S
0

The EPOLLRDHUP flag is not defined in Python for no reason. If your Linux kernel is >= 2.6.17, you can define it and register your socket in epoll like this:

import select
if not "EPOLLRDHUP" in dir(select):
    select.EPOLLRDHUP = 0x2000
...
epoll.register(socket.fileno(), select.EPOLLIN | select.EPOLLRDHUP)

You can then catch the event you need using the same flag (EPOLLRDHUP):

elif event & select.EPOLLRDHUP:
     print "Stream socket peer closed connection"
     # try shutdown on both side, then close the socket:
     socket.close()
     epoll.unregister(socket.fileno())

For more info you can check selectmodule.c in python's repository:

Spheroidal answered 21/10, 2010 at 13:53 Comment(0)
F
0

I have another approach..

try:
    data = s.recv(4096)
except socket.error:
    if e[0] in (errno.EWOULDBLOCK, errno.EAGAIN): # since this is a non-blocking socket..
        return # no error
    else:
        # error
        socket.close()

if not data: #closed either
    socket.close() 
Fuss answered 17/6, 2011 at 0:47 Comment(0)
S
0
if event & select.EPOLLHUP:
    epoll.unregister(fd)
Selfconsistent answered 18/10, 2011 at 6:1 Comment(0)
E
0
elif event & (select.EPOLLERR | select.EPOLLHUP):
    epoll.unregister(fileno)
    cs[fileno].close()
    del cs[fileno]
Example answered 23/2, 2012 at 11:36 Comment(1)
You didnt answer his question at the end of his post.Ave

© 2022 - 2024 — McMap. All rights reserved.