Using python Requests library to consume from Twitter's user streams - how to detect disconnection?
Asked Answered
W

1

7

I'm trying to use Requests to create a robust way of consuming from Twitter's user streams. So far, I've produced the following basic working example:

"""
Example of connecting to the Twitter user stream using Requests.
"""

import sys

import json

import requests

from oauth_hook import OAuthHook

def userstream(access_token, access_token_secret, consumer_key, consumer_secret):
    oauth_hook = OAuthHook(access_token=access_token, access_token_secret=access_token_secret, 
                           consumer_key=consumer_key, consumer_secret=consumer_secret, 
                           header_auth=True)

    hooks = dict(pre_request=oauth_hook)
    config = dict(verbose=sys.stderr)
    client = requests.session(hooks=hooks, config=config)

    data = dict(delimited="length")
    r = client.post("https://userstream.twitter.com/2/user.json", data=data, prefetch=False)

    # TODO detect disconnection somehow
    # https://github.com/kennethreitz/requests/pull/200/files#L13R169
    # Use a timeout? http://pguides.net/python-tutorial/python-timeout-a-function/
    for chunk in r.iter_lines(chunk_size=1):
        if chunk and not chunk.isdigit():
            yield json.loads(chunk)

if __name__ == "__main__":
    import pprint
    import settings
    for obj in userstream(access_token=settings.ACCESS_TOKEN, access_token_secret=settings.ACCESS_TOKEN_SECRET, consumer_key=settings.CONSUMER_KEY, consumer_secret=settings.CONSUMER_SECRET):
        pprint.pprint(obj)

However, I need to be able to handle disconnections gracefully. Currently, when the stream disconnects, the above just hangs, and there are no exceptions raised.

What would be the best way to achieve this? Is there a way to detect this through the urllib3 connection pool? Should I use a timeout?

Wedurn answered 14/9, 2012 at 9:58 Comment(0)
P
0

I would recommend adding a timeout parameter to the client.post() call. http://docs.python-requests.org/en/latest/user/quickstart/#timeouts

However, it is important to note that requests doesn't set the TCP timeout, so you could set that using the following:

import socket
socket.setdefaulttimeout(TIMEOUT)
Parthen answered 23/6, 2014 at 17:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.