I'm trying to use Requests to create a robust way of consuming from Twitter's user streams. So far, I've produced the following basic working example:
"""
Example of connecting to the Twitter user stream using Requests.
"""
import sys
import json
import requests
from oauth_hook import OAuthHook
def userstream(access_token, access_token_secret, consumer_key, consumer_secret):
oauth_hook = OAuthHook(access_token=access_token, access_token_secret=access_token_secret,
consumer_key=consumer_key, consumer_secret=consumer_secret,
header_auth=True)
hooks = dict(pre_request=oauth_hook)
config = dict(verbose=sys.stderr)
client = requests.session(hooks=hooks, config=config)
data = dict(delimited="length")
r = client.post("https://userstream.twitter.com/2/user.json", data=data, prefetch=False)
# TODO detect disconnection somehow
# https://github.com/kennethreitz/requests/pull/200/files#L13R169
# Use a timeout? http://pguides.net/python-tutorial/python-timeout-a-function/
for chunk in r.iter_lines(chunk_size=1):
if chunk and not chunk.isdigit():
yield json.loads(chunk)
if __name__ == "__main__":
import pprint
import settings
for obj in userstream(access_token=settings.ACCESS_TOKEN, access_token_secret=settings.ACCESS_TOKEN_SECRET, consumer_key=settings.CONSUMER_KEY, consumer_secret=settings.CONSUMER_SECRET):
pprint.pprint(obj)
However, I need to be able to handle disconnections gracefully. Currently, when the stream disconnects, the above just hangs, and there are no exceptions raised.
What would be the best way to achieve this? Is there a way to detect this through the urllib3 connection pool? Should I use a timeout?