Unable to stop Streaming in tweepy after one minute
Asked Answered
P

4

9

I am trying to stream twitter data for a period of time of say 5 minutes, using the Stream.filter() method. I am storing the retrieved tweets in a JSON file. The problem is I am unable to stop the filter() method from within the program. I need to stop the execution manually. I tried stopping the data based on system time using the time package. I was able to stop writing tweets to the JSON file but the stream method is still going on, but It was not able to continue to the next line of code. I am using IPython notebook to write and execute the code. Here's the code:

auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)

from tweepy import Stream
from tweepy.streaming import StreamListener

class MyListener(StreamListener):

    def __init__(self, start_time, time_limit=60):
        self.time = start_time
        self.limit = time_limit

    def on_data(self, data):
        while (time.time() - self.time) < self.limit:
            try:
                saveFile = open('abcd.json', 'a')
                saveFile.write(data)
                saveFile.write('\n')
                saveFile.close()
                return True
            except BaseException as e:
                print 'failed ondata,', str(e)
                time.sleep(5)
        return True

    def on_status(self, status):
        if (time.time() - self.time) >= self.limit:
            print 'time is over'
            return false

    def on_error(self, status):
        if (time.time() - self.time) >= self.limit:
            print 'time is over'
            return false
        else:
            print(status)
            return True

start_time = time.time()
stream_data = Stream(auth, MyListener(start_time,20))
stream_data.filter(track=['name1','name2',...list ...,'name n'])#list of the strings I want to track

These links are similar but I does not answer my question directly

Tweepy: Stream data for X minutes?

Stopping Tweepy steam after a duration parameter (# lines, seconds, #Tweets, etc)

Tweepy Streaming - Stop collecting tweets at x amount

I used this link as my reference, http://stats.seandolinar.com/collecting-twitter-data-using-a-python-stream-listener/

Pulido answered 3/11, 2015 at 12:18 Comment(2)
You have a valid question, no need to worry there. What do you mean by the problem is I am unable to stop the filter() method from within the program. Are you trying to pause the stream? Or change the filter keywords?Photoneutron
@Photoneutron I want the stream to when I need it to run, say once in an hour. If I let it to run forever, it will just hit the API limit and stop working. I want to open and close the stream programatically within the code.Pulido
C
29
  1. In order to close the stream you need to return False from on_data(), or on_status().

  2. Because tweepy.Stream() runs a while loop itself, you don't need the while loop in on_data().

  3. When initializing MyListener, you didn't call the parent's class __init__ method, so it wasn't initialized properly.

So for what you're trying to do, the code should be something like:

class MyStreamListener(tweepy.StreamListener):
    def __init__(self, time_limit=60):
        self.start_time = time.time()
        self.limit = time_limit
        self.saveFile = open('abcd.json', 'a')
        super(MyStreamListener, self).__init__()

    def on_data(self, data):
        if (time.time() - self.start_time) < self.limit:
            self.saveFile.write(data)
            self.saveFile.write('\n')
            return True
        else:
            self.saveFile.close()
            return False

myStream = tweepy.Stream(auth=api.auth, listener=MyStreamListener(time_limit=20))
myStream.filter(track=['test'])
Coburg answered 11/11, 2015 at 21:15 Comment(0)
P
0

Access the variable myListener.running but instead of passing MyListener directly to Stream create a variable as follows:

myListener = MyListener()
timeout code here... suchas time.sleep(20)
myListener.running = False 
Phraseogram answered 15/3, 2016 at 22:43 Comment(0)
B
0

So, I was having this issue as well. Fortunately Tweepy is open source so it's easy so dig into the problem.

Basically the important part is this here:

def _data(self, data):
    if self.listener.on_data(data) is False:
        self.running = False

On Stream class in streaming.py

That means, to close the connection you just have to return false on the listener's on_data() method.

Borras answered 28/6, 2017 at 13:19 Comment(0)
H
0

For those who are trying with Twitter api V2 (StreamingClient class), here is the solution:

client.disconnect()

Hundredpercenter answered 3/11, 2022 at 14:45 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.