Get All Follower IDs in Twitter by Tweepy
Asked Answered
H

5

30

Is it possible to get the full follower list of an account who has more than one million followers, like McDonald's?

I use Tweepy and follow the code:

c = tweepy.Cursor(api.followers_ids, id = 'McDonalds')
ids = []
for page in c.pages():
     ids.append(page)

I also try this:

for id in c.items():
    ids.append(id)

But I always got the 'Rate limit exceeded' error and there were only 5000 follower ids.

Houri answered 2/7, 2013 at 17:15 Comment(0)
D
49

In order to avoid rate limit, you can/should wait before the next follower page request. Looks hacky, but works:

import time
import tweepy

auth = tweepy.OAuthHandler(..., ...)
auth.set_access_token(..., ...)

api = tweepy.API(auth)

ids = []
for page in tweepy.Cursor(api.followers_ids, screen_name="McDonalds").pages():
    ids.extend(page)
    time.sleep(60)

print len(ids)

Hope that helps.

Donner answered 5/7, 2013 at 14:5 Comment(5)
it works but not for large number of followers. i did try it with an account which has 600K followers and kept receiving error messages as to 'rate limit exceeded'...any idea how to get over this prob?Murrain
Maybe you don't need to sleep for the last page. if len(page) == 5000: time.sleep(60)Mir
This worked great i am able to retreive large no of followers id but i want to know can we get next_cursor numerical value in tweepy or in your codeRancorous
@Rancorous take a look at this test. Might help.Donner
I keep getting a code 34. Did the documentation change/Tormentor
E
28

Use the rate limiting arguments when making the connection. The api will self control within the rate limit.

The sleep pause is not bad, I use that to simulate a human and to spread out activity over a time frame with the api rate limiting as a final control.

api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True, compression=True)

also add try/except to capture and control errors.

example code https://github.com/aspiringguru/twitterDataAnalyse/blob/master/sample_rate_limit_w_cursor.py

I put my keys in an external file to make management easier.

https://github.com/aspiringguru/twitterDataAnalyse/blob/master/keys.py

Eurasian answered 22/7, 2016 at 23:36 Comment(1)
Good advice. Thanks!Prog
S
5

I use this code and it works for a large number of followers : there are two functions one for saving followers id after every sleep period and another one to get the list : it is a little missy but I hope to be useful.

def save_followers_status(filename,foloowersid):
    path='//content//drive//My Drive//Colab Notebooks//twitter//'+filename
    if not (os.path.isfile(path+'_followers_status.csv')):
      with open(path+'_followers_status.csv', 'wb') as csvfile:
        filewriter = csv.writer(csvfile, delimiter=',')


    if len(foloowersid)>0:
        print("save followers status of ", filename)
        file = path + '_followers_status.csv'
        # https: // stackoverflow.com / questions / 3348460 / csv - file - written -with-python - has - blank - lines - between - each - row
        with open(file, mode='a', newline='') as csv_file:
            writer = csv.writer(csv_file, delimiter=',')
            for row in foloowersid:
                writer.writerow(np.array(row))
            csv_file.closed

def get_followers_id(person):
    foloowersid = []
    count=0

    influencer=api.get_user( screen_name=person)
    influencer_id=influencer.id
    number_of_followers=influencer.followers_count
    print("number of followers count : ",number_of_followers,'\n','user id : ',influencer_id)
    status = tweepy.Cursor(api.followers_ids, screen_name=person, tweet_mode="extended").items()
    for i in range(0,number_of_followers):
        try:
            user=next(status)
            foloowersid.append([user])
            count += 1
        except tweepy.TweepError:
            print('error limite of twiter sleep for 15 min')
            timestamp = time.strftime("%d.%m.%Y %H:%M:%S", time.localtime())
            print(timestamp)
            if len(foloowersid)>0 :
                print('the number get until this time :', count,'all folloers count is : ',number_of_followers)
                foloowersid = np.array(str(foloowersid))
                save_followers_status(person, foloowersid)
                foloowersid = []
            time.sleep(15*60)
            next(status)
        except :
            print('end of foloowers ', count, 'all followers count is : ', number_of_followers)
            foloowersid = np.array(str(foloowersid))
            save_followers_status(person, foloowersid)      
            foloowersid = []
    save_followers_status(person, foloowersid)
    # foloowersid = np.array(map(str,foloowersid))
    return foloowersid
Scoggins answered 31/8, 2019 at 12:18 Comment(0)
D
3

The answer from alecxe is good, however no one has referred to the docs. The correct information and explanation to answer the question lives in the Twitter API documentation. From the documentation :

Results are given in groups of 5,000 user IDs and multiple “pages” of results can be navigated through using the next_cursor value in subsequent requests.

Demulsify answered 9/7, 2018 at 9:51 Comment(0)
M
0

Tweepy's "get_follower_ids()" uses https://api.twitter.com/1.1/followers/ids.json endpoint. This endpoint has a rate limit (15 requests per 15 min). You are getting the 'Rate limit exceeded' error, cause you are crossing that threshold.

Instead of manually putting the sleep in your code you can use wait_on_rate_limit=True when creating the Tweepy API object.

Moreover, the endpoint has an optional parameter count which specifies the number of users to return per page. The Twitter API documentation does not says anything about its default value. Its maximum value is 5000. To get the most ids per request explicitly set it to the maximum. So that you need fewer requests.

Here is my code for getting all the followers' ids:

auth = tweepy.OAuth1UserHandler(consumer_key = '', consumer_secret = '', 
    access_token= '', access_token_secret= '') 
api = tweepy.API(auth, wait_on_rate_limit=True)
account_id = 71026122 # instead of account_id you can also use screen_name
follower_ids = []
for page in tweepy.Cursor(api.get_follower_ids, user_id = account_id, count = 5000).pages():
    follower_ids.extend(page)
Megrims answered 13/7, 2022 at 23:19 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.