python http/udp bittorrent tracker scrape library

Asked 10/3, 2013 at 10:12 Answered 11/3, 2013 at 3:56

Solved python bittorrent tracker libtorrent

I have a list of torrent info_hashes. For each info_hash, I have a list of trackers that correspond with that info_hash.

What I would like to do is scrape each tracker in the list to get the seeder/leecher/completed count. However, i'd rather not attempt to write this myself as i'm sure this code has been implemented elsewhere

Does anyone know of a python library that can scrape http:// and udp:// trackers?

I have been using libtorrent for other parts of this project, however it can only scrape a tracker from a valid torrent_handle (and I dont want to have to add these info_hashes to a libtorrent session in order to scrape the tracker because it will start downloading the files which I dont want)

Tackling answered 10/3, 2013 at 10:12 Comment(0)

I didnt want to use libtorrent also because it is quite inefficient - I want to be able to query a tracker for multiple info_hashes instead of one at a time.

I ended up writing my own python HTTP/UDP tracker scraping code, see here: https://github.com/erindru/m2t/blob/master/m2t/scraper.py (improvements most welcome!)

Tackling answered 11/3, 2013 at 3:56 Comment(5)

Can this get you the peer list/ seeder list of IP addresses? – Ponderable 13/11, 2013 at 1:53

Nope it currently doesn't care about that, but could be extended to do so – Tackling 13/11, 2013 at 3:59

OK Thanks. One more question, I see the http expects a dictionary (bencoded) and so it gets the data. Yet the udp just offsets the buffer, how did you know the order of bytes and what they represent, so If I need the IPs of peers at what offset is that? Is there any documentation? – Ponderable 13/11, 2013 at 4:52

UDP tracker protocol is not the same as HTTP, see xbtt.sourceforge.net/udp_tracker_protocol.html – Tackling 13/11, 2013 at 20:25

Thanks I was looking at this earlier, it does not have a peer_list. Is it possible to extend your implementation to get the peer list for both http and udp. Otherwise, how do the torrent clients do that? – Ponderable 13/11, 2013 at 21:26

This is not directly an answer to your question, but a suggestion of how you could use libtorrent.

If you add the info-hash in a paused, non-auto-managed state (controlled by the flags in add_torrent_params). In that case libtorrent won't start downloading it.

Keep in mind that libtorrent does not (yet) support scraping the DHT.

Viyella answered 10/3, 2013 at 23:56 Comment(0)

Recommended topics

Hot tags