Record streaming and saving internet radio in python
Asked Answered
D

6

12

I am looking for a python snippet to read an internet radio stream(.asx, .pls etc) and save it to a file.

The final project is cron'ed script that will record an hour or two of internet radio and then transfer it to my phone for playback during my commute. (3g is kind of spotty along my commute)

any snippits or pointers are welcome.

Dysphonia answered 22/11, 2010 at 15:47 Comment(0)
D
8

So after tinkering and playing with it Ive found Streamripper to work best. This is the command i use

streamripper http://yp.shoutcast.com/sbin/tunein-station.pls?id=1377200 -d ./streams -l 10800 -a tb$FNAME
Dysphonia answered 28/3, 2011 at 22:33 Comment(0)
T
15

The following has worked for me using the requests library to handle the http request.

import requests

stream_url = 'http://your-stream-source.com/stream'

r = requests.get(stream_url, stream=True)

with open('stream.mp3', 'wb') as f:
    try:
        for block in r.iter_content(1024):
            f.write(block)
    except KeyboardInterrupt:
        pass

That will save a stream to the stream.mp3 file until you interrupt it with ctrl+C.

Timeconsuming answered 8/12, 2015 at 15:4 Comment(5)
it does helps but how do i discard blocks which are repeated in the file ?Ulphiah
It seems that requests doesn't support SHOUTcast protocol (some radios), it raises an error with BadStatusLine('ICY 200 OK\r\n',))Regimentals
See my solution for this below, @Regimentals - it took me a while to find the best way, but I'm overriding the status check it does and telling it ICY 200 OK is really HTTP/1.0 200 OK which is the cleanest way to get past that and on to more fun stuff. Hope it helps!Helminthiasis
For those who are looking for the stream url, on many websites it is not obvious where to find it. But many streaming sites provide a .pls (playlist) file for download, and if you open the .pls file in a text editor it will have the stream url.Comatulid
i wonder if there is a way to take the audio data in "block" and process it for dead space -- so you can decide wether to write that block or not.Newish
D
8

So after tinkering and playing with it Ive found Streamripper to work best. This is the command i use

streamripper http://yp.shoutcast.com/sbin/tunein-station.pls?id=1377200 -d ./streams -l 10800 -a tb$FNAME
Dysphonia answered 28/3, 2011 at 22:33 Comment(0)
H
7

If you find that your requests or urllib.request call in Python 3 fails to save a stream because you receive "ICY 200 OK" in return instead of an "HTTP/1.0 200 OK" header, you need to tell the underlying functions ICY 200 OK is OK!

What you can effectively do is intercept the routine that handles reading the status after opening the stream, just before processing the headers.

Simply put a routine like this above your stream opening code.

def NiceToICY(self):
    class InterceptedHTTPResponse():
        pass
    import io
    line = self.fp.readline().replace(b"ICY 200 OK\r\n", b"HTTP/1.0 200 OK\r\n")
    InterceptedSelf = InterceptedHTTPResponse()
    InterceptedSelf.fp = io.BufferedReader(io.BytesIO(line))
    InterceptedSelf.debuglevel = self.debuglevel
    InterceptedSelf._close_conn = self._close_conn
    return ORIGINAL_HTTP_CLIENT_READ_STATUS(InterceptedSelf)

Then put these lines at the start of your main routine, before you open the URL.

ORIGINAL_HTTP_CLIENT_READ_STATUS = urllib.request.http.client.HTTPResponse._read_status
urllib.request.http.client.HTTPResponse._read_status = NiceToICY

They will override the standard routine (this one time only) and run the NiceToICY function in place of the normal status check when it has opened the stream. NiceToICY replaces the unrecognised status response, then copies across the relevant bits of the original response which are needed by the 'real' _read_status function. Finally the original is called and the values from that are passed back to the caller and everything else continues as normal.

I have found this to be the simplest way to get round the problem of the status message causing an error. Hope it's useful for you, too.

Helminthiasis answered 27/12, 2016 at 2:6 Comment(0)
T
3

I am aware this is a year old, but this is still a viable question, which I have recently been fiddling with.

Most internet radio stations will give you an option of type of download, I choose the MP3 version, then read the info from a raw socket and write it to a file. The trick is figuring out how fast your download is compared to playing the song so you can create a balance on the read/write size. This would be in your buffer def.

Now that you have the file, it is fine to simply leave it on your drive (record), but most players will delete from file the already played chunk and clear the file out off the drive and ram when streaming is stopped.

I have used some code snippets from a file archive without compression app to handle a lot of the file file handling, playing, buffering magic. It's very similar in how the process flows. If you write up some sudo-code (which I highly recommend) you can see the similarities.

Tetrastichous answered 7/11, 2012 at 23:19 Comment(1)
Is there really a need to worry about how fast it's download in regards to buffering it? You can simply have your code save the audio chunks as they arrive. That's pretty automatic with Python using something like: req=urllib.request.Request(URL); resp=urllib.request.urlopen(req); headers=resp.getheaders(); with open(outfile, 'wb') as f: shutil.copyfileobj(resp, f)Helminthiasis
B
1

I'm only familiar with how shoutcast streaming works (which would be the .pls file you mention):

You download the pls file, which is just a playlist. It's format is fairly simple as it's just a text file that points to where the real stream is.

You can connect to that stream as it's just HTTP, that streams either MP3 or AAC. For your use, just save every byte you get to a file and you'll get an MP3 or AAC file you can transfer to your mp3 player.

Shoutcast has one addition that is optional: metadata. You can find how that works here, but is not really needed.

If you want a sample application that does this, let me know and I'll make up something later.

Brannan answered 22/11, 2010 at 16:5 Comment(1)
Thanks, Im going to give it a tryDysphonia
R
0

In line with the answer from https://stackoverflow.com/users/1543257/dingles (https://stackoverflow.com/a/41338150), here's how you can achieve the same result with the asynchronous HTTP client library - aiohttp:

import functools

import aiohttp
from aiohttp.client_proto import ResponseHandler
from aiohttp.http_parser import HttpResponseParserPy


class ICYHttpResponseParser(HttpResponseParserPy):
    def parse_message(self, lines):
        if lines[0].startswith(b"ICY "):
            lines[0] = b"HTTP/1.0 " + lines[0][4:]
        return super().parse_message(lines)


class ICYResponseHandler(ResponseHandler):
    def set_response_params(
        self,
        *,
        timer = None,
        skip_payload = False,
        read_until_eof = False,
        auto_decompress = True,
        read_timeout = None,
        read_bufsize = 2 ** 16,
        timeout_ceil_threshold = 5,
    ) -> None:
        # this is a copy of the implementation from here:
        # https://github.com/aio-libs/aiohttp/blob/v3.8.1/aiohttp/client_proto.py#L137-L165
        self._skip_payload = skip_payload

        self._read_timeout = read_timeout
        self._reschedule_timeout()

        self._timeout_ceil_threshold = timeout_ceil_threshold

        self._parser = ICYHttpResponseParser(
            self,
            self._loop,
            read_bufsize,
            timer=timer,
            payload_exception=aiohttp.ClientPayloadError,
            response_with_body=not skip_payload,
            read_until_eof=read_until_eof,
            auto_decompress=auto_decompress,
        )

        if self._tail:
            data, self._tail = self._tail, b""
            self.data_received(data)


class ICYConnector(aiohttp.TCPConnector):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._factory = functools.partial(ICYResponseHandler, loop=self._loop)

This can then be used as follows:

session = aiohttp.ClientSession(connector=ICYConnector())
async with session.get("url") as resp:
    print(resp.status)

Yes, it's using a few private classes and attributes but this is the only solution to change the handling of something that's part of HTTP spec and (theoretically) should not ever need to be changed by the library's user...

All things considered, I would say this is still rather clean in comparison to monkey patching which would cause the behavior to be changed for all requests (especially true for asyncio where setting before and resetting after a request does not guarantee that something else won't make a request while request to ICY is being made). This way, you can dedicate a ClientSession object specifically for requests to servers that respond with the ICY status line.

Note that this comes with a performance penalty for requests made with ICYConnector - in order for this to work, I am using the pure Python implementation of HttpResponseParser which is going to be slower than the one that aiohttp uses by default and is written in C. This cannot really be done differently without vendoring the whole library as the behavior for parsing status line is deeply hidden in the C code.

Redon answered 16/4, 2022 at 3:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.