How to create video thumbnails with Python and Gstreamer
Asked Answered
J

5

9

I'd like to create thumbnails for MPEG-4 AVC videos using Gstreamer and Python. Essentially:

  1. Open the video file
  2. Seek to a certain point in time (e.g. 5 seconds)
  3. Grab the frame at that time
  4. Save the frame to disc as a .jpg file

I've been looking at this other similar question, but I cannot quite figure out how to do the seek and frame capture automatically without user input.

So in summary, how can I capture a video thumbnail with Gstreamer and Python as per the steps above?

Julio answered 3/4, 2013 at 14:6 Comment(2)
Note that "5 seconds" probably won't work. For many commercial movies, you'll just get the intro/logo. Try to find black frames (they indicate scene changes) and then seek a few seconds into the scene. Offer the user 4-5 of those to find an image which is easy to recognize.Josejosee
This is for personal videos that are all longer than 5 seconds. In any case, the 5 seconds figure is just arbitrary and for the sake of the example. It could be 2, 10, or any other value below, let's say, 30 secs.Julio
B
8

To elaborate on ensonic's answer, here's an example:

import os
import sys

import gst

def get_frame(path, offset=5, caps=gst.Caps('image/png')):
    pipeline = gst.parse_launch('playbin2')
    pipeline.props.uri = 'file://' + os.path.abspath(path)
    pipeline.props.audio_sink = gst.element_factory_make('fakesink')
    pipeline.props.video_sink = gst.element_factory_make('fakesink')
    pipeline.set_state(gst.STATE_PAUSED)
    # Wait for state change to finish.
    pipeline.get_state()
    assert pipeline.seek_simple(
        gst.FORMAT_TIME, gst.SEEK_FLAG_FLUSH, offset * gst.SECOND)
    # Wait for seek to finish.
    pipeline.get_state()
    buffer = pipeline.emit('convert-frame', caps)
    pipeline.set_state(gst.STATE_NULL)
    return buffer

def main():
    buf = get_frame(sys.argv[1])

    with file('frame.png', 'w') as fh:
        fh.write(str(buf))

if __name__ == '__main__':
    main()

This generates a PNG image. You can get raw image data using gst.Caps("video/x-raw-rgb,bpp=24,depth=24") or something like that.

Note that in GStreamer 1.0 (as opposed to 0.10), playbin2 has been renamed to playbin and the convert-frame signal is named convert-sample.

The mechanics of seeking are explained in this chapter of the GStreamer Application Development Manual. The 0.10 playbin2 documentation no longer seems to be online, but the documentation for 1.0 is here.

Bwana answered 10/5, 2013 at 8:47 Comment(5)
That is excellent, thanks! I've tried to port the code to PyGI, and I've found an issue whereby gst.Caps('image/png') no longer works, as the new Gst.Caps() does not take any arguments, and I haven't found any replacement (Gst.caps_from_string('image/png') segfaults). Any pointers?Julio
I've created a gist with the PyGI version and it runs without error. However, it creates unreadable .png files. If any GStreamer expert could spot the mistake, any pointer would be welcome, thanks!Julio
My guess is that str(buf) no longer does what it used to and it's now giving you something like "<GStBuffer ...>". Have you tried looking at the resulting PNG file? I'm guessing you want something like buf.data.Bwana
You were right, the png file was textual data. I've managed to get hold of the Gst.Buffer (see updated gist), but I still can't figure out how to get the actual bytes from the buffer. This is proving to be a bit more difficult than I expected, I might have to open a new question.Julio
Ok, so it seems this is not possible to do in GStreamer 1.0, I believe I've now stumbled upon this bug :/Julio
T
4

An example in Vala, with GStreamer 1.0 :

var playbin = Gst.ElementFactory.make ("playbin", null);
playbin.set ("uri", "file:///path/to/file");
// some code here.
var caps = Gst.Caps.from_string("image/png");
Gst.Sample sample;
Signal.emit_by_name(playbin, "convert-sample", caps, out sample);
if(sample == null)
    return;
var sample_caps = sample.get_caps ();
if(sample_caps == null)
    return;
unowned Gst.Structure structure = sample_caps.get_structure(0);
int width = (int)structure.get_value ("width");
int height = (int)structure.get_value ("height");
var memory = sample.get_buffer().get_memory (0);
Gst.MapInfo info;
memory.map (out info, Gst.MapFlags.READ);
uint8[] data = info.data;
Townley answered 4/2, 2015 at 10:52 Comment(2)
Thanks! However, I'm still looking for a Python example. It seems that due to this bug this is still not possible with GStreamer 1.0 and Python :/Julio
It doesn't answer the question, but it does help me. If someone could explain when "sample" might be null then I might be able to get this to work!Achaemenid
G
4

It's an old question but I still haven't found it documented anywhere.
I found that the following worked on a playing video with Gstreamer 1.0

import gi
import time
gi.require_version('Gst', '1.0')
from gi.repository import Gst

def get_frame():
    caps = Gst.Caps('image/png')
    pipeline = Gst.ElementFactory.make("playbin", "playbin")
    pipeline.set_property('uri','file:///home/rolf/GWPE.mp4')
    pipeline.set_state(Gst.State.PLAYING)
    #Allow time for it to start
    time.sleep(0.5)
    # jump 30 seconds
    seek_time = 30 * Gst.SECOND
    pipeline.seek(1.0, Gst.Format.TIME,(Gst.SeekFlags.FLUSH | Gst.SeekFlags.ACCURATE),Gst.SeekType.SET, seek_time , Gst.SeekType.NONE, -1)

    #Allow video to run to prove it's working, then take snapshot
    time.sleep(1)
    buffer = pipeline.emit('convert-sample', caps)
    buff = buffer.get_buffer()
    result, map = buff.map(Gst.MapFlags.READ)
    if result:
        data = map.data
        pipeline.set_state(Gst.State.NULL)
        return data
    else:
        return

if __name__ == '__main__':
    Gst.init(None)
    image = get_frame()
    with open('frame.png', 'wb') as snapshot:
        snapshot.write(image)

The code should run with both Python2 and Python3, I hope it helps someone.

Gavel answered 17/10, 2017 at 15:3 Comment(0)
P
2

Use playbin2. set the uri to the media file, use gst_element_seek_simple to seek to the desired time position and then use g_signal_emit to invoke the "convert-frame" action signal.

Paratuberculosis answered 4/4, 2013 at 13:6 Comment(5)
Thanks for your answer. Would you care elaborating a bit with a code snippet perhaps? I understand the part with playbin2, but neither gst_element_seek_simple() nor gst.element_seek_simple() seem to be available in Python.Julio
Ok, figured out that there is gst.Element.seek_simple() in Python, and how to use it. Still, a Python snippet would be really helpful, as now the next thing to figure out is how to use the g_signal_emit equivalent.Julio
Sorry, I can't help up n the python side :/Paratuberculosis
And on the C side? I should be able to work out the Python code from a C snippet.Julio
It might be gobject.GObject.emit()Flannery
C
0

Here is a script in python to capture an image with gstreamer 1.0 and playbin. if you want to resize the image, just cast the data to GdkPixbuf and use the scale methods. Hope it can still help...

import sys

import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst

def main(args):
    if len(args) != 2:
        sys.stderr.write("usage: %s <media file or uri>\n" % args[0])
        sys.exit(1)

    GObject.threads_init()
    Gst.init(None)
        
    playbin = Gst.ElementFactory.make("playbin", None)
    if not playbin:
        sys.stderr.write("'playbin' gstreamer plugin missing\n")
        sys.exit(1)

    # take the commandline argument and ensure that it is a uri
    if Gst.uri_is_valid(args[1]):
      uri = args[1]
    else:
      uri = Gst.filename_to_uri(args[1])
    playbin.set_property('uri', uri)
    #to avoid from opening a window with the video, we use fakesink 
    playbin.set_property('audio_sink', Gst.ElementFactory.make("fakesink", None))
    playbin.set_property('video_sink', Gst.ElementFactory.make("fakesink", None))

    playbin.set_state(Gst.State.PAUSED) #init the first reading
    
    state=playbin.get_state(Gst.CLOCK_TIME_NONE) # Wait for state change to finish.

    assert(playbin.seek_simple(
        Gst.Format.TIME, Gst.SeekFlags.FLUSH, 1 * Gst.SECOND)) #move the cursor to 1second in the video
    
    playbin.get_state(Gst.CLOCK_TIME_NONE) # Wait for seek to finish.
    
    caps = Gst.Caps.from_string('image/png') # caps is the format of capture 
    
    sample = playbin.emit('convert-sample', caps) #send the signal convert-sample to the element playbin

    buffer=sample.get_buffer() #retrieve the buffer

    result, map = buffer.map(Gst.MapFlags.READ) #retrieve the data from the buffer with the map method
    if result:
        data = map.data #the data of the capture
        with open('frame.png', 'wb') as snapshot:
          snapshot.write(data)
    
    playbin.set_state(Gst.State.NULL) # cleanup

if __name__ == '__main__':
    sys.exit(main(sys.argv))
Coherent answered 21/3, 2023 at 18:36 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.