Using Whisper API to Generate .SRT Transcripts?

Asked 2/10, 2023 at 20:2 Answered 20/10 at 13:31

I'm exploring the capabilities of the Whisper API and was wondering if it can be used to generate an .SRT file with transcriptions. From what I understand, this transcription to .SRT can be achieved when running the model locally using the Whisper package. Unfortunately, I don't possess the computational resources to run the model locally, so I'm leaning towards using the API directly.

Has anyone had experience with this or can provide guidance on how to approach it through the API?

The following python script can be used a starting point, but the question is about capabilities of the model itself, not specific to any programming language.

import os
import openai
openai.api_key = API_KEY
audio_file = open("audio.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)
print(transcript.text)

Gerianne answered 2/10, 2023 at 20:2 Comment(0)

A cursory look at OpenAI's docs shows that srt is a supported value for the response_format parameter on the /v1/audio/transcriptions endpoint.

With the official Python bindings you're using in your example, you should be able to pass this as a named parameter to your openai.Audio.transcribe() invocation:

transcript = openai.Audio.transcribe("whisper-1", audio_file, response_format="srt")

Flexure answered 2/10, 2023 at 20:11 Comment(0)

I use pinokio and whisper to translate audio to .srt directedly and it works almost perfectly.

Herzegovina answered 20/10 at 13:31 Comment(1)

It will be good if you provide a code block which supports your answer. – Blowzed 22/10 at 10:33

Recommended topics

Hot tags