I'm exploring the capabilities of the Whisper API and was wondering if it can be used to generate an .SRT file with transcriptions. From what I understand, this transcription to .SRT can be achieved when running the model locally using the Whisper package. Unfortunately, I don't possess the computational resources to run the model locally, so I'm leaning towards using the API directly.
Has anyone had experience with this or can provide guidance on how to approach it through the API?
The following python script can be used a starting point, but the question is about capabilities of the model itself, not specific to any programming language.
import os
import openai
openai.api_key = API_KEY
audio_file = open("audio.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)
print(transcript.text)