How to decode AWS Kinesis Video Stream GetMedia API output to mp3/wav?
B

1

6

I ingested data to (Kinesis Video Stream) KVS via AWS Connect service now using GetMedia API am able to extract the Payload but how can I convert this output to a mp3/wav ? I want to ingest this output to AWS Transcribe service to get text format of audio call ingested by AWS Connect service to KVS.

Output of Payload for below code is like :

00#AWS_KINESISVIDEO_CONTINUATION_TOKEND\x87....\x1faudio/L16;rate=8000;channels=1;\x12T\xc......00"AWS_KINESISVIDEO_MILLIS_BEHIND_NOWD\x87\x10\x00\x00\x074564302g\xc8\x10\x00\x00^E\xa3\x10\x00\x00#AWS_KINESISVIDEO_CONTINUATION_TOKEND\x87\x10\x00\x00/91343852333181432506572546233025969374566791063'

Note: Above response was too long, so pasted some of it.

import json
import boto3

kinesis_client = boto3.client('kinesisvideo', region_name='us-east-1')

response = kinesis_client.get_data_endpoint(
    StreamARN='arn:aws:kinesisvideo:us-east-1:47...,
    APIName='GET_MEDIA')

t = response['DataEndpoint']
video_client = boto3.client('kinesis-video-media', endpoint_url=t, region_name='us-east-1')
stream = video_client.get_media(
    StreamARN='arn:aws:kinesisvideo:us-east-1:47...',
    StartSelector={'StartSelectorType': 'EARLIEST'})

streamingBody = stream['Payload']
print(streamingBody.read())

Please suggest how can I convert payload output to mp3/wav etc.

Brothel answered 18/3, 2019 at 19:9 Comment(2)
How have you solved this problem? I have quite similar problem - I need to extract 1st frame of video from Payload.Lalonde
@Lalonde my team followed and deployed this: github.com/aws-samples/amazon-connect-realtime-transcriptionBrothel
L
0

I am facing the same problem, I can export the payload to S3 as a raw file but when I listen it, it is not properly audible like it was a crypted conversation.

I just save the payload into a file.

f = open("myAudio.wav", 'w+b')
f.write(stream['Payload'].read())
f.close() 
Libertinage answered 1/4, 2019 at 11:0 Comment(10)
can you convert that audio as text ? using below code and see whether audio proper converting to text or not ? import speech_recognition as sr r= sr.Recognizer() audio='myAudio.wav' with sr.AudioFile(audio) as source: print('Started!') audio =r.record(source) print('Done!') try: text=r.recognize_google(audio) print(text) except Exception as e: print(e)Brothel
By the way its waste to try, we had chat with AWS technical team, they clearly told we can parse the kinesis mkv formatted media only using java not using python as of now. so follow below link step by step to deploy aws connect-transcribe --no need to know java just follow steps as it is github.com/aws-samples/amazon-connect-realtime-transcription our team succeeded doing same without java knowledge, hope you will be able to.Brothel
Hey, thanks for your answer. I don't even try yet to transcribe the audio. At the moment, I only want to save it in a S3 bucket and then just listen it as it was a voicemail. But somehow the audio file is not properly audible. Did you manage to convert the payload in a listenable wav file?Libertinage
Yes my colleagues done using the code in that github link I provided above. What is your audio producer ? Imean from where you are ingesting audio to kinesis ?Brothel
From AWS connect. But the link you provided just explains how to transcribe the audio. I thought your concern was to output the payload in wav or mp3 format. Have you succeeded with this issue in python. I don't to transcribe it yet, I just want to save the payload from the getMedia function into a file that I could listen with Audacity or quicktime player for example.Libertinage
Then you looking for non-realtime solution in such case you dont need kinesis itself, kinesis is for realtime streaming, so to save in s3 bucket use recording block instead of kinesis block in your contact flow in aws connect, the s3 bucket link in which audio saved will be available in your aws connect account settings. refer this link #48953859Brothel
Well not really, the agent must be listening in order to record it. In my case, we want to create a voicemail which means no agent will take the call.Libertinage
In same record block, you get to see option whether to record only customer voice or both agent and customer voice and yes agent can be even a computer(IVR) i.e play prompt for which prompt block is available in aws connect.Brothel
@Libertinage hi , did you find the answer for this ?Justiciable
Hi James_Rajkumar, not in python unfortunately we had to to do it in java instead, aws has a solution for it.Libertinage

© 2022 - 2024 — McMap. All rights reserved.