I have an audio file.
I have a bunch of [start, end] time stamp segments.
WHAT I WANT TO ACHIEVE:
Say audio is 6:00 minutes long.
Segments I have are : [[0.0,4.0], [8.0,12.0], [16.0,20.0], [24.0,28.0]]
After I pass these two to sox + python , out put should be audio that is 6 minutes long, but has audio only in the times passed by the segments.
i.e I want to pass the time stamps
and original audio to SOX + python
so that an audio with everything silenced out except for those portions corresponding to the passed segments is generated
I couldn't achieve above but came somewhat close to the opposite, after days of googling I have this:
UPDATED, MORE CONCISE CODE + EXAMPLE:
sox command that takes padding and trimming like this
SOX__SILENCE = 'sox "{inputaudio}" -c 1 "{outputaudio}" {padding}{trimming}'
Random Segments for testing:
# random segments:
A= [[0.0,16.0]]
b=[[1.0,2.0]]
z= [[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]
q= [[0.0,4.0], [8.0,12.0], [16.0,20.0], [24.0,28.0]]
A small python script to generate padding and trimming.
PADDING:
def get_pad_pattern_from_timestamps(my_segments):
padding = 'pad'
for segment in my_segments:
duration = str(segment[1] - segment[0])
padding = padding + ' ' + duration + '@' + str(segment[0])
return padding
print get_pad_pattern_from_timestamps(A)
print get_pad_pattern_from_timestamps(b)
print get_pad_pattern_from_timestamps(z)
print get_pad_pattern_from_timestamps(q)
OUTPUT from ^:
pad [email protected]
pad [email protected]
pad [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
pad [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
TRIMMING:
def get_trimm_pattern_from_timestamps(my_segments):
trimming = ''
for segment in my_segments:
duration = str(segment[1] - segment[0])
trimming = trimming + ' trim 0 ' + str(segment[0]) + ' 0 ' + duration + ' ' + duration
return trimming
print get_trimm_pattern_from_timestamps(A)
print get_trimm_pattern_from_timestamps(b)
print("\n")
print get_trimm_pattern_from_timestamps(z)
print("\n")
print get_trimm_pattern_from_timestamps(q)
print("\n")
OUTPUT FROM TRIMMING:
trim 0 0.0 0 16.0 16.0
trim 0 1.0 0 1.0 1.0
trim 0 1.6 0 6.7 6.7 trim 0 13.2 0 20.5 20.5 trim 0 35.0 0 3.0 3.0 trim 0 42.0 0 9.0 9.0 trim 0 70.2 0 3.5 3.5 trim 0 90.0 0 9.2 9.2 trim 0 123.0 0 8.1 8.1
trim 0 0.0 0 4.0 4.0 trim 0 8.0 0 4.0 4.0 trim 0 16.0 0 4.0 4.0 trim 0 24.0 0 4.0 4.0 trim 0 32.0 0 4.0 4.0 trim 0 40.0 0 4.0 4.0
RUNNING SOX using about outputs from a terminal:
Padding:
sox dinners.mp3 -c 1 testlongpad.mp3 pad [email protected] [email protected] [email protected] [email protected]
Trimming:
sox dinners.mp3 -c 1 testrim.mp3 trim 0 0.0 0 16.0 16.0
Padd and trimm:
sox dinners.mp3 -c 1 testlongpadtrim.mp3 pad [email protected] [email protected] [email protected] [email protected] trim 0 0.0 0 4.0 4.0 trim 0 8.0 0 4.0 4.0 trim 0 16.0 0 4.0 4.0 trim 0 24.0 0 4.0 4.0
If S are my segments, then NS is everything else. In ^ approach I'm passing NS , and NS is getting removed from Audio.
What I want to achieve is still the same but in a different way i.e I want to pass S
so that only portions of audio corresponding toS
are retained.
PS: My question is very specific, i am new to audio processing and unsure how to proceed. Kindly don't close question as being too broad or something. I'd be happy to provide more details to provide clarification. Lastly this is not a hw question. This is for a personal project.
Sample Audio : https://www.dropbox.com/s/1p27nfwney42ka2/LAZY_SALON_-03-_Hot_Dinners.mp3?dl=0
Sample Segments[[start,end],[,] ] : [[1.6, 8.3], [13.2, 33.7], [35.0,38.0], [42.0,51.0], [70.2,73.7], [90.0,99.2], [123.0,131.1]]
So when these time stamps are passed to sox/python with audio, everything in the audio except those portions in the supplied segments should be silenced out.