Detecting and printing timestamps of periods of silence using SoX
Asked Answered
P

4

12

I am trying to output the begin-timestamps of periods of silence (since there is background noise, by silence I mean a threshold) in a given audio file. Eventually, I want to split the audio file into smaller audio files, given these timestamps. It is important that no part of the original file be discarded.

I tried

sox in.wav out.wav silence 1 0.5 1% 1 2.0 1% : newfile : restart

(courtesy http://digitalcardboard.com/blog/2009/08/25/the-sox-of-silence/)

Although, it somewhat did the job, it also trimmed and discarded the periods of silence, which I do not want happening.

Is 'silence' the right option, or is there a simpler way to accomplish what I need to do?

Thanks.

Ploss answered 6/8, 2013 at 0:5 Comment(1)
Any news on this topic? Could you accomplish this? I need to do exactly the same. Currently I detect silence with audacity and export the label-track as textfile.Thwart
R
17

Unfortunately not Sox, but ffmpeg has a silencedetect filter that does exactly what you're looking for:

ffmpeg -i in.wav -af silencedetect=noise=-50dB:d=1 -f null -

(detecting threshold of -50db, for a minimum of 1 seconds, cribbed from the ffmpeg documentation)

...this would print a result like this:

Press [q] to stop, [?] for help
[silencedetect @ 0x7ff2ba5168a0] silence_start: 264.718
[silencedetect @ 0x7ff2ba5168a0] silence_end: 265.744 | silence_duration: 1.02612
size=N/A time=00:04:29.53 bitrate=N/A
Revert answered 12/5, 2016 at 2:36 Comment(1)
Are there any new libraries in 2017 that can accomplish this ? i.e given an audio file , be able to detect and output time stamps of periods of speech and periods of silence. Thanks.Tadeas
N
5

There is (currently, at least) no way to make the silence effect output the position where it has detected silence, or to retain all of the silent audio.

If you are able to recompile SoX yourself, you could add an output statement yourself to find out about the cut positions, then use trim in a separate invocation to split the file. With the stock version, you are out of luck.

Noellanoelle answered 8/8, 2013 at 22:35 Comment(1)
Hi chirlu, I was hoping that wasn't true. I'll see what can do.Ploss
S
3

SoX can easily give you the timestamps of the actual silences in a text file. Not periods of silence though, but you can calculate those with a simple script

   .dat   Text  Data  files.   These  files  contain a textual representation of the sample data.  There is one line at the beginning that contains the sample
          rate, and one line that contains the number of channels.  Subsequent lines contain two or more numeric data intems: the time since the beginning  of
          the first sample and the sample value for each channel.

          Values are normalized so that the maximum and minimum are 1 and -1.  This file format can be used to create data files for external programs such as
          FFT analysers or graph routines.  SoX can also convert a file in this format back into one of the other file formats.

          Example containing only 2 stereo samples of silence:

              ; Sample Rate 8012
              ; Channels 2
                          0   0    0
              0.00012481278   0    0

So you can do sox in.wav out.dat, then parse the text file and consider a silence a sequence of rows with a value close to 0 (depending on your threshold)

Sayres answered 7/6, 2019 at 21:24 Comment(0)
E
1

necroposting: You can run a separate script that iterates all of the sox output files, (for f in *.wav), and use the command; soxi -D $f to obtain the DURATION of the sound clip. Then, get the system time in seconds date "+%s", then subtract to find the time the recording starts.

Egomania answered 20/5, 2014 at 19:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.