detect sound level in raw pcm data
Asked Answered
E

2

5

I writing small program which need to detect sound level and write it if level higher than set in settings, i done sound capturing via portaudio, compressing via libvorbis, but one part of program has unfinished and i stuck on it, i need to detect sound level of raw pcm data, i have bad understanding of what pcm data is and does not know any audio analyzing/processing algorithm, is we have existing c/c++ library which can do it ?, or is some simple algorithm which can be implemented in c/c++ exists ?

Emigration answered 21/2, 2013 at 12:50 Comment(0)
R
2

Look into the Speex and WebRTC libraries... they both have voice-activity-detectors in them. If you're looking for a measure of sound level, you'll need to decide on linear or logarithmic level indicator. A common format for PCM is -32768 to 32767 range (16-bit short)... one simple thing you can do is simply sum up the absolute values of the samples in a period and divide by the number of samples to get an average level for the period.

Rhythmical answered 21/2, 2013 at 12:55 Comment(4)
i have signed 16 bit 48khz pcm, as i understand i need to sum 48000 samples and compare it with level ? to detect level in one second ?Emigration
sure, you can choose whatever time period you want... one second is fine. remember to sum the >absolute< values...Rhythmical
ok, it worked better, still not as i want, but better than my initial code, you can look here sss.chaoslab.ru/git/?p=misc.git;a=blob;f=sound_detector/… maybe you have more suggestions about sound level/silence detection, maybe i should apply some filter to reduce generic noise (i have few computers and other noisy tech in room)Emigration
yeah you'll have to play with thresholds to figure out where your noise levels are... look into automatic gain control algorithms. you might also do a calculation of your zero-crossing rate (number of times samples go from positive to negative or back) as that too can be useful information...Rhythmical
A
5

It depends on how you define "sound level", which can be as simple as detecting a peak, and more complex as following industry standards/recommendation on obtaining loudness levels.

PCM data is typically a stream of signed values: 0x00..0xFF in case of 8 bit PCM, -0x8000..+0x7FFF for 16-bit PCM, or -1.0..+1.0 in case of floating point values.

Th easiest is to detect simple peak by looking for maximal absolute value for a given time frame. You can apply log10 afterwards to convert to decibels.

Attending answered 21/2, 2013 at 12:55 Comment(1)
currently i implemented looking for maximum value(s) during time frame, but this works bad nearly unusableEmigration
R
2

Look into the Speex and WebRTC libraries... they both have voice-activity-detectors in them. If you're looking for a measure of sound level, you'll need to decide on linear or logarithmic level indicator. A common format for PCM is -32768 to 32767 range (16-bit short)... one simple thing you can do is simply sum up the absolute values of the samples in a period and divide by the number of samples to get an average level for the period.

Rhythmical answered 21/2, 2013 at 12:55 Comment(4)
i have signed 16 bit 48khz pcm, as i understand i need to sum 48000 samples and compare it with level ? to detect level in one second ?Emigration
sure, you can choose whatever time period you want... one second is fine. remember to sum the >absolute< values...Rhythmical
ok, it worked better, still not as i want, but better than my initial code, you can look here sss.chaoslab.ru/git/?p=misc.git;a=blob;f=sound_detector/… maybe you have more suggestions about sound level/silence detection, maybe i should apply some filter to reduce generic noise (i have few computers and other noisy tech in room)Emigration
yeah you'll have to play with thresholds to figure out where your noise levels are... look into automatic gain control algorithms. you might also do a calculation of your zero-crossing rate (number of times samples go from positive to negative or back) as that too can be useful information...Rhythmical

© 2022 - 2024 — McMap. All rights reserved.