How to convert audio byte to samples
Asked Answered
K

1

2

This is my struct

/* wave data block header */
typedef struct wavehdr_tag {
    LPSTR       lpData;                 /* pointer to locked data buffer */
    DWORD       dwBufferLength;         /* length of data buffer */
    DWORD       dwBytesRecorded;        /* used for input only */
    DWORD_PTR   dwUser;                 /* for client's use */
    DWORD       dwFlags;                /* assorted flags (see defines) */
    DWORD       dwLoops;                /* loop control counter */
    struct wavehdr_tag FAR *lpNext;     /* reserved for driver */
    DWORD_PTR   reserved;               /* reserved for driver */
} WAVEHDR, *PWAVEHDR, NEAR *NPWAVEHDR, FAR *LPWAVEHDR;

I have this variable WAVEHDR waveHeader;

I record 10 secs from microphone and waveHeader->lpData has my raw recorded data, and waveHeader->dwBytesRecorded is the raw data's length Now I want to calculate the volume in each second to say which second has highest volume and which one has the lowest.

I know I should sum the absolute values and divide by the number of samples

I used sum += abs(waveHeader->lpData[i]); for i from 0 to length of one secs data, but it doesn't give me a good result

it always gives me the same result for each second, but I am silent in some seconds and speak in some...

I read I have to add samples, not bytes How should I convert waveHeader->lpData[i] to samples?

//len = length of one secs data (waveHeader->dwBytesRecorded/10)
for (int i=0; i<len; i++)
{
    sum += abs(waveHeader->lpData[i]);
}
Kirshbaum answered 8/4, 2019 at 5:36 Comment(8)
The WAVEFORMATEX that you passed to waveInOpen gives you nChannels and wBitsPerSample. Multiple the two together and divide by 8 and that's the number of bytes per sample.Anxious
@JonathanPotter nChannels = 2 and wBitsPerSample = 16 (2*16)/8 is 4 so each sample has 4 bytes, yes? Now what should I do to sum the absolute values and divide by the number of samples? I got confused :(Kirshbaum
For 4-byte samples, you would cast lpData to a DWORD*. But you probably want to handle the two channels individually (i.e. they're not actually 4 byte samples, they're 2x2 byte samples) so you could cast it to a WORD* and then calculate the average for each channel.Anxious
@JonathanPotter both channel are the same, so if I calculate the average for one channel it's enough, I think I misunderstood something, because in any way I calculate the average, the result is unusable / I add some code in the question.Kirshbaum
You're not casting the pointer in that code, all you're doing is casting individual bytes to words.Anxious
By the way, what you trying to implement? Are you trying to render a "peak meter" or some visual indicator of volume on the screen? Or are you trying to do some other sort of analysis of the signal? The reason why I ask, is if you are trying get a visual rendering of the signal, there's lots of easy cheats you can apply to make this easy without having to do sophisticated signal processing.Seigniory
Perhaps you should attach a sample WAV file you record. Specifically, WAVEHDR is a ell known structure on its own but no one knows what format (values) you are using and whether the entire WAV you create is valid.Neology
@RomanR. Thanks for the time you left, I succeeded miraculouslyKirshbaum
S
1

You have the WAVEFORMATEX used for capturing the audio, right? If so, you can modify the following routine to meet your needs:

void ProcessSamples(WAVEHDR* header, WAVEFORMATEX* format)
{
    BYTE* pData = (BYTE*)(header->data);
    DWORD dwNumSamples = header->dwBytesRecorded / format->nBlockAlign;

    // 16-bit stereo, the most common format
    if ((format->wBitsPerSample == 16) && (format->nChannels == 2))
    {
        for (DWORD index = 0; index < dwNumSamples; index++)
        {
            short left = *(short*)pData; pData+=2;
            short right = *(short*)pData; pData+=2;
        }
    }
    else if ((format->wBitsPerSample == 16) && (format->nChannels == 1))
    {
        for (DWORD index = 0; index < dwNumSamples; index++)
        {
            short monoSample = *(short*)pData; pData+=2;
        }
    }
    else if ((format->wBitsPerSample == 8) && (format->nChannels == 2))
    {
        // 8-bit samples are unsigned.
        // "128" is the median silent value
        // normalize to a "signed" value
        for (DWORD index = 0; index < dwNumSamples; index++)
        {
            signed char left = (*(signed char*)pData) - 128; pData += 1;
            signed char right = (*(signed char*)pData) - 128; pData += 1;
        }
    }
    else if ((format->wBitsPerSample == 8) && (format->nChannels == 1))
    {
        for (DWORD index = 0; index < dwNumSamples; index++)
        {
            signed char monosample = (*(signed char*)pData) - 128; pData += 1;
        }
    }
}
Seigniory answered 10/4, 2019 at 4:48 Comment(1)
hi, I search and find your answer and try to use your code in my project but I have a problem, this is my question my problem is I don't know how to memcpy_s the data? and I don't sure is my problem because of samples and bytes or is something else..? I'm new in voice, please help me in my projectColchicum

© 2022 - 2024 — McMap. All rights reserved.