How to read the data in a wav file to an array

Asked 6/1, 2012 at 6:13 Answered 8/1, 2016 at 0:0

I need to get all the samples of a wav file into an array (or two if you need to do that to keep the stereo) so that I can apply some modifications to them. I was wondering if this is easily done (preferably without external libraries). I have no experience with reading in sound files, so I don't know much about the subject.

Randazzo answered 6/1, 2012 at 6:13 Comment(2)

Why don't you want to use libraries? If it's a licensing issue, look for an LGPL or similar license. If not, well, .NET is built on libraries so that you don't have to code everything yourself. Install NuGet, get NAudio (or another audio library), and don't reinvent the wheel :). Or do it in C ;) – Ultraconservative 14/11, 2013 at 5:37

Why is reinventing the wheel such a bad thing? Not only will you feel good for working it out yourself, you will also learn so much in the process. If you're coding for fun and have no time restraints then I'd say definitely reinvent the wheel! – Pianism 5/3, 2015 at 22:54

WAV files (at least, uncompressed ones) are fairly straightforward. There's a header, then the data follows it.

Here's a great reference: ~~https://ccrma.stanford.edu/courses/422/projects/WaveFormat/~~ (mirror)

Quinquennium answered 6/1, 2012 at 6:18 Comment(5)

So I can get the data in, then isolate the bytes after offset 44 and then reuse the same header to save the file, right? – Randazzo 6/1, 2012 at 6:30

not always, WAV files can contain other chunks as well after the fmt chunk and before the data chunk. It is best to properly parse the RIFF chunks – Accad 6/1, 2012 at 16:8

The data in the WAV file won't actually be floating point will it? Specifically, will it be a floating point representation of the actual time-domain waveform? – Ostrander 13/7, 2012 at 15:37

It will be if the AudioFormat in the "fmt " subchunk is 0x0003 (WAVE_FORMAT_IEEE_FLOAT) – Evince 18/12, 2012 at 3:59

That link is a great introduction, but watch out. Many wavs today have headers longer than 44 bytes. 46-byte headers are particularly common and that introduction does not discuss the possibility of extra data to parse. – Handlebar 15/11, 2013 at 19:8

This code should do the trick. It converts a wave file to a normalized double array (-1 to 1), but it should be trivial to make it an int/short array instead (remove the /32768.0 bit and add 32768 instead). The right[] array will be set to null if the loaded wav file is found to be mono.

I can't claim it's completely bullet proof (potential off-by-one errors), but after creating a 65536 sample array, and creating a wave from -1 to 1, none of the samples appear to go 'through' the ceiling or floor.

// convert two bytes to one double in the range -1 to 1
static double bytesToDouble(byte firstByte, byte secondByte) {
    // convert two bytes to one short (little endian)
    short s = (secondByte << 8) | firstByte;
    // convert to range from -1 to (just below) 1
    return s / 32768.0;
}

// Returns left and right double arrays. 'right' will be null if sound is mono.
public void openWav(string filename, out double[] left, out double[] right)
{
    byte[] wav = File.ReadAllBytes(filename);

    // Determine if mono or stereo
    int channels = wav[22];     // Forget byte 23 as 99.999% of WAVs are 1 or 2 channels

    // Get past all the other sub chunks to get to the data subchunk:
    int pos = 12;   // First Subchunk ID from 12 to 16

    // Keep iterating until we find the data chunk (i.e. 64 61 74 61 ...... (i.e. 100 97 116 97 in decimal))
    while(!(wav[pos]==100 && wav[pos+1]==97 && wav[pos+2]==116 && wav[pos+3]==97)) {
        pos += 4;
        int chunkSize = wav[pos] + wav[pos + 1] * 256 + wav[pos + 2] * 65536 + wav[pos + 3] * 16777216;
        pos += 4 + chunkSize;
    }
    pos += 8;

    // Pos is now positioned to start of actual sound data.
    int samples = (wav.Length - pos)/2;     // 2 bytes per sample (16 bit sound mono)
    if (channels == 2) samples /= 2;        // 4 bytes per sample (16 bit stereo)

    // Allocate memory (right will be null if only mono sound)
    left = new double[samples];
    if (channels == 2) right = new double[samples];
    else right = null;

    // Write to double array/s:
    int i=0;
    while (pos < length) {
        left[i] = bytesToDouble(wav[pos], wav[pos + 1]);
        pos += 2;
        if (channels == 2) {
            right[i] = bytesToDouble(wav[pos], wav[pos + 1]);
            pos += 2;
        }
        i++;
    }
}

Comeau answered 22/6, 2012 at 19:12 Comment(9)

Someone correct me if i am wrong. You can't do (secondByte << 8) to an 8bit type like byte. byte is an 8-bit signed integer and cannot be bitshifted it by 8 because there is no space to push those 8 bits. – Heins 19/6, 2013 at 16:52

@MartinBerger Bytes are auto-promoted to int in most cases, including shifting. – Donegan 14/1, 2014 at 18:1

@Clément So during shifting, variable is promoted to platform-specifc int, which usually is 32 bits? Didn't know that, thanks. – Heins 15/1, 2014 at 15:10

@MartinBerger yes, see msdn.microsoft.com/en-us/library/aa691330(v=vs.71).aspx and in particular the last rule: Otherwise, both operands are converted to type int.. – Donegan 15/1, 2014 at 23:47

@MartinBerger FYI there is no such thing as "platform specific int". C#'s int (which maps to the Int32 struct in .NET) is ALWAYS 32 bits. – Pianism 4/3, 2015 at 1:33

@Backwards_Dave What i meant was this. For example, for int type, you have "thus at least 16 bits in size". C# probably has fixed int size of 4 bytes, but in C it may differ across platforms. – Heins 4/3, 2015 at 9:22

@MartinBerger The OP used a C# tag, not a C tag. – Pianism 5/3, 2015 at 22:56

In the last loop i had "no such variable as length in this context" in the byte converter i had "cannot convert short to byte or something i changed it with an int but i'm stuck which length to use. – Vet 2/5, 2016 at 8:24

@comprehensible pos is used to measure the index position in the wav array. I updated his example to wav.Length instead, and wrapped the entire s definition formula in a (short) cast. You will of course need to add using System.IO; to your own classes to use the File class he uses to open the wav file. – Pedersen 11/11, 2016 at 8:3

Assuming your WAV file contains 16 bit PCM (which is the most common), you can use NAudio to read it out into a byte array, and then copy that into an array of 16 bit integers for convenience. If it is stereo, the samples will be interleaved left, right.

using (WaveFileReader reader = new WaveFileReader("myfile.wav"))
{
    Assert.AreEqual(16, reader.WaveFormat.BitsPerSample, "Only works with 16 bit audio");
    byte[] buffer = new byte[reader.Length];
    int read = reader.Read(buffer, 0, buffer.Length);
    short[] sampleBuffer = new short[read / 2];
    Buffer.BlockCopy(buffer, 0, sampleBuffer, 0, read);
}

I know you wanted to avoid third party libraries, but if you want to be sure to cope with WAV files with extra chunks, I suggest avoiding approaches like just seeking 44 bytes into the file.

Accad answered 6/1, 2012 at 16:47 Comment(0)

NOTE: As per Daniel Moller's comment, refer to http://soundfile.sapp.org/doc/WaveFormat/ to understand the code. (And upvote the comment).

At time of writing nobody has addressed 32-bit or 64-bit encoded WAVs.

The following code handles 16/32/64 bit and mono/stereo:

static bool readWav( string filename, out float[] L, out float[] R )
{
    L = R = null;

    try {
        using (FileStream fs = File.Open(filename,FileMode.Open))
        {
            BinaryReader reader = new BinaryReader(fs);

            // chunk 0
            int chunkID       = reader.ReadInt32();
            int fileSize      = reader.ReadInt32();
            int riffType      = reader.ReadInt32();
            

            // chunk 1
            int fmtID         = reader.ReadInt32();
            int fmtSize       = reader.ReadInt32(); // bytes for this chunk (expect 16 or 18)

            // 16 bytes coming...
            int fmtCode       = reader.ReadInt16();
            int channels      = reader.ReadInt16();
            int sampleRate    = reader.ReadInt32();
            int byteRate      = reader.ReadInt32();
            int fmtBlockAlign = reader.ReadInt16();
            int bitDepth      = reader.ReadInt16();

            if (fmtSize == 18)
            {
                // Read any extra values
                int fmtExtraSize = reader.ReadInt16();
                reader.ReadBytes(fmtExtraSize);
            }

            // chunk 2
            int dataID = reader.ReadInt32();
            int bytes = reader.ReadInt32();
            
            // DATA!
            byte[] byteArray = reader.ReadBytes(bytes);
            
            int bytesForSamp = bitDepth/8;
            int nValues = bytes / bytesForSamp;


            float[] asFloat = null;
            switch( bitDepth ) {
                case 64:
                    double[] 
                        asDouble = new double[nValues];  
                    Buffer.BlockCopy(byteArray, 0, asDouble, 0, bytes);
                    asFloat = Array.ConvertAll( asDouble, e => (float)e );
                    break;
                case 32:
                    asFloat = new float[nValues];   
                    Buffer.BlockCopy(byteArray, 0, asFloat, 0, bytes);
                    break;
                case 16:
                    Int16 [] 
                        asInt16 = new Int16[nValues];   
                    Buffer.BlockCopy(byteArray, 0, asInt16, 0, bytes);
                    asFloat = Array.ConvertAll( asInt16, e => e / (float)(Int16.MaxValue+1) );
                    break;
                default:
                    return false;
            }

            switch( channels ) {
            case 1:
                L = asFloat;
                R = null;
                return true;
            case 2:
                // de-interleave
                int nSamps = nValues / 2;
                L = new float[nSamps];
                R = new float[nSamps];
                for( int s=0, v=0; s<nSamps; s++ ) {
                    L[s] = asFloat[v++];
                    R[s] = asFloat[v++];
                }
                return true;
            default:
                return false;
            }
        }
    }
    catch {
            Debug.Log( "...Failed to load: " + filename );
            return false;
    }

    return false;
}

Karlotte answered 8/1, 2016 at 0:0 Comment(12)

Shouldn't L and R be of Length (samps / 2), since asFloat is of size samps? – Anchoress 17/2, 2016 at 9:36

oo I think you're right. Also I should be using Float32 and Float64 rather than float and double. – Karlotte 18/2, 2016 at 11:13

I think float and double are right. C# does not have Float32 and Float64 rather System.Single and System.Double. Anyways, thanks for the code, it helped me fix an annoying error on mine. – Anchoress 18/2, 2016 at 13:9

Great! -- Read this answer along with this article to understand better what's gong on: soundfile.sapp.org/doc/WaveFormat – Mervin 22/7, 2016 at 16:33

the inner loop of case 2 will loop i =0 to i = samps, s++ will execute 2x sample times. but asFloat is size of samps. Array out of bound guaranteed. I think it should be i<samps/2; – Rustin 19/12, 2017 at 21:37

Hello, how to call this function? help please – Sanitize 11/5, 2021 at 20:30

Hey P i, This is fantastic for me, I've beefed it up and created a writer too, but I can't work out how to turn the convenient float[] of sample data back into shorts during writing for 16-bit audio. Everything else works great, but converting to short the simple way gets me a bunch of zeros. Please help with the maths? – Jacquiline 8/9, 2022 at 21:34

@Jacquiline If you paste a gist I can take a look – Karlotte 9/9, 2022 at 22:57

@Rustin Code already has int nSamps = nValues / 2; – Karlotte 9/9, 2022 at 23:3

My BW.Write(Convert.ToInt16(lData[i])) gets me zeros coz I don't know how to do it properly. But I think it would really help me if you could explain the maths here: asFloat = Array.ConvertAll( asInt16, e => e / (float)(Int16.MaxValue+1)) and why you converted the 16-bit sample data to 4-byte floats? Also, wouldn't we lose precision when converting doubles to floats (for 64-bit)? It seems like you're basically "bitcrushing"(audio term) to 32-bit. It's probably my ignorance of numeric types :) – Jacquiline 10/9, 2022 at 0:15

@Jacquiline I'm using Int16.max+1 as int ranges from -32768 to +32767. i.e. slightly range-asymmetrical. Dividing thru by 32768 normalizes (ensures range [-1.f, +1.f]). I was converting to 32 bits as I wanted the same processing logic to act over any source wav & didn't care about precision loss. Bear in mind I was simply sharing code that worked for my use-case, rather than making any attempt at a commercial-grade function. You'll have to tweak it if out-of-the-box isn't cutting it for you. – Karlotte 10/9, 2022 at 20:15

Thanks man, as a sound-guy, it must be perfect! I worked out bw.Write(Convert.ToInt16((short.MaxValue + 1) * lData[i])) writes the data perfectly for 16-bit. I'll probably just use a double instead of float as the storage array, and it should all be great. Thanks again. – Jacquiline 10/9, 2022 at 20:43

WAV files (at least, uncompressed ones) are fairly straightforward. There's a header, then the data follows it.

Here's a great reference: ~~https://ccrma.stanford.edu/courses/422/projects/WaveFormat/~~ (mirror)

Quinquennium answered 6/1, 2012 at 6:18 Comment(5)

So I can get the data in, then isolate the bytes after offset 44 and then reuse the same header to save the file, right? – Randazzo 6/1, 2012 at 6:30

not always, WAV files can contain other chunks as well after the fmt chunk and before the data chunk. It is best to properly parse the RIFF chunks – Accad 6/1, 2012 at 16:8

The data in the WAV file won't actually be floating point will it? Specifically, will it be a floating point representation of the actual time-domain waveform? – Ostrander 13/7, 2012 at 15:37

It will be if the AudioFormat in the "fmt " subchunk is 0x0003 (WAVE_FORMAT_IEEE_FLOAT) – Evince 18/12, 2012 at 3:59

http://hourlyapps.blogspot.com/2008/07/open-source-wave-graph-c-net-control.html
Here is a Control which Display's the Spectrum of a Wav file ,which also Serves a Byte[] of Decoded Wav File where you can play and/or Change their Values .

Just download the Control and it's pretty good for WAV File manipulation.

Cyclosis answered 6/1, 2012 at 9:47 Comment(0)

To get the wav file into an array you can just do this:

byte[] data = File.ReadAllBytes("FilePath");

but like Fletch said you need to isolate the data from the headers. It should be just a simple offset.

Chauffer answered 6/1, 2012 at 6:27 Comment(0)

Try Play audio data from array

PlayerEx pl = new PlayerEx();

private static void PlayArray(PlayerEx pl)
{
    double fs = 8000; // sample freq
    double freq = 1000; // desired tone
    short[] mySound = new short[4000];
    for (int i = 0; i < 4000; i++)
    {
        double t = (double)i / fs; // current time
        mySound[i] = (short)(Math.Cos(t * freq) * (short.MaxValue));
    }
    IntPtr format = AudioCompressionManager.GetPcmFormat(1, 16, (int)fs);
    pl.OpenPlayer(format);
    byte[] mySoundByte = new byte[mySound.Length * 2];
    Buffer.BlockCopy(mySound, 0, mySoundByte, 0, mySoundByte.Length);
    pl.AddData(mySoundByte);
    pl.StartPlay();
}

Cowles answered 1/2, 2014 at 16:24 Comment(1)

This answer does not match the question asked, but it might be helpfull on the proper question. – Stanleystanly 23/8, 2021 at 12:41

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags