I need to get all the samples of a wav file into an array (or two if you need to do that to keep the stereo) so that I can apply some modifications to them. I was wondering if this is easily done (preferably without external libraries). I have no experience with reading in sound files, so I don't know much about the subject.
WAV files (at least, uncompressed ones) are fairly straightforward. There's a header, then the data follows it.
Here's a great reference: https://ccrma.stanford.edu/courses/422/projects/WaveFormat/ (mirror)
fmt
" subchunk is 0x0003 (WAVE_FORMAT_IEEE_FLOAT
) –
Evince This code should do the trick. It converts a wave file to a normalized double array (-1 to 1), but it should be trivial to make it an int/short array instead (remove the /32768.0
bit and add 32768 instead). The right[]
array will be set to null if the loaded wav file is found to be mono.
I can't claim it's completely bullet proof (potential off-by-one errors), but after creating a 65536 sample array, and creating a wave from -1 to 1, none of the samples appear to go 'through' the ceiling or floor.
// convert two bytes to one double in the range -1 to 1
static double bytesToDouble(byte firstByte, byte secondByte) {
// convert two bytes to one short (little endian)
short s = (secondByte << 8) | firstByte;
// convert to range from -1 to (just below) 1
return s / 32768.0;
}
// Returns left and right double arrays. 'right' will be null if sound is mono.
public void openWav(string filename, out double[] left, out double[] right)
{
byte[] wav = File.ReadAllBytes(filename);
// Determine if mono or stereo
int channels = wav[22]; // Forget byte 23 as 99.999% of WAVs are 1 or 2 channels
// Get past all the other sub chunks to get to the data subchunk:
int pos = 12; // First Subchunk ID from 12 to 16
// Keep iterating until we find the data chunk (i.e. 64 61 74 61 ...... (i.e. 100 97 116 97 in decimal))
while(!(wav[pos]==100 && wav[pos+1]==97 && wav[pos+2]==116 && wav[pos+3]==97)) {
pos += 4;
int chunkSize = wav[pos] + wav[pos + 1] * 256 + wav[pos + 2] * 65536 + wav[pos + 3] * 16777216;
pos += 4 + chunkSize;
}
pos += 8;
// Pos is now positioned to start of actual sound data.
int samples = (wav.Length - pos)/2; // 2 bytes per sample (16 bit sound mono)
if (channels == 2) samples /= 2; // 4 bytes per sample (16 bit stereo)
// Allocate memory (right will be null if only mono sound)
left = new double[samples];
if (channels == 2) right = new double[samples];
else right = null;
// Write to double array/s:
int i=0;
while (pos < length) {
left[i] = bytesToDouble(wav[pos], wav[pos + 1]);
pos += 2;
if (channels == 2) {
right[i] = bytesToDouble(wav[pos], wav[pos + 1]);
pos += 2;
}
i++;
}
}
(secondByte << 8)
to an 8bit type like byte
. byte
is an 8-bit signed integer and cannot be bitshifted it by 8 because there is no space to push those 8 bits. –
Heins Otherwise, both operands are converted to type int.
. –
Donegan pos
is used to measure the index position in the wav array. I updated his example to wav.Length
instead, and wrapped the entire s
definition formula in a (short)
cast. You will of course need to add using System.IO;
to your own classes to use the File
class he uses to open the wav file. –
Pedersen Assuming your WAV file contains 16 bit PCM (which is the most common), you can use NAudio to read it out into a byte array, and then copy that into an array of 16 bit integers for convenience. If it is stereo, the samples will be interleaved left, right.
using (WaveFileReader reader = new WaveFileReader("myfile.wav"))
{
Assert.AreEqual(16, reader.WaveFormat.BitsPerSample, "Only works with 16 bit audio");
byte[] buffer = new byte[reader.Length];
int read = reader.Read(buffer, 0, buffer.Length);
short[] sampleBuffer = new short[read / 2];
Buffer.BlockCopy(buffer, 0, sampleBuffer, 0, read);
}
I know you wanted to avoid third party libraries, but if you want to be sure to cope with WAV files with extra chunks, I suggest avoiding approaches like just seeking 44 bytes into the file.
NOTE: As per Daniel Moller's comment, refer to http://soundfile.sapp.org/doc/WaveFormat/ to understand the code. (And upvote the comment).
At time of writing nobody has addressed 32-bit or 64-bit encoded WAVs.
The following code handles 16/32/64 bit and mono/stereo:
static bool readWav( string filename, out float[] L, out float[] R )
{
L = R = null;
try {
using (FileStream fs = File.Open(filename,FileMode.Open))
{
BinaryReader reader = new BinaryReader(fs);
// chunk 0
int chunkID = reader.ReadInt32();
int fileSize = reader.ReadInt32();
int riffType = reader.ReadInt32();
// chunk 1
int fmtID = reader.ReadInt32();
int fmtSize = reader.ReadInt32(); // bytes for this chunk (expect 16 or 18)
// 16 bytes coming...
int fmtCode = reader.ReadInt16();
int channels = reader.ReadInt16();
int sampleRate = reader.ReadInt32();
int byteRate = reader.ReadInt32();
int fmtBlockAlign = reader.ReadInt16();
int bitDepth = reader.ReadInt16();
if (fmtSize == 18)
{
// Read any extra values
int fmtExtraSize = reader.ReadInt16();
reader.ReadBytes(fmtExtraSize);
}
// chunk 2
int dataID = reader.ReadInt32();
int bytes = reader.ReadInt32();
// DATA!
byte[] byteArray = reader.ReadBytes(bytes);
int bytesForSamp = bitDepth/8;
int nValues = bytes / bytesForSamp;
float[] asFloat = null;
switch( bitDepth ) {
case 64:
double[]
asDouble = new double[nValues];
Buffer.BlockCopy(byteArray, 0, asDouble, 0, bytes);
asFloat = Array.ConvertAll( asDouble, e => (float)e );
break;
case 32:
asFloat = new float[nValues];
Buffer.BlockCopy(byteArray, 0, asFloat, 0, bytes);
break;
case 16:
Int16 []
asInt16 = new Int16[nValues];
Buffer.BlockCopy(byteArray, 0, asInt16, 0, bytes);
asFloat = Array.ConvertAll( asInt16, e => e / (float)(Int16.MaxValue+1) );
break;
default:
return false;
}
switch( channels ) {
case 1:
L = asFloat;
R = null;
return true;
case 2:
// de-interleave
int nSamps = nValues / 2;
L = new float[nSamps];
R = new float[nSamps];
for( int s=0, v=0; s<nSamps; s++ ) {
L[s] = asFloat[v++];
R[s] = asFloat[v++];
}
return true;
default:
return false;
}
}
}
catch {
Debug.Log( "...Failed to load: " + filename );
return false;
}
return false;
}
float[]
of sample data back into short
s during writing for 16-bit audio. Everything else works great, but converting to short
the simple way gets me a bunch of zeros. Please help with the maths? –
Jacquiline int nSamps = nValues / 2;
–
Karlotte BW.Write(Convert.ToInt16(lData[i]))
gets me zeros coz I don't know how to do it properly. But I think it would really help me if you could explain the maths here: asFloat = Array.ConvertAll( asInt16, e => e / (float)(Int16.MaxValue+1))
and why you converted the 16-bit sample data to 4-byte floats? Also, wouldn't we lose precision when converting doubles to floats (for 64-bit)? It seems like you're basically "bitcrushing"(audio term) to 32-bit. It's probably my ignorance of numeric types :) –
Jacquiline bw.Write(Convert.ToInt16((short.MaxValue + 1) * lData[i]))
writes the data perfectly for 16-bit. I'll probably just use a double instead of float as the storage array, and it should all be great. Thanks again. –
Jacquiline WAV files (at least, uncompressed ones) are fairly straightforward. There's a header, then the data follows it.
Here's a great reference: https://ccrma.stanford.edu/courses/422/projects/WaveFormat/ (mirror)
fmt
" subchunk is 0x0003 (WAVE_FORMAT_IEEE_FLOAT
) –
Evince http://hourlyapps.blogspot.com/2008/07/open-source-wave-graph-c-net-control.html
Here is a Control which Display's the Spectrum of a Wav file ,which also Serves a Byte[] of Decoded Wav File where you can play and/or Change their Values .
Just download the Control and it's pretty good for WAV File manipulation.
To get the wav file into an array you can just do this:
byte[] data = File.ReadAllBytes("FilePath");
but like Fletch said you need to isolate the data from the headers. It should be just a simple offset.
Try Play audio data from array
PlayerEx pl = new PlayerEx();
private static void PlayArray(PlayerEx pl)
{
double fs = 8000; // sample freq
double freq = 1000; // desired tone
short[] mySound = new short[4000];
for (int i = 0; i < 4000; i++)
{
double t = (double)i / fs; // current time
mySound[i] = (short)(Math.Cos(t * freq) * (short.MaxValue));
}
IntPtr format = AudioCompressionManager.GetPcmFormat(1, 16, (int)fs);
pl.OpenPlayer(format);
byte[] mySoundByte = new byte[mySound.Length * 2];
Buffer.BlockCopy(mySound, 0, mySoundByte, 0, mySoundByte.Length);
pl.AddData(mySoundByte);
pl.StartPlay();
}
© 2022 - 2024 — McMap. All rights reserved.