How to get float array of samples from audio file
Asked Answered
I

3

7

I’m developing a UWP application ( for Windows 10) which works with audio data. It receives samples buffer at the start in the form of a float array of samples, which items are changing from -1f to 1f. Earlier I used NAudio.dll 1.8.0 that gives all necessary functionality. Worked with WaveFileReader, waveBuffer.FloatBuffer, WaveFileWriter classes. However, when I finished this app and tried to build Release version, got this error: ILT0042: Arrays of pointer types are not currently supported: 'System.Int32*[]'.

I’ve tried to solve it:

  1. https://forums.xamarin.com/discussion/73169/uwp-10-build-fail-arrays-of-pointer-types-error

There is advice to remove the link to .dll, but I need it.

  1. I’ve tried to install NAudio the same version using Manage NuGet Packages, but WaveFileReader, WaveFileWriter is not available.

  2. In NAudio developer’s answer (How to store a .wav file in Windows 10 with NAudio) I’ve read about using AudioGraph, but I can build float array of samples only in the realtime playback, but I need get the full samples to pack right after the audio file uploading. Example of getting samples during the recording process or playback: https://learn.microsoft.com/ru-ru/windows/uwp/audio-video-camera/audio-graphs

That’s why I need help: how to get FloatBuffer for working with samples after audio file uploading? For example, for building audio waves or calculation for audio effects applying.

Thank you in advance.


  1. I’ve tried to use FileStream and BitConverter.ToSingle(), however, I had a different result compared to NAudio. In other words, I’m still looking for a solution.

     private float[] GetBufferArray()
     {
         string _path = ApplicationData.Current.LocalFolder.Path.ToString() + "/track_1.mp3";
         FileStream _stream = new FileStream(_path, FileMode.Open);
         BinaryReader _binaryReader = new BinaryReader(_stream);
         int _dataSize = _binaryReader.ReadInt32();
         byte[] _byteBuffer = _binaryReader.ReadBytes(_dataSize);
    
         int _sizeFloat = sizeof(float);
         float[] _floatBuffer = new float[_byteBuffer.Length / _sizeFloat];
         for (int i = 0, j = 0; i < _byteBuffer.Length - _sizeFloat; i += _sizeFloat, j++)
         {
             _floatBuffer[j] = BitConverter.ToSingle(_byteBuffer, i);
         }
         return _floatBuffer;
     }
    
Irritating answered 27/2, 2017 at 10:48 Comment(5)
Links: 1) msdn.microsoft.com/en-us/library/ff827591.aspx 2) msdn.microsoft.com/ru-ru/library/…Irritating
If you're using Naudio, why don't you use the AudioFileReader or Mp3FileReader class? AudioFileReader will give you a float[] of the audio data right out of the box.Maidie
The functions you described do really work, but in the release version of UWP project’s build solution with NAudio.dll there is the following error: ‘’ILT0042: Arrays of pointer types are not currently supported: 'System.Int32*[]’’’ In the Nuget NAudio package these methods can’t be implemented, and I couldn’t find the alternative for them.Irritating
If the library itself is throwing exceptions, you may want to look into finding a more compatible version or alternative before trying to use it fully. The only other alternative I can think of is the MediaFoundationReader class, which should read most common audio files for you.Maidie
Thanks for the help, found only an example of working with AudioFileReader. Examples of working with MediaFoundationReader to find Audio Data did not meet. I will definitely try this class.Irritating
A
4

Another way to read samples from an audio file in UWP is using AudioGraph API. It will work for all audio formats that Windows10 supports

Here is a sample code

namespace AudioGraphAPI_read_samples_from_file
{
    // App opens a file using FileOpenPicker and reads samples into array of 
    // floats using AudioGragh API
// Declare COM interface to access AudioBuffer
[ComImport]
[Guid("5B0D3235-4DBA-4D44-865E-8F1D0E4FD04D")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
unsafe interface IMemoryBufferByteAccess
{
    void GetBuffer(out byte* buffer, out uint capacity);
}

public sealed partial class MainPage : Page
{
    StorageFile mediaFile;

    AudioGraph audioGraph;
    AudioFileInputNode fileInputNode;
    AudioFrameOutputNode frameOutputNode;

    /// <summary>
    /// We are going to fill this array with audio samples
    /// This app loads only one channel 
    /// </summary>
    float[] audioData;
    /// <summary>
    /// Current position in audioData array for loading audio samples 
    /// </summary>
    int audioDataCurrentPosition = 0;

    public MainPage()
    {
        this.InitializeComponent();            
    }

    private async void Open_Button_Click(object sender, RoutedEventArgs e)
    {
        // We ask user to pick an audio file
        FileOpenPicker filePicker = new FileOpenPicker();
        filePicker.SuggestedStartLocation = PickerLocationId.MusicLibrary;
        filePicker.FileTypeFilter.Add(".mp3");
        filePicker.FileTypeFilter.Add(".wav");
        filePicker.FileTypeFilter.Add(".wma");
        filePicker.FileTypeFilter.Add(".m4a");
        filePicker.ViewMode = PickerViewMode.Thumbnail;
        mediaFile = await filePicker.PickSingleFileAsync();

        if (mediaFile == null)
        {
            return;
        }

        // We load samples from file
        await LoadAudioFromFile(mediaFile);

        // We wait 5 sec
        await Task.Delay(5000);

        if (audioData == null)
        {
            ShowMessage("Error loading samples");
            return;
        }

        // After LoadAudioFromFile method finished we can use audioData
        // For example we can find max amplitude
        float max = audioData[0];
        for (int i = 1; i < audioData.Length; i++)
            if (Math.Abs(audioData[i]) > Math.Abs(max))
                max = audioData[i];
        ShowMessage("Maximum is " + max.ToString());
    }

    private async void ShowMessage(string Message)
    {
        var dialog = new MessageDialog(Message);
        await dialog.ShowAsync();
    }

    private async Task LoadAudioFromFile(StorageFile file)
    {
        // We initialize an instance of AudioGraph
        AudioGraphSettings settings = 
            new AudioGraphSettings(
                Windows.Media.Render.AudioRenderCategory.Media
                );
        CreateAudioGraphResult result1 = await AudioGraph.CreateAsync(settings);
        if (result1.Status != AudioGraphCreationStatus.Success)
        {
            ShowMessage("AudioGraph creation error: " + result1.Status.ToString());
        }
        audioGraph = result1.Graph;

        if (audioGraph == null)
            return;

        // We initialize FileInputNode
        CreateAudioFileInputNodeResult result2 = 
            await audioGraph.CreateFileInputNodeAsync(file);
        if (result2.Status != AudioFileNodeCreationStatus.Success)
        {
            ShowMessage("FileInputNode creation error: " + result2.Status.ToString());
        }
        fileInputNode = result2.FileInputNode;

        if (fileInputNode == null)
            return;

        // We read audio file encoding properties to pass them to FrameOutputNode creator
        AudioEncodingProperties audioEncodingProperties = fileInputNode.EncodingProperties;

        // We initialize FrameOutputNode and connect it to fileInputNode
        frameOutputNode = audioGraph.CreateFrameOutputNode(audioEncodingProperties);
        fileInputNode.AddOutgoingConnection(frameOutputNode);

        // We add a handler achiving the end of a file
        fileInputNode.FileCompleted += FileInput_FileCompleted;
        // We add a handler which will transfer every audio frame into audioData 
        audioGraph.QuantumStarted += AudioGraph_QuantumStarted;

        // We initialize audioData
        int numOfSamples = (int)Math.Ceiling(
            (decimal)0.0000001
            * fileInputNode.Duration.Ticks
            * fileInputNode.EncodingProperties.SampleRate
            );
        audioData = new float[numOfSamples];

        audioDataCurrentPosition = 0;

        // We start process which will read audio file frame by frame
        // and will generated events QuantumStarted when a frame is in memory
        audioGraph.Start();

    }

    private void FileInput_FileCompleted(AudioFileInputNode sender, object args)
    {
        audioGraph.Stop();
    }

    private void AudioGraph_QuantumStarted(AudioGraph sender, object args)
    {
        AudioFrame frame = frameOutputNode.GetFrame();
        ProcessInputFrame(frame);

    }

    unsafe private void ProcessInputFrame(AudioFrame frame)
    {
        using (AudioBuffer buffer = frame.LockBuffer(AudioBufferAccessMode.Read))
        using (IMemoryBufferReference reference = buffer.CreateReference())
        {
            // We get data from current buffer
            ((IMemoryBufferByteAccess)reference).GetBuffer(
                out byte* dataInBytes,
                out uint capacityInBytes
                );
            // We discard first frame; it's full of zeros because of latency
            if (audioGraph.CompletedQuantumCount == 1) return;

            float* dataInFloat = (float*)dataInBytes;
            uint capacityInFloat = capacityInBytes / sizeof(float);
            // Number of channels defines step between samples in buffer
            uint step = fileInputNode.EncodingProperties.ChannelCount;
            // We transfer audio samples from buffer into audioData
            for (uint i = 0; i < capacityInFloat; i += step)
            {
                if (audioDataCurrentPosition < audioData.Length)
                {
                    audioData[audioDataCurrentPosition] = dataInFloat[i];
                    audioDataCurrentPosition++;
                }
            }
        }
    }
}

}

Edited: It solves the problem because it reads samples from a file into a float array

Attenuator answered 8/10, 2017 at 10:53 Comment(1)
Just linking to your own library or tutorial is not a good answer. Linking to it, explaining why it solves the problem, providing code on how to do so and disclaiming that you wrote it makes for a better answer. See: What signifies “Good” self promotion?Atheism
I
3

First popular way of getting AudioData from Wav file.

Thanks to PI user’s answer How to read the data in a wav file to an array, I’ve solved the problem with wav file reading in float array in UWP project. But file’s structure differs from standard one (maybe, only in my project there is such problem) when it records in wav file using AudioGraph. It leads to unpredictable result. We receive value1263424842 instead of predictable 544501094 getting format id. After that, all following values are displayed incorrectly. I’ve found out the correct id sequentially searching in the bytes. I realised that AudioGraph adds extra chunk of data to recorded wav file, but record’s format is still PCM. This extra chunk of data looks like the data about file format, but it contains also empty values, empty bytes. I can’t find any information about that, maybe somebody here knows? The solution from PI I’ve changed for my needs. That’s what I’ve got:

           using (FileStream fs = File.Open(filename, FileMode.Open))
            {
                BinaryReader reader = new BinaryReader(fs);

                int chunkID = reader.ReadInt32();
                int fileSize = reader.ReadInt32();
                int riffType = reader.ReadInt32();
                int fmtID;

                long _position = reader.BaseStream.Position;
                while (_position != reader.BaseStream.Length-1)
                {
                    reader.BaseStream.Position = _position;
                    int _fmtId = reader.ReadInt32();
                    if (_fmtId == 544501094) {
                        fmtID = _fmtId;
                        break;
                    }
                    _position++;
                }
                int fmtSize = reader.ReadInt32();
                int fmtCode = reader.ReadInt16();

                int channels = reader.ReadInt16();
                int sampleRate = reader.ReadInt32();
                int byteRate = reader.ReadInt32();
                int fmtBlockAlign = reader.ReadInt16();
                int bitDepth = reader.ReadInt16();

                int fmtExtraSize;
                if (fmtSize == 18)
                {
                    fmtExtraSize = reader.ReadInt16();
                    reader.ReadBytes(fmtExtraSize);
                }

                int dataID = reader.ReadInt32();
                int dataSize = reader.ReadInt32();

                byte[] byteArray = reader.ReadBytes(dataSize);

                int bytesForSamp = bitDepth / 8;
                int samps = dataSize / bytesForSamp;

                float[] asFloat = null;
                switch (bitDepth)
                {
                    case 16:
                        Int16[] asInt16 = new Int16[samps];
                        Buffer.BlockCopy(byteArray, 0, asInt16, 0, dataSize);
                        IEnumerable<float> tempInt16 =
                            from i in asInt16
                            select i / (float)Int16.MaxValue;
                        asFloat = tempInt16.ToArray();
                        break;
                    default:
                        return false;
                }

                //For one channel wav audio
                floatLeftBuffer.AddRange(asFloat);

From buffer to file record has inverse algorithm. At this moment this is the only one correct algorithm for working with wav files which allows to get audio data. Used this article working with AudioGraph - https://learn.microsoft.com/ru-ru/windows/uwp/audio-video-camera/audio-graphs. Note that you can set up necessary data of record’s format with AudioEncodingQuality recirdung from MIC to file.

Second way of getting AudioData using NAudio from Nugget Packages.

I used MediaFoundationReader class.

        float[] floatBuffer;
        using (MediaFoundationReader media = new MediaFoundationReader(path))
        {
            int _byteBuffer32_length = (int)media.Length * 2;
            int _floatBuffer_length = _byteBuffer32_length / sizeof(float);

            IWaveProvider stream32 = new Wave16ToFloatProvider(media);
            WaveBuffer _waveBuffer = new WaveBuffer(_byteBuffer32_length);
            stream32.Read(_waveBuffer, 0, (int)_byteBuffer32_length);
            floatBuffer = new float[_floatBuffer_length];

            for (int i = 0; i < _floatBuffer_length; i++) {
                floatBuffer[i] = _waveBuffer.FloatBuffer[i];
            }
        }

Comparing two ways I noticed:

  • Received values of samples differ on 1/1 000 000. I can’t say what way is more precise (if you know, will be glad to hear);
  • Second way of getting AudioData works for MP3 files, too.

If you’ve found any mistakes or have comments about that, welcome.

Irritating answered 14/3, 2017 at 12:38 Comment(0)
P
1

Import statement

using NAudio.Wave;
using NAudio.Wave.SampleProviders;

Inside function

AudioFileReader reader = new AudioFileReader(filename);
ISampleProvider isp = reader.ToSampleProvider();
float[] buffer = new float[reader.Length / 2];
isp.Read(buffer, 0, buffer.Length);

buffer array will be having 32 bit IEEE float samples. This is using NAudio Nuget Package Visual Studio.

Pronucleus answered 29/6, 2020 at 21:31 Comment(1)
float array size should be "number of samples". reader.Length gives size of audio in bytes. Now since bitrate is 16 bits ( 2 bytes ) each sample is of 2 bytes. Hence bytes to no of samples conversion requires divide by 2. If wav file was of 32 bits bitrate then divide by 4.Pronucleus

© 2022 - 2024 — McMap. All rights reserved.