Issue encoding and decoding an audio recording to G711 ( PCMU - uLaw) format

Asked 24/4, 2014 at 15:53 Answered 12/5, 2014 at 7:55

Solved android codec pcm audiorecord audiotrack

There isn't too much info about apply this codec when we need to streaming audio. Without apply the codec, my code work like a charm establishing a communication between 2 devices but I need encode/decode in that format because I will need streaming with the server and not between two devices (I am testing this code using 2 devices).

I am looking for the chance if anyone of your could see where is the key of my problem. I've tried different configurations of the input parameters. Maybe, the codecs that I am using are wrong (I took them from one project with Apache license.)

This values are set in the recorder-sender as in the player-receiver device:

private int port=50005;
private int sampleRate = 8000 ;//44100;
private int channelConfig = AudioFormat.CHANNEL_OUT_MONO;    
private int audioFormat = AudioFormat.ENCODING_PCM_16BIT;       
int minBufSize = AudioTrack.getMinBufferSize(sampleRate, channelConfig, audioFormat);

Note: CHANNEL_OUT_MONO in the player and CHANNEL_IN_MONO in the recorder item.

And these are my methods:

public void startStreamingEncoding() {

    Thread streamThread = new Thread(new Runnable() {

        @Override
        public void run() {
            try {

                android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_AUDIO);

                DatagramSocket socket = new DatagramSocket();

                short[] buffer = new short[minBufSize];

                DatagramPacket packet;

                final InetAddress destination = InetAddress.getByName(ip_receiver); 

                recorder = new AudioRecord(MediaRecorder.AudioSource.MIC,sampleRate,channelConfig,audioFormat,minBufSize*10);

                recorder.startRecording();

                /////Encoding:
                Encoder encoder = new G711UCodec();
                byte[] outBuffer = new byte[minBufSize];

                while(status == true) {

                    //reading data from MIC into buffer
                    minBufSize = recorder.read(buffer, 0, buffer.length);
                    //Encoding:
                    encoder.encode(buffer, minBufSize, outBuffer, 0);

                    //putting buffer in the packet
                    packet = new DatagramPacket (outBuffer, outBuffer.length, destination,port);

                    socket.send(packet);
                }

            } catch(UnknownHostException e) {
                Log.e("VS", "UnknownHostException");
            } catch (IOException e) {
                e.printStackTrace();
                Log.e("VS", "IOException");
            } 
        }

    });
    streamThread.start();
 }

And the method to play and decoding the stream:

    public void playerAudioDecoding()
{
    Thread thrd = new Thread(new Runnable() {
        @Override
        public void run() 
        {
            android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_AUDIO);

            AudioTrack track = new AudioTrack(AudioManager.STREAM_MUSIC, 
                    sampleRate, AudioFormat.CHANNEL_CONFIGURATION_MONO, 
                    AudioFormat.ENCODING_PCM_16BIT, minBufSize, 
                    AudioTrack.MODE_STREAM);
            track.play();

            Decoder decoder = new G711UCodec();

            try
            {
                DatagramSocket sock = new DatagramSocket(port);
                byte[] buf = new byte[minBufSize];

                while(true)
                {
                    DatagramPacket pack = new DatagramPacket(buf, minBufSize);
                    sock.receive(pack);

                    //Decoding:
                    int size = pack.getData().length;
                    short[] shortArray = new short[size];

                    decoder.decode(shortArray, pack.getData(), minBufSize, 0);
                    byte[] array = MyShortToByte(shortArray);
                    track.write(array, 0, array.length);
                }
            }
            catch (SocketException se)
            {
                Log.e("Error", "SocketException: " + se.toString());
            }
            catch (IOException ie)
            {
                Log.e("Error", "IOException" + ie.toString());
            }
        } // end run
    });
    thrd.start();
}

And it is the codec class that I am using with Apache license:

public class G711UCodec implements Encoder, Decoder {
// s00000001wxyz...s000wxyz
// s0000001wxyza...s001wxyz
// s000001wxyzab...s010wxyz
// s00001wxyzabc...s011wxyz
// s0001wxyzabcd...s100wxyz
// s001wxyzabcde...s101wxyz
// s01wxyzabcdef...s110wxyz
// s1wxyzabcdefg...s111wxyz

private static byte[] table13to8 = new byte[8192];
private static short[] table8to16 = new short[256];

static {
    // b13 --> b8
    for (int p = 1, q = 0; p <= 0x80; p <<= 1, q+=0x10) {
        for (int i = 0, j = (p << 4) - 0x10; i < 16; i++, j += p) {
            int v = (i + q) ^ 0x7F;
            byte value1 = (byte) v;
            byte value2 = (byte) (v + 128);
            for (int m = j, e = j + p; m < e; m++) {
                table13to8[m] = value1;
                table13to8[8191 - m] = value2;
            }
        }
    }

    // b8 --> b16
    for (int q = 0; q <= 7; q++) {
        for (int i = 0, m = (q << 4); i < 16; i++, m++) {
            int v = (((i + 0x10) << q) - 0x10) << 3;
            table8to16[m ^ 0x7F] = (short) v;
            table8to16[(m ^ 0x7F) + 128] = (short) (65536 - v);
        }
    }
}

public int decode(short[] b16, byte[] b8, int count, int offset) {
    for (int i = 0, j = offset; i < count; i++, j++) {
        b16[i] = table8to16[b8[j] & 0xFF];
    }
    return count;
}

public int encode(short[] b16, int count, byte[] b8, int offset) {

    for (int i = 0, j = offset; i < count; i++, j++) {
        b8[j] = table13to8[(b16[i] >> 4) & 0x1FFF];
    }
    return count;
}

public int getSampleCount(int frameSize) {
    return frameSize;
}

}

Really, I don't know what it happen; If I change the sampleRate to 4000 I can recognice my voice and some words but there is a lot echo. And i repeat, if disable the encoding/decoding process and make the streaming in PCM, the quality is fantastic. Let see if anybody could help me and thanks in advance.

Padang answered 24/4, 2014 at 15:53 Comment(3)

SOme old code I have uses twice the recommended buffer for the AudioTrack and 10 times for the AudioRecord. I can't remember exactly why, but it could be related to your problem. – Sheelagh 2/5, 2014 at 19:11

I am working on a solution with PCM over RTP/UDP and I have this clicking, bumping, almost drum sound constantly, seems to be in tune with the samples received, have you had this issue? any thoughts on what it could be? – Aret 26/5, 2017 at 19:9

Long time without play around of this feature/implementation. I remember that at the beggining I reproduced same sound issues as you but I think we solved playing around buffering. I can´t give you a solution because i don´t remember...only i know that using the coded i posted, the audio was fine. – Padang 16/6, 2017 at 10:36

Ok guys, finally I resolved for myself the problem encoding/decoding audio. It's been a annoying task during the last week. The main problem of my code was the encoding was well done but the decoding wasn't so I was working around it and modify these class with the help of other resources and I've created my own encoding/decoding methods (and these are working like a charm!!!).

Other important decision was to change the encoding format. Now I am using alaw, and not anymore ulaw. The only reason why I did this change is because programmatically is easier to implement alaw than ulaw.

Also I had to play a lot with the parameters as a buffers sizes, etc etc.

I will submit my code and I hope that someone of you could save so much time using my references.

    private int port=50005;
private int sampleRate = 8000; //44100;
private int channelConfig = AudioFormat.CHANNEL_IN_MONO;    
private int audioFormat = AudioFormat.ENCODING_PCM_16BIT;       
int minBufSize = AudioRecord.getMinBufferSize(sampleRate, channelConfig, audioFormat);

public void startStreamingEncoding() {

    Thread streamThread = new Thread(new Runnable() {

        @Override
        public void run() {
            try {

                android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_AUDIO);

                DatagramSocket socket = new DatagramSocket();

                byte[] buffer = new byte[4096];

                DatagramPacket packet;

                final InetAddress destination = InetAddress.getByName(ip_receiver); 

                recorder = new AudioRecord(MediaRecorder.AudioSource.MIC,sampleRate,channelConfig,audioFormat, minBufSize * 10);

                recorder.startRecording();

                /////Encoding:
                CMG711 encoder = new CMG711();
                byte[] outBuffer = new byte[4096];

                int read, encoded;
                File sdCard = Environment.getExternalStorageDirectory();
                FileOutputStream out = new FileOutputStream( new File( sdCard ,"audio-bernard.raw" ));

                while(status == true) {

                    //reading data from MIC into buffer
                    read = recorder.read(buffer, 0, buffer.length);
                    Log.d(getTag(), "read: "+read );

                    //Encoding:
                    encoded = encoder.encode(buffer,0, read, outBuffer);                      

                    //putting buffer in the packet
                    packet = new DatagramPacket (outBuffer, encoded, destination,port);
                    out.write( outBuffer, 0, encoded );

                    socket.send(packet);
                }

            } catch(UnknownHostException e) {
                Log.e("VS", "UnknownHostException");
            } catch (IOException e) {
                e.printStackTrace();
                Log.e("VS", "IOException");
            } 
        }

    });
    streamThread.start();
 }

And for the receiver and player class or method:

private int port=50005;
private int sampleRate = 8000 ;//44100;
private int channelConfig = AudioFormat.CHANNEL_OUT_MONO;    
private int audioFormat = AudioFormat.ENCODING_PCM_16BIT;       
int minBufSize = AudioTrack.getMinBufferSize(sampleRate, channelConfig, audioFormat);


 public void playerAudioDecodingBernard()
    {
        Thread thrd = new Thread(new Runnable() {
            @Override
            public void run() 
            {
                android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_AUDIO);

                AudioTrack track = new AudioTrack(AudioManager.STREAM_MUSIC, 
                        sampleRate, AudioFormat.CHANNEL_OUT_MONO, 
                        AudioFormat.ENCODING_PCM_16BIT, minBufSize * 10, 
                        AudioTrack.MODE_STREAM);


                CMG711 decoder = new CMG711();

                try
                {
                    DatagramSocket sock = new DatagramSocket(port);
                    byte[] buf = new byte[4096];

                    int frame = 0;
                    while(true)
                    {
                        DatagramPacket pack = new DatagramPacket(buf, 4096);
                        sock.receive(pack);

                        //Decoding:                         
                        int size = pack.getLength();
                        //Log.d( "Player", "Player: "+ size +", "+pack.getLength() + ", "+pack.getOffset() );
                        byte[] byteArray = new byte[size*2];

                        decoder.decode(pack.getData(), 0, size, byteArray);
                        track.write(byteArray, 0, byteArray.length);

                        if( frame++ > 3 )
                            track.play();
                    }
                }
                catch (SocketException se)
                {
                    Log.e("Error", "SocketException: " + se.toString());
                }
                catch (IOException ie)
                {
                    Log.e("Error", "IOException" + ie.toString());
                }
            } // end run
        });
        thrd.start();
    }

And this one is the class o encode/decoding in alaw format:

public class CMG711
{
/** decompress table constants */
private static short aLawDecompressTable[] = new short[]
{ -5504, -5248, -6016, -5760, -4480, -4224, -4992, -4736, -7552, -7296, -8064, -7808, -6528, -6272, -7040, -6784, -2752, -2624, -3008, -2880, -2240, -2112, -2496, -2368, -3776, -3648, -4032, -3904, -3264, -3136, -3520, -3392, -22016, -20992, -24064, -23040, -17920, -16896, -19968, -18944, -30208, -29184, -32256, -31232, -26112, -25088, -28160, -27136, -11008, -10496, -12032, -11520, -8960, -8448, -9984, -9472, -15104, -14592, -16128, -15616, -13056, -12544, -14080, -13568, -344, -328, -376,
        -360, -280, -264, -312, -296, -472, -456, -504, -488, -408, -392, -440, -424, -88, -72, -120, -104, -24, -8, -56, -40, -216, -200, -248, -232, -152, -136, -184, -168, -1376, -1312, -1504, -1440, -1120, -1056, -1248, -1184, -1888, -1824, -2016, -1952, -1632, -1568, -1760, -1696, -688, -656, -752, -720, -560, -528, -624, -592, -944, -912, -1008, -976, -816, -784, -880, -848, 5504, 5248, 6016, 5760, 4480, 4224, 4992, 4736, 7552, 7296, 8064, 7808, 6528, 6272, 7040, 6784, 2752, 2624,
        3008, 2880, 2240, 2112, 2496, 2368, 3776, 3648, 4032, 3904, 3264, 3136, 3520, 3392, 22016, 20992, 24064, 23040, 17920, 16896, 19968, 18944, 30208, 29184, 32256, 31232, 26112, 25088, 28160, 27136, 11008, 10496, 12032, 11520, 8960, 8448, 9984, 9472, 15104, 14592, 16128, 15616, 13056, 12544, 14080, 13568, 344, 328, 376, 360, 280, 264, 312, 296, 472, 456, 504, 488, 408, 392, 440, 424, 88, 72, 120, 104, 24, 8, 56, 40, 216, 200, 248, 232, 152, 136, 184, 168, 1376, 1312, 1504, 1440, 1120,
        1056, 1248, 1184, 1888, 1824, 2016, 1952, 1632, 1568, 1760, 1696, 688, 656, 752, 720, 560, 528, 624, 592, 944, 912, 1008, 976, 816, 784, 880, 848 };

private final static int cClip = 32635;
private static byte aLawCompressTable[] = new byte[]
{ 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7 };

public int encode( byte[] src, int offset, int len, byte[] res )
{
    int j = offset;
    int count = len / 2;
    short sample = 0;

    for ( int i = 0; i < count; i++ )
    {
        sample = (short) ( ( ( src[j++] & 0xff ) | ( src[j++] ) << 8 ) );
        res[i] = linearToALawSample( sample );
    }
    return count;
}

private byte linearToALawSample( short sample )
{
    int sign;
    int exponent;
    int mantissa;
    int s;

    sign = ( ( ~sample ) >> 8 ) & 0x80;
    if ( !( sign == 0x80 ) )
    {
        sample = (short) -sample;
    }
    if ( sample > cClip )
    {
        sample = cClip;
    }
    if ( sample >= 256 )
    {
        exponent = (int) aLawCompressTable[( sample >> 8 ) & 0x7F];
        mantissa = ( sample >> ( exponent + 3 ) ) & 0x0F;
        s = ( exponent << 4 ) | mantissa;
    }
    else
    {
        s = sample >> 4;
    }
    s ^= ( sign ^ 0x55 );
    return (byte) s;
}

public void decode( byte[] src, int offset, int len, byte[] res )
{
    int j = 0;
    for ( int i = 0; i < len; i++ )
    {
        short s = aLawDecompressTable[src[i + offset] & 0xff];
        res[j++] = (byte) s;
        res[j++] = (byte) ( s >> 8 );
    }
}
}

Hope to be useful for someone of you! Thanks anyway for the help received, specially to bonnyz.

Padang answered 12/5, 2014 at 7:55 Comment(3)

Thank you Juan Pedro Martinez! Had some issues using encoding and decoding and this helped me to rectify it. – Leban 28/7, 2015 at 13:4

@Juan Pedro I have implemented your encode and decode. But the output is so bad. There is way too much of noise. Did you rectify it – Donnelly 30/7, 2019 at 14:37

Thank you very much for sharing your code. I couldn't finally use the pcm alaw but the raw PCM was very helpful to solve my problem. – Barraza 1/3, 2020 at 11:27

What sampleRate have you tried? Sample-rate (both in playback and recording) is something very important because it involves the whole audio pipeline and only few setup are guarantee to work on every devices (I'm sure of 44100). Also, keep in mind that you cannot specify random sampleRate (like 4000), because they will (or they should) be scaled to the nearest supported sampleRate. Similar considerations are valid also for the buffer size.

My guess is that a wrong setup of the pipeline produces sound artifacts which degenerate after the "compression" step.

What happens if you setup you clients with 44100?

Can you try to query the AudioManager and then test variuous supported sampleRate/buffersize

AudioManager audioManager = (AudioManager) this.getSystemService(Context.AUDIO_SERVICE);
String rate = audioManager.getProperty(AudioManager.PROPERTY_OUTPUT_SAMPLE_RATE);
String size = audioManager.getProperty(AudioManager.PROPERTY_OUTPUT_FRAMES_PER_BUFFER);

Leadin answered 29/4, 2014 at 20:56 Comment(2)

I have been trying with different samplerate and buffersize and the result is really bad. I am pretty sure the issue is related with the coding/decoding code, maybe also converting one array of byte to one of short. Still trying stuffs without success. Thanks anyway boonyz. – Padang 1/5, 2014 at 9:42

about this, did you tried to encode/decode a sample buffer using that class(G711UCodec)? For example, if you start with s00000001wxyz you get s000wxyz and then again s00000001wxyz? – Leadin 2/5, 2014 at 8:25

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags