To modify the pitch of an audio file at specific positions in the files using the codes in java below

Asked 18/3, 2015 at 6:20 Answered 19/3, 2015 at 1:58

I want to modify the pitch of the audio clip dynamically based on the user input at different time instants, say if user enters change the pitch of the audio after 10 seconds then how can I achieve the same?

I found this link that describes how to modify the pitch but i want to repeat this process at different time instants of the audio clip and for some short duration only. Can anybody guide me on this ?

Some Edits

Edit 1

I found this code as I mentioned this previously as well

//source file
final File file1 = new File(“Sample.mp3”);
//destination file
final File file2 = new File(“Sample_cat.wav”);
//audio stream of file1
final AudioInputStream in1 = getAudioInputStream(file1);
//get audio format for targetted sound
final AudioFormat inFormat = getOutFormat(in1.getFormat());
//change the frequency of Audio format
private AudioFormat getOutFormat(AudioFormat inFormat) {
        int ch = inFormat.getChannels();
        float rate = inFormat.getSampleRate();    
        return new AudioFormat(PCM_SIGNED, 72000, 16, ch, ch * 2, rate,
                inFormat.isBigEndian());
    }
//get the target file audio stream using file format
final AudioInputStream in2 = getAudioInputStream(inFormat, in1);
//write the audio file in targeted pitch file
AudioSystem.write(in2, AudioFileFormat.Type.WAVE, file2);

EDIT 2 I found another code which sets the position of audio file from where you want to start and stop the audio.

        File audioFile = new File(audioFilePath);


        AudioInputStream audioStream = AudioSystem.getAudioInputStream(audioFile);

        AudioFormat format = audioStream.getFormat();

        DataLine.Info info = new DataLine.Info(Clip.class, format);

        Clip audioClip = (Clip) AudioSystem.getLine(info); 
        audioClip.open(audioStream);
        audioClip.setLoopPoints(10_000, 500_000);
        audioClip.loop(1);

Now, How can I change the pitch for the duration set in Edit 2 i.e. 10 ms to 50 ms using the code in Edit 1

can anybody suggest me if I can do the same thing in any other ways except Java ? then suggestions are welcome... Please Help. I am new to this.

**Edit 3 **

One can refer the exact problem on this link : link

These are the values(in milliseconds) that I am referring :

public static void convertMsgToAudio(String msg){

        int len = msg.length();
        duration = new double[len];
        msg = msg.toUpperCase();
        System.out.println("Msg 2 : " + msg);

        int i;
        //char ch;
        for(i=0;i<msg.length();i++){

            if(msg.charAt(i) == 'A'){
                duration[i] = 50000;
            }
            else if (msg.charAt(i) == 'B'){
                duration[i] = 100000;
            }
            else if (msg.charAt(i) == 'C'){
                duration[i] = 150000;
            }
            else if (msg.charAt(i) == 'D'){
                duration[i] = 200000;               
            }
            else if (msg.charAt(i) == 'E'){
                duration[i] = 250000;
            }
            else if (msg.charAt(i) == 'F'){
                duration[i] = 300000;
            }
            else if (msg.charAt(i) == 'G'){
                duration[i] = 350000;
            }
            else if (msg.charAt(i) == 'H'){
                duration[i] = 400000;
            }
            else if (msg.charAt(i) == 'I'){
                duration[i] = 450000;
            }
            else if (msg.charAt(i) == 'J'){
                duration[i] = 500000;
            }
            else if (msg.charAt(i) == 'K'){
                duration[i] = 550000;
            }
            else if (msg.charAt(i) == 'L'){
                duration[i] = 600000;
            }
            else if (msg.charAt(i) == 'M'){
                duration[i] = 650000;
            }
            else if (msg.charAt(i) == 'N'){
                duration[i] = 700000;
            }
            else if (msg.charAt(i) == 'O'){
                duration[i] = 750000;
            }
            else if (msg.charAt(i) == 'P'){
                duration[i] = 800000;
            }
            else if (msg.charAt(i) == 'Q'){
                duration[i] = 850000;
            }
            else if (msg.charAt(i) == 'R'){
                duration[i] = 900000;
            }
            else if (msg.charAt(i) == 'S'){
                duration[i] = 950000;
            }
            else if (msg.charAt(i) == 'T'){
                duration[i] = 1000000;
            }
            else if (msg.charAt(i) == 'U'){
                duration[i] = 1100000;
            }
            else if (msg.charAt(i) == 'V'){
                duration[i] = 1200000;
            }
            else if (msg.charAt(i) == 'W'){
                duration[i] = 1300000;
            }
            else if (msg.charAt(i) == 'X'){
                duration[i] = 1400000;
            }
            else if (msg.charAt(i) == 'Y'){
                duration[i] = 1500000;
            }
            else if (msg.charAt(i) == 'Z'){
                duration[i] = 1600000;
            }

        }

    }

Comma answered 18/3, 2015 at 6:20 Comment(2)

Can nobody help me at this ? – Comma 18/3, 2015 at 7:2

can anybody suggest me if I can do the same thing in any other ways except Java ? then suggestions are welcome... – Comma 18/3, 2015 at 12:31

Java does not expose the data in a Clip for editing, as far as I know.

I've never tried altering the pitch by messing with the sample rate. Maybe that is a good way to go. There is a section of the Java tutorials that covers changes in formatting of wav files: Using Files and Format Converters. Seems like this would be good background info and may even cover the solution you are attempting.

Here is what I do, call it a VarispeedWavPlayer.

(1) have a volatile instance float variable that is a speed factor (1 is same speed, 1.1 is 110%, 0.5 is half speed, etc.

(2) have a float that will be a running 'tapehead'

(3) start with normal code for reading in from an AudioInputStream and outputting to a SourceDataLine (good example in "Reading Sound Files" in the above Java Tutorial link.

(4) in the area where there is the comment

// Here, do something useful with the audio data that's 
// now in the audioBytes array...

(a) convert the incoming bytes to PCM data.

Example of how to do this, with 16-bit encoding, stereo, little-endian ("CD quality"). This uses a read buffer size of 1024 bytes, which converts to 256 frames (remember, there are two tracks, left and right) of short data that ranges from -32767 to 32767 (or maybe 32768, I can't recall that detail at the moment).

while((bytesRead = audioInputStream.read(rawByteBuffer, 0, 1024)) != -1)
{
    for (int i = 0, n = bytesRead / 2); i < n; i ++)
    {
        pcmBuffer[i] =  ( rawByteBuffer[i * 2] & 0xff )
                        | ( rawByteBuffer[(i * 2) + 1)] << 8 ) ;
    }
}

The above was edited to be clear, and can use some performance optimizations.

(b) fetch the current "speedfactor" and write a loop that iterates through the PCM frame values (keeping in mind that with stereo, the "next" frame for that track is +2):

tapehead += speedfactor;

(c) this will usually land at a fractional value. Use linear interpolation to calculate a value at that intermediate spot. For example, if you land at tapehead = 10.25, and frame 10 is 0.5, and frame 11 is 0.6, you'd return 0.525.

(d) convert the interpolated values back to bytes (reverse of step 4a)

(e) accumulate the bytes and send them out via the SourceDataLine.

I've left out some details in terms of managing the fact that the input and output bytebuffers will not match one-to-one. But if you grasp the basic concept, then I think you will be able to solve this.

Note, this will only update the speed at the point when the "speedFactor" variable is consulted, once per input buffer. So you don't want the input buffer to be overly large.

Quotha answered 19/3, 2015 at 1:58 Comment(16)

thanks for the response :) But i haven't have the good grasp of audio .. specifically i did not understand how to convert incoming to PCM data .. – Comma 19/3, 2015 at 5:19

and afterwords ... so if you can provide me the code for this rather than the pseudocode that will be a great help ... as I am running out of time and I am not very good at java .. but have no other choice .. – Comma 19/3, 2015 at 5:30

Each of the steps has been addressed at StackOverflow. I think you will have some success if you search for them. My time is short as well. It may be that you have bit off a bit more than you are ready for, as this does require more than just a beginner's skill level of Java coding. – Quotha 19/3, 2015 at 6:17

thanks for your advice I will do a bit on this .. can you tell me one more thing ,... I have stated the exact problem on #29138669 .. can you please tell me if the way that you are saying will help me achieve this .. ? thanks in advance :) – Comma 19/3, 2015 at 7:5

To take two bytes (assumes 16-bit encoding, little-endian) and convert to PCM ranging from -32767 to 32767, try this line of code: int audioVal = ( buffer[i+1] << 8 ) | ( buffer[i] & 0xff ) ; // I tried a search and am having trouble finding the posts that had similar. I know they used to be here! – Quotha 19/3, 2015 at 7:31

I do think the method I described can do this. What I have written in the past is this: code to play back a wav file via a SourceDataLine at a given speed (faster, slower). I haven't done the part where one changes the speed part way through before. But your description of giving "kill bill" is slightly easier to implement than what I described which was for real-time pitch alteration effects. – Quotha 19/3, 2015 at 7:34

I have edited my question.. do look at it once .. and one more thing the aprroach that your are telling me is it not similar to the answer posted on the link that i gave .. although it is in python ? – Comma 19/3, 2015 at 8:14

is this the link you were talking about ? #12120899 – Comma 19/3, 2015 at 8:34

can you now provide me the code.. sorry for asking like this but I have tried many different things and they are not working .. not this but some other things .. it will be great if you can help me with the code .. – Comma 19/3, 2015 at 18:14

I added code at 4a to help with converting 16-bit encoded bytes to PCM.This answer is a different approach from the reply given in your Python version. I do not have experience with that approach (see 2nd paragraph). If you really are desperate for someone to write code for you, you might consider adding a bounty or hiring a tutor/coder. The more you ask for people to provide code for you, the greater risk of being labelled a "Help Vampire" meta.stackexchange.com/questions/19665/the-help-vampire-problem – Quotha 19/3, 2015 at 19:24

@Ohil Freihofner : thanks for advice :) but issue is not that I cannot code, issue is that I am not able to understand what are you trying to do in that code... without understanding the logic you cannot code anything .. if I am not wrong ? I tried understanding linear interpolation.. but in sound it was totally different code than the one one that guy wrote in python who was also doing linear interpolation .. thanks for all your help till now.. believe me I am not a kind of person who will keep asking for code but a small thing would have helped at this stage .. :) – Comma 19/3, 2015 at 21:5

hey I have started understanding a bit .. about the code .. just to clarify ... is tapehead the variable i.e. jumping to the desired sample whose pitch we want to change and speedfactor is the variable which represents the number of frames which help in jumping to the desired frame ? – Comma 19/3, 2015 at 21:34

Yes, I think you are getting it. It's a lot to digest if this is new! Analogy: tape head reading a tape flowing by. With normal reading (if tape were digital stream of digits instead of an analog wave) tape advances 1 space (used for 1 frame) and the tapeHead reads the frame. With accelerated reading, tape advances, say, 1.7 spaces (instead of 1) and tapeHead interpolates that value that would be between frame 1 and frame 2. Sometimes the term "cursor" is used for a variable that progressively moves through an array. tapeHead is a cursor. Linear interpolation is good enough for decent audio. – Quotha 20/3, 2015 at 0:42

I am sorry but can you tell me one more thing .. the duration array that I mentioned in my question has the values in milliseconds ... so First should I convert them into frames ? this duration will be equivalent to the speed factor right ? If my intial taphead is at frame1 and speedfactor (=some duration value) is some floating value then adding it to frame 1 gives me say frame 10 , then on linear interpolating frame 1 and frame 10 will change the pitch for all from 1-10 ? – Comma 20/3, 2015 at 5:3

I just want once you convert the data into PCM sample values .. iterate through duration array such that ... all the samples that lie in each of the durations are modified with the new pitch value .. I think now you will understand better what I want to say ... Pls clarify on this :) – Comma 20/3, 2015 at 5:5

Incoming audio is 44100 frames per second stereo, probably (depends on your source files), and never changes. If you want milliseconds 1000 to 3000 to go at 3/4 speed of the source, then when you get to frame 44100, start incrementing the tapeHead by 0.75, until you cross frame 132300 (3 * 44100). If you count incoming bytes instead of frames, multiply the above by 4 (assuming 4 bytes per frame). When your tapeHead crosses the threshold value for the next speed, calculate the new speedIncrement and continue. If you want to go 1/4 FASTER than the source, the speedIncrement would be 1.25. – Quotha 20/3, 2015 at 15:54

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags