Algorithm and package to modify the pitch of the sound for certain durations repeatedly
Asked Answered
A

1

0

I want to create an audio file using the existing audio file by which I can modify the pitch of the audio for different durations of the file. Like if the file is of 36sec then I want to modify the pitch for 1st 2 sec with some value then from 6th sec to 9th sec some other value and so on.

Basically, I am trying to modify the audio file based on the text message that user gives like say if user inputs "kill bill", according to each character in the message k,i,l,b... I have taken an array which stores different durations and like that I have the table for 26 alphabets a,b,c,d,... and so on. Based on these durations, I am trying to modify the file for these particular durations. The issue is that I don't really have a very good hands-on over the audio and I even tried dong the same in Java but unable to do so.

Is there some other parameter that could be changed in an audio file without making the change much noticeable?

I am referring to these values, although the code is in Java but just ignore that. I will transform that later in Python. Values are in milliseconds.

public static void convertMsgToAudio(String msg){

        int len = msg.length();
        duration = new double[len];
        msg = msg.toUpperCase();
        System.out.println("Msg 2 : " + msg);

        int i;
        //char ch;
        for(i=0;i<msg.length();i++){

            if(msg.charAt(i) == 'A'){
                duration[i] = 50000;
            }
            else if (msg.charAt(i) == 'B'){
                duration[i] = 100000; // value in milliseconds 
            }
            else if (msg.charAt(i) == 'C'){
                duration[i] = 150000;
            }
            else if (msg.charAt(i) == 'D'){
                duration[i] = 200000;               
            }
            else if (msg.charAt(i) == 'E'){
                duration[i] = 250000;
            }
            else if (msg.charAt(i) == 'F'){
                duration[i] = 300000;
            }
            else if (msg.charAt(i) == 'G'){
                duration[i] = 350000;
            }
            else if (msg.charAt(i) == 'H'){
                duration[i] = 400000;
            }
            else if (msg.charAt(i) == 'I'){
                duration[i] = 450000;
            }
            else if (msg.charAt(i) == 'J'){
                duration[i] = 500000;
            }
            else if (msg.charAt(i) == 'K'){
                duration[i] = 550000;
            }
            else if (msg.charAt(i) == 'L'){
                duration[i] = 600000;
            }
            else if (msg.charAt(i) == 'M'){
                duration[i] = 650000;
            }
            else if (msg.charAt(i) == 'N'){
                duration[i] = 700000;
            }
            else if (msg.charAt(i) == 'O'){
                duration[i] = 750000;
            }
            else if (msg.charAt(i) == 'P'){
                duration[i] = 800000;
            }
            else if (msg.charAt(i) == 'Q'){
                duration[i] = 850000;
            }
            else if (msg.charAt(i) == 'R'){
                duration[i] = 900000;
            }
            else if (msg.charAt(i) == 'S'){
                duration[i] = 950000;
            }
            else if (msg.charAt(i) == 'T'){
                duration[i] = 1000000;
            }
            else if (msg.charAt(i) == 'U'){
                duration[i] = 1100000;
            }
            else if (msg.charAt(i) == 'V'){
                duration[i] = 1200000;
            }
            else if (msg.charAt(i) == 'W'){
                duration[i] = 1300000;
            }
            else if (msg.charAt(i) == 'X'){
                duration[i] = 1400000;
            }
            else if (msg.charAt(i) == 'Y'){
                duration[i] = 1500000;
            }
            else if (msg.charAt(i) == 'Z'){
                duration[i] = 1600000;
            }

        }

    }

Now, I am trying to do the same in Python. I am very new to this concept but this is the first time I am facing issues with this concept.

Ambidexterity answered 19/3, 2015 at 6:4 Comment(4)
Maybe I misunderstand your question but I think the karplus-strong algorithm might be of some help hereCacuminal
Are you looking for the algorithm, a package recommendation, or both?Economical
@Economical both.. I would not have asked for this but the issue is I don't have prior knowledge about this at all :(Ambidexterity
No problem, it's interesting but the algorithm part might be marginally off-topic. This is getting several close-votes. I say leave it open.Economical
M
1

A simple way is to work on raw PCM data directly; in this format the audio data is just a sequence of -32768...32767 values stored as 2 bytes per entry (assuming 16-bit signed, mono) sampled at regular intervals (e.g. 44100Hz).

To alter the pitch you can just "read" this data faster e.g. at 45000Hz or 43000Hz and this is easily done with a resampling procedure. For example

 import struct
 data = open("pcm.raw", "rb").read()
 parsed = struct.unpack("%ih" % (len(data)//2), data)
 # Here parsed is an array of numbers

 pos = 0.0     # position in the source file
 speed = 1.0   # current read speed = original sampling speed
 result = []

 while pos < len(parsed)-1:
     # Compute a new sample (linear interpolation)
     ip = int(pos)
     v = int(parsed[ip] + (pos - ip)*(parsed[ip+1] - parsed[ip]))
     result.append(v)

     pos += speed     # Next position
     speed += 0.0001  # raise the pitch

 # write the result to disk
 open("out.raw", "wb").write(struct.pack("%ih" % len(result)), result)

This is a very very simple approach to the problem, note however for example that increasing the pitch will shorten the length, to avoid this more sophisticated math is needed than just interpolating.

I used approach this for example to raise by one tone a song over its length (I wanted to see if this was noticeable... it isn't).

Malaspina answered 19/3, 2015 at 7:15 Comment(4)
thanks for the response :) but may I ask you where are you taking the audio source file ?is it like the pcm.raw file is the source file i.e. source file is in pcm format ?but my file is in .wav formatAmbidexterity
There are free audio editing programs (e.g. audacity: audacity.sourceforge.net) and also free command line tools (e.g. sox: sox.sourceforge.net) to convert from wav/mp3 to raw and back.Malaspina
i want the pos to be according to the duration array that I took above in the example which is in millisec .. how can i use it in the code that you mentioned ?Ambidexterity
@POOJAGUPTA: the samples in raw format are uniformly spaced; for example if the data is sampled at 44100Hz each value covers 0.022675736961451247... milliseconds (i.e. 1000./44100). In other words 44100 samples are one second of input and the lenght of output will be shorter or longer depending on the pitch variation. Audio editing programs provide options to decide what sampling frequency to use when converting from/to raw.Malaspina

© 2022 - 2024 — McMap. All rights reserved.