Now that's a neat idea, especially for a mobile app.
I'd probably start off-line by using a .wav file as input to get the effects working the way I wanted. You can use any high level language for this, but you probably want something that will map reasonably well into C/C++.
In terms of a production version, I'd go native and do this in C or C++. You want something fast for real time audio processing & I like to avoid dependencies on things like .net for distribution. (Not that I have anything against .net, it's great for servers and distribution within a company but I'm not so keen on having it as a dependency for shrink wrap software.)
Windows DirectShow would be a tempting option - you could do some interesting effects with multi-media as well if you had the voice morpher implemented as a direct show filter.