text-to-speech-to-wav in Delphi
Asked Answered
P

1

13

I imported the SAPI type library into Delphi. I can output speech to the PC speakers with this code:

procedure TForm1.Button1Click(Sender: TObject);
var
  Voice: TSpVoice;
begin
  Voice := TSpVoice.Create(nil);
  Voice.Speak('Hello World!', 0);
end;

I can output speech to a .wav file with this code:

procedure TForm1.Button1Click(Sender: TObject);
var
  Voice: TSpVoice;
  Stream: TSpFileStream;
begin
  Voice := TSpVoice.Create(nil);
  Stream := TSpFileStream.Create(nil);
  Stream.Open('c:\temp\test.wav', SSFMCreateForWrite, False);
  Voice.AudioOutputStream := Stream.DefaultInterface;
  Voice.Speak('Hello World!', 0);
  Stream.Close;
end;

The problem is that when I play back the .wav file it sounds terrible, like it's using a really low bitrate. Audacity tells me the file is mono 16-bit 22.05kHz but it sounds much worse than that.

How do I output speech to a mono 16-bit 44.1kHz .wav file that will sound exactly the same as speech output directly to the PC speakers? I could not figure out how to modify the second code sample to set the bits per sample and the bitrate.

Follup-up: Glenn's answer solves the bitrate issue. Thanks for that. But the quality of the speech output to the .wav file is still inferior to what is output directly to the speakers. I used screen recording software to record the output from the first block of code as helloworldtospeakers.wav. The second block of code, with Glenn's line added, produces helloworldtowav.wav. The second file clearly has some distortion to it. Any ideas?

Pule answered 14/10, 2012 at 4:54 Comment(1)
I imported it as well into XE8 but can't get it to speak at all and it's asking for 3 parameters. pwcs: PWideChar; dwFlags: Cardinal; outpulStreamNumber: CardinalZoril
D
11

See the Format attribute on your file stream object. It's an SpAudioFormat type which has a Type property you use to set the audio format. That's an enumerated type, which has a great many options, so you'll need to study them to get what you want.

This line should get it for you (at least with the version of type library I used).

Stream.Format.Type_ := SAFT44kHz16BitMono;
Dottiedottle answered 14/10, 2012 at 6:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.