C++ Microsoft SAPI: How to set Windows text-to-speech output to a memory buffer?

Asked 7/5, 2010 at 3:47 Answered 1/6, 2014 at 7:3

I have been trying to figure out how to "speak" a text into a memory buffer using Windows SAPI 5.1 but so far no success, even though it seems it should be quite simple.

There is an example of streaming the synthesized speech into a .wav file, but no examples of how to stream it to a memory buffer.

In the end I need to have the synthesized speech in a char* array in 16 kHz 16-bit little-endian PCM format. Currently I create a temp .wav file, redirect speech output there, then read it, but it seems to be a rather stupid solution.

Anyone knows how to do that?

Thanks!

Scrambler answered 7/5, 2010 at 3:47 Comment(1)

did you manage to do it? – Goldstone 31/5, 2014 at 10:29

Look at ISpStream::SetBaseStream. Here's a little helper:

inline HRESULT SPCreateStreamOnHGlobal(
                    HGLOBAL hGlobal,            //Memory handle for the stream object
                    BOOL fDeleteOnRelease,      //Whether to free memory when the object is released
                    const WAVEFORMATEX * pwfex, //WaveFormatEx for stream
                    ISpStream ** ppStream)      //Address of variable to receive ISpStream pointer
{
    HRESULT hr;
    IStream * pMemStream;
    *ppStream = NULL;
    hr = ::CreateStreamOnHGlobal(hGlobal, fDeleteOnRelease, &pMemStream);
    if (SUCCEEDED(hr))
    {
        hr = ::CoCreateInstance(CLSID_SpStream, NULL, CLSCTX_ALL, __uuidof(*ppStream), (void **)ppStream);
        if (SUCCEEDED(hr))
        {
            hr = (*ppStream)->SetBaseStream(pMemStream, SPDFID_WaveFormatEx, pwfex);
            if (FAILED(hr))
            {
                (*ppStream)->Release();
                *ppStream = NULL;
            }
        }
        pMemStream->Release();
    }
    return hr;
}

Leasehold answered 11/5, 2010 at 21:2 Comment(3)

Eric, How would can you find out the size you need for the GlobalAlloc call to get the HGLOBAL memory handle? I am guessing it would vary depending on how much speech is spoken, but how can you find this out? – Carnation 13/3, 2013 at 2:41

You don't need to. The memory stream managed by ::CreateStreamOnHGlobal will reallocate the memory as needed. – Leasehold 15/3, 2013 at 19:44

I used this example as a basis for my implementation of streaming speech to a buffer. But when reading from the IStream object I always get zero bytes read. Looking at the stream object, there were bytes written (using Stream::Stat). Do I need to use IStream::LockRegion to get the data? – Pubilis 14/7, 2017 at 8:45

I accomplished it using the ISpStream. Use Setbasestream function of the ispstream to bind it to an istream and then set the output of ispvoice to that ispstream.

Here is my working solution if anybody wants it :

https://github.com/itsyash/MS-SAPI-demo

Goldstone answered 1/6, 2014 at 7:3 Comment(0)

Do you know how to create a memory-mapped file? You could see if the ISpStream will bind to it.

Ferrotype answered 7/5, 2010 at 4:13 Comment(0)

Recommended topics

Hot tags