How to incorporate SSML into Python

B

6

5

I need to use SSML to play an audio file with the tag in my Alexa Skill (as per Amazon's instructions).

Problem is, I don't know how to use SSML with Python. I know I can use it with Java but I want to build my skills with Python. I've looked all over, but haven't found any working examples of SSML in a Python script/program - does anyone know?

Bryson answered 14/4, 2016 at 20:44 Comment(0)

H

5

This was asked two years ago but maybe someone will benefit from the below.

I've just checked and if you use Alexa Skills Kit SDK for Python you can simply add SSML to your response, for example:

@sb.request_handler(can_handle_func=is_request_type("LaunchRequest"))
def launch_request_handler(handler_input):

    speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!"

    return handler_input.response_builder.speak(speech_text).response

Hope this helps.

Hosiery answered 1/9, 2018 at 11:53 Comment(0)

F

3

SSML audio resides in the response.outputSpeech.ssml attribute. Here is an example obj with other required parameters removed:

{
 "response": {
    "outputSpeech": {
      "type": "SSML",
      "ssml": "<speak>
              Welcome to Car-Fu.
              <audio src="https://carfu.com/audio/carfu-welcome.mp3" />
              You can order a ride, or request a fare estimate. Which will it be?
              </speak>"
    }
}

Further reference:

Firm answered 21/4, 2016 at 1:12 Comment(0)

I

3

Install ssml-builder "pip install ssml-builder", and use it:

from ssml_builder.core import Speech

speech = Speech()
speech.add_text('sample text')
ssml = speech.speak()
print(ssml)

Impede answered 28/9, 2019 at 11:43 Comment(0)

I

2

These comments really helped a lot in figuring out how to make SSML works using the ask-sdk-python. Instead of

speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!" - from wmatt's comment

I defined variables that represents the start and end of every tags that I'm using

ssml_start = '<speak>'
speech_text = ssml_start + whispered_s + "Here are the latest alerts from MMDA" + whispered_e

using single quotes and concatenate those strings to the speech output and it worked! Thanks a lot guys! I appreciate it a lot!

Inhaler answered 4/4, 2019 at 21:13 Comment(0)

B

1

This question was somewhat vague, however I did manage to figure out how to incorporate SSML into a Python script. Here's a snippet that plays some audio:

  if 'Item' in intent['slots']:
    chosen_item = intent['slots']['Item']['value']
    session_attributes = create_attributes(chosen_item)

    speech_output =  '<speak> Here is something to play' + \
    chosen_item + \
    '<audio src="https://s3.amazonaws.com/example/example.mp3" /> </speak>'

Bryson answered 20/4, 2016 at 16:54 Comment(1)

User BMW has pointed out the correct answer. When you set the type param of the outputSpeech JSON object to SSML and use ssml instead if text, you can use SSML tags (as documented in the Speech Synthesis Markup Language (SSML) Reference). – Gapes 29/1, 2018 at 14:50

S

1

The ssml package for python exists.

you can install like below by pip



    $ pip install pyssml
    or
    $ pip3 install pyssml

so example is link below

http://blog.naver.com/chandong83/221145083125 sorry. it is korean.



    # -*- coding: utf-8 -*-
    # for amazon
    import re
    import os
    import sys
    import time
    from boto3 import client
    from botocore.exceptions import BotoCoreError, ClientError
    import vlc
    from pyssml.PySSML import PySSML


    # amazon service fuction
    # if isSSML is True, SSML format
    # else Text format
    def aws_polly(text, isSSML = False):
        voiceid = 'Joanna'

        try:
            polly = client("polly", region_name="ap-northeast-2")

            if isSSML:
                textType = 'ssml'
            else:
                textType = 'text'

            response = polly.synthesize_speech(
                    TextType=textType,
                    Text=text,
                    OutputFormat="mp3",
                    VoiceId=voiceid)

            # get Audio Stream (mp3 format)
            stream = response.get("AudioStream")

            # save the audio Stream File
            with open('aws_test_tts.mp3', 'wb') as f:
                data = stream.read()
                f.write(data)


            # VLC play audio
            # non block
            p = vlc.MediaPlayer('./aws_test_tts.mp3')
            p.play()

        except ( BotoCoreError, ClientError) as err:
            print(str(err))


    if __name__ == '__main__':
        # normal pyssml
        #s = PySSML()

        # amazon speech ssml
        s = AmazonSpeech()

        # normal 
        s.say('i am normal')

        #  speed is very slow
        s.prosody({'rate':"x-slow"}, 'i am very slow')

        #  volume is very loud
        s.prosody({'volume':'x-loud'}, 'my voice is very loud')

        #  take a one sec
        s.pause('1s')

        #  pitch is very high
        s.prosody({'pitch':'x-high'}, 'my tone is very high')

        # amazone 
        s.whisper('i am whispering')
        # print to convert to ssml format
        print(s.ssml())

        # request aws polly and play
        aws_polly(s.ssml(), True)

        # Wait while playback.
        time.sleep(50)

Sweeping answered 23/11, 2017 at 12:48 Comment(0)

Recommended topics

Hot tags