In Chrome 46, the XML is being interpreted properly as an XML document, on Windows, when the language is set to en
; however, I see no evidence that the tags are actually doing anything. I heard no difference between the <emphasis>
and non-<emphasis>
versions of this SSML:
var msg = new SpeechSynthesisUtterance();
msg.text = '<?xml version="1.0"?>\r\n<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><emphasis>Welcome</emphasis> to the Bird Seed Emporium. Welcome to the Bird Seed Emporium.</speak>';
msg.lang = 'en';
speechSynthesis.speak(msg);
The <phoneme>
tag was also completely ignored, which made my attempt to speak IPA fail.
var msg = new SpeechSynthesisUtterance();
msg.text='<?xml version="1.0" encoding="ISO-8859-1"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream. The name is pronounced <phoneme alphabet="ipa" ph="pævˈloʊvə">...</phoneme> or <phoneme alphabet="ipa" ph="pɑːvˈloʊvə">...</phoneme>, unlike the name of the dancer, which was <phoneme alphabet="ipa" ph="ˈpɑːvləvə">...</phoneme> </speak>';
msg.lang = 'en';
speechSynthesis.speak(msg);
This is despite the fact that the Microsoft speech API does handle SSML correctly. Here is a C# snippet, suitable for use in LinqPad:
var str = "Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream. The name is pronounced /pævˈloʊvə/ or /pɑːvˈloʊvə/, unlike the name of the dancer, which was /ˈpɑːvləvə/.";
var regex = new Regex("/([^/]+)/");
if (regex.IsMatch(str))
{
str = regex.Replace(str, "<phoneme alphabet=\"ipa\" ph=\"$1\">word</phoneme>");
str.Dump();
}
SpeechSynthesizer synth = new SpeechSynthesizer();
PromptBuilder pb = new PromptBuilder();
pb.AppendSsmlMarkup(str);
synth.Speak(pb);
var xmldoc = new DOMParser().parseFromString(text, 'text/xml')
does not help either, so I think matt's point is correct. – Quincythe square root of [[pbas +4]] 2 [[char LTRL]]a[[char NORM]] to the [[pbas +4]] 14 [[char LTRL]]x[[char NORM]]
. I do not know if this is only for Mac native voices, though. developer.apple.com/library/mac/documentation/UserExperience/… – Frangos