To capture free-form speech input (rather than a defined list of possible values), you'll need to use the AMAZON.LITERAL
slot type. The Amazon documentation for the Literal slot type describes a use case similar to yours, where a skill is created to take any phrase and post it to a Social Media site. This is done by creating a StatusUpdate intent:
{
"intents": [
{
"intent": "StatusUpdate",
"slots": [
{
"name": "UpdateText",
"type": "AMAZON.LITERAL"
}
]
}
]
}
Since it uses the AMAZON.LITERAL
slot type, this intent will be able to capture any arbitrary phrase. However, to ensure that the speech engine will do a decent job of capturing real-world phrases, you need to provide a variety of example utterances that resemble the sorts of things you expect the user to say.
Given that in your described scenario, you're trying to capture very dynamic phrases, there's a couple things in the documentation you'll want to give extra consideration to:
If you are using the AMAZON.LITERAL type to collect free-form text
with wide variations in the number of words that might be in the slot,
note the following:
- Covering this full range (minimum, maximum, and all in between) will
require a very large set of samples. Try to provide several hundred
samples or more to address all the variations in slot value words as
noted above.
- Keep the phrases within slots short enough that users can
say the entire phrase without needing to pause.
Lengthy spoken input can lead to lower accuracy experiences, so avoid
designing a spoken language interface that requires more than a few
words for a slot value. A phrase that a user cannot speak without
pausing is too long for a slot value.
That said, here's the example Sample Utterances from the documentation, again:
StatusUpdate post the update {arrived|UpdateText}
StatusUpdate post the update {dinner time|UpdateText}
StatusUpdate post the update {out at lunch|UpdateText}
...(more samples showing phrases with 4-10 words)
StatusUpdate post the update {going to stop by the grocery store this evening|UpdateText}
If you provide enough examples of different lengths to give an accurate picture of the range of expected user utterances, then your intent will be able to accurately capture dynamic phrases in real uses cases, which you can access in the UpdateText
slot. Based on this, you should be able to implement an intent specific to your needs.