Podcast
Questions and Answers
What is the main purpose of Amazon Polly?
What is the main purpose of Amazon Polly?
- To transcribe speech into text
- To analyze and interpret speech patterns
- To create interactive voice assistants
- To convert text into lifelike speech using deep learning (correct)
What is a lexicon in the context of Amazon Polly?
What is a lexicon in the context of Amazon Polly?
- A rule-based system for generating speech
- A dictionary that defines how to pronounce specific words or phrases (correct)
- A set of pre-recorded voice samples
- A tool for creating custom voice models
What does SSML stand for and what is its primary function?
What does SSML stand for and what is its primary function?
- Speech Segmentation Markup Language; to break down speech into individual units
- Speech Syntax Markup Language; to define the grammatical structure of speech
- Speech Stream Markup Language; to manage the flow of speech data
- Speech Synthesis Markup Language; to provide instructions on how to pronounce text (correct)
Which of the following is NOT a voice engine option available in Amazon Polly?
Which of the following is NOT a voice engine option available in Amazon Polly?
What is the primary benefit of using speech marks in Amazon Polly?
What is the primary benefit of using speech marks in Amazon Polly?
Flashcards
Amazon Polly
Amazon Polly
A service that converts text into lifelike speech using deep learning.
Lexicons
Lexicons
Definitions for how specific text should be pronounced by Polly.
SSML
SSML
Speech Synthesis Markup Language for controlling speech output and pauses.
Speech Marks
Speech Marks
Signup and view all the flashcards
Voice Engines
Voice Engines
Signup and view all the flashcards
Study Notes
Amazon Polly Overview
- Amazon Polly synthesizes lifelike speech from text using deep learning.
- It's the opposite of Amazon Transcribe, which transcribes speech to text.
- It creates applications that speak.
- Example: Inputting "Hi, my name is Stephane, and this is a demo of Amazon Polly" will generate the speech.
Advanced Features
-
Lexicons: Allows for custom pronunciations of words or phrases.
- Example: specify that "AWS" should be pronounced as "Amazon Web Services".
- Example: specify that "W3C" should be pronounced as "World Wide Web Consortium".
-
SSML (Speech Synthesis Markup Language): Provides markup to control how text is pronounced.
- Example: SSML can create pauses, whispers, emphasis, and control the pronunciation of abbreviations.
- Example: "Hello" followed by a break, "how are you?" will result in a pause after "Hello".
Voice Engines
- Different voice engines are available with varying characteristics.
- Voice engines span from historical to newer neural, standard, long-form, and generative.
- Newer engines produce more human-like voices.
Speech Marks
- Provides location of audio segments corresponding to words or sentences.
- Important for functions such as lip-synching and highlighting spoken words in audio.
- Provides location of audio segments corresponding to words or sentences together with the audio.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.