New Text-to-Speech API for Chrome extensions
By Dominic Mazzoni, Software EngineerInterested in making your Chrome Extension (or
packaged
app) talk using synthesized speech? Chrome now includes a Text-to-Speech (TTS) API
that’s simple to use, powerful, and flexible for users.
Let’s start with the "simple to use" part. A few clever apps and extensions
figured out how to talk before this API was available – typically by sending text to
a remote server that returns an MP3 file that can be played using HTML5 audio. With the new
API, you just need to add "tts" to your permissions and then write:
chrome.tts.speak('Hello, world!');
It’s also
very easy to change the rate, pitch, and volume. Here’s an example that speaks more
slowly:
chrome.tts.speak('Can you understand me now?', {rate:
0.6});
How about powerful? To get even fancier and synchronize speech
with your application, you can register to receive callbacks when the speech starts and
finishes. When a TTS engine supports it, you can get callbacks for individual words too. You
can also get a list of possible voices and ask for a particular voice – more on this
below. All the details can be found in the
TTS API docs, and we
provide complete example code on the
samples
page.
In fact, the API is powerful enough that
ChromeVox,
the Chrome OS screen reader for visually impaired users, is built using this API.
Here are three examples you can try now:
TTS
Demo (app)
Talking
Alarm Clock (extension)
SpeakIt
(extension)
Finally, let's talk about flexibility for users. One of the
most important things we wanted to do with this API was to make sure that users have a great
selection of voices to choose from. So we've opened that up to developers, too.
The
TTS Engine API
enables you to implement a speech engine as an extension for Chrome. Essentially, you provide
some information about your voice in the extension manifest and then register a JavaScript
function that gets called when the client calls
chrome.tts.speak
.
Your extension then takes care of synthesizing and outputting the speech – using any
web technology you like, including
HTML5 Audio, the new
Web Audio
API, or
Native
Client.
Here are two voices implemented using the TTS Engine
API that you can install now:
Lois TTS
- US English Flite SLT
Female TTS - US EnglishThese voices both use Native Client
to synthesize speech. The experience is very easy for end users: just click and install one of
those voices, and immediately any talking app or extension has the ability to speak using that
voice.
If a user doesn't have any voices installed, Chrome
automatically speaks using the native speech capabilities of your Windows or Mac operating
system, if possible. Chrome OS comes with a built-in speech engine, too. For now, there's
unfortunately no default voice support on Linux – but TTS is fully supported once
users first install a voice from the Chrome Web Store.
Now it's your
turn: add speech capability to your app or extension today! We can't wait to hear what you
come up with, and if you talk about it, please add the hashtag
#chrometts so we can join the
conversation. If you have any feedback, direct it to the
Chromium-extensions
group.
Dominic
Mazzoni is a Software Engineer working on Chrome accessibility. He's the original
author of Audacity, the free audio editor.Posted
by Scott
Knaster, Editor