Saturday, October 19, 2013
To those of us from a certain decade of computing, the phrase “text-to-speech” reminds us favorably of Dr. Sbaitso. A fun take on that reminiscence is an article titled “Dr. Sbaitso was my only friend”.
It would be neat if we could have access to text-to-speech functionality from Factor. And it would be especially neat if it was cross-platform!
We’ll start by defining a
speak-text word that is a generic word that
dispatches on the value of the
os object, so we can provide
HOOK: speak-text os ( str -- )
On Mac OS, we cheat a bit and just call out to the say command-line tool built into Mac OS:
M: macosx speak-text "say \"%s\"" sprintf try-process ;
We just use the default voice set in System Preferences, but changing the voice is just one of the many options available including adjusting the number of words spoken per minute. For more information on Mac OS support for speech, read the Speech Synthesis Programming Guide.
On Linux, text-to-speech is not builtin. Instead, I decided to use the Festival Speech Synthesis System, which includes a command-line tool that can be configured to speak text:
M: linux speak-text "festival --tts" utf8 [ print ] with-process-writer ;
In addition to this, you can find a whole host of other features in the Festival manual.
On Windows, it would probably be cool to bind to the Microsoft Speech API, but that seemed a little bit harder than the quick-and-dirty approach I took.
Support required two commits to the main Factor repository by Doug Coleman and myself:
- google.translate: adding translate-tts - using Google Translate to “speak” text to an MP3
- windows.winmm: Add binding to play mp3s - using WinMM to play MP3 files
Those two commits allow us to implement
speak-text on Windows:
M: windows speak-text translate-tts open-command play-command close-command ;
This code for this is available on my GitHub.