Using a Speech engine to highlight words

From Rosetta Code
Using a Speech engine to highlight words is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Display a piece of text and produce spoken output via a speech engine. As each word is being spoken, highlight the word on the display. In languages where cursor control and highlighting are not possible, it is permissible to output each word as it is spoken.


We use the simple SAPI.SPVoice COM Object and a parsing loop. The highlighting is done with EM_SETSEL and Notepad. Rather crude, but it works. Due to the simplistic nature of the parsing loop, the text ends with a space.

SetTitleMatchMode 2
EM_SETSEL := 0x00B1
Run notepad,,,pid
WinWaitActive ahk_pid %pid%
ControlSetText, Edit1, % text := "AutoHotkey was the first to implement this task! ", ahk_pid %pid%
pVoice := ComObjCreate("Sapi.spvoice"), i := 1 ; the spvoice COM Object ships with the OS
; parse the text
While lf := SubStr(text, i, 1)
If lf = %A_Space%
SendMessage, EM_SetSel, % i-StrLen(word)-1, % i-1, Edit1, ahk_pid %pid%
pVoice.speak(word), word := "", i++
Else word .= lf, i++

M2000 Interpreter[edit]

Module UsingEvents {
Form 60, 32
Cls 5, 0
Pen 14
Declare WithEvents sp "SAPI.SpVoice"
That$="Rosetta Code is a programming chrestomathy site"
\\ this function called as sub routine - same scope as Module
\\ we can call it from event function too
Function Localtxt {
\\ move the cursor to middle line
Cursor 0, height/2
\\ using OVER the line erased with background color and then print text over
\\ ordinary Print using transparent printing of text
\\ $(0) set mode to non proportional text, @() move the cursor to sepecific position
Print Over $(0),@(margin), That$
Call Local LocalTxt()
Function sp_Word {
Read New &StreamNumber, &StreamPosition, &CharacterPosition, &Length
Call Local LocalTxt()
Cursor 0, height/2
Pen 15 {Print Part $(0), @(CharacterPosition+margin); Mid$(That$, CharacterPosition+1, Length)}
Function sp_EndStream {
Const SVEEndInputStream = 4
Const SVEWordBoundary = 32
Const SVSFlagsAsync = 1&
With sp, "EventInterests", SVEWordBoundary+SVEEndInputStream
Method sp, "Speak", That$, SVSFlagsAsync
While Not EndStream {Wait 10}
Call Local LocalTxt()


DynamicModule[{text = "This is some text.", words, i = 0}, 
[email protected]@{Dynamic[
If[i != 0, MapAt[Style[#, Red] &, #, i], #] &@(words =
[email protected]), " "]]], InputField[[email protected], String],
While[i < [email protected], i++; FinishDynamic[]; Speak[words[[i]]];
Pause[Max[0.7, 0.12 StringLength[words[[i]]]]]]; i = 0]}]


Library: Shoes

I'm having difficulty figuring out how to get Shoes to update the GUI (like Tk's update command), so the user must click the button once for each word.

Uses the Ruby code from Speech synthesis

load 'speechsynthesis.rb'
if ARGV.length == 1
$text = "This is default text for the highlight and speak program"
$text = ARGV[1..-1].join(" ")
$words = $text.split do
@idx = 0
stack do
@sentence = para(strong($words[0] + " "), $words[1..-1].map {|word| span(word + " ")})
button "Say word" do
keypress do |key|
case key
when :control_q, "\x11" then exit
def say_and_highlight
speak $words[@idx]
@idx = (@idx + 1) % $words.length
@sentence.replace($ {|word, idx| idx == @idx ? strong(word + " ") : span(word + " ")})


This code uses the external /usr/bin/say program (known available on Mac OS X) as its interface to the speech engine; this produces rather stilted speech because it forces the text to be spoken one word at a time instead of as a whole sentence (in order to keep the highlighting synchronized).

Library: Tk
package require Tcl 8.5
package require Tk 8.5
proc say {text button} {
grab $button
$button configure -state disabled -cursor watch
set starts [$text search -all -regexp -count lengths {\S+} 1.0]
foreach start $starts length $lengths {
lappend strings [$text get $start "$start + $length char"]
lappend ends [$text index "$start + $length char"]
$text tag remove sel 1.0 end
foreach from $starts str $strings to $ends {
$text tag add sel $from $to
update idletasks
exec /usr/bin/say << $str
$text tag remove sel 1.0 end
grab release $button
$button configure -state normal -cursor {}
pack [text .t]
pack [button .b -text "Speak, computer!" -command {say .t .b}] -fill x
.t insert 1.0 "This is an example of speech synthesis with Tcl/Tk."