bah.speech

BlitzMax Forums/Brucey's Modules/bah.speech

Russell

(Posted 2011) [#1]

Hello, Brucey! Hope everything is good with you.

I'm having a lot of fun with the bah.speech module but had a couple of questions.

First, I don't see it listed in the modules list ("* The Modules *"), or on your google code website. Is that because it is Windows only? [On that note, I found this cross-platform speech and speech recognition library on sourceforge: http://voce.sourceforge.net]

Second, The bah.speech/doc folder is empty. Do you know where I can get instructions on fine-tuning the speech (adding pauses, pitch rises, etc)? [Never mind. I found it on Microsoft's website. Interesting... I see that the SAPI (Speech API) also supports recognition... ;)]

Take care!
Russell

Russell

(Posted 2011) [#2]

Actually, Microsoft's website was not too helpful in figuring out how to change the rate/pitch of the text. (Although I guess there is a rate() method supposedly). Pitch and pause would be more useful, though, so that the spoken text has more "life" in it. It looks like MS SAPI is primarily designed to speak words from applications, so inflection, etc are not much of a factor.

I have to say I'm impressed with the default voice quality, though. Quite a step above the Amiga's built-in speech capability (up to AmigaDos version 2.05 anyway, if I remember correctly). Still, impressive for 1985-ish.

Later,
Russell

Brucey

(Posted 2011) [#3]

I've found the Mac voices to be in general more clear than those of Windows.
That said, there are some outstanding commercial TTS packages available that will knock your socks off! A bit expensive though, unfortunately.

I've kept the interface purposely simple so that it's easier to use the same calls across platforms. Once you start looking at low-level tweaks to the voices, the APIs tend to wander off in different incompatible directions.

Brucey

(Posted 2011) [#4]

I need to sort out the voice listing for Windows too...
On the Mac, you can get a list like this :


0 : com.apple.speech.synthesis.voice.Agnes
1 : com.apple.speech.synthesis.voice.Albert
2 : com.apple.speech.synthesis.voice.Alex
3 : com.acapela.CoWa.voice.Alyona22kM
4 : com.apple.speech.synthesis.voice.BadNews
5 : com.apple.speech.synthesis.voice.Bahh
6 : com.apple.speech.synthesis.voice.Bells
7 : com.apple.speech.synthesis.voice.Boing
8 : com.apple.speech.synthesis.voice.Bruce
9 : com.apple.speech.synthesis.voice.Bubbles
10 : com.apple.speech.synthesis.voice.Cellos
11 : com.apple.speech.synthesis.voice.Deranged
12 : com.apple.speech.synthesis.voice.Fred
13 : com.apple.speech.synthesis.voice.GoodNews
14 : com.acapela.CoWa.voice.Graham22kM
15 : com.apple.speech.synthesis.voice.Hysterical
16 : com.apple.speech.synthesis.voice.Junior
17 : com.apple.speech.synthesis.voice.Kathy
18 : com.acapela.CoWa.voice.Lucy22kM
19 : com.acapela.CoWa.voice.Peter22kM
20 : com.apple.speech.synthesis.voice.Organ
21 : com.apple.speech.synthesis.voice.Princess
22 : com.acapela.CoWa.voice.Rachel22kM
23 : com.apple.speech.synthesis.voice.Ralph
24 : com.apple.speech.synthesis.voice.Trinoids
25 : com.apple.speech.synthesis.voice.Vicki
26 : com.apple.speech.synthesis.voice.Victoria
27 : com.apple.speech.synthesis.voice.Whisper
28 : com.apple.speech.synthesis.voice.Zarvox

Which you can then use to change the voice.

Russell

(Posted 2011) [#5]

Well that's interesting! What library does the Mac use for speech? So those are "built-in" voice profiles? I noticed in the example, example_01.bmx, that the program does a search for available voices:

SuperStrict

Framework BaH.Speech
Import brl.StandardIO

Local speech:TSpeech = New TSpeech

Local s:String[] = TSpeech.availableVoices()
For Local i:Int = 0 Until s.length
	Print s[i]
Next

Print s.length  '<----Returns zero!

If s Then ' Choose the first voice, or use the default if there are none found.
	speech.setVoice(s[0])
End If

speech.speak("Hello, this is BaH dot Speech")

While speech.isSpeaking()
	Delay 10
Wend

End

When I check the length of 's' it returns zero. The default voice sounds androgynous, which is probably just fine for most situations. I'm jealous of the Mac with its 29 built-in profiles, though :)

Russell

Brucey

(Posted 2011) [#6]

Yeah, availableVoices() doesn't currently return anything on Windows. I need to work out the best way to get Max to pull the data out of the API...

wmaass

(Posted 2011) [#7]

btw, very useful mod brucey. Used it for a golf simulator app that among other things provides text to speech capabilities during round play for things like who is up, how far to pin etc. I am directing my customers to get the a voice from Cepstal for best results.

Last edited 2011

Brucey

(Posted 2011) [#8]

Finally implemented availableVoices() and setVoice() for Win32.

Example list:

Microsoft Anna - English (United States)
Microsoft Mary
Microsoft Mike
Sample TTS Voice

However, on my 64bit Windows 7 machine it only plays the first (default) one, since the others appear to be un-playable on this platform. So despite the fact you may get a list of voices, you may not hear anything from them. It might be a 64-bit thing.

Still, the API is in line with its Mac counterpart now... :o)

wmaass

(Posted 2011) [#9]

That's great Brucey, will try it out.

Brucey

(Posted 2011) [#10]

Added some Linux support also now... although control is much less than on the other platforms - unless there are more low-level APIs that are hiding somewhere...

Brucey

(Posted 2011) [#11]

Ooh, and apparently OS X Lion now integrates the Nuance voices, which adds another 40 or so, including dialects. For example, there are Australian, South African and Scottish accented female voices...
In fact many of the international voices (Thai, Russian, etc) are female too. Perhaps they are more soothing on the ears ;-)