Hi,
I've been trying to find more information about text to speech in captivate 5.5 but all the information out there seems to be outdated for version 4. I have successfully downloaded the voices and inserted the text to speech text but I now have 2 questions and would GREATLY appreciate anyone's insight.
1) The audio gets cut off at the end of the slide when I preview all my slides. For each slide the last word always cuts off. I tried adding 1 second of silence at the end of the audio but I am still having the issue. Does anyone know how to fix this?
2) How do I edit the pronunciation etc. of the voices? They sound a little robotic and the vtml codes that were posted for captivate 4 do not work in 5.5. I just want to have a little more emotional fluctuation so that it doesn't sound so robotic. Again, any insight is much appreciated.
-Amy
I also had the cut-off problem in Captivate 5. Add punctuation, such as a comma or period, before the end of the line.
For the NeoSpeech voices, my line looks like:
| Welcome to this getting started tour <vtml_pause time="150"/> of Myu Studio Performance. In this video, we will look at managing Scenarios in the Library.<vtml_pause time="100"/> |
Both the period and the vtml_pause tag are required to avoid clipping.
As for more natural sounding, I do two things:
1) Insert pauses, like the <vtml_pause time="150"/> when things sound too rushed.
2) Use funky spellings. For example, just using "Mu Studio Performance" results in "moo studio performance", but the funky spelling makes TTS pronounce the word correctly.
3) Use the The CMU Pronouncing Dictionary (http://www.seech.cs.cmu.edu/cgi-bin/cmudict?in=handy&stress=-s) to generate better pronunciations. I include the pronunciations inline, because I never got the dictionary editor to actually work. I have assembled a mini-dictionary of words I use, for example:
<vtml_phoneme alphabet="x-cmu" ph="P R IH1 N T AH0 B AH0 L">printable</vtml_phoneme>
<vtml_phoneme alphabet="x-cmu" ph="L AY1 V">live</vtml_phoneme>
<vtml_phoneme alphabet="x-cmu" ph="R AW1 T S">routes</vtml_phoneme>
The voices always seem a little robotic, but with some legwork, they can get pretty good. And when I needed to change a product name in 8 videos, it took less than half a day to republished everything. Using a voice actor would have taken much, much, longer.
North America
Europe, Middle East and Africa
Asia Pacific