Skip navigation
Currently Being Moderated

Can Adobe Audition CC help improve Text-to-Speech output?

Dec 19, 2013 2:44 AM

Tags: #effects #audition #voiceover #voice #text-to-speech #tts #narration #speech_synthesis

I was wondering...

 

if I used high-end speech synthesis software (voices) to produce voiceover narration for a video, can I somehow further improve it in Adobe Audition CC?

 

I am not talking about pronunciation, speed and pitch of course (those can be improved only within TTS editors, with punctuation and tags), but just about smoothing the output to make it softer, more professional and more pleasant to listen to. I am already pretty satisfied with the results, but I am interested if there are any preset effects in Adobe Audition that can improve it at least a bit?

 

This is a computer speech, I am of course not having any illusions that it can be perfect.

 

Thank for reading this!

 
Replies
  • SteveG(AudioMasters)
    5,602 posts
    Oct 26, 2006
    Currently Being Moderated
    Dec 19, 2013 2:50 AM   in reply to Maroon83

    Absolutely couldn't tell without listening to some of it. The one thing that I can be absolutely sure of though is that there's no preset for it. All audio is different, and presets are only ever a starting point for the real treatment you might need.

     

    In principle though, it might be possible to make computer-generated speech a little easier on the ear - just by adding a little artificial ambience to it, for instance. Still need to hear it first, though.

     
    |
    Mark as:
  • SteveG(AudioMasters)
    5,602 posts
    Oct 26, 2006
    Currently Being Moderated
    Dec 19, 2013 5:47 AM   in reply to Maroon83

    You won't actually improve on the speech, but I think that the total lack of ambience isn't going to make long-term listening that easy. So a very small touch of reverb, and possibly a little background noise, will probably improve things a lot. Yes I know that sounds counter-intuitive, but just try it!

     
    |
    Mark as:
  • SteveG(AudioMasters)
    5,602 posts
    Oct 26, 2006
    Currently Being Moderated
    Dec 19, 2013 2:35 PM   in reply to Maroon83

    On the speech itself I used the Convolution Reverb with the 'classroom' impulse. Set the mix to 20% and also set the room size to 20%. This will make it sound as though it's synthesised in a room...

     

    As for suitable background ambience - well strangely enough that's not so easy. In the end I downloaded the Adobe Ambience 1 file (see Audition Help menu), unzipped it and used the Ambience Air Conditioner 180 01 file, but with the Notch filter to remove the pitched hum. I pulled down 1431Hz, 297Hz,441Hz, 518Hz and 882Hz all at around -36dB and that made it more 'general' sounding (you may have to mess about with that a bit - it might be improvable), The most important thing about this though is the level it should be at, which should be no higher than -55dB below the speech peak level - in other words, you should hardly hear it.

     

    To achieve all of this, you need to put all the files in Multitrack view in separate tracks, and use the Mixer to add the reverb to the speech, and the Notch filter to the ambience. You can loop the ambience as many times as you need it.

     

    The only other thing I'd say about your file is that it's way too fast to learn from! People assimilate information in the gaps between sentences, and with this file there are hardly any - it needs pacing.

     
    |
    Mark as:
  • Currently Being Moderated
    Jan 3, 2014 9:00 PM   in reply to Maroon83

    Can I ask you which text to speech software/service you used?

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points