If you are looking for an awesome text to speech generator you have to think about several things that will make your generated voice awesome:
- How natural does the generated voice sound?
- How easy it is to use this software (do you have to sign up or download anything)?
- How much can you extend it (add voice breaks, emphasis, improve pitch or tone)?
- Is it free?
The best text to speech system I can recommend is SPIK.ai, a free web app that uses AI to generate natural sounding voices. Why is this the best option, in my opinion? Find the reasons below.
Using deep learning to generate natural voices
The human voice is actually very hard to generate, if you are using older algorithms. With newer approaches (for example convolutional or recurrent neural networks) the voices sound a lot more natural.
There are two very popular approaches to generating human sounding voices but the concept is the same. What doesn’t work very well is putting together clips of voices to generate other sounds, something that we may think it should sound natural. It doesn’t. It is costly, time consuming and in the end the voice may be smooth, but emotionless.
What does work is skipping the sound part and generate the wave form of the audio files. Two very good approaches are based on Convolutional Neural Networks (WaveNet) and Recurrent Neural Networks (SampleRNN). Fun fact about the two leading approaches: they both can be used to generate music, if properly used and trained.
SPIK.ai uses the WaveNet algorithm thus providing natural sounding voices that you can use to generate audio files from text.
Easy to use voice generation software
Spik.AI is easy to use. So much so that you can just go on its front page, add your text, choose the voice and press generate.
There is no signup required, you don’t have to download or install anything and there are no ads making it hard for you to generate the right voice.
Once you have pressed the “Generate” button, the app takes you to another page where you can preview your generated speech and/or download or share it with your friends. That’s it. As simple as it gets.
Go beyond simple text to speech with SSML (Speech Synthesis Markup Language)
Spik.AI can help you generate even scripted speech files, with the use of SSML. The markup allows you to improve the quality of your audio files with some awesome tricks:
- Breaks: use the <break> tag to set for how much time the voice should stop.
- Emphasis: use the <emphasis> tag to emphasize some important aspect of your text.
- Improve tone, pitch and rate with the <prosody> tag.
- Instruct the app on how to speak special parts of the text with the
These are just some of the cool things you can do with Spik.AI’s text to speech generation software. By the way – there’s also the reverse process coming up soon: Speech to text – a great option to quickly create transcripts from voice recordings. Just make sure to sign up for a reminder of when it launches.
Is this text to speech software free?
This is a big one. Unlike other apps, Spik.AI is awesome at generating voices, easy to use but also free. That’s definitely a big plus.