As more and more businesses adopt AI, it’s becoming commonplace to see AI-narrated voiceover in online services. In this article, we explore when AI voices work well for a brand and when they can make your brand seem hard to relate to. For readers who plan to use AI voices, we will share our fair and balanced research results on the best-sounding AI voice generators.
From Synthetic to Authentic: An Overview of AI Voiceover Technology
Numerous businesses are adopting AI technology nowadays. For example, the use of AI chatbots for customer service has changed the way companies contact their customers, and their response rate has skyrocketed. However, would AI voiceovers as such affect the development of businesses?
Although AI is more efficient and can save time, it can also be cold and without emotion, ultimately contributing to an unpleasant viewer experience. Many AI voices are trained on large datasets of professional voice actor recordings and learn how to project intonation, pace, and accent. After learning appropriate speech patterns, AI will use text-to-speech (TTS) to generate the voice you want – from an energetic female US commentator to a UK male with a deep, soothing voice. Why do some AI voices feel great while others sound flat? TTS alone can make voices sound monotonous and robotic. That’s why natural language processing (NLP) is helpful. NLP increases AI’s ability to mimic human tone, rhythm, pitch, and fluctuations in the voice to make it sound highly human.
Exploring the Capabilities of AI Voiceovers
Voice generators. AI voices. Synthetic voices. Whatever you call them, chances are good that you hear them frequently – in commercials and corporate marketing videos, for example, or in the instructional e-learnings and other video tutorials that an increasing number of companies are using now.
Despite their efficiency, AI voices can be a liability for a brand. If viewers sense a robotic tone, they might reflexively dismiss the voiced content as unimportant or untrustworthy. But are there any current AI voices on the market that are sufficiently sophisticated to avoid these pitfalls? We undertook a comprehensive study of more than 100 voices from more than 20 AI voiceover tools. Our process involved grouping these tools into three categories based on how human-sounding their voices were.
AI Tools That Provide the Highest Voice Quality
We decided that our task would be to look at AI voices for marketing or explainer videos, and we limited ourselves to tools that allow commercial use. We did not consider text-to-speech tools for personal use, such as reading web content or books out loud, nor did we consider AI voices destined for TV broadcasts, audiobooks, etc.
To make our analysis more manageable, we focused on men speaking in British English from a shortlist of some 100 options.
Here are all the AI voices we tested, categorized based on quality:
Basic AI Voices | Standard AI Voices | Premium AI Voices |
Aidocmaker.com Good Normal Audiate Davis Chat, Guy Default, Ryan, Tony Friendly Cohesive Adam Descript Malcolm Lovo Marcus, Shawn Murf Finn, Freddie Music Radio Creative Echo, Mike Speechify Evan, Guy Friendly, Liam, Nate Revoicer Andrew, Caleb, Grayson Synthesys Ian Synthesia Newscaster, Professional Voicebooking David |
ElevenLabs Brian, Jeremy, Liam, Paul, Tyler Kyrk (Commercial version) Arnold, Jeremy, Jon William |
We put ElevenLabs’ voices at the top of our list and for a good reason. The Jeremy and Liam voices are the most stable: the free and paid versions are of similar quality, and they’ve been reliably great for more than six months.
We expected all these tools to deliver this level of reliability, but we were terribly disappointed with another tool we had tried that also offers the Jeremy voice. The preview sounded great, but the paid version didn’t match ElevenLabs' quality.
AI VS. A Real Voice Actor
The voices of AIs are increasingly difficult to distinguish from those of humans, which is good news for video creators. Although the quality has improved greatly, AI voices, while no longer flat or monotone, still often sound “second-rate” when compared with their human counterparts. A human voiceover lets you sense the speaker’s body language – the raised eyebrows, the hand gestures, the energy. Without a body to go with the voice, AI voices can’t compete. There is also a telltale “agreeableness” to the AI voice. This isn’t so much that it sounds flat or boring as that its smoothness makes it sound a little unnatural. A poetical analogy might be that the human voice is like a river with its natural meanders, shallows and ever-changing flow, and the AI voiceover is more like an artificial canal: unvarying and direct. It’s subtle enough that you need good ears to notice it, but it’s still there. So, in case you care, here are some thoughts on when and why you should trust the human voice.
Putting A Pricetag On A High-Quality AI Voiceover
Synthetic speech is now good enough, and many of us instinctively feel that the balance has tipped irrevocably in AI’s favor. Human voices are more convincing, but that comes at a very high price, often unjustifiably high.
For instance, if you want to pursue a career as an e-learning voice actor, GVAA (Global Voice Acting Academy) suggests that you charge around $0.2–0.35 per word (that’s about $35–50 per minute). A voiceover from Fiverr would be around $15 per minute, or you could post a voiceover request on Upwork at a budget of $2–$3 per minute for an ongoing job and get decent applicants.
In turn, the AI-generated voiceover would cost $0.2–3.5 per minute. The cost will differ depending on which tool you use and the monthly subscription you choose, which will depend on how many minutes of recording you need.
Here are the rates on the three AI voiceover tools that we found to be adequate (ranking higher to lower quality):
Tool Name | Free Version (Monthly) | Paid Version(Monthly) |
ElevenLabs |
10K characters (about 10 mins) |
$5 – 30,000 characters (about 30 mins) $22 – 100,000 characters (about 120 mins) $99 – 500,000 characters (about 600 mins) $330 – 2,000,000 characters (about 2400 mins) |
NaturalReader |
5K characters per day (about 5 mins) |
$99 – 300,000 characters (about 360 mins) |
Play.ht |
12.5K characters (about 12.5 mins) |
$39 – 250,000 characters (about 330 mins) $99 – unlimited |
How Much Time Does AI Voiceover Technology Save?
Let’s say that you found and successfully onboarded a voice actor. Now comes the crunch: timing. How soon can they start voicing the script? How long will it be before you get the first pass? How much time will revisions take? And what about the post-production of the final version?
Based on our own experience and that of our industry colleagues, we estimate that each of these stages – drafting, revisions, and post-production – requires around 24 to 48 hours. So, recording a 5-minute voiceover tends to even out to around 72 working hours in total (or a little more than three workdays).
What about an AI voiceover?
Basic & Standard AI Voices
These usually need substantial effort to put inflection where it should be.
Premium AI Voices
They’re probably already getting the timbre right about 95 percent of the time and placing emphasis in the right places. A good AI, for instance, might read, ‘You’re probably already aware that humans and AI are VERY different creatures,’ without any special fiddling. You might want to tweak no more than 1-2 sentences out of 10 – say, using quotation marks or ALL CAPS to shift the emphasis slightly. A good-quality AI version of that script would require about 40-60 minutes of time to produce a 5-minute voiceover, but only if you don’t need to rewrite anything.
Comparing AI and Human Voiceovers: Advantages and Disadvantages
Given that AI can produce voiceovers in a matter of minutes, often for free, is there a downside to them?
When it comes to the production of an audio recording, we’ve already explored how and why quality and cost issues can be of utmost importance. However, in our experience, other factors can prove decisive in making a choice between a human or AI voiceover.
Factor | AI Voiceover | Human Voiceover |
Budget |
Generally cheaper A decent AI voiceover will cost $0.2–3.5 per minute |
Considerably more expensive Even though certain voice actors may work at $2–5 per minute, most experts will charge $15 or more per minute |
Time | Nearly instant | Fairly time-consuming |
Editing | Very easy | Hard |
Consistency | High | Medium |
Impact | Low to medium | High |
Originality | Low to medium | High |
Main Findings
Many AI solutions provide voiceovers of differing quality. In general, when projects need results fast, AI is a good solution. AI voiceovers are quick, accurate, and reliable. Despite this, human voices still hold an advantage over AI voices. Humans can add more depth and emotion to a script that helps connect them to the audience. It all comes back to your requirements, budget, and brand ethos, as well as sending the right message to your potential and current customers. The most effective response to the booming world of generative AI might be a decision based on the particular context and your detailed and well-thought-out requirements.