'Voice Engine': OpenAI unveils AI technology that clones anyone's voice
Acknowledging election risks, San Francisco-based company says the model can duplicate someone's speech based on a 15-second audio sample, but it has yet to decide how to deploy the technology.
OpenAI has revealed a voice-cloning tool it plans to keep tightly controlled until safeguards are in place to thwart audio fakes meant to dupe listeners.
A model called "Voice Engine" can essentially duplicate someone's speech based on a 15-second audio sample, according to an OpenAI blog post sharing results of a small-scale test of the tool.
"We recognise that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year," the San Francisco-based company said on Friday.
"We are engaging with US and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build."
Disinformation researchers fear rampant misuse of AI-powered applications in a pivotal election year thanks to proliferating voice cloning tools, which are cheap, easy to use and hard to trace.
Acknowledging these problems, OpenAI said it was "taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse."
We're sharing our learnings from a small-scale preview of Voice Engine, a model which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker. https://t.co/yLsfGaVtrZ
— OpenAI (@OpenAI) March 29, 2024
DeepFake, Robocalls and more
The cautious unveiling came a few months after a political consultant working for the long-shot presidential campaign of a Democratic rival to Joe Biden admitted being behind a robocall impersonating the US leader.
The AI-generated call, the brainchild of an operative for Minnesota congressman Dean Phillips, featured what sounded like Biden's voice urging people not to cast ballots in January's New Hampshire primary.
The incident caused alarm among experts who fear a deluge of AI-powered deepfake disinformation in the 2024 White House race as well as in other key elections around the globe this year.
OpenAI said that partners testing Voice Engine agreed to rules including requiring explicit and informed consent of any person whose voice is duplicated using the tool.
It must also be made clear to audiences when voices they are hearing are AI generated, the company added.
"We have implemented a set of safety measures, including watermarking to trace the origin of any audio generated by Voice Engine, as well as proactive monitoring of how it's being used," OpenAI said.
Sam Altman, CEO of OpenAI, appears poised to challenge Apple's Siri and Amazon's Alexa voice assistants in the next phase, as hinted in a recent media interview where he emphasised OpenAI's focus on releasing "a lot of other important things" before GPT-5's launch.