OpenAI knows that AI-generated voice tools can be a sketchy business.
In a blog post sharing the early test phase results of its new synthetic voice tool, the artificial intelligence company addressed concerns about the use of AI to replicate human voices, especially in an election year.
OpenAI’s “Voice Engine” tool, which the company says it first developed in late 2022, uses a 15-second audio clip of a real person’s voice to create an eerily realistic, human-sounding replica of that voice.
And users can make that voice say anything — even in other languages.
The tool is not yet available to the public, and OpenAI says it is still considering “whether and how to deploy this technology at scale.”
“We recognize that generating speech that resembles people’s voices has serious risks, which are especially top of mind in an election year,” OpenAI wrote in its blog post. “We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society, and beyond to ensure we are incorporating their feedback as we build.”
OpenAI currently uses the tool to power ChatGPT’s “read aloud” features, as well as the company’s text-to-speech API.
At the end of last year, OpenAI started expanding the tool externally, working with what it described as a “small group of trusted partners” to test out Voice Engine for things like children’s educational materials, language translation, and medical voice recovery, the company said in its post.
OpenAI stressed that its partner organizations must obey strict policies to use Voice Engine, like getting consent from every individual being impersonated and informing listeners that the voice is AI-generated.
“We are taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse,” the company wrote. “We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities.”
Though the company said it’s not yet sure whether it will ever release the tool to the general public, it pushed policymakers and developers to take steps to prevent dangerous misuse of the tech it was developing.
For example, OpenAI suggested establishing a “no-go voice list” to prevent the nonconsensual replication of prominent voices, like politicians or celebrities.
The company also recommended that banks stop using voice-based security authentication and that researchers develop techniques to track whether a voice is real or fake.