OpenAI’s Voice Engine Announced: Potential and Concerns

OpenAI has unveiled a new AI-based audio cloning tool called Voice Engine. This technology can generate natural-sounding speech that closely resembles the original speaker from just a 15-second audio sample. The system can read the text using a synthetic voice that sounds like the user’s, even in a language different from the user’s native one.

The high-profile AI start-up has allowed a small group of businesses to test this new system. However, OpenAI is not sharing the technology more widely yet due to potential dangers. A voice generator could help spread disinformation across social media and allow criminals to impersonate people online or during phone calls. The company is particularly worried that this kind of technology could be used to break voice authenticators that control access to online banking accounts and other personal applications.

OpenAI is exploring ways of watermarking synthetic voices or adding controls that prevent people from using the technology with the voices of politicians or other prominent figures. The company is taking a cautious approach to a broader release to understand its potential dangers better.

The generative AI model powering Voice Engine has been hiding in plain sight for some time. The same model underpins the “read aloud” capabilities in ChatGPT, OpenAI’s AI-powered chatbot, as well as the preset voices available in OpenAI’s text-to-speech API. Spotify has been using it since early September to dub podcasts for high-profile hosts like Lex Fridman in different languages.

read more > www.nytimes.com

NIMBUS27