In today’s episode of What Could Go Wrong: AI Edition, a tech company that uses artificial intelligence to mimic voices is adding more “safeguards” to its tech after it was used to generate clips of celebrities reading offensive content.
ElevenLabs (opens in new tab) is a research company specializing in AI speech software that generates realistic-sounding voices to voice-over audiobooks, games, and articles in any language. One of its tools called Voice Lab lets users “clone” a voice by simply uploading a one-minute clip of them speaking. From there, you can use the cloned voice to read up to 2,500 characters using its text-to-speech feature. I know what you’re thinking: there’s no way anyone could exploit this system by uploading someone else’s voice, right?
Enter 4Chan. 4Chan posters used the application to generate sound clips of celebrities saying racist, homophobic, and other offensive messages and then spread them online. I won’t link them here, but the clips that have been circulating most widely include Emma Watson reading an excerpt of Adolf Hitler’s Mein Kampf and Joe Biden announcing that the US will send troops into Ukraine.
ElvenLabs says it has been taking steps to keep Voice Lab from being used for “malicious purposes,” posting how it plans to keep its tech out of the wrong hands in a lengthy Twitter thread. (opens in new tab)
ElevenLabs claims it “always had the ability to trace any generated audio clip back to a specific user.” Next week it will release a tool that will allow anyone to confirm that a clip was generated using its technology and report it.
The company says that the malicious content was created by “free anonymous accounts,” so it will add a new layer of identity verification. Voice Lab will be made available only on paid tiers, and immediately remove the free version from its site. ElevenLabs are currently tracking and banning any account that creates harmful content in violation of its policies.
ElevenLabs admits that putting the tech behind a paywall “won’t always prevent abuse” but says it does “make VoiceLab users less anonymous and force them to think twice before sharing improper content.”
But what about the problem of people using celebrity voices in general instead of their own? ElevenLabs has suggested requiring users to read a sample prompt to train the AI on their voice instead of uploading any ole audio file.
Free accounts will still be able to use the text-to-speech functionality, but only with access to pre-made voices. ElevenLabs says it will continue to monitor the situation and that all affected accounts will get a refund.
Just this week, the CEO of OpenAI said that the misuse of AI could be “lights out for all of us.” But seriously, what did ElevenLabs think would happen if you gave the internet a tool that could make any voice say anything? Come on, guys.