Meta unveils speech generation AI: Voicebox
Meta, the parent company of Facebook and Instagram, announced a speech-generation AI model called Voicebox on June 16.
The company said Voicebox could generate speech from text and noted that the model could match an audio style based on a sample just two seconds long.
Voicebox can also convert a text sample to another language and, given a separate speech sample, read the translated text in the speaker’s original voice. This capability supports six languages: English, French, German, Spanish, Polish, and Portuguese.
The AI model can additionally edit existing recordings to remove background noise. More generally, it can create speech that is modeled on diverse speech samples.
Voicebox could be leveraged by various users
Meta said that Voicebox and other similar AI models could allow virtual assistants and non-player characters in its metaverse to have realistic voices. The tool could also be of use to content creators and to users with accessibility needs, it said.
Meta said that Voicebox is currently a research project. It did not say when the feature might be publicly available, but it shared a demo video.
Meta announced several consumer AI tools earlier in June, revealed details about its AI chips in May, and discussed internal AI applications in an April investor call.