Meta’s Voicebox is an AI-powered text-to-speech studio

Meta’s Voicebox technology is not yet available to the general public. — Picture courtesy of AndreyPopov / Getty Images via ETX Studio

Thursday, 22 Jun 2023 8:01 AM MYT

SAN FRANCISCO, June 22 — After virtual reality, the Meta group is now entering the audio arena. The American tech giant has unveiled Voicebox, a handy online studio for transforming text into audio, in six different languages. For the time being, Meta has decided not to share its new AI tool with the general public.

After the world of virtual reality, Mark Zuckerberg is now jumping into audio with Voicebox. In a blog post, the social networking giant describes this new tool as “a generative AI model that can help with audio editing, sampling and styling.”

More natural voices

First and foremost, Meta’s studio will enable text-to-speech generation, i.e., it will be able to transform written text into spoken audio using a synthetic voice. Among other options, users will be able to benefit from cross-lingual style transfer. “Given a sample of speech and a passage of text in English, French, German, Spanish, Polish, or Portuguese, Voicebox can produce a reading of the text in that language,” says Meta.

Even more impressive is Voicebox’s ability to reproduce the audio style from an extract of just two seconds. This can then be used to generate other audio content. The style used is thus more representative of the way people speak in everyday life, more natural and therefore more pleasing to the ear.

In addition to transforming text into audio and reproducing an audio style, the studio offers the option of editing an extract. In fact, the user can delete a sound or any other part of an audio track to make the content perfect without having to make a new recording.

“We trained Voicebox with more than 50,000 hours of recorded speech and transcripts from public domain audiobooks in English, French, Spanish, German, Polish, and Portuguese. Voicebox is trained to predict a speech segment when given the surrounding speech and the transcript of the segment,” explains Meta.

However, the American group is not the first to have taken an interest in synthetic voices. TikTok caused a buzz with its own text-to-speech tool when it launched in 2020. The Chinese giant even made it possible to use the voices of Disney movie characters such as Rocket Raccoon from Guardians of the Galaxy, C-3PO from Star Wars and Stitch from Lilo and Stitch to read text in audio format. More engaging and more inclusive, the use of synthetic voices continues to appeal to users and major players in social networking. For Meta, “this type of technology could be used in the future to help creators easily edit audio tracks, allow visually impaired people to hear written messages from friends in their voices, and enable people to speak any foreign language in their own voice.” A way of strengthening ties and attracting new users. — ETX Studio

Bangladeshi ‘Datuk’ arrested over alleged migrant worker quota fraud linked to a non-existent project

Daim Zainuddin dies at 86

Man stunned as ‘friendly’ Taman OUG neighbour revealed to have killed mum and hidden her body in freezer for three years

KL police chief: Delivery driver arrested over petrol bomb attack on Bukit OUG building

Hasnah Hashim sworn in as new Chief Judge of Malaya, former AG Terrirudin made Federal Court judge

South Korea prosecutors indict controversial American streamer Johnny Somali

Defence minister outlines anti-bullying measures at UPNM after recent incidents, including CCTVs and dormitory reforms

Kota Kinabalu City Hall: Sabah Smart Parking app users urged to convert physical coupons to digital credits ahead of full transition

No crime, no conspiracy, just an honest mistake, says Selangor police chief of missing RM1m cash in Ampang

Malaysia’s allure for foreign digital nomads: RM87.9m in annual spending driven by fast internet, affordable lifestyle

King confers appointment letters to 25 judges, including new Chief Judge of Malaya

Taman OUG man allegedly killed mother to ‘send her to heaven’, say police

King grants audience to new Attorney General, Chief Secretary to Govt at Istana Negara

No dating, sex, marriage or having kids with men: What is 4B, the extreme S. Korean feminist movement sweeping post-election America?

What’s the digital nomad lifestyle all about? Here’s what you need to know before working remotely from ‘paradise’ or anywhere in the world

Meta’s Voicebox is an AI-powered text-to-speech studio

You May Also Like

Related Articles

Just IN

DBP picks Cheras doctor, two other ‘influencers’ to promote proper use of BM

PM Anwar: Western media bias shows in Israel-Palestine conflict coverage (VIDEO)

Govt open to further talks on opposition MoU for parliamentary allocations, says DPM Fadillah