OpenAI

OpenAI releases Voice Engine text-to-speech model

Published

2 years ago

March 29, 2024

OpenAI today released the Voice Engine model, which uses text-to-speech technology to generate natural-sounding speech based on the text input.

The company has released a small-scale preview of the model to show how it tries to resemble the original speaker.

The company says a small model with a single 15-second sample can create emotive and realistic voices. However, the sample still sounds like it needs some polish.

OpenAI first worked on Voice Engine in late 2022 and integrated the model into text-to-speech API and ChatGPT’s voice and read-aloud features. Here are a few objectives that it is looking to accomplish with this new model:

Reading Assistance
Content translation
Supporting people who are non-verbal
Helping patients recover their voice

The Voice Engine model is in an early phase of testing including a small group of internal partners. The preview is not available for a wide range of testers.

There are a few of the improvements that are expected to come in the future build of Voice Engine. You can check all of the speech samples on OpenAI’s official website linked below.

(source)