OpenAI gives ChatGPT mouth and eyes

A new version of OpenAI’s ChatGPT chatbot gets voice and image capabilities. Conversations can be held and problems solved using photos.
Like Amazon’s Alexa, Apple’s Siri and other digital voice assistants, users can speak to ChatGPT and the bot will respond.

Bedtime stories from ChatGPT
With ChatGPT’s new voice function, conversations can be held on the go, “request a bedtime story for your family or settle a discussion at the dinner table,” says OpenAI, citing a few application examples.

OpenAI argues that ChatGPT’s synthetic voices are more natural than others used in popular digital voice assistants. There are five different options to choose from, including male and female voices. According to the report, the new voice feature is based on a new text-to-speech model that is capable of generating a human-like voice from text and a few seconds of speech samples. To create the voices, OpenAI says it worked with professional voice actors.

The technology behind it is also being used by Spotify for the pilot phase of its Voice Translation feature, according to OpenAI, to allow the platform’s podcasters to translate their content into different languages using their own voices.

Cooking with photos of the contents of the refrigerator

In the future, however, ChatGPT users will not only be able to converse with the chatbot, but also, for example, take photos of things in their environment and ask the chatbot to troubleshoot why, for example, the grill won’t start. When presented with a photo, table or chart, ChatGPT can provide a detailed description of the image and answer questions about its contents. Or users can upload a photo of the inside of their refrigerator, for example, and the chatbot can suggest a list of dishes they can prepare with the ingredients on hand.

ChatGPT combines chatbot and voice assistant
The success of Microsoft subsidiary OpenAI’s ChatGPT has created hype around AI. The rapidly improving AI technology can summarize documents, write computer code, produce intelligible speech and even photos and videos by processing and synthesizing massive amounts of data. More and more companies are betting on the use of AI and trying to launch their own applications based on generative AI.

With the new version of ChatGPT, OpenAI is moving beyond competing chatbots like Google Bard while competing with technologies like Alexa and Siri. Amazon and Apple’s voice assistants have long offered ways to interact with smartphones, laptops and other devices through spoken words. However, chatbots such as ChatGPT and Google Bard have more powerful voice capabilities and are able to instantly write emails, poems or term papers and comment on almost any topic thrown at them. Now, with the ChatGPT update, OpenAI has combined the two communication methods in some ways.

The new voice and image features in ChatGPT will be released to subscribers of the Plus and Enterprise plans in the next two weeks, according to OpenAI. Other user groups are expected to follow. However, the voice function will initially only be available on iOS, i.e. iPhones and iPads, and Android devices with voice; the image function, on the other hand, will be available on all platforms.

