OpenAI’s latest upgrade essentially lets users livestream with ChatGPT

cyptouser6 months ago (05-14)Cryptocurrencies News111

55966e89>

ChatGPT creator OpenAI has announced its latest AI model, GPT-4o, a chattier, more humanlike AI chatbot, which can interpret a user’s audio and video and respond in real time.

A series of demos released by the firm shows GPT-4 Omni helping potential users with things like interview preparation — by making sure they look presentable for the interview — as well as calling a customer service agent to get a replacement iPhone.

Other demos show it can share dad jokes, translate a bilingual conversation in real time, be the judge of a rock-paper-scissors match between two users, and respond with sarcasm when asked. One demo even shows how ChatGPT reacts to being introduced to the user’s puppy for the first time.

"Well hello, Bowser! Aren't you just the most adorable little thing?" the chatbot exclaimed.

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN

Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx
— OpenAI (@OpenAI) May 13, 2024

“It feels like AI from the movies; and it’s still a bit surprising to me that it’s real,” said the firm’s CEO, Sam Altman, in a May 13 blog post.

“Getting to human-level response times and expressiveness turns out to be a big change.”

A text and image-only input version was launched on May 13, with the full version set to roll out in the coming weeks, OpenAI said in a recent X post.

GPT-4o will be available to both paid and free ChatGPT users and will be accessible from ChatGPT’s API.

OpenAI said the “o” in GPT-4o stands for “omni” — which seeks to mark a step toward more natural human-computer interactions.

Introducing GPT-4o, our new model which can reason across text, audio, and video in real time.

It's extremely versatile, fun to play with, and is a step towards a much more natural form of human-computer interaction (and even human-computer-computer interaction): pic.twitter.com/VLG7TJ1JQx
— Greg Brockman (@gdb) May 13, 2024

GPT-4o’s ability to process any input of text, audio and image at the same time is a considerable advancement compared with OpenAI’s earlier AI tools, such as ChatGPT-4, which often “loses a lot of information” when forced to multi-task.

Related: Apple finalizing deal with OpenAI for ChatGPT iPhone integration: Report

OpenAI said “GPT-4o is especially better at vision and audio understanding compared to existing models,” which even includes picking up on a user’s emotions and breathing patterns.

It is also “much faster” and “50% cheaper” than GPT-4 Turbo in OpenAI’s API.

The new AI tool can respond to audio inputs in as little as 2.3 seconds, with an average time of 3.2 seconds, OpenAI claims, which it says is similar to human response times in an ordinary conversation.

Magazine: How to stop the artificial intelligence apocalypse: David Brin, Uplift author

The content on this website comes from the Internet. Due to the inconvenience of proofreading the authenticity and accuracy of the copyright or content of some content, it may be temporarily impossible to confirm the authenticity and accuracy of the copyright or content. For copyright issues or other issues caused by this, please Call or email this site. It will be deleted or changed immediately after verification.

Back to list

Previous：More crypto AI alliances emerge following $7.5B token merger

Next：US senators challenge DOJ's broad definition of crypto money transmitters