Software
ChatGPT Gains Real-Time Video Capability After 7 Months
2024-12-12
OpenAI has been a pioneer in the field of artificial intelligence, constantly pushing the boundaries of what's possible. With the recent release of real-time video capabilities for ChatGPT, the company has taken another significant step forward. This new feature allows users to interact with ChatGPT in a whole new way, bringing a level of visual engagement that was previously unimaginable.

Unlock the Power of Visual Interaction with ChatGPT

Introduction to Real-Time Video Capabilities

After nearly seven months of anticipation, OpenAI has finally made its real-time video capabilities for ChatGPT a reality. During a livestream on Thursday, the company announced that its Advanced Voice Mode, a human-like conversational feature for ChatGPT, is now getting vision. This means that users subscribed to ChatGPT Plus, Team, or Pro can use their phones to point at objects and receive near real-time responses from ChatGPT.

Advanced Voice Mode with vision also enables users to understand what's on a device's screen through screen sharing. For example, it can explain various settings menus or provide suggestions on a math problem. This added visual element enhances the user experience and opens up a whole new world of possibilities for interacting with ChatGPT.

Accessing Advanced Voice Mode with Vision

To access Advanced Voice Mode with vision, users simply need to tap the voice icon next to the ChatGPT chat bar and then tap the video icon on the bottom left. This will start the video and allow users to interact with ChatGPT in a visual context. To screen-share, users can tap the three-dot menu and select "Share Screen."

The rollout of Advanced Voice Mode with vision will begin on Thursday and be completed within the next week. However, not all users will have access immediately. ChatGPT Enterprise and Edu subscribers will have to wait until January, and there is no timeline yet for users in the EU, Switzerland, Iceland, Norway, or Liechtenstein.

Demo on CNN's "60 Minutes"

In a recent demo on CNN's "60 Minutes," OpenAI President Greg Brockman showcased the power of Advanced Voice Mode with vision. He had ChatGPT quiz Anderson Cooper on his anatomy skills as Cooper drew body parts on a blackboard. ChatGPT was able to "understand" what Cooper was drawing, demonstrating the accuracy and capabilities of the visual analysis component.

However, in the same demo, Advanced Voice Mode with vision made a mistake on a geometry problem, highlighting the fact that it is not perfect and can be prone to hallucinations. This is an important consideration as users rely on the accuracy of the responses they receive from ChatGPT.

Delays and Focus on Additional Platforms

Advanced Voice Mode with vision has been delayed multiple times, reportedly due in part to OpenAI announcing the feature far before it was production-ready. In April, the company promised that the feature would roll out to users "within a few weeks," but it took several months for it to finally arrive in early fall for some ChatGPT users. At that time, it lacked the visual analysis component.

In the lead-up to Thursday's launch, OpenAI has been focused on bringing the voice-only Advanced Voice Mode experience to additional platforms and users in the EU. This shows the company's commitment to expanding the reach and usability of its products.

Rivals in the Race

OpenAI's competitors, such as Google and Meta, are also working on similar capabilities for their respective chatbot products. This week, Google made its real-time, video-analyzing conversational AI feature, Project Astra, available to a group of "trusted testers" on Android. This highlights the growing importance of visual interaction in the world of chatbots and the intense competition among companies to offer the most advanced features.

Launch of "Santa Mode"

In addition to Advanced Voice Mode with vision, OpenAI launched a festive "Santa Mode" on Thursday. This adds Santa's voice as a preset voice in ChatGPT, providing a fun and seasonal experience for users. Users can find it by tapping or clicking the snowflake icon in the ChatGPT app next to the prompt bar.

More Stories
see more