ChatGPT gains multimodal capabilities to better assist users

Share post:

ChatGPT has gained multimodal capabilities, allowing it to receive and respond to image and voice inputs. This new feature will make ChatGPT even more helpful in a variety of tasks, such as solving math problems, identifying objects, and providing recipes.

To use the image input feature, users simply need to snap a picture of what they are looking at and add the question they’d like an answer to. ChatGPT will then analyze the image and provide a response. For example, users could use this feature to identify the name of a plant, look up the nutritional information of a food item, or get help solving a math problem.

The voice input and output feature gives ChatGPT the same functionality as a voice assistant. Users can now ask ChatGPT to perform tasks or answer questions simply by speaking. ChatGPT will then process the request and respond verbally.

The sources for this piece include an article in ZDNET.

SUBSCRIBE NOW

Related articles

Tests unable to distinguish AI from human reviews

AI-generated restaurant reviews can now pass the Turing test, successfully fooling both human readers and automated detectors, according...

Zuckerberg shares his vision with investors and Meta stock tanks

In an era where instant gratification is often the norm, Meta CEO Mark Zuckerberg’s strategic pivot towards long-term,...

AI surpasses human benchmarks in most areas: Stanford report

Stanford University’s Institute for Human-Centered Artificial Intelligence (HAI) has published the seventh annual issue of its AI Index...

Microsoft and OpenAI partner to build a $100 Billion AI supercomputer “Stargate”

In a bold stride towards computational supremacy, Microsoft, in partnership with OpenAI, is reported to be laying the...

Become a member

New, Relevant Tech Stories. Our article selection is done by industry professionals. Our writers summarize them to give you the key takeaways