Connect with us

AI

ReALM Apple AI can identify on-screen reference and understand background information

Published

on

Apple AI

Apple has been researching a new large language model called ReALM which can detect on-screen reference and understand its context.

ReALM (Reference Resolution As Language Modeling) works on the identification of references with unknown context or background information to provide details to the end user.

Once combined with a voice assistant, the model could bring a massive improvement in general conversation.

In the background, ReALM reconstructs the screen using parsed on-screen data and locations to generate a text-based representation that captures the visual layout (via VentureBeat).

The researchers have improved the model data to understand the reference at different resolutions. It could have the capability to outperform ChatGPT-4.

The researcher notes that the ReALM has been optimized over an existing system with similar features. Its smallest model has improved by 5 percent to identify on-screen references.

It is also added that the model may surpass the GPT-4 level of data processing. Despite its advantages, the model also has some limitations over the number of applications.

There’s no confirmation whether Apple will implement this model in its Siri voice assistant. However, the company kept on pushing boundaries to compete against its AI rivals. In that case, it is researching new AI solutions to put them in devices.

Mannoo specializes in Generative AI, Large Language Model (LLM), and Aerospace Science. Prior to delving into these fields, he was a Python programmer, a game designer, and an Android and iOS app developer with over 5 years of experience. He has prior writing experience in creative writing about smartphones and technology before working at Eonmsk.com. You can explore his X/TWitter and LinkedIn pages or contact him through his email.