Connect with us

News

xAI releases Grok 4 Fast, its cost-efficient models with 2 million token context window

Published

on

xAI Grok 4 Fast

xAI has launched Grok 4 Fast, its new cost-efficient reasoning models. It offers a massive token context window to provide more freedom while interacting with AI.

This new model outperforms Grok 3 Mini on all reasoning benchmarks with the highest token cost efficiency. This is the result of large-scale reinforcement learning, which optimizes Grok 4 Fast’s intelligence density. According to xAI’s evaluation, Grok 4 has used 40% fewer thinking tokens on average compared to Grok 4, while providing the same performance as Grok 4, as tested on benchmarks.

This token efficiency and lower price per token allow the AI maker to reduce the price by up to 98% to achieve the same performance as Grok 4.

This new model can seamlessly browse the internet and X (formerly Twitter) to augment queries with real-time data. It searches through links, online media, including the ones that are available on X, and combines the results as fast as possible. This capability comes in handy when asked for coding or web-related queries.

Unified Model

Unlike previous models, Grok 4 Fast comes with a unified architecture where reasoning and non-reasoning are handled by the same model weights. xAI explains that this unified approach reduces end-to-end latency and cuts token costs.

Users can now access the Grok 4 Fast via the Grok web, Android, and iOS applications. The Auto mode will now choose this new model for difficult queries for a fast experience without degrading response quality. Interestingly, this model is now available for all Grok users, including the free ones. And for a limited time, the new model will also be available for free on OpenRouter and Vercel AI Gateway.

API

xAI has released two models – grok-4-fast-reasoning and grok-4-fast-non-reasoning. These two have a 2-million token context window. These two models are now available on the xAI API console.

(source)

Mannoo specializes in Generative AI, Large Language Model (LLM), and Aerospace Science. Prior to delving into these fields, he was a Python programmer, a game designer, and an Android and iOS app developer with over 5 years of experience. He has prior writing experience in creative writing about smartphones and technology before working at Eonmsk.com. You can explore his X/TWitter and LinkedIn pages or contact him through his email.