Grok 4 Surpasses Rivals to Become the Top AI Model in Latest Benchmarks

Elon Musk’s xAI has officially taken the lead in the AI race with its latest model, Grok 4, now outperforming all major competitors. According to newly released benchmark results, Grok 4 scored an impressive 73 on the Artificial Analysis Intelligence Index, outpacing OpenAI’s o3 and Google’s Gemini 2.5 Pro (both at 70), Anthropic’s Claude 4 Opus (64), and DeepSeek R1 0528 (68).

This marks a pivotal moment for xAI, as it’s the first time their technology claims the top spot in performance evaluations. While Grok 3 was already competitive with models from industry leaders, Grok 4 establishes a new standard, excelling in areas such as coding, mathematical reasoning, and general knowledge. The benchmarks were run using the xAI API, which may differ slightly from the model currently integrated into the X (Twitter) platform.

Grok 4’s capabilities go beyond raw power—it’s designed for advanced reasoning, making its responses more thoughtful and nuanced. Although its per-token pricing matches that of Claude 4 Sonnet ($3/$15 per 1M tokens), it is still more expensive than Gemini 2.5 Pro or o3. Despite this, Grok 4 is expected to be widely available, not only on X and the xAI API, but also through platforms like Microsoft Azure AI Foundry.