Grok 3 API - Reasoning Tokens are Counted Differently

#llm #chatgpt #api #programming

I've learned it the hard way... If you use the recently released Grok-3 Mini reasoning model (which is great by the way) you might have your token usage reported wrong...

TLDR;

While both OpenAI and xAI report reasoning usage in usage.completion_tokens_details.reasoning_tokens field:

OpenAI includes reasoning tokens in usage.completion_tokens
xAI doesn't include

Hence for OpenAI (and according to my tests for Deepseek R1) in order to get the total tokens you can use the old good completion_tokens field. With xAI you need to add up the 2 values to get the right totals (and get you cost estimations correct).

Neither litellm nor AG2 (out of recently used LLM libs) adjust the reported usage for that Grok's quirk.

Not fully OpenAI Chat Completions API Compliant

Grok API provides a compatible OpenAI endpoint. For reasoning models the didn't invent the wheel and use the standard reasoning_effort parameter just like OpenAI does with its' o1/o3/o4 models. Yet for some reasons xAI decided to deviate from OpenAI's approach to reasoning tokens accounting.

That's unfortunate this inconsistency got into prod API from xAI.

DEV Community

Grok 3 API - Reasoning Tokens are Counted Differently

TLDR;

Not fully OpenAI Chat Completions API Compliant

Top comments (0)