Skip to content

Fix chat completions returning null logprobs for models without reasoning #430

Open
ethanj801 wants to merge 1 commit into
theroyallab:mainfrom
ethanj801:fix-chat-logprobs-content-span
Open

Fix chat completions returning null logprobs for models without reasoning #430
ethanj801 wants to merge 1 commit into
theroyallab:mainfrom
ethanj801:fix-chat-logprobs-content-span

Conversation

@ethanj801

@ethanj801 ethanj801 commented Jun 27, 2026

Copy link
Copy Markdown

Is your pull request related to a problem? Please describe.
Yes. This fixes a bug where the chat completions endpoint returns without logprobs when using a model that does not have reasoning or tool calling.

When you call /v1/chat/completions or /v1/completions, you can ask for logprobs by setting the logprobs and top_logprobs parameters in the request. The server then reports, for each generated token, the log-probability the model gave it along with the most likely alternative tokens.

The problem is that /v1/completions returns logprobs correctly, while /v1/chat/completions returns logprobs: null for any model that does not have both reasoning tags (for example ) and tool-call tags configured. The request still succeeds with a 200, so nothing signals that anything went wrong. The logprobs field simply comes back empty.

The cause is in the chat streaming collector (_chat_stream_collector in endpoints/OAI/utils/chat_completion.py). A chat response can contain hidden reasoning spans and tool-call spans. It appears (based on the structure of the code and the comments), that logprobs are only meant to cover the visible answer. However, this is determined with the following if statement:

# Collect logprobs in content span only. Also make sure we're not just coming
# out of a </think> tag
if (
     "logprobs_content" in generation
     and tag not in [t_think_end, t_tool_end]   # PR NOTE: issue is on this line
     and not in_reasoning
     and not in_tool
):
     collected_logprobs += generation["logprobs_content"]

The variables in_reasoning and in_tool are set to True at the start of the reasoning/tool by checking for the reasoning/tool start tokens. They are then set to False when the reasoning/tool end token is detected. However, because those flags are set earlier in the code, the if statement needs a separate evaluation (and tag not in [t_think_end, t_tool_end]) to determine if the current token is the reasoning/tool ending token.

The current implementation of the code is incorrect. The variable tag is set to the current reasoning/thinking tag, using regex (see below). If no tag is detected, it is set to None instead. This causes a problem when using a model that does not use reasoning or thinking, as t_think_end and t_tool_end will be set to None, causing the statement to always be False. This means the log probabilities will never be added.

tag = None

while text:
    # Find + identify tag and split text into before and after parts
    if split_re:
        match = split_re.search(text)
        if match:
            i, j = match.span()
            sub, text, tag = text[:i], text[j:], match[0]
        else:
            sub, text, tag = text, "", None
    else:
        sub, text, tag = text, "", None

Why should this feature be added?
This is a bugfix that restores expected behavior. logprobs are an OpenAI-compatible feature that the chat endpoint advertises, so a user would reasonably expect them to work. The condition that triggers the bug, using model that lacks either reasoning or tool calling is relatively common, especially with older models. The fix is one line and leaves behavior unchanged for models that do use reasoning or tool tags.

Examples
Request, against a model with no reasoning or tool tags using the chatml template. After fix will contain log probs, prior to fix it will not. (Tested with gemma3_270M).:

  POST /v1/chat/completions
  {
    "messages": [{"role": "user", "content": "Write one short sentence."}],
    "max_tokens": 8,
    "logprobs": true,
    "top_logprobs": 5
  }

The content-span logprobs guard collected a token's logprobs only when
`tag not in [t_think_end, t_tool_end]`. With no reasoning or tool tags
configured both ends are None, and an ordinary content token also has tag
None, so `None not in [None, None]` is False and logprobs are never collected.
/v1/chat/completions then returns null logprobs for any model lacking both
reasoning and tool tags, while /v1/completions returns them correctly.

Filter unset tags out of the membership test, matching the `if s` filter the
split regex above already applies to the same tags.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant