Fix chat completions returning null logprobs for models without reasoning #430
Open
ethanj801 wants to merge 1 commit into
Open
Fix chat completions returning null logprobs for models without reasoning #430ethanj801 wants to merge 1 commit into
ethanj801 wants to merge 1 commit into
Conversation
The content-span logprobs guard collected a token's logprobs only when `tag not in [t_think_end, t_tool_end]`. With no reasoning or tool tags configured both ends are None, and an ordinary content token also has tag None, so `None not in [None, None]` is False and logprobs are never collected. /v1/chat/completions then returns null logprobs for any model lacking both reasoning and tool tags, while /v1/completions returns them correctly. Filter unset tags out of the membership test, matching the `if s` filter the split regex above already applies to the same tags.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Is your pull request related to a problem? Please describe.
Yes. This fixes a bug where the chat completions endpoint returns without logprobs when using a model that does not have reasoning or tool calling.
When you call
/v1/chat/completionsor/v1/completions, you can ask for logprobs by setting thelogprobsandtop_logprobsparameters in the request. The server then reports, for each generated token, the log-probability the model gave it along with the most likely alternative tokens.The problem is that
/v1/completionsreturns logprobs correctly, while/v1/chat/completionsreturns logprobs: null for any model that does not have both reasoning tags (for example ) and tool-call tags configured. The request still succeeds with a 200, so nothing signals that anything went wrong. The logprobs field simply comes back empty.The cause is in the chat streaming collector (
_chat_stream_collectorinendpoints/OAI/utils/chat_completion.py). A chat response can contain hidden reasoning spans and tool-call spans. It appears (based on the structure of the code and the comments), that logprobs are only meant to cover the visible answer. However, this is determined with the following if statement:The variables
in_reasoningandin_toolare set to True at the start of the reasoning/tool by checking for the reasoning/tool start tokens. They are then set to False when the reasoning/tool end token is detected. However, because those flags are set earlier in the code, the if statement needs a separate evaluation (and tag not in [t_think_end, t_tool_end]) to determine if the current token is the reasoning/tool ending token.The current implementation of the code is incorrect. The variable
tagis set to the current reasoning/thinking tag, using regex (see below). If no tag is detected, it is set toNoneinstead. This causes a problem when using a model that does not use reasoning or thinking, ast_think_endandt_tool_endwill be set toNone, causing the statement to always be False. This means the log probabilities will never be added.Why should this feature be added?
This is a bugfix that restores expected behavior. logprobs are an OpenAI-compatible feature that the chat endpoint advertises, so a user would reasonably expect them to work. The condition that triggers the bug, using model that lacks either reasoning or tool calling is relatively common, especially with older models. The fix is one line and leaves behavior unchanged for models that do use reasoning or tool tags.
Examples
Request, against a model with no reasoning or tool tags using the chatml template. After fix will contain log probs, prior to fix it will not. (Tested with gemma3_270M).: