Skip to content

fix(runtime): normalize Kimi temperature in worker-side helper calls #335

@drewstone

Description

@drewstone

Problem

The agent-lab powered binary follow-up found that Kimi author/reproducer calls can be normalized in the lab runner, but worker-side helper calls inside generated/runtime strategies still hit a separate chat path that sends temperature below 1.

Observed in agent-lab run runs/2026-06-18/powered-binary-followup/kimi-k27-binary-n12p3-invalid.json:

  • model: kimi-k2.7-code
  • cell: kimi-k27-binary-n12p3
  • status: stopped invalid after 12/12 gen0 rows and 3/12 gen1 rows
  • trace: runs/2026-06-18/powered-binary-followup/kimi-k27-binary-n12p3-trace/events.jsonl
  • error count: 6 invalid temperature: only 1 is allowed for this model events

Example trace event shape:

LLM call 400: {"error":{"message":"invalid temperature: only 1 is allowed for this model" ...}}

Why it matters

This makes Kimi unusable for certifiable strategy-evolution experiments even when the runner config sets WORKER_TEMPERATURE=1 and wraps the author ChatClient. The failure appears in worker-side helper calls such as critique/analyst paths used by generated strategies, not just top-level worker calls.

Expected fix

Normalize or retry temperature-rejected Kimi requests wherever runtime worker/helper calls bind agent-eval chat clients, matching the existing router-client behavior that retries temperature errors at temperature=1.

Guardrail

Do not hide these as scored model failures. Either retry with the provider-required temperature or surface a structured infra/model-adapter error so evals can mark the cell invalid instead of treating it as a bad policy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions