Given a search question, return reasoning and content("\n\n") with no tool calls in first turn ending with finish_reason==stop.

#77
by xingmo - opened

Hi there! Can anybody helps?
There is a confused problem I had ever met so many times as title says.

Here is my DEBUG log:

  [DEBUG] finish_reason=stop
  [DEBUG] reasoning_content=The user is asking about the distance from Sacramento to a flight school in Atwater. Let me first search for information about this flight school.

Atwater is a c…[+202]
  [DEBUG] content=


  [DEBUG] tool_calls=[]

with my vllm run scricpt:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
python -m vllm.entrypoints.openai.api_server \
    --model "$MODEL" \
    --served-model-name qwen3.5-35b-a3b \
    --tensor-parallel-size 8 \
    --enable-expert-parallel \
    --max-model-len 40960 \
    --language-model-only \
    --enable-prefix-caching \
    --dtype bfloat16 \
    --trust-remote-code \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_coder \
    --reasoning-parser qwen3 \
    --port 8002 \
    --host 0.0.0.0

Sign up or log in to comment