Skip to content

[Feature] support min_p_sampling #2872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from

Conversation

lizexu123
Copy link
Contributor

@lizexu123 lizexu123 commented Jul 16, 2025

功能描述
我们参考了flashinfer的实现(感谢),实现了min_p_from_prob,支持min_p以张量的形式传入,既支持gpu kernel的形式,也支持paddle散op的形式
使用方式
服务方式请求:

response = client.chat.completions.create(
    model="default",
    messages=[
        {"role": "user", "content": "北京天安门在哪里?"},
    ],
    temperature=0.1,
    metadata={"min_p":0.1},
    stream=False,
)

print(response.choices[0].message.content)
print("\n")

离线方式:

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

model_name_or_path = "Qwen/Qwen3-0.6B"

sampling_params = SamplingParams(temperature=1.0,min_p=0.1)
llm = LLM(model=model_name_or_path, tensor_parallel_size=1,reasoning_parser="qwen3")
prompt = "北京天安门在哪里?"
messages = [{"role": "user", "content": prompt}]
output = llm.chat([messages],
                   sampling_params)

print(output)

Copy link

paddle-bot bot commented Jul 16, 2025

Thanks for your contribution!

@@ -282,6 +286,7 @@ def forward_cuda(
sampled_token_ids=next_tokens,
logprobs_tensors=logprobs_tensors,
)
self.step+=1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete it

"""
min_p_sampling
"""
if paddle.count_nonzero(min_p_arr)==0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit all files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

咱们的pre-commit失效了吗

Copy link
Collaborator

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

单测需要和小算子组合比对正确性

# limitations under the License.


import matplotlib.pyplot as plt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

单测不比依赖matplotlib

@lizexu123 lizexu123 force-pushed the min_p_1 branch 2 times, most recently from 644ac9d to 13d4cdd Compare July 18, 2025 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants