Skip to content

Commit b610689

Browse files
authored
Merge branch 'main' into cpu-offload
2 parents d57b974 + 865233e commit b610689

File tree

118 files changed

+2574
-2492
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

118 files changed

+2574
-2492
lines changed

.github/workflows/nightly-eval.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,14 @@ jobs:
2727
bash scripts/ci_install_dependency.sh
2828
pip install --upgrade "evalplus[vllm] @ git+https://github.com/evalplus/evalplus"
2929
30-
- name: Test human eval
30+
- name: Test gsm8k
3131
timeout-minutes: 120
3232
run: |
3333
cd test/srt
34-
python3 test_nightly_human_eval.py
34+
python3 test_nightly_gsm8k_eval.py
3535
36-
- name: Test gsm8k
36+
- name: Test human eval
3737
timeout-minutes: 120
3838
run: |
3939
cd test/srt
40-
python3 test_nightly_gsm8k_eval.py
40+
python3 test_nightly_human_eval.py

.github/workflows/pr-test.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ jobs:
118118
timeout-minutes: 10
119119
run: |
120120
cd test/srt
121-
python3 -m unittest test_bench_latency.TestBenchLatency.test_default
121+
python3 -m unittest test_bench_one_batch.TestBenchOneBatch.test_default
122122
123123
- name: Benchmark online latency
124124
timeout-minutes: 10
@@ -194,7 +194,7 @@ jobs:
194194
timeout-minutes: 10
195195
run: |
196196
cd test/srt
197-
python3 -m unittest test_bench_latency.TestBenchLatency.test_moe_default
197+
python3 -m unittest test_bench_one_batch.TestBenchOneBatch.test_moe_default
198198
199199
accuracy-test-1-gpu:
200200
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'

.pre-commit-config.yaml

+7-6
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
default_language_version:
2-
python: python3.9
3-
41
default_stages: [pre-commit, pre-push, manual]
52

63
repos:
@@ -28,7 +25,11 @@ repos:
2825
- repo: https://github.com/psf/black
2926
rev: 24.10.0
3027
hooks:
31-
- id: black
32-
types: [python]
3328
- id: black-jupyter
34-
types: [jupyter]
29+
- repo: https://github.com/kynan/nbstripout
30+
rev: 0.8.1
31+
hooks:
32+
- id: nbstripout
33+
args:
34+
- '--keep-output'
35+
- '--extra-keys=metadata.kernelspec metadata.language_info.version'

LICENSE

+1-1
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@
186186
same "printed page" as the copyright notice for easier
187187
identification within third-party archives.
188188

189-
Copyright [yyyy] [name of copyright owner]
189+
Copyright 2023-2024 SGLang Team
190190

191191
Licensed under the Apache License, Version 2.0 (the "License");
192192
you may not use this file except in compliance with the License.

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ The core features include:
3737

3838
- **Fast Backend Runtime**: Provides efficient serving with RadixAttention for prefix caching, jump-forward constrained decoding, continuous batching, token attention (paged attention), tensor parallelism, FlashInfer kernels, chunked prefill, and quantization (INT4/FP8/AWQ/GPTQ).
3939
- **Flexible Frontend Language**: Offers an intuitive interface for programming LLM applications, including chained generation calls, advanced prompting, control flow, multi-modal inputs, parallelism, and external interactions.
40-
- **Extensive Model Support**: Supports a wide range of generative models (Llama, Gemma, Mistral, QWen, DeepSeek, LLaVA, etc.), embedding models (e5-mistral, gte) and reward models (Skywork), with easy extensibility for integrating new models.
40+
- **Extensive Model Support**: Supports a wide range of generative models (Llama, Gemma, Mistral, QWen, DeepSeek, LLaVA, etc.), embedding models (e5-mistral, gte, mcdse) and reward models (Skywork), with easy extensibility for integrating new models.
4141
- **Active Community**: SGLang is open-source and backed by an active community with industry adoption.
4242

4343
## Getting Started

benchmark/lora/lora_bench.py

+13-14
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,16 @@
1-
"""
2-
Copyright 2023-2024 SGLang Team
3-
Licensed under the Apache License, Version 2.0 (the "License");
4-
you may not use this file except in compliance with the License.
5-
You may obtain a copy of the License at
6-
7-
http://www.apache.org/licenses/LICENSE-2.0
8-
9-
Unless required by applicable law or agreed to in writing, software
10-
distributed under the License is distributed on an "AS IS" BASIS,
11-
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12-
See the License for the specific language governing permissions and
13-
limitations under the License.
14-
"""
1+
# Copyright 2023-2024 SGLang Team
2+
# Licensed under the Apache License, Version 2.0 (the "License");
3+
# you may not use this file except in compliance with the License.
4+
# You may obtain a copy of the License at
5+
#
6+
# http://www.apache.org/licenses/LICENSE-2.0
7+
#
8+
# Unless required by applicable law or agreed to in writing, software
9+
# distributed under the License is distributed on an "AS IS" BASIS,
10+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11+
# See the License for the specific language governing permissions and
12+
# limitations under the License.
13+
# ==============================================================================
1514

1615
import argparse
1716
import asyncio

docker/Dockerfile.rocm

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Usage (to build SGLang ROCm docker image):
2-
# docker build --build-arg SGL_BRANCH=v0.3.5.post2 -t testImage -f Dockerfile.rocm .
2+
# docker build --build-arg SGL_BRANCH=v0.3.6 -t testImage -f Dockerfile.rocm .
33

44
# default base image
55
ARG BASE_IMAGE="rocm/vllm-dev:20241022"

0 commit comments

Comments
 (0)