-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OLMo model #1676
Add OLMo model #1676
Conversation
Thanks for the contribution. Can you add it here sglang/test/srt/models/test_generation_models.py Lines 52 to 56 in f1088e0
and run the test sglang/docs/en/model_support.md Lines 16 to 23 in f1088e0
|
I had to bump the error tolerance for decode for the test to pass. The resulting tokens are all the same. |
from torch import nn | ||
from transformers import OlmoConfig | ||
from vllm.distributed import get_tensor_model_parallel_world_size | ||
from vllm.model_executor.layers.linear import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@janimo Could you submit another PR to replace this with SGLang's linear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for OLMo models
Note: Only the new ones with -hf suffix are supported, not the old ones built before OLMo was integrated into transformers, those are deprecated and have an extra package requirement.
https://huggingface.co/collections/allenai/olmo-suite-65aeaae8fe5b6b2122b46778