Skip to content

Conversation

Nottlespike
Copy link

At the very least when using the new Qwen2.5 models that are still the Qwen2 architecture when making the YAML file for the MoE the architecture line needs to be verbatim architecture: Qwen MoE otherwise it won't accurately detect the Qwen models.

@cg123
Copy link
Collaborator

cg123 commented Oct 26, 2024

Thanks for the doc fix! It looks like you're also bringing in the changes from multi-module-architecture though - that isn't quite ready to merge into main. Could you please base this off of main?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants