Implemented PEFT feature in CSModel class#28
Open
MKHarsha03 wants to merge 19 commits intovandijklab:masterfrom
Open
Implemented PEFT feature in CSModel class#28MKHarsha03 wants to merge 19 commits intovandijklab:masterfrom
MKHarsha03 wants to merge 19 commits intovandijklab:masterfrom
Conversation
Author
|
@aakashdp6548 @SyedA5688 I am an engineering student and this is my first contribution. I am working on a project that uses your library and added the LoRA feature as I needed it for my project. Please review my code and mention any sorts of corrections. Thank you |
There was a problem hiding this comment.
Pull request overview
This PR adds initial PEFT/LoRA support to the CSModel wrapper and updates packaging/docs to support newer model/tokenizer requirements and a newer Python baseline.
Changes:
- Add optional LoRA (PEFT) configuration to
CSModelinitialization. - Update packaging requirements (Python >= 3.10; add tokenizer-related deps; add
peft/bitsandbytes). - Update documentation + README tutorials list and adjust ReadTheDocs configuration.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
src/cell2sentence/csmodel.py |
Adds PEFT/LoRA parameters, HF login option, and updates Trainer initialization. |
src/cell2sentence/tests/test_csmodel.py |
Replaces prior CSModel test with a new PEFT-loading-focused test (currently problematic). |
src/cell2sentence/tests/small_data_diffgenes.csv |
Modifies a test fixture CSV (currently breaks existing CSData tests). |
setup.cfg |
Raises minimum Python version and adds new dependencies. |
docs/source/csmodel.rst |
Updates CSModel docs page (currently contains an unresolved conflict marker). |
README.md |
Updates Python version, install instructions, and tutorials list (contains a broken tutorial link and an incomplete row). |
.readthedocs.yaml |
Attempts to configure RTD installs (currently duplicated blocks). |
docs/Makefile, docs/make.bat |
Adds trailing newline only. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/source/csmodel.rst
Outdated
Comment on lines
+1
to
+23
|
|
||
| A CSModel object is a wrapper around a Cell2Sentence model, which tracks the path of the model | ||
| saved on disk. When needed, the model is loaded from the path on disk for inference or finetuning. | ||
| The class contains utilities for model generation and cell embedding with a Huggingface backend. | ||
|
|
||
| .. autofunction:: csmodel.CSModel | ||
|
|
||
| .. autofunction:: csmodel.CSModel.__init__ | ||
|
|
||
| .. autofunction:: csmodel.CSModel.__str__ | ||
|
|
||
| .. autofunction:: csmodel.CSModel.fine_tune | ||
|
|
||
| .. autofunction:: csmodel.CSModel.generate_from_prompt | ||
|
|
||
| .. autofunction:: csmodel.CSModel.generate_from_prompt_batched | ||
|
|
||
| .. autofunction:: csmodel.CSModel.embed_cell | ||
|
|
||
| .. autofunction:: csmodel.CSModel.embed_cells_batched | ||
|
|
||
| .. autofunction:: csmodel.CSModel.push_model_to_hub | ||
| ======= |
| @@ -2,4 +2,3 @@ | |||
| g1,0,3,0,1,3 | |||
| g2,0,0,1,1,2 | |||
| g3,3,1,0,0,1 | |||
src/cell2sentence/csmodel.py
Outdated
Comment on lines
+87
to
+88
| if not os.path.exists(save_dir): | ||
| os.mkdir(save_dir) |
Comment on lines
+78
to
+80
| if huggingface_token: | ||
| login(huggingface_token) | ||
|
|
src/cell2sentence/csmodel.py
Outdated
| data_collator=data_collator, | ||
| train_dataset=train_dataset, | ||
| eval_dataset=eval_dataset, | ||
| processing_class=self.tokenizer #changed argument from tokenizer to processing_class as per modern documentation |
Comment on lines
+52
to
+70
| class TestCSModelPeftModelLoadingAndErrorHandling: | ||
| @classmethod | ||
| def setup_class(self): | ||
| self.save_dir = "/mnt/c/Users/khmam/Desktop/c2s_model_directory" | ||
| self.save_name = "lora_gemma_model" | ||
| hf_model_path = "vandijklab/C2S-Scale-Gemma-2-2B" | ||
| self.csmodel = CSModel( | ||
| model_name_or_path=hf_model_path, | ||
| save_dir=self.save_dir, | ||
| save_name=self.save_name, | ||
| peft = True, | ||
| ) | ||
|
|
||
| def test_csmodel_created_correctly(self): | ||
| assert self.csmodel.save_path == os.path.join(self.save_dir, self.save_name) | ||
|
|
||
| def test_layers_are_created_correctly(self): | ||
| model = AutoModelForCausalLM.from_pretrained(self.csmodel.save_path, trust_remote_code = True) | ||
| print(model) |
README.md
Outdated
| | [c2s_tutorial_4_cell_type_prediction.ipynb](tutorials/c2s_tutorial_4_cell_type_prediction.ipynb) | Cell type prediction using C2S models | ||
| | [c2s_tutorial_5_cell_generation.ipynb](tutorials/c2s_tutorial_5_cell_generation.ipynb) | Cell generation conditioned on cell type | ||
| | [c2s_tutorial_6_cell_annotation_with_foundation_model.ipynb](tutorials/c2s_tutorial_6_cell_annotation_with_foundation_model.ipynb) | Cell type annotation with foundation model | ||
| | [c2s_tutorial_7_custom_prompt_templates.ipynb](tutorials/c2s_tutorials_7_custom_prompt_templates.ipynb) | Custom Prompt Templates with C2S PromptFormatter class |
Comment on lines
+23
to
+29
| install_requires = | ||
| torch | ||
| transformers | ||
| peft | ||
| bitsandbytes | ||
| datasets | ||
| anndata |
Author
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.