Skip to content

Commit f46f537

Browse files
authored
Update finetune.md
Signed-off-by: Swati Allabadi <[email protected]>
1 parent bae75d2 commit f46f537

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

docs/source/finetune.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ tensorboard --logdir runs/<file> --bind_all
6969

7070
To run fine tuning for any user specific dataset, prepare the dataset using the following steps:
7171

72-
1) Create a directory named 'dataset' inside efficient-transformers.
72+
1) Create a directory named 'dataset' inside efficient-transformers.
7373
2) Inside this directory, create a file named 'custom_dataset.py'. This is different than the custom_dataset.py present at efficient-transformers/QEfficient/finetune/dataset.
7474
3) Inside the newly created efficient-transformers/dataset/custom_dataset.py, define a function named 'get_custom_dataset'.
7575
4) get_custom_dataset() should have following 4 parameters: dataset_config, tokenizer, split, context_length. This function gets called twice through Qefficient/cloud/finetune.py with the name get_preprocessed_dataset.
@@ -87,10 +87,12 @@ def get_custom_dataset(dataset_config, tokenizer, split, context_length=None):
8787
# based on split, retrieve only the specific portion of the dataset (train or eval) either here or at the last
8888

8989
def apply_prompt_template():
90+
# transform the passed datapoint by applying the prompt on it
9091

9192
def tokenize():
93+
# tokenize the passed datapoint
9294

93-
# define prompt
95+
# define the prompt
9496
# call apply_prompt_template() for each data point:
9597
# dataset = dataset.map(apply_prompt_template ,<other args>)
9698
# call tokenize() for each data point:

0 commit comments

Comments
 (0)