conda env create --file env.yml
- Galaxy help forum
- Biostars Q&A
- Navigate to
\llama2
and then executepython qlora-train.py
- Utilizes HuggingFace's Transformers package to download pre-trained LLMs
- qLoRA to drastically reduce the number of parameters (from 2B to 6 million)
- SFT for setting up the training process
- Dataset collection from Galaxy's training material and GitHub pull requests
- https://github.com/uwwint/discourse-scraper/blob/master/extract_documents/create_RAG_docs.ipynb
- Pipeline for RAG that uses fine-tuned Llama2 using Haystack
- Save model to HuggingFace Hub: https://github.com/uwwint/discourse-scraper/blob/master/llama2/save_to_hub.ipynb
- Model name: anuprulez/fine-tuned-gllm