Skip to content

uwwint/discourse-scraper

Repository files navigation

Scientific question-answering by fine-tuned large language models (LLMs)

Create an environment using the following command:

conda env create --file env.yml

Dataset collection

  • Galaxy help forum
  • Biostars Q&A

Fine-tune Llama2 (2B and 7B)

  • Navigate to \llama2 and then execute python qlora-train.py
  • Utilizes HuggingFace's Transformers package to download pre-trained LLMs
  • qLoRA to drastically reduce the number of parameters (from 2B to 6 million)
  • SFT for setting up the training process

Outcomes

llama2_ans1

llama2_answers

Retrieval augmented generation (RAG)

rag_llm2

Save the fine-tuned model to HuggingFace Hub

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published