From 268cbe3c54e0fa65936299c3fe500472fc721c7e Mon Sep 17 00:00:00 2001 From: JanLeyvaFocusEconomics Date: Mon, 1 Apr 2024 23:10:55 +0200 Subject: [PATCH 1/7] added catala translate --- notebooks/ca/_toctree.yml | 6 + notebooks/ca/index.md | 33 ++ notebooks/ca/rag_zephyr_langchain.ipynb | 525 ++++++++++++++++++++++++ 3 files changed, 564 insertions(+) create mode 100644 notebooks/ca/_toctree.yml create mode 100644 notebooks/ca/index.md create mode 100644 notebooks/ca/rag_zephyr_langchain.ipynb diff --git a/notebooks/ca/_toctree.yml b/notebooks/ca/_toctree.yml new file mode 100644 index 00000000..352f598d --- /dev/null +++ b/notebooks/ca/_toctree.yml @@ -0,0 +1,6 @@ +- title: Open-Source AI Cookbook + sections: + - local: index + title: Open-Source AI Cookbook + - local: rag_zephyr_langchain + title: Simple RAG using Hugging Face Zephyr and LangChain diff --git a/notebooks/ca/index.md b/notebooks/ca/index.md new file mode 100644 index 00000000..ed5282c2 --- /dev/null +++ b/notebooks/ca/index.md @@ -0,0 +1,33 @@ +# Open-Source AI Cookbook + +The Open-Source AI Cookbook is a collection of notebooks illustrating practical aspects of building AI +applications and solving various machine learning tasks using open-source tools and models. + +## Latest notebooks + +Check out the recently added notebooks: + +- [Using LLM-as-a-judge 🧑‍⚖️ for an automated and versatile evaluation](llm_judge) +- [Create a legal preference dataset](pipeline_notus_instructions_preferences_legal) +- [Suggestions for Data Annotation with SetFit in Zero-shot Text Classification](labelling_feedback_setfit) +- [Implementing semantic cache to improve a RAG system](semantic_cache_chroma_vector_database) +- [Building A RAG Ebook "Librarian" Using LlamaIndex](rag_llamaindex_librarian) +- [Stable Diffusion Interpolation](stable_diffusion_interpolation) +- [Building A RAG System with Gemma, MongoDB and Open Source Models](rag_with_hugging_face_gemma_mongodb) +- [Prompt Tuning with PEFT Library](prompt_tuning_peft) +- [Migrating from OpenAI to Open LLMs Using TGI's Messages API](tgi_messages_api_demo) +- [Automatic Embeddings with TEI through Inference Endpoints](automatic_embedding_tei_inference_endpoints) +- [Simple RAG for GitHub issues using Hugging Face Zephyr and LangChain](rag_zephyr_langchain) +- [Embedding multimodal data for similarity search using 🤗 transformers, 🤗 datasets and FAISS](faiss_with_hf_datasets_and_clip) +- [Fine-tuning a Code LLM on Custom Code on a single GPU](fine_tuning_code_llm_on_single_gpu) +- [RAG Evaluation Using Synthetic data and LLM-As-A-Judge](rag_evaluation) +- [Advanced RAG on HuggingFace documentation using LangChain](advanced_rag) +- [Detecting Issues in a Text Dataset with Cleanlab](issues_in_text_dataset) + +You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook). + +## Contributing + +The Open-Source AI Cookbook is a community effort, and we welcome contributions from everyone! +Check out the cookbook's [Contribution guide](https://github.com/huggingface/cookbook/blob/main/README.md) to learn +how you can add your "recipe". diff --git a/notebooks/ca/rag_zephyr_langchain.ipynb b/notebooks/ca/rag_zephyr_langchain.ipynb new file mode 100644 index 00000000..234fbf66 --- /dev/null +++ b/notebooks/ca/rag_zephyr_langchain.ipynb @@ -0,0 +1,525 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Kih21u1tyr-I" + }, + "source": [ + "# Simple RAG for GitHub issues using Hugging Face Zephyr and LangChain\n", + "\n", + "_Autora: [Maria Khalusova](https://github.com/MKhalusova)_\n", + "_Traductor: [Jan Leyva](https://github.com/JanLeyva)_\n", + "\n", + "Aquesta llibreta ensenya com pots construir rapidament un RAG *(Retrieval Augmented Generation)* per a un projecte de *GitHub issues* utilitzant el model[`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), i LangChain.\n", + "\n", + "\n", + "\n", + "**Que es un *RAG*?**\n", + "\n", + "Un *RAG* es una técnica popular per corretgir el problema quan un *LLM* no te l'informació necessaries o bé perquè aquestes no estava en el seu conjunt de dades d'entrenament o bé per evitar al·lucinacions tot i haver-les vist abans. Aquestes dades poden ser de propietat privada, sensibles o com en l'exemple actualitzades sovint.\n", + "\n", + "Si les teves dades son estatiques i no canvien sivint, hauries de considerar *fine-tune* un model de llenguatge. En moltes casos, encara que *fine-tune* pot ser costos i quan es fa repetidament pot comportar problemes (e.g. evitar que el model es desvi). Això passa quan el model te un comportament no desitjable.\n", + "\n", + "\n", + "**RAG (Retrieval Augmented Generation)** no requereix *fine-tine* (ajustar) el model. En comptes, el que fa el *RAG* es proporcionar mes context de dades rellevants al model i així pot generar millor respostes informades.\n", + "\n", + "\n", + "Aqui tenim una il·lustració:\n", + "\n", + "![RAG diagram](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/rag-diagram.png)\n", + "\n", + "* Les dades externes es converteixen en vectors *embedding* (representacions vecgtoritzada del text) amb un model que crea aquests *embeddings* diferent. Embeddings models son tipicament petits, així actualitzar els vectors creats es més ràpid, barat i fàcil que *fine-tune* el model.\n", + "\n", + "* A la vegada, el fet de que el *fine-tune* no sigui necessari et dona mes llivertat a l'hora de canviar el *LLM* per un més potent quan estigui disponible. O canviar-lo per una versió mes petita i optima quan necessitis que la generació sigui mes ràpida.\n", + "\n", + "Ilustrem com construir un *RAG* utilitzant un model lliure *LLM*, *embeddings* model i LangChain.\n", + "\n", + "Primer, instala les dependencies requerides:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "lC9frDOlyi38" + }, + "outputs": [], + "source": [ + "!pip install -q torch transformers accelerate bitsandbytes transformers sentence-transformers faiss-gpu" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "-aYENQwZ-p_c" + }, + "outputs": [], + "source": [ + "# If running in Google Colab, you may need to run this cell to make sure you're using UTF-8 locale to install LangChain\n", + "import locale\n", + "locale.getpreferredencoding = lambda: \"UTF-8\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "W5HhMZ2c-NfU" + }, + "outputs": [], + "source": [ + "!pip install -q langchain" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "R8po01vMWzXL" + }, + "source": [ + "## Prepara les dades" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3cCmQywC04x6" + }, + "source": [ + "En aquest example, carreguem totes les incidencies (ambdues obertes i tancades) desde [PEFT library's repo](https://github.com/huggingface/peft).\n", + "\n", + "Primer, necessitem aconseguir [GitHub personal access token](https://github.com/settings/tokens?type=beta) to access the GitHub API." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "8MoD7NbsNjlM" + }, + "outputs": [], + "source": [ + "from getpass import getpass\n", + "ACCESS_TOKEN = getpass(\"YOUR_GITHUB_PERSONAL_TOKEN\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fccecm3a10N6" + }, + "source": [ + "Després, carreguem totes les incidencies de [huggingface/peft](https://github.com/huggingface/peft) repositori:\n", + "\n", + "- Per defecte, les *pull requests* son considerades incidencies també, per això escollim excloir-les de les dades editant la configuració `include_prs=False`\n", + "- Possa `state=\"all\"` vol dir que carregarem ambdos tancats i oberts." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "8EKMit4WNDY8" + }, + "outputs": [], + "source": [ + "from langchain.document_loaders import GitHubIssuesLoader\n", + "\n", + "loader = GitHubIssuesLoader(\n", + " repo=\"huggingface/peft\",\n", + " access_token=ACCESS_TOKEN,\n", + " include_prs=False,\n", + " state=\"all\"\n", + ")\n", + "\n", + "docs = loader.load()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CChTrY-k2qO5" + }, + "source": [ + "The content of individual GitHub issues may be longer than what an embedding model can take as input. If we want to embed all of the available content, we need to chunk the documents into appropriately sized pieces.\n", + "\n", + "The most common and straightforward approach to chunking is to define a fixed size of chunks and whether there should be any overlap between them. Keeping some overlap between chunks allows us to preserve some semantic context between the chunks. The recommended splitter for generic text is the [RecursiveCharacterTextSplitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/recursive_text_splitter), and that's what we'll use here. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "OmsXOf59Pmm-" + }, + "outputs": [], + "source": [ + "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", + "\n", + "splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=30)\n", + "\n", + "chunked_docs = splitter.split_documents(docs)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DAt_zPVlXOn7" + }, + "source": [ + "## Create the embeddings + retriever" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-mvat6JQl4yp" + }, + "source": [ + "Now that the docs are all of the appropriate size, we can create a database with their embeddings.\n", + "\n", + "To create document chunk embeddings we'll use the `HuggingFaceEmbeddings` and the [`BAAI/bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5) embeddings model. There are many other embeddings models available on the Hub, and you can keep an eye on the best performing ones by checking the [Massive Text Embedding Benchmark (MTEB) Leaderboard](https://huggingface.co/spaces/mteb/leaderboard).\n", + "\n", + "\n", + "To create the vector database, we'll use `FAISS`, a library developed by Facebook AI. This library offers efficient similarity search and clustering of dense vectors, which is what we need here. FAISS is currently one of the most used libraries for NN search in massive datasets.\n", + "\n", + "We'll access both the embeddings model and FAISS via LangChain API." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ixmCdRzBQ5gu" + }, + "outputs": [], + "source": [ + "from langchain.vectorstores import FAISS\n", + "from langchain.embeddings import HuggingFaceEmbeddings\n", + "\n", + "db = FAISS.from_documents(chunked_docs,\n", + " HuggingFaceEmbeddings(model_name='BAAI/bge-base-en-v1.5'))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2iCgEPi0nnN6" + }, + "source": [ + "We need a way to return(retrieve) the documents given an unstructured query. For that, we'll use the `as_retriever` method using the `db` as a backbone:\n", + "- `search_type=\"similarity\"` means we want to perform similarity search between the query and documents\n", + "- `search_kwargs={'k': 4}` instructs the retriever to return top 4 results.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "id": "mBTreCQ9noHK" + }, + "outputs": [], + "source": [ + "retriever = db.as_retriever(\n", + " search_type=\"similarity\",\n", + " search_kwargs={'k': 4}\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WgEhlISJpTgj" + }, + "source": [ + "The vector database and retriever are now set up, next we need to set up the next piece of the chain - the model." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tzQxx0HkXVFU" + }, + "source": [ + "## Load quantized model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9jy1cC65p_GD" + }, + "source": [ + "For this example, we chose [`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), a small but powerful model.\n", + "\n", + "With many models being released every week, you may want to substitute this model to the latest and greatest. The best way to keep track of open source LLMs is to check the [Open-source LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).\n", + "\n", + "To make inference faster, we will load the quantized version of the model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "L-ggaa763VRo" + }, + "outputs": [], + "source": [ + "import torch\n", + "from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig\n", + "\n", + "model_name = 'HuggingFaceH4/zephyr-7b-beta'\n", + "\n", + "bnb_config = BitsAndBytesConfig(\n", + " load_in_4bit=True,\n", + " bnb_4bit_use_double_quant=True,\n", + " bnb_4bit_quant_type=\"nf4\",\n", + " bnb_4bit_compute_dtype=torch.bfloat16\n", + ")\n", + "\n", + "model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config)\n", + "tokenizer = AutoTokenizer.from_pretrained(model_name)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hVNRJALyXYHG" + }, + "source": [ + "## Setup the LLM chain" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RUUNneJ1smhl" + }, + "source": [ + "Finally, we have all the pieces we need to set up the LLM chain.\n", + "\n", + "First, create a text_generation pipeline using the loaded model and its tokenizer.\n", + "\n", + "Next, create a prompt template - this should follow the format of the model, so if you substitute the model checkpoint, make sure to use the appropriate formatting." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "id": "cR0k1cRWz8Pm" + }, + "outputs": [], + "source": [ + "from langchain.llms import HuggingFacePipeline\n", + "from langchain.prompts import PromptTemplate\n", + "from transformers import pipeline\n", + "from langchain_core.output_parsers import StrOutputParser\n", + "\n", + "text_generation_pipeline = pipeline(\n", + " model=model,\n", + " tokenizer=tokenizer,\n", + " task=\"text-generation\",\n", + " temperature=0.2,\n", + " do_sample=True,\n", + " repetition_penalty=1.1,\n", + " return_full_text=True,\n", + " max_new_tokens=400,\n", + ")\n", + "\n", + "llm = HuggingFacePipeline(pipeline=text_generation_pipeline)\n", + "\n", + "prompt_template = \"\"\"\n", + "<|system|>\n", + "Answer the question based on your knowledge. Use the following context to help:\n", + "\n", + "{context}\n", + "\n", + "\n", + "<|user|>\n", + "{question}\n", + "\n", + "<|assistant|>\n", + "\n", + " \"\"\"\n", + "\n", + "prompt = PromptTemplate(\n", + " input_variables=[\"context\", \"question\"],\n", + " template=prompt_template,\n", + ")\n", + "\n", + "llm_chain = prompt | llm | StrOutputParser()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "l19UKq5HXfSp" + }, + "source": [ + "Note: _You can also use `tokenizer.apply_chat_template` to convert a list of messages (as dicts: `{'role': 'user', 'content': '(...)'}`) into a string with the appropriate chat format._\n", + "\n", + "\n", + "Finally, we need to combine the `llm_chain` with the retriever to create a RAG chain. We pass the original question through to the final generation step, as well as the retrieved context docs:" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "id": "_rI3YNp9Xl4s" + }, + "outputs": [], + "source": [ + "from langchain_core.runnables import RunnablePassthrough\n", + "\n", + "retriever = db.as_retriever()\n", + "\n", + "rag_chain = (\n", + " {\"context\": retriever, \"question\": RunnablePassthrough()}\n", + " | llm_chain\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UsCOhfDDXpaS" + }, + "source": [ + "## Compare the results\n", + "\n", + "Let's see the difference RAG makes in generating answers to the library-specific questions." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "id": "W7F07fQLXusU" + }, + "outputs": [], + "source": [ + "question = \"How do you combine multiple adapters?\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KC0rJYU1x1ir" + }, + "source": [ + "First, let's see what kind of answer we can get with just the model itself, no context added:" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 125 + }, + "id": "GYh-HG1l0De5", + "outputId": "277d8e89-ce9b-4e04-c11b-639ad2645759" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "\" To combine multiple adapters, you need to ensure that they are compatible with each other and the devices you want to connect. Here's how you can do it:\\n\\n1. Identify the adapters you need: Determine which adapters you require to connect the devices you want to use together. For example, if you want to connect a USB-C device to an HDMI monitor, you may need a USB-C to HDMI adapter and a USB-C to USB-A adapter (if your computer only has USB-A ports).\\n\\n2. Connect the first adapter: Plug in the first adapter into the device you want to connect. For instance, if you're connecting a USB-C laptop to an HDMI monitor, plug the USB-C to HDMI adapter into the laptop's USB-C port.\\n\\n3. Connect the second adapter: Next, connect the second adapter to the first one. In this case, connect the USB-C to USB-A adapter to the USB-C port of the USB-C to HDMI adapter.\\n\\n4. Connect the final device: Finally, connect the device you want to use to the second adapter. For example, connect the HDMI cable from the monitor to the HDMI port on the USB-C to HDMI adapter.\\n\\n5. Test the connection: Turn on both devices and check whether everything is working correctly. If necessary, adjust the settings on your devices to ensure optimal performance.\\n\\nBy combining multiple adapters, you can connect a variety of devices together, even if they don't have the same type of connector. Just be sure to choose adapters that are compatible with all the devices you want to connect and test the connection thoroughly before relying on it for critical tasks.\"" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "llm_chain.invoke({\"context\":\"\", \"question\": question})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "i-TIWr3wx9w8" + }, + "source": [ + "As you can see, the model interpreted the question as one about physical computer adapters, while in the context of PEFT, \"adapters\" refer to LoRA adapters.\n", + "Let's see if adding context from GitHub issues helps the model give a more relevant answer:" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 125 + }, + "id": "FZpNA3o10H10", + "outputId": "31f9aed3-3dd7-4ff8-d1a8-866794fefe80" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "\" Based on the provided context, it seems that combining multiple adapters is still an open question in the community. Here are some possibilities:\\n\\n 1. Save the output from the base model and pass it to each adapter separately, as described in the first context snippet. This allows you to run multiple adapters simultaneously and reuse the output from the base model. However, this approach requires loading and running each adapter separately.\\n\\n 2. Export everything into a single PyTorch model, as suggested in the second context snippet. This would involve saving all the adapters and their weights into a single model, potentially making it larger and more complex. The advantage of this approach is that it would allow you to run all the adapters simultaneously without having to load and run them separately.\\n\\n 3. Merge multiple Lora adapters, as mentioned in the third context snippet. This involves adding multiple distinct, independent behaviors to a base model by merging multiple Lora adapters. It's not clear from the context how this would be done, but it suggests that there might be a recommended way of doing it.\\n\\n 4. Combine adapters through a specific architecture, as proposed in the fourth context snippet. This involves merging multiple adapters into a single architecture, potentially creating a more complex model with multiple behaviors. Again, it's not clear from the context how this would be done.\\n\\n Overall, combining multiple adapters is still an active area of research, and there doesn't seem to be a widely accepted solution yet. If you're interested in exploring this further, it might be worth reaching out to the Hugging Face community or checking out their documentation for more information.\"" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "rag_chain.invoke(question)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hZQedZKSyrwO" + }, + "source": [ + "As we can see, the added context, really helps the exact same model, provide a much more relevant and informed answer to the library-specific question.\n", + "\n", + "Notably, combining multiple adapters for inference has been added to the library, and one can find this information in the documentation, so for the next iteration of this RAG it may be worth including documentation embeddings." + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "gpuType": "T4", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.7" + } + }, + "nbformat": 4, + "nbformat_minor": 1 +} From 033bdb4655e4cc77a973bd5b059eef31f722b5de Mon Sep 17 00:00:00 2001 From: JanLeyvaFocusEconomics Date: Fri, 5 Apr 2024 20:00:44 +0200 Subject: [PATCH 2/7] translated notebook --- notebooks/ca/rag_zephyr_langchain.ipynb | 65 +++++++++++++------------ 1 file changed, 33 insertions(+), 32 deletions(-) diff --git a/notebooks/ca/rag_zephyr_langchain.ipynb b/notebooks/ca/rag_zephyr_langchain.ipynb index 234fbf66..f55f5eef 100644 --- a/notebooks/ca/rag_zephyr_langchain.ipynb +++ b/notebooks/ca/rag_zephyr_langchain.ipynb @@ -25,7 +25,7 @@ "**RAG (Retrieval Augmented Generation)** no requereix *fine-tine* (ajustar) el model. En comptes, el que fa el *RAG* es proporcionar mes context de dades rellevants al model i així pot generar millor respostes informades.\n", "\n", "\n", - "Aqui tenim una il·lustració:\n", + "Aqui tenim una il·lustració d'un RAG:\n", "\n", "![RAG diagram](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/rag-diagram.png)\n", "\n", @@ -33,7 +33,7 @@ "\n", "* A la vegada, el fet de que el *fine-tune* no sigui necessari et dona mes llivertat a l'hora de canviar el *LLM* per un més potent quan estigui disponible. O canviar-lo per una versió mes petita i optima quan necessitis que la generació sigui mes ràpida.\n", "\n", - "Ilustrem com construir un *RAG* utilitzant un model lliure *LLM*, *embeddings* model i LangChain.\n", + "Ilustrem com construir un *RAG* utilitzant un model lliure *LLM*, *embeddings* del model i LangChain.\n", "\n", "Primer, instala les dependencies requerides:" ] @@ -143,9 +143,9 @@ "id": "CChTrY-k2qO5" }, "source": [ - "The content of individual GitHub issues may be longer than what an embedding model can take as input. If we want to embed all of the available content, we need to chunk the documents into appropriately sized pieces.\n", + "La llargada d'una incidencia de GitHub pot ser mes llarga de la capacitat màxima que pot admetre el *embedding*. Per això, si volem aplicar el *embbeding* a tot el contingut, hem de separar en troços del tamany adecuat les incidencies.\n", "\n", - "The most common and straightforward approach to chunking is to define a fixed size of chunks and whether there should be any overlap between them. Keeping some overlap between chunks allows us to preserve some semantic context between the chunks. The recommended splitter for generic text is the [RecursiveCharacterTextSplitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/recursive_text_splitter), and that's what we'll use here. " + "La manera més directe de fer això es separar el contingut en una mesura definida i marcar una sobreposició d'aquest. D'aquesta manera, mantenint una sobreposició entre la separació del text, mantenim el context semantic entre les diferents divisions del text. Per separar el text es recomana utilitzar [RecursiveCharacterTextSplitter](https://python.langchain.com/docs/modules/data_connection/document_transformers/recursive_text_splitter), i per això es el que utilitzarem." ] }, { @@ -169,7 +169,7 @@ "id": "DAt_zPVlXOn7" }, "source": [ - "## Create the embeddings + retriever" + "## Crear els embeddings + retriever" ] }, { @@ -178,14 +178,13 @@ "id": "-mvat6JQl4yp" }, "source": [ - "Now that the docs are all of the appropriate size, we can create a database with their embeddings.\n", + "Una vegada totes els documents tenen la llargada apropiada, podem crear una base de dades amb aquests *embbedings*.\n", "\n", - "To create document chunk embeddings we'll use the `HuggingFaceEmbeddings` and the [`BAAI/bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5) embeddings model. There are many other embeddings models available on the Hub, and you can keep an eye on the best performing ones by checking the [Massive Text Embedding Benchmark (MTEB) Leaderboard](https://huggingface.co/spaces/mteb/leaderboard).\n", + "Per crear els troços dels documents i fer els embeddings utilitzarem `HuggingFaceEmbeddings` and the [`BAAI/bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5). Hi ha moltes mes models de embbeding disponibles en la plataforma, pots fer-hi una ullada als que millor funcionen a [Massive Text Embedding Benchmark (MTEB) Leaderboard](https://huggingface.co/spaces/mteb/leaderboard).\n", "\n", + "Per crear la base de dades vectoritzada, utilitzarem `FAISS`, una llibreria desenvolupada per Facebook AI. Aquesta llibreria ofereix similars resultats per busqueda i agrupament que una base de dades convencional. FAISS es actualmentr una de les llibreries mes utilitzades per busqueda de NN en base de dades gegants.\n", "\n", - "To create the vector database, we'll use `FAISS`, a library developed by Facebook AI. This library offers efficient similarity search and clustering of dense vectors, which is what we need here. FAISS is currently one of the most used libraries for NN search in massive datasets.\n", - "\n", - "We'll access both the embeddings model and FAISS via LangChain API." + "Utilitarem LangChain API per accedir ambdues llibreries FAISS i el model de *embedding*." ] }, { @@ -209,9 +208,9 @@ "id": "2iCgEPi0nnN6" }, "source": [ - "We need a way to return(retrieve) the documents given an unstructured query. For that, we'll use the `as_retriever` method using the `db` as a backbone:\n", - "- `search_type=\"similarity\"` means we want to perform similarity search between the query and documents\n", - "- `search_kwargs={'k': 4}` instructs the retriever to return top 4 results.\n" + "Necessitem una manera de retornar els documents necessaris amb una demanda no estructurada (query unstructured). Per això, utilitzarem el métode `as_retriever` utilitzant la `db` com a suport:\n", + "- `search_type=\"similarity\"` vol dir que volem fer una busqueda amb resultats similars entre la demanda/*query* i els documents.\n", + "- `search_kwargs={'k': 4}` ens retornara nomes els 4 resultats principals." ] }, { @@ -234,7 +233,7 @@ "id": "WgEhlISJpTgj" }, "source": [ - "The vector database and retriever are now set up, next we need to set up the next piece of the chain - the model." + "Tant la base de dades vectoritzada com el retornador de documents estan iniciats i configuratrs, el següent pas serà configurar la cadena del model." ] }, { @@ -243,7 +242,9 @@ "id": "tzQxx0HkXVFU" }, "source": [ - "## Load quantized model" + "## Carrega el model quantitzat\n", + "\n", + "(un model quantitzat es un model que en comptes de expressar-lo en la seva màxima precisió els valors o fem una precisió menor (e.g. int8, in4))" ] }, { @@ -252,11 +253,11 @@ "id": "9jy1cC65p_GD" }, "source": [ - "For this example, we chose [`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), a small but powerful model.\n", + "Per aquest exemple, hem escollit [`HuggingFaceH4/zephyr-7b-beta`](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), un model petit però potent.\n", "\n", - "With many models being released every week, you may want to substitute this model to the latest and greatest. The best way to keep track of open source LLMs is to check the [Open-source LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).\n", + "Amb tant models que surten cada setmana, potser voldras canviar a un model més nou i més gran. La millor manera d'estar alerta dels ultims models de llicencia lliure es mirant [Open-source LLM leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).\n", "\n", - "To make inference faster, we will load the quantized version of the model:" + "Per tal de fer més ràpida la generació de text carregarem el model quantitzat:" ] }, { @@ -289,7 +290,7 @@ "id": "hVNRJALyXYHG" }, "source": [ - "## Setup the LLM chain" + "## Configura la cadena del LLM" ] }, { @@ -298,11 +299,11 @@ "id": "RUUNneJ1smhl" }, "source": [ - "Finally, we have all the pieces we need to set up the LLM chain.\n", + "Finalment, tenim totes les peces que necessitavem per configurar la cedena del LLM.\n", "\n", - "First, create a text_generation pipeline using the loaded model and its tokenizer.\n", + "Primer de tot, creem un generador de text `text_generation`utilitzant el model carregat i el seu *tokenitzador*.\n", "\n", - "Next, create a prompt template - this should follow the format of the model, so if you substitute the model checkpoint, make sure to use the appropriate formatting." + "Despres, crearem una plantilla de *prompt* - això ha de seguir el format del model, això vol dir que si en algun moment canvies els *checkpoint* del model, assegura't de canviar la plantilla per una apropiada també." ] }, { @@ -359,10 +360,9 @@ "id": "l19UKq5HXfSp" }, "source": [ - "Note: _You can also use `tokenizer.apply_chat_template` to convert a list of messages (as dicts: `{'role': 'user', 'content': '(...)'}`) into a string with the appropriate chat format._\n", + "Nota: _També pots utilitzar `tokenizer.apply_chat_template` per convertir la llista de missatges (com a diccionaris: `{'role': 'user', 'content': '(...)'}`) a str amb la plantilla apropiada._\n", "\n", - "\n", - "Finally, we need to combine the `llm_chain` with the retriever to create a RAG chain. We pass the original question through to the final generation step, as well as the retrieved context docs:" + "Finalment, necessitem convinar `llm_chain` amb el que ens torna la base de dades vectoritzada per tal de crear la cadena del RAG. Per això, necessitem passar la pregunta original a traves del generador de text (LLM) juntament amb els documents mes rellevants que ens torni dels que hem fet el *embbeding* abans:" ] }, { @@ -389,9 +389,9 @@ "id": "UsCOhfDDXpaS" }, "source": [ - "## Compare the results\n", + "## Comparem els resultats\n", "\n", - "Let's see the difference RAG makes in generating answers to the library-specific questions." + "Mirem quina diferencia que hi ha generant les respostes amb el RAG a preguntes específiques." ] }, { @@ -411,7 +411,7 @@ "id": "KC0rJYU1x1ir" }, "source": [ - "First, let's see what kind of answer we can get with just the model itself, no context added:" + "Primer de tot mirem quins resultats obtenim sense passar extra de context al model." ] }, { @@ -450,8 +450,9 @@ "id": "i-TIWr3wx9w8" }, "source": [ - "As you can see, the model interpreted the question as one about physical computer adapters, while in the context of PEFT, \"adapters\" refer to LoRA adapters.\n", - "Let's see if adding context from GitHub issues helps the model give a more relevant answer:" + "Com pots veuere el model interpreta la pregunta com si parlessim de adaptadors fisics d'ordinador, mentres que amb el contexte del PEFT, \"adapters\" es refereix als adaptadors de LoRa.\n", + "\n", + "Anem a veure sia afeguint context de les incidencies de GitHub ajuda al model a donar una resposta més rellevant:" ] }, { @@ -490,9 +491,9 @@ "id": "hZQedZKSyrwO" }, "source": [ - "As we can see, the added context, really helps the exact same model, provide a much more relevant and informed answer to the library-specific question.\n", + "Tal i com podem veure, afeguir context, realment ajuda al mateix model, i així retorna molt millor resposta i més relevant per el context que estavem preguntat.\n", "\n", - "Notably, combining multiple adapters for inference has been added to the library, and one can find this information in the documentation, so for the next iteration of this RAG it may be worth including documentation embeddings." + "Hem de tenir en compte, que combinar diferents \"adapters\" per generar text s'ha afeguit a la llibreria recentment, aquesta informació es pot trobar a la documentació, i per això podría ser beneficios afeguir la documentació a la base de dades vectoritzada havent fet els *embedding* corresponents. Per la pròxima itereció podria ser beneficios incluir això al RAG." ] } ], From 3914b637031c140ac7745f649d6180ecf32eb1f5 Mon Sep 17 00:00:00 2001 From: JanLeyvaFocusEconomics Date: Fri, 5 Apr 2024 20:06:11 +0200 Subject: [PATCH 3/7] translated notebook --- notebooks/ca/index.md | 26 +++++++------------------- 1 file changed, 7 insertions(+), 19 deletions(-) diff --git a/notebooks/ca/index.md b/notebooks/ca/index.md index ed5282c2..6efca48f 100644 --- a/notebooks/ca/index.md +++ b/notebooks/ca/index.md @@ -3,26 +3,14 @@ The Open-Source AI Cookbook is a collection of notebooks illustrating practical aspects of building AI applications and solving various machine learning tasks using open-source tools and models. -## Latest notebooks - -Check out the recently added notebooks: - -- [Using LLM-as-a-judge 🧑‍⚖️ for an automated and versatile evaluation](llm_judge) -- [Create a legal preference dataset](pipeline_notus_instructions_preferences_legal) -- [Suggestions for Data Annotation with SetFit in Zero-shot Text Classification](labelling_feedback_setfit) -- [Implementing semantic cache to improve a RAG system](semantic_cache_chroma_vector_database) -- [Building A RAG Ebook "Librarian" Using LlamaIndex](rag_llamaindex_librarian) -- [Stable Diffusion Interpolation](stable_diffusion_interpolation) -- [Building A RAG System with Gemma, MongoDB and Open Source Models](rag_with_hugging_face_gemma_mongodb) -- [Prompt Tuning with PEFT Library](prompt_tuning_peft) -- [Migrating from OpenAI to Open LLMs Using TGI's Messages API](tgi_messages_api_demo) -- [Automatic Embeddings with TEI through Inference Endpoints](automatic_embedding_tei_inference_endpoints) +El Codi-Obert IA Cookbook es una colecció de llibretes que ensenyen com construir aplicacions de IA i resoldre diferents problemes de *machine learning* utilitzant models i eines de codi-obert. + +## Ultimes llibretes + +Revisa les llibretes afeguides recentment en català: + + - [Simple RAG for GitHub issues using Hugging Face Zephyr and LangChain](rag_zephyr_langchain) -- [Embedding multimodal data for similarity search using 🤗 transformers, 🤗 datasets and FAISS](faiss_with_hf_datasets_and_clip) -- [Fine-tuning a Code LLM on Custom Code on a single GPU](fine_tuning_code_llm_on_single_gpu) -- [RAG Evaluation Using Synthetic data and LLM-As-A-Judge](rag_evaluation) -- [Advanced RAG on HuggingFace documentation using LangChain](advanced_rag) -- [Detecting Issues in a Text Dataset with Cleanlab](issues_in_text_dataset) You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook). From b383e6d6090d476994ae60268191d4530f8fc12a Mon Sep 17 00:00:00 2001 From: JanLeyva Date: Thu, 16 May 2024 11:46:12 +0200 Subject: [PATCH 4/7] added ca in .github/workflow to the PR --- .github/workflows/build_documentation.yml | 2 +- .github/workflows/build_pr_documentation.yml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/build_documentation.yml b/.github/workflows/build_documentation.yml index 3aed3267..7870ecc5 100644 --- a/.github/workflows/build_documentation.yml +++ b/.github/workflows/build_documentation.yml @@ -17,7 +17,7 @@ jobs: package_name: cookbook path_to_docs: cookbook/notebooks/ additional_args: --not_python_module - languages: en zh-CN + languages: en zh-CN ca convert_notebooks: true secrets: hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }} \ No newline at end of file diff --git a/.github/workflows/build_pr_documentation.yml b/.github/workflows/build_pr_documentation.yml index 64aaf9fe..1c482cd8 100644 --- a/.github/workflows/build_pr_documentation.yml +++ b/.github/workflows/build_pr_documentation.yml @@ -20,5 +20,5 @@ jobs: package_name: cookbook path_to_docs: cookbook/notebooks/ additional_args: --not_python_module - languages: en zh-CN + languages: en zh-CN ca convert_notebooks: true \ No newline at end of file From 4467599aceadf412bbaf6ba466d01e1c09487ac2 Mon Sep 17 00:00:00 2001 From: Jan Leyva <78868781+JanLeyva@users.noreply.github.com> Date: Fri, 17 May 2024 18:03:02 +0200 Subject: [PATCH 5/7] Update _toctree.yml fixing PR caused by error in `_toctree.yml` --- notebooks/ca/_toctree.yml | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/notebooks/ca/_toctree.yml b/notebooks/ca/_toctree.yml index 352f598d..884e909f 100644 --- a/notebooks/ca/_toctree.yml +++ b/notebooks/ca/_toctree.yml @@ -2,5 +2,8 @@ sections: - local: index title: Open-Source AI Cookbook - - local: rag_zephyr_langchain + +- title: LLM and RAG recipes with other Libraries + sections: + - local: rag_zephyr_langchain title: Simple RAG using Hugging Face Zephyr and LangChain From 1ed301ce016f770801156911952daac716850b67 Mon Sep 17 00:00:00 2001 From: Jan Leyva <78868781+JanLeyva@users.noreply.github.com> Date: Fri, 17 May 2024 18:06:38 +0200 Subject: [PATCH 6/7] Update _toctree.yml --- notebooks/ca/_toctree.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/notebooks/ca/_toctree.yml b/notebooks/ca/_toctree.yml index 884e909f..e7a7bcb7 100644 --- a/notebooks/ca/_toctree.yml +++ b/notebooks/ca/_toctree.yml @@ -3,7 +3,7 @@ - local: index title: Open-Source AI Cookbook -- title: LLM and RAG recipes with other Libraries +- title: Receptes de LLM i RAG amb altres Llibreries sections: - local: rag_zephyr_langchain - title: Simple RAG using Hugging Face Zephyr and LangChain + title: Simple RAG utilitzant Hugging Face Zephyr i LangChain From 7cc18510af3d315dfbcbba1bd00c668c68598ab6 Mon Sep 17 00:00:00 2001 From: Jan Leyva <78868781+JanLeyva@users.noreply.github.com> Date: Fri, 17 May 2024 18:10:48 +0200 Subject: [PATCH 7/7] Update index.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit translated to català --- notebooks/ca/index.md | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/notebooks/ca/index.md b/notebooks/ca/index.md index 6efca48f..89ba51a1 100644 --- a/notebooks/ca/index.md +++ b/notebooks/ca/index.md @@ -1,8 +1,5 @@ # Open-Source AI Cookbook -The Open-Source AI Cookbook is a collection of notebooks illustrating practical aspects of building AI -applications and solving various machine learning tasks using open-source tools and models. - El Codi-Obert IA Cookbook es una colecció de llibretes que ensenyen com construir aplicacions de IA i resoldre diferents problemes de *machine learning* utilitzant models i eines de codi-obert. ## Ultimes llibretes @@ -10,12 +7,11 @@ El Codi-Obert IA Cookbook es una colecció de llibretes que ensenyen com constru Revisa les llibretes afeguides recentment en català: -- [Simple RAG for GitHub issues using Hugging Face Zephyr and LangChain](rag_zephyr_langchain) +- [Simple RAG per GitHub issues utilitzant Hugging Face Zephyr i LangChain](rag_zephyr_langchain) -You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook). +També pots mirar les llibretes de cookbook a [GitHub repo](https://github.com/huggingface/cookbook). ## Contributing -The Open-Source AI Cookbook is a community effort, and we welcome contributions from everyone! -Check out the cookbook's [Contribution guide](https://github.com/huggingface/cookbook/blob/main/README.md) to learn -how you can add your "recipe". +El Codi-Obert AI Cookbook es un esforç de la comunitat i es benvinguda la col·laboració de tothom! +Mira el Cookbook aqui [Contribution guide](https://github.com/huggingface/cookbook/blob/main/README.md) per apendre com pots afeguir "receptes".