updated: 05/31/2023
This repository contains references to open-source models similar to ChatGPT, as well as Langchain and prompt engineering libraries. It also includes related samples and research on Langchain, Vector Search (including feasibility checks on Elasticsearch, Azure Cognitive Search, Azure Cosmos DB), and more.
Rule: Brief each item on one or a few lines as much as possible.
-
Section 1 : Llama-index and Vector Storage (Search)
-
Section 2 : ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search
-
Section 3 : Microsoft Semantic Kernel with Azure Cosmos DB
-
Section 4 : Langchain sample code
-
Section 5: Prompt Engineering, Finetuning, and Langchain
- Prompt Engineering
- OpenAI Prompt Guide
- DeepLearning.ai Prompt Engineering Course and others
- Awesome ChatGPT Prompts
- ChatGPT : “user”, “assistant”, and “system” messages.
- Finetuning : PEFT - LoRA - QLoRA
- Quantization : Quantization & Run ChatGPT on a Raspberry Pi / Android
- Sparsification
- Langchain vs Semantic Kernel
-
Section 6: Improvement
- Math problem-solving skill
- OpenAI's plans according to Sam Altman Humanloop interview has been removed from the site. Instead of that, Web-archived link.
-
Section 7: List of OSS LLM
-
Section 8 : References
- Langchain and Prompt engineering library
- AutoGPT
- picoGPT : tiny implementation of GPT-2
- Communicative Agents
- Democratizing the magic of ChatGPT with open models
- Hugging face Transformer
- Hugging face StarCoder
- MLLM (multimodal large language model)
- Generate 3D
- DragGAN
- string2string
- Tiktoken Alternative in C#
- UI/UX
- PDF with ChatGPT
- Edge and Chrome Extension / Plugin
- etc. + MS Fabric
- 日本語(Japanese Materials)
-
Acknowledgements
This repository has been created for testing and feasibility checks using vector and language chains, specifically llama-index. These libraries are commonly used when implementing Prompt Engineering and consuming one's own data into LLM.
- docker : Opensearch Docker-compose
- docker-elasticsearch : Not working for ES v8, requiring security plug-in with mandatory
- docker-elk : Elasticsearch Docker-compose, Optimized Docker configurations with solving security plug-in issues.
- es-open-search-set-analyzer.py : Put Language analyzer into Open search
- es-open-search.py : Open search sample index creation
- es-search-set-analyzer.py : Put Language analyzer into Elastic search
- es-search.py : Usage of Elastic search python client
- files : The Sample file for consuming
- index.json : Vector data local backup created by llama-index
- index_vector_in_opensearch.json : Vector data stored in Open search (Source:
files\all_h1.pdf
) - llama-index-azure-elk-create.py: llama-index ElasticsearchVectorClient (Unofficial file to manipulate vector search, Created by me, Not Fully Tested)
- llama-index-lang-chain.py : Lang chain memory and agent usage with llama-index
- llama-index-opensearch-create.py : Vector index creation to Open search
- llama-index-opensearch-query-chatgpt.py : Test module to access Azure Open AI Embedding API.
- llama-index-opensearch-query.py : Vector index query with questions to Open search
- llama-index-opensearch-read.py : llama-index ElasticsearchVectorClient (Unofficial file to manipulate vector search, Created by me, Not Fully Tested)
- env.template : The properties. Change its name to
.env
once your values settings is done.
OPENAI_API_TYPE=azure
OPENAI_API_BASE=https://????.openai.azure.com/
OPENAI_API_VERSION=2022-12-01
OPENAI_API_KEY=<your value in azure>
OPENAI_DEPLOYMENT_NAME_A=<your value in azure>
OPENAI_DEPLOYMENT_NAME_B=<your value in azure>
OPENAI_DEPLOYMENT_NAME_C=<your value in azure>
OPENAI_DOCUMENT_MODEL_NAME=<your value in azure>
OPENAI_QUERY_MODEL_NAME=<your value in azure>
INDEX_NAME=gpt-index-demo
INDEX_TEXT_FIELD=content
INDEX_EMBEDDING_FIELD=embedding
ELASTIC_SEARCH_ID=elastic
ELASTIC_SEARCH_PASSWORD=elastic
OPEN_SEARCH_ID=admin
OPEN_SEARCH_PASSWORD=admin
- Not All Vector Databases Are Made Equal
- Printed version for "Medium" limits. - Link
- Vector Search in Azure Cosmos DB for MongoDB vCore
- Vector search (private preview) - Azure Cognitive Search
pip install milvus
- Docker compose: https://milvus.io/docs/install_offline-docker.md
- Milvus Embedded through python console only works in Linux and Mac OS.
- In Windows, Use this link, https://github.com/matrixji/milvus/releases.
# Step 1. Start Milvus
1. Unzip the package
Unzip the package, and you will find a milvus directory, which contains all the files required.
2. Start a MinIO service
Double-click the run_minio.bat file to start a MinIO service with default configurations. Data will be stored in the subdirectory s3data.
3. Start an etcd service
Double-click the run_etcd.bat file to start an etcd service with default configurations.
4. Start Milvus service
Double-click the run_milvus.bat file to start the Milvus service.
# Step 2. Run hello_milvus.py
After starting the Milvus service, you can test by running hello_milvus.py. See Hello Milvus for more information.
-
Azure Open AI Embedding API,text-embedding-ada-002, supports 1536 dimensions. Elastic search, Lucene based engine, supports 1024 dimensions as a max. Open search can insert 16,000 dimensions as a vector storage.
-
Lang chain interface of Azure Open AI does not support ChatGPT yet. so that reason, need to use alternatives such astext-davinci-003
.
@open ai documents: text-embedding-ada-002: Smaller embedding size. The new embeddings have only 1536 dimensions, one-eighth the size of davinci-001 embeddings, making the new embeddings more cost effective in working with vector databases. https://openai.com/blog/new-and-improved-embedding-model
@open search documents: However, one exception to this is that the maximum dimension count for the Lucene engine is 1,024, compared with 16,000 for the other engines. https://opensearch.org/docs/latest/search-plugins/knn/approximate-knn/
@llama-index examples: However, the examples in llama-index uses 1536 vector size.
The files in this directory, extra_steps
, have been created for managing extra configurations and steps for launching the demo repository.
https://github.com/Azure-Samples/azure-search-openai-demo
- fix_from_origin : The modified files, setup related
- ms_internal_az_init.ps1 : Powershell script for Azure module installation
- ms_internal_troubleshootingt.ps1 : Set Specific Subscription Id as default
- (optional) Check Azure module installation in Powershell by running
ms_internal_az_init.ps1
script - (optional) Set your Azure subscription Id to default
Start the following commands in
./azure-search-openai-demo
directory
- (deploy azure resources) Simply Run
azd up
The azd stores relevant values in the .env file which is stored at ${project_folder}\.azure\az-search-openai-tg\.env
.
AZURE_ENV_NAME=<your_value_in_azure>
AZURE_LOCATION=<your_value_in_azure>
AZURE_OPENAI_SERVICE=<your_value_in_azure>
AZURE_PRINCIPAL_ID=<your_value_in_azure>
AZURE_SEARCH_INDEX=<your_value_in_azure>
AZURE_SEARCH_SERVICE=<your_value_in_azure>
AZURE_STORAGE_ACCOUNT=<your_value_in_azure>
AZURE_STORAGE_CONTAINER=<your_value_in_azure>
AZURE_SUBSCRIPTION_ID=<your_value_in_azure>
BACKEND_URI=<your_value_in_azure>
- Move to
app
bycd app
command - (sample data loading) Move to
scripts
then Change into Powershell byPowershell
command, Runprepdocs.ps1
- console output (excerpt)
Uploading blob for page 20 -> role_library-20.pdf
Uploading blob for page 21 -> role_library-21.pdf
Uploading blob for page 22 -> role_library-22.pdf
Uploading blob for page 23 -> role_library-23.pdf
Uploading blob for page 24 -> role_library-24.pdf
Uploading blob for page 25 -> role_library-25.pdf
Uploading blob for page 26 -> role_library-26.pdf
Uploading blob for page 27 -> role_library-27.pdf
Uploading blob for page 28 -> role_library-28.pdf
Uploading blob for page 29 -> role_library-29.pdf
Uploading blob for page 30 -> role_library-30.pdf
Indexing sections from 'role_library.pdf' into search index 'gptkbindex'
Splitting './data\role_library.pdf' into sections
Indexed 60 sections, 60 succeeded
- Move to
app
bycd ..
andcd app
command - (locally running) Run
start.cmd
- console output (excerpt)
Building frontend
> [email protected] build \azure-search-openai-demo\app\frontend
> tsc && vite build
vite v4.1.1 building for production...
✓ 1250 modules transformed.
../backend/static/index.html 0.49 kB
../backend/static/assets/github-fab00c2d.svg 0.96 kB
../backend/static/assets/index-184dcdbd.css 7.33 kB │ gzip: 2.17 kB
../backend/static/assets/index-41d57639.js 625.76 kB │ gzip: 204.86 kB │ map: 5,057.29 kB
Starting backend
* Serving Flask app 'app'
* Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit
127.0.0.1 - - [13/Apr/2023 14:25:31] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [13/Apr/2023 14:25:31] "GET /assets/index-184dcdbd.css HTTP/1.1" 200 -
127.0.0.1 - - [13/Apr/2023 14:25:31] "GET /assets/index-41d57639.js HTTP/1.1" 200 -
127.0.0.1 - - [13/Apr/2023 14:25:31] "GET /assets/github-fab00c2d.svg HTTP/1.1" 200 -
127.0.0.1 - - [13/Apr/2023 14:25:32] "GET /favicon.ico HTTP/1.1" 304 -
127.0.0.1 - - [13/Apr/2023 14:25:42] "POST /chat HTTP/1.1" 200 -
Running from second times
- Move to
app
bycd ..
andcd app
command - (locally running) Run
start.cmd
Another Reference Architectue
C# Implementation ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search
Azure Cosmos DB + OpenAI ChatGPT C# blazor and Azure Custom Template
Azure Open AI work with Cognitive Search act as a Long-term memory
- ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search
- Can ChatGPT work with your enterprise data?
- Azure OpenAI と Azure Cognitive Search の組み合わせを考える
Options: 1. Vector similarity search, 2. Pure Vector Search, 3. Hybrid Search, 4. Semantic Hybrid Search
# Semantic Hybrid Search
query = "what is azure sarch?"
search_client = SearchClient(
service_endpoint, index_name, AzureKeyCredential(key))
results = search_client.search(
search_text=query, #text
vector=Vector(value=generate_embeddings(
query), k=3, fields="contentVector"), #vector
select=["title", "content", "category"],
query_type="semantic", query_language="en-us", semantic_configuration_name='my-semantic-config', query_caption="extractive", query_answer="extractive", #semantic
top=3
)
semantic_answers = results.get_answers()
Microsoft Langchain Library supports C# and Python and offers several features, some of which are still in development and may be unclear on how to implement. However, it is simple, stable, and faster than Python-based open-source software. The features listed on the link include: Semantic Kernel Feature Matrix
This section includes how to utilize Azure Cosmos DB for vector storage and vector search by leveraging the Semantic-Kernel.
- appsettings.template.json : Environment value configuration file.
- ComoseDBVectorSearch.cs : Vector Search using Azure Cosmos DB
- CosmosDBKernelBuild.cs : Kernel Build code (test)
- CosmosDBVectorStore.cs : Embedding Text and store it to Azure Cosmos DB
- LoadDocumentPage.cs : PDF splitter class. Split the text to unit of section. (C# version of
azure-search-openai-demo/scripts/prepdocs.py
) - LoadDocumentPageOutput : LoadDocumentPage class generated output
- MemoryContextAndPlanner.cs : Test code of context and planner
- MemoryConversationHistory.cs : Test code of conversation history
- Program.cs : Run a demo. Program Entry point
- SemanticFunction.cs : Test code of conversation history
- semanticKernelCosmos.csproj : C# Project file
- Settings.cs : Environment value class
- SkillBingSearch.cs : Bing Search Skill
- SkillDALLEImgGen.cs : DALLE Skill (Only OpenAI, Azure Open AI not supports yet)
{
"Type": "azure",
"Model": "<model_deployment_name>",
"EndPoint": "https://<your-endpoint-value>.openai.azure.com/",
"AOAIApiKey": "<your-key>",
"OAIApiKey": "",
"OrdId": "-", //The value needs only when using Open AI.
"BingSearchAPIKey": "<your-key>",
"aoaiDomainName": "<your-endpoint-value>",
"CosmosConnectionString": "<cosmos-connection-string>"
}
-
Semantic Kernel has recently introduced support for Azure Cognitive Search as a memory. However, it currently only supports Azure Cognitive Search with a Semantic Search interface, lacking any features to store vectors to ACS.
-
According to the comments, this suggests that the strategy of the plan could be divided into two parts. One part focuses on Semantic Search, while the other involves generating embeddings using OpenAI.
Azure Cognitive Search automatically indexes your data semantically, so you don't need to worry about embedding generation.
samples/dotnet/kernel-syntax-examples/Example14_SemanticMemory.cs
.
// TODO: use vectors
// @Microsoft Semactic Kernel
var options = new SearchOptions
{
QueryType = SearchQueryType.Semantic,
SemanticConfigurationName = "default",
QueryLanguage = "en-us",
Size = limit,
};
- SemanticKernel Implementation sample to overcome Token limits of Open AI model. Semantic Kernel でトークンの限界を超えるような長い文章を分割してスキルに渡して結果を結合したい (zenn.dev) Semantic Kernel でトークンの限界を超える
Semantic Kernel sample code to integrate with Bing Search (ReAct??)
\ms-semactic-bing-notebook
- gs_chatgpt.ipynb: Azure Open AI ChatGPT sample to use Bing Search
- gs_davinci.ipynb: Azure Open AI Davinci sample to use Bing Search
Bing Search UI for demo
\bing-search-webui
: (utility)
cite: @practical-ai
- Langchain_1_(믹스의_인공지능).ipynb : Langchain Get started
- langchain_1_(믹스의_인공지능).py : -
- Langchain_2_(믹스의_인공지능).ipynb : Langchain Utilities
- langchain_2_(믹스의_인공지능).py : -
from langchain.chains.summarize import load_summarize_chain
chain = load_summarize_chain(chat, chain_type="map_reduce", verbose=True)
chain.run(docs[:3])
- stuff: Sends everything at once in LLM. If it's too long, an error will occur.
- map_reduce: Summarizes by dividing and then summarizing the entire summary.
- refine: (Summary + Next document) => Summary
- map_rerank: Ranks by score and summarizes to important points.
- Zero-shot
- Few-shot Learning
- Chain of Thought (CoT): ReAct and Self Consistency also inherit the CoT concept.
- Recursively Criticizes and Improves (RCI)
- ReAct: Grounding with external sources. (Reasoning and Act)
- Chain-of-Thought Prompting (paper)
- Tree of Thought (github)
- Prompt Concept
- Question-Answering
- Roll-play:
Act as a [ROLE] perform [TASK] in [FORMAT]
- Reasoning
- Prompt-Chain
- Program Aided Language Model
- Recursive Summarization: Long Text -> Chunks -> Summarize pieces -> Concatenate -> Summarize
To be specific, the ChatGPT API allows for differentiation between “user”, “assistant”, and “system” messages.
- always obey "system" messages.
- all end user input in the “user” messages.
- "assistant" messages as previous chat responses from the assistant.
Presumably, the model is trained to treat the user messages as human messages, system messages as some system level configuration, and assistant messages as previous chat responses from the assistant. (@https://blog.langchain.dev/using-chatgpt-api-to-evaluate-chatgpt/)
PEFT: Parameter-Efficient Fine-Tuning (Youtube)
-
Training language models to follow instructions with human feedback
@Binghchat
Sparsification is a technique used to reduce the size of large language models (LLMs) by removing redundant parameters without significantly affecting their performance. It is one of the methods used to compress LLMs. LLMs are neural networks that are trained on massive amounts of data and can generate human-like text. The term “sparsification” refers to the process of removing redundant parameters from these models.
Langchain | Semantic Kernel |
---|---|
Memory | Memory |
Tookit | Skill |
Tool | Function (Native, Semantic) |
Agent | Planner |
Chain | Steps, Pipeline |
Tool | Connector |
expressed in natural language in a text file "skprompt.txt" using SK's Prompt Template language. Each semantic function is defined by a unique prompt template file, developed using modern
-
Variables : use the {{$variableName}} syntax : Hello {{$name}}, welcome to Semantic Kernel!
-
Function calls: use the {{namespace.functionName}} syntax : The weather today is {{weather.getForecast}}.
-
Function parameters: {{namespace.functionName $varName}} and {{namespace.functionName "value"}} syntax : The weather today in {{$city}} is {{weather.getForecast $city}}.
-
Prompts needing double curly braces : {{ "{{" }} and {{ "}}" }} are special SK sequences.
-
Values that include quotes, and escaping :
For instance:
... {{ 'no need to \"escape" ' }} ... is equivalent to:
... {{ 'no need to "escape" ' }} ...
-
If you're using a text LLM, first try
zero-shot-react-description
. -
If you're using a Chat Model, try
chat-zero-shot-react-description
. -
If you're using a Chat Model and want to use memory, try
conversational-react-description
. -
self-ask-with-search
: self ask with search paper -
react-docstore
: ReAct paper
Journey | Short Description |
---|---|
ASK | A user's goal is sent to SK as an ASK |
Kernel | The kernel orchestrates a user's ASK |
Planner | The planner breaks it down into steps based upon resources that are available |
Resources | Planning involves leveraging available skills, memories, and connectors |
Steps | A plan is a series of steps for the kernel to execute |
Pipeline | Executing the steps results in fulfilling the user's ASK |
GET | And the user gets what they asked for ... |
- List of OSS LLM
- Printed version for "Medium" limits. - Link
- Auto-GPT
- babyagi: Most simplest implementation
- microsoft/JARVIS
- An unnecessarily tiny implementation of GPT-2 in NumPy. picoGPT
- lightaime/camel: 🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society (github.com)
- 1:1 Conversation between two ai agents Camel Agents - a Hugging Face Space by camel-ai Hugging Face (camel-agents)
- facebookresearch/llama
- Falcon LLM Apache 2.0 license
- StableVicuna Open Source RLHF LLM Chatbot
- Alpaca
- gpt4all
- vicuna
- dolly
- Cerebras-GPT
- GPT4All Download URL
- KoAlpaca
- Facebook: ImageBind / SAM (Just Info)
- facebookresearch/ImageBind: ImageBind One Embedding Space to Bind Them All (github.com)
- facebookresearch/segment-anything(SAM): The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. (github.com)
- Microsoft: Kosmos-1
- [2302.14045] Language Is Not All You Need: Aligning Perception with Language Models (arxiv.org)
- Language Is Not All You Need
openai/shap-e: Generate 3D objects conditioned on text or images (github.com)
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold (paper)
The string2string library is an open-source tool that offers a comprehensive suite of efficient algorithms for a broad range of string-to-string problems.
microsoft/Tokenizer: .NET and Typescript implementation of BPE tokenizer for OpenAI LLMs. (github.com) microsoft/Tokenizer
- Gradio
- Text generation web UI
- Very Simple Langchain example using Open AI: langchain-ask-pdf
- Open AI Chat Mockup: An open source ChatGPT UI. (github.com) mckaywrigley/chatbot-ui
- Streaming with Azure OpenAI SSE
- BIG-AGI FKA nextjs-chatgpt-app
- Embedding does not use Open AI. Can be executed locally. pdfGPT
-
activeloopai/deeplake: AI Vector Database for LLMs/LangChain. Doubles as a Data Lake for Deep Learning. Store, query, version, & visualize any data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai (github.com)
-
mosaicml/llm-foundry: LLM training code for MosaicML foundation models (github.com)
-
Microsoft Fabric: Fabric integrates technologies like Azure Data Factory, Azure Synapse Analytics, and Power BI into a single unified product
-
OpenAI Cookbook Examples and guides for using the OpenAI API
-
gpt4free for educational purposes only