- Linux VM / WSL with Ubuntu 22.04 LTS
- Docker
- CUDA Toolkit:
- "sudo apt install nvidia-cuda-toolkit"
- .deb package
- or detailed installation
- GCC and G++ Compilers:
- "sudo apt install gcc g++"
- Git + Git LFS:
- "sudo apt install git"
- "curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash"
- "sudo apt-get install git-lfs"
- "git lfs install"
- Install some model with "git clone
link here
"
- PostgreSQL (Local in Docker):
- Default PostgreSQL 16.X
- PGvector Extension
- Leaderboard
- GGUF - quantization format (like "optimization" process).
- Instruct - fine-tuned models for understanding natural language instructions in the prompt.
- Llama 3:
- Leaderboard
- intfloat/multilingual-e5-large
- BAAI/bge-m3
- sentence-transformers/distiluse-base-multilingual-cased-v2
- intfloat/multilingual-e5-small
- Python 3.10.x:
- install requirements - in backend directory "python3 -m venv venv" then "pip install -r requirements.txt"
- freeze requirements - in backend directory "pip freeze > requirements.txt"
- Requirements:
- fastapi[all]
- sqlalchemy[asyncio]
- alembic
- asyncpg
- pyjwt[crypto]
- llama-cpp-python:
- create seperate venv
python3 -m venv venv_llama
(because version of dependencies can cause some bugs with GPU acceleration) - install or reinstall with
CMAKE_ARGS="-DLLAMA_CUDA=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir
orCMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir
- for server api inference -
pip install llama-cpp-python[server]
- run server api inference -
python3 -m llama_cpp.server --config_file <config_file or config file path>
- create seperate venv
- langchain langchain-community langchain_openai
- transformers
- sentence-transformers
- faiss-cpu / faiss-gpu
- argon2-cffi
- sqladmin
- langfuse (sdk + docker)
- Python 3.10.x:
- Requirements:
- streamlit extra-streamlit-components
- httpx
- pydantic[email] pydantic_settings
- pyscaffold
- pyinstrument
- pyjwt[crypto]