A Streamlit web application that helps you estimate GPU memory requirements for Hugging Face models. Original gist by @philschmid.
- Search Hugging Face models by name
- Quick search buttons for popular model families
- Support for different data types (float16, bfloat16, float32)
- Real-time memory requirement calculations
- User-friendly interface
- Clone the repository:
git clone https://github.com/gabzofar/LLM-GPU-Memory-Calculator.git
cd LLM-GPU-Memory-Calculator
- Install the required dependencies:
pip install -r requirements.txt
Run the Streamlit app:
streamlit run app.py
The application will open in your default web browser. You can then:
- Search for models using the search bar
- Use quick search buttons for popular model families
- Select a specific model from the search results
- Choose your desired data type
- Click "Calculate Memory Requirements" to see the estimated GPU memory needed
- Memory estimates include an 18% overhead for CUDA kernels and runtime requirements
- Actual memory usage may vary depending on your specific setup
- Memory calculations use binary prefix (1024³ bytes per GiB)
- Original concept: @philschmid
- Built with Streamlit
- Model data from 🤗 Hugging Face