OpenCompass Model Evaluation

This repository contains a customized version of OpenCompass 0.4.2, tailored for model evaluation in our specific business scenarios.

Overview

This project provides a comprehensive framework for evaluating language models using OpenCompass, with custom configurations and datasets relevant to our business use cases. The evaluation environment is containerized using Docker to ensure consistency and reproducibility.

Prerequisites

Docker with GPU support
Git
NVIDIA GPU drivers
Sufficient disk space for models and datasets

Getting Started

1. Clone the Repository

git clone [email protected]:YoctoHan/opencompass.git
cd opencompass

2. Build the Docker Image

Navigate to the docker directory and build the evaluation environment:

cd docker
docker build -t aix-opencompass-eval-250524:latest .

3. Launch the Container

Use the provided script to create and start a new container:

cd ../scripts
./launch_container.sh

This script will:

Create a new container with GPU support
Mount the parent directory as a workspace
Set up the necessary network configurations
Activate the pre-configured OpenCompass environment

4. Prepare Datasets

Once inside the container, run the data preparation script:

./workspace/scripts/prepare_data.sh

This script will download and organize all required datasets for evaluation.

5. Run Evaluations

After data preparation is complete, you're ready to start model evaluations. The OpenCompass environment is pre-configured and activated by default.

Example evaluation command:

python run.py evaluations/eval_aixcoder_debug.py

Project Structure

opencompass/
├── docker/           # Docker configuration files
├── scripts/          # Utility scripts
├── configs/          # Evaluation configurations
├── data/             # Dataset storage (created after preparation)
├── outputs/          # Evaluation results
└── evaluations/      # Evaluation scripts

Environment Details

Base Image: PyTorch 2.6.0 with CUDA 12.6 and cuDNN 9
Python Version: 3.10
OpenCompass Version: 0.4.2
Pre-configured proxy settings for package installation

Support

For questions or issues related to this evaluation framework, please open an issue in the repository.

License

This project follows the licensing terms of the original OpenCompass project. Please refer to the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 901 Commits
.github		.github
docker		docker
docs		docs
evaluations		evaluations
examples		examples
opencompass		opencompass
requirements		requirements
scripts		scripts
tests		tests
tools		tools
.codespellrc		.codespellrc
.gitignore		.gitignore
.owners.yml		.owners.yml
.pre-commit-config-zh-cn.yaml		.pre-commit-config-zh-cn.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_zh-CN.md		README_zh-CN.md
dataset-index.yml		dataset-index.yml
requirements.txt		requirements.txt
run.py		run.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenCompass Model Evaluation

Overview

Prerequisites

Getting Started

1. Clone the Repository

2. Build the Docker Image

3. Launch the Container

4. Prepare Datasets

5. Run Evaluations

Project Structure

Environment Details

Support

License

About

Uh oh!

Releases

Packages

Languages

License

YoctoHan/opencompass

Folders and files

Latest commit

History

Repository files navigation

OpenCompass Model Evaluation

Overview

Prerequisites

Getting Started

1. Clone the Repository

2. Build the Docker Image

3. Launch the Container

4. Prepare Datasets

5. Run Evaluations

Project Structure

Environment Details

Support

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages