This repository includes the code and scripts to reproduce the experiments presented in the arXiv paper How Quantization Shapes Bias in Large Language Models. The code can also be used to test social bias on large language models compatible with the HuggingFace library. Our framework is built on top of COMPL-AI.
This work presents a comprehensive evaluation of how quantization affects model bias, with particular attention to its impact on individual demographic subgroups. We focus on weight and activation quantization strategies and examine their effects across a broad range of bias types, including stereotypes, toxicity, sentiment, and fairness. We employ both probabilistic and generated text-based metrics across nine benchmarks and evaluate models varying in architecture family and reasoning ability. Our findings show that quantization has a nuanced impact on bias: while it can reduce model toxicity and does not significantly impact sentiment, it tends to slightly increase stereotypes and unfairness in generative tasks, especially under aggressive compression. These trends are generally consistent across demographic categories and model types, although their magnitude depends on the specific setting. Overall, our results highlight the importance of carefully balancing efficiency and ethical considerations when applying quantization in practice.
Clone the repository and fetch all submodules:
git clone https://github.com/insait-institute/quantization-affects-social-bias.git
cd quantization-affects-social-bias
git submodule update --init --recursive[Note] When a path is required to run a script, please provide the absolute path to avoid errors.
[Recommended] Set the HuggingFace home to the model folder at the root of the repository, and export your HF token, which is required to download the benchmarks and models.
export HF_HOME="./models"
export HF_TOKEN="..."Create the two Conda environments needed to run the Social Bias Evaluation Framework and the quantization library:
-
To set up the framework environment, use the
framework_env.yamlfile:conda env create -f framework_env.yaml
-
[Optional] To set up the quantization library environment, use the
compression_env.yamlfile:conda env create -f compression/compression_env.yaml
Download the necessary datasets to run the evaluation:
conda activate bias_eval
python helper_tools/download_datasets.py[Optional] Download the un-quantized pre-trained models into the MODELS_DIR folder:
conda activate bias_eval
bash helper_tools/download_models.py <MODELS_DIR>[Optional] Quantize the model as described in the article. After quantization, each model will be saved in a dedicated folder within MODELS_DIR (note: MODELS_DIR is the folder containing the root folder of the models to be quantized).
conda activate compress
cd compress_models <MODELS_DIR>
bash compress_models.sh- [Fast] To test the installation of the Social Bias Evaluation Framework on a dummy model, run the following command:
conda activate bias_eval
cd run_scripts
bash run_test.sh- [Slow] To test the framework on the LLM model saved in the
MODEL_PATHfolder, run the following script. The script will load the LLM and run the evaluation on a subset of each evaluation benchmark.
conda activate bias_eval
cd run_scripts
bash run_debug.sh <MODEL_PATH>- To fully evaluate a model, use the following commands, where
MODEL_PATHis the model folder andCONFIG_PATHis the model configuration file stored in./configs/models/:
conda activate bias_eval
cd run_scripts
bash run.sh <MODEL_PATH> <CONFIG_PATH>- To reproduce the evaluation performed in the article, run the following:
conda activate bias_eval
cd run_scripts
bash run_all.sh <MODELS_DIR>- To run the LLM-as-a-judge evaluation on toxic continuations, use the following, where
BENCH_RESULTS_DIRis the folder containing the benchmark results (e.g.,results/runs/Qwen2.5-14B-Instruct/1984-04-30_00:00:00),BENCH_NAMEcan beboldordt_toxic,MODEL_NAMEis the name of the model whose results you want to evaluate, and [optional]JUDGE_PATHis the path to the judge model.
conda activate bias_eval
cd run_scripts
bash run_judge.sh <BENCH_RESULTS_DIR> <BENCH_NAME> <MODEL_NAME> <JUDGE_PATH>- To reproduce the LLM-as-a-judge evaluation performed in the article, run the following:
conda activate bias_eval
cd run_scripts
bash run_judge_all.shTo compute the size of the un-quantized model as well as the non-fake-quantized models reported in the article, run the following commands:
conda activate bias_eval
python helper_tools/compute_model_size.py @article{marcuzzi2025quantizationshapesbiaslarge,
title={How Quantization Shapes Bias in Large Language Models},
author={Federico Marcuzzi and Xuefei Ning and Roy Schwartz and Iryna Gurevych},
year={2025},
eprint={2508.18088},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.18088}
}