The dataset are described in the paper: "GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models". If you find the dataset useful, please cite the paper. The dataset format is very simple — each entry contains a pair of texts, one "chosen" and one "rejected".
Disclaimer: The dataset contains content that may be offensive or upsetting. Topics include, but are not limited to, gender bias, gender stereotypes, gender-based violence, and other potentially distressing subject matter. Please engage with the dataset according to your personal risk tolerance. The dataset is intended for research purposes, especially for studies aimed at reducing gender bias in models. The views expressed in the data do not reflect those of the authors.