FEAT: Added Red Team Social Bias dataset #714

MoolmanM · 2025-02-14T19:53:07Z

Overview

This PR introduces the integration of the Red Team Social Bias dataset. This dataset contains aggregated and unified existing red-teaming prompts designed to identify stereotypes, discrimination, hate speech, and other representation harms in text-based Large Language Models (LLMs).

Work Completed

1.Dataset Integration:

Added functionality to fetch the Red Team Social Bias dataset using fetch_red_team_social_bias_prompts_dataset.

2.Demonstration Notebook and Script:

Introduced a demo .ipynb and .py under doc/code/orchestrators/.

Related Issue

issue #661

Concern: Handling Multi-Turn Prompts

Currently, I have implemented the extraction of Single Prompts from the dataset. However, I am unsure how to proceed with Multi-Turn prompts.

Let me know how you'd like me to handle this, and I can adjust accordingly.

MoolmanM · 2025-02-14T20:02:15Z

@microsoft-github-policy-service agree

doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb

pyrit/datasets/fetch_example_datasets.py

doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb

pyrit/datasets/fetch_example_datasets.py

romanlutz

Nice work, @MoolmanM ! This one is almost ready, just a few small things.

pyrit/datasets/fetch_example_datasets.py

pyrit/orchestrator/multi_turn/red_team_social_bias_prompts_orchestrator.py

doc/code/orchestrators/red_team_social_bias_prompts.ipynb

romanlutz · 2025-03-02T02:43:58Z

Something seems off given the number of changes included now. Perhaps the merge went wrong? Happy to help if you want

MoolmanM · 2025-03-02T11:06:35Z

Something seems off given the number of changes included now. Perhaps the merge went wrong? Happy to help if you want

Thank you for the offer! I really appreciate it. I decided to take this on myself as part of my learning process, and I believe I’ve got everything sorted now. Thanks again for your support!

doc/_toc.yml

romanlutz

Nice, just a small comment.

@svannie678 FYI

pyrit/datasets/fetch_example_datasets.py

eugeniavkim reviewed Feb 14, 2025

View reviewed changes

doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb Outdated Show resolved Hide resolved

eugeniavkim reviewed Feb 14, 2025

View reviewed changes

doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb Outdated Show resolved Hide resolved

romanlutz requested changes Feb 17, 2025

View reviewed changes

pyrit/datasets/fetch_example_datasets.py Outdated Show resolved Hide resolved

doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb Outdated Show resolved Hide resolved

pyrit/datasets/fetch_example_datasets.py Outdated Show resolved Hide resolved

romanlutz reviewed Feb 25, 2025

View reviewed changes

pyrit/datasets/fetch_example_datasets.py Show resolved Hide resolved

pyrit/orchestrator/multi_turn/red_team_social_bias_prompts_orchestrator.py Outdated Show resolved Hide resolved

doc/code/orchestrators/red_team_social_bias_prompts.ipynb Outdated Show resolved Hide resolved

MoolmanM added a commit to MoolmanM/PyRIT that referenced this pull request Mar 1, 2025

FIX: Remove redundant red team social bias files (Azure#714)

f533066

MoolmanM added a commit to MoolmanM/PyRIT that referenced this pull request Mar 1, 2025

FIX: Add red team social bias source information (Azure#714)

04b3101

MoolmanM added 6 commits March 2, 2025 12:59

FIX: Changed Red Team Social Bias Orchestrator

246fb66

FEAT: Added Red Team Social Bias dataset

390a457

FIX: Removed redundant loading code

720dfb4

FIX: Refactor fetch_red_team_social_bias_prompts function

d8f7979

FIX: Remove redundant red team social bias files (Azure#714)

b14672c

FIX: Add red team social bias source information (Azure#714)

5ccc997

MoolmanM force-pushed the main branch from 3b7d5db to 5ccc997 Compare March 2, 2025 11:02

romanlutz reviewed Mar 3, 2025

View reviewed changes

doc/_toc.yml Outdated Show resolved Hide resolved

Update _toc.yml

dffcaf5

romanlutz approved these changes Mar 3, 2025

View reviewed changes

pyrit/datasets/fetch_example_datasets.py Outdated Show resolved Hide resolved

pyrit/datasets/fetch_example_datasets.py Outdated Show resolved Hide resolved

pyrit/datasets/fetch_example_datasets.py Outdated Show resolved Hide resolved

romanlutz and others added 4 commits March 2, 2025 21:56

Update fetch_example_datasets.py

77689ce

Update fetch_example_datasets.py

e031c9a

FIX: Ensure categorization is always a list

6f56bfd

Merge updates from remote main branch

2de37cd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Added Red Team Social Bias dataset #714

FEAT: Added Red Team Social Bias dataset #714

MoolmanM commented Feb 14, 2025

MoolmanM commented Feb 14, 2025 •

edited

Loading

romanlutz left a comment

romanlutz commented Mar 2, 2025

MoolmanM commented Mar 2, 2025

romanlutz left a comment

FEAT: Added Red Team Social Bias dataset #714

Are you sure you want to change the base?

FEAT: Added Red Team Social Bias dataset #714

Conversation

MoolmanM commented Feb 14, 2025

Overview

Work Completed

1.Dataset Integration:

2.Demonstration Notebook and Script:

Concern: Handling Multi-Turn Prompts

MoolmanM commented Feb 14, 2025 • edited Loading

romanlutz left a comment

Choose a reason for hiding this comment

romanlutz commented Mar 2, 2025

MoolmanM commented Mar 2, 2025

romanlutz left a comment

Choose a reason for hiding this comment

MoolmanM commented Feb 14, 2025 •

edited

Loading