-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Added Red Team Social Bias dataset #714
base: main
Are you sure you want to change the base?
Conversation
@microsoft-github-policy-service agree |
doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb
Outdated
Show resolved
Hide resolved
doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb
Outdated
Show resolved
Hide resolved
doc/code/orchestrators/red_team_social_bias_prompt_orchestrator.ipynb
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, @MoolmanM ! This one is almost ready, just a few small things.
pyrit/orchestrator/multi_turn/red_team_social_bias_prompts_orchestrator.py
Outdated
Show resolved
Hide resolved
Something seems off given the number of changes included now. Perhaps the merge went wrong? Happy to help if you want |
Thank you for the offer! I really appreciate it. I decided to take this on myself as part of my learning process, and I believe I’ve got everything sorted now. Thanks again for your support! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, just a small comment.
@svannie678 FYI
Overview
This PR introduces the integration of the Red Team Social Bias dataset. This dataset contains aggregated and unified existing red-teaming prompts designed to identify stereotypes, discrimination, hate speech, and other representation harms in text-based Large Language Models (LLMs).
Work Completed
1.Dataset Integration:
Added functionality to fetch the Red Team Social Bias dataset using
fetch_red_team_social_bias_prompts_dataset
.2.Demonstration Notebook and Script:
Introduced a demo .ipynb and .py under
doc/code/orchestrators/
.Related Issue
issue #661
Concern: Handling Multi-Turn Prompts
Currently, I have implemented the extraction of Single Prompts from the dataset. However, I am unsure how to proceed with Multi-Turn prompts.
Let me know how you'd like me to handle this, and I can adjust accordingly.