Awesome-Red-Teaming

Publish Date	Title	Authors	PDF	Code
2024-12-08	Heuristic-Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models	Ma Teng et.al.	2412.05934	null
2024-02-29 [ICLR 2024]	Curiosity-driven Red-teaming for Large Language Models	Zhang-Wei Hong et.al.	2402.19464	link
2024-06-07 [ICML 2024]	COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability	Xingang Guo et.al.	2402.08679	link
2024-02-05 [ICLR ur 66663]	Weak-to-Strong Jailbreaking on Large Language Models	Xuandong Zhao et.al.	2401.17256	link

Transferability/Diversity/Dynamism

Benchmark

Multi-Turn Attack

Mechanism and vulnerability

Safety Alignment

Multi-Target

Unlearning

Bias

Self-Evolution

New Jailbreak Method

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-Red-Teaming

Table of Contents

Survey

LLM as Attacker

Transferability/Diversity/Dynamism

Benchmark

Multi-Turn Attack

Mechanism and vulnerability

Safety Alignment

Multi-Target

Unlearning

Bias

Self-Evolution

New Jailbreak Method

About

Releases

Packages

chen37058/Awesome_Red_Teaming

Folders and files

Latest commit

History

Repository files navigation

Awesome-Red-Teaming

Table of Contents

Survey

LLM as Attacker

Transferability/Diversity/Dynamism

Benchmark

Multi-Turn Attack

Mechanism and vulnerability

Safety Alignment

Multi-Target

Unlearning

Bias

Self-Evolution

New Jailbreak Method

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages