Introducing LLM Breakout DB: the Public Crowd-Sourced Jailbreak Prompt Database
As AI systems become more prevalent and powerful, ensuring their robustness and security has never been more critical. LLM Breakout DB is an initiative designed to catalog and track jailbreak techniques used to bypass safety mechanisms in Large Language Models (LLMs). Inspired by the cybersecurity CVE (Common Vulnerabilities and Exposures) framework, this project aims to create an open and structured repository of known vulnerabilities in AI systems, fostering responsible disclosure and ethical AI development.
Jailbreaking AI models is a growing concern, with researchers and adversaries continuously finding ways to bypass safeguards. By openly documenting these exploits, this database serves multiple purposes:
- Enhancing AI Safety: By studying known jailbreaks, AI developers can strengthen defenses and build more secure models.
- Community Collaboration: Encouraging public contributions and verification fosters transparency and shared learning.
- Responsible Disclosure: AI providers can address vulnerabilities proactively, reducing potential harm before widespread misuse.
- Structured Vulnerability Tracking: Inspired by the CVE model, each jailbreak entry is standardized, ensuring clear documentation and reproducibility testing.
The Minimum Viable Product (MVP) of this system is now available for public testing. The following features are included:
A searchable and categorized database where users can:
- Browse known jailbreak prompts and their effectiveness scores.
- Filter vulnerabilities by affected LLM model, version, and provider.
- View community-rated effectiveness and reproducibility scores.
Registered users can:
- Submit new jailbreak prompts with supporting evidence (screenshots, logs, explanations).
- Rate and comment on existing jailbreaks.
- Flag outdated or ineffective exploits for review.
To ensure data quality and integrity, a multi-tiered verification system is implemented:
- Verified Researchers: Experts assess submissions for credibility and impact.
- Moderators: Oversee submissions, manage disputes, and ensure compliance with disclosure policies.
AI developers can opt into pre-publication notifications of vulnerabilities affecting their models. A controlled disclosure window (e.g., 30 days) allows vendors to implement fixes before public release.
To incentivize high-quality contributions, a leaderboard ranks contributors based on:
- Verified submissions.
- Effectiveness and reproducibility ratings.
- Community engagement.
While the MVP lays the foundation, Version 1.0 and beyond will expand capabilities with:
- API Access: Providing programmatic access to jailbreak data.
- Advanced Security Measures: Anti-spam protections, tamper-proof records, and stronger user verification.
- Extended Responsible Disclosure Options: More granular vendor control over disclosure timing.
- Enhanced UI & Analytics: Improving search, visualization, and reporting functionalities.
LLM Breakout DB is a collaborative effort that relies on contributions from security researchers, AI developers, and ethical hackers. Whether you want to submit findings, verify reports, or explore AI vulnerabilities, you can start engaging with the platform today.
🔗 Access the MVP here: LLM Breakout DB
📢 Interested in collaborating? Contact us to discuss research partnerships with AI alignment institutions and universities: https://linkedin.com/in/carloshvp
Together, we can build a safer and more transparent AI ecosystem.