Skip to content

GenAI-Gurus/llm_breakout_db

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introducing LLM Breakout DB: the Public Crowd-Sourced Jailbreak Prompt Database

A New Initiative for AI Security and Transparency

As AI systems become more prevalent and powerful, ensuring their robustness and security has never been more critical. LLM Breakout DB is an initiative designed to catalog and track jailbreak techniques used to bypass safety mechanisms in Large Language Models (LLMs). Inspired by the cybersecurity CVE (Common Vulnerabilities and Exposures) framework, this project aims to create an open and structured repository of known vulnerabilities in AI systems, fostering responsible disclosure and ethical AI development.

Why This Project Matters

Jailbreaking AI models is a growing concern, with researchers and adversaries continuously finding ways to bypass safeguards. By openly documenting these exploits, this database serves multiple purposes:

  • Enhancing AI Safety: By studying known jailbreaks, AI developers can strengthen defenses and build more secure models.
  • Community Collaboration: Encouraging public contributions and verification fosters transparency and shared learning.
  • Responsible Disclosure: AI providers can address vulnerabilities proactively, reducing potential harm before widespread misuse.
  • Structured Vulnerability Tracking: Inspired by the CVE model, each jailbreak entry is standardized, ensuring clear documentation and reproducibility testing.

Core Features of the MVP

The Minimum Viable Product (MVP) of this system is now available for public testing. The following features are included:

1. Jailbreak Prompt Database

A searchable and categorized database where users can:

  • Browse known jailbreak prompts and their effectiveness scores.
  • Filter vulnerabilities by affected LLM model, version, and provider.
  • View community-rated effectiveness and reproducibility scores.

2. Community Submissions & Verification

Registered users can:

  • Submit new jailbreak prompts with supporting evidence (screenshots, logs, explanations).
  • Rate and comment on existing jailbreaks.
  • Flag outdated or ineffective exploits for review.

3. Structured Review and Moderation

To ensure data quality and integrity, a multi-tiered verification system is implemented:

  • Verified Researchers: Experts assess submissions for credibility and impact.
  • Moderators: Oversee submissions, manage disputes, and ensure compliance with disclosure policies.

4. Responsible Disclosure Mechanism

AI developers can opt into pre-publication notifications of vulnerabilities affecting their models. A controlled disclosure window (e.g., 30 days) allows vendors to implement fixes before public release.

5. Leaderboard & Scoring System

To incentivize high-quality contributions, a leaderboard ranks contributors based on:

  • Verified submissions.
  • Effectiveness and reproducibility ratings.
  • Community engagement.

Future Development Roadmap

While the MVP lays the foundation, Version 1.0 and beyond will expand capabilities with:

  • API Access: Providing programmatic access to jailbreak data.
  • Advanced Security Measures: Anti-spam protections, tamper-proof records, and stronger user verification.
  • Extended Responsible Disclosure Options: More granular vendor control over disclosure timing.
  • Enhanced UI & Analytics: Improving search, visualization, and reporting functionalities.

Get Involved

LLM Breakout DB is a collaborative effort that relies on contributions from security researchers, AI developers, and ethical hackers. Whether you want to submit findings, verify reports, or explore AI vulnerabilities, you can start engaging with the platform today.

🔗 Access the MVP here: LLM Breakout DB

📢 Interested in collaborating? Contact us to discuss research partnerships with AI alignment institutions and universities: https://linkedin.com/in/carloshvp

Together, we can build a safer and more transparent AI ecosystem.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages