Prompt Jailbreak Example - HuggingFace Inference Endpoints

This repository demonstrates the use of a prompt jailbreak to expose information within a system prompt. Specifically, we target any LLM hosted on HuggingFace Inference Endpoints. The standard example shows jailbreaks upon google/gemma-7b-it.

Instructions

Execute pip install -r requirements.txt to install the necessary dependencies.
Set your HF_TOKEN environment variable in the jailbreak.py file - Line 28.
Run the jailbreak with python jailbreak.py.

The expected output should be two arrays - one containing the original responses from the LLM, and another with the jailbreak applied.

Other LLMs and Examples

This repo supports any LLM hosted on HuggingFace Inference. If you wish to change the target LLM, simply modify API_URL on line 27 in jailbreak.py. Furthermore, if you wish to send different user prompts or alter/change the jailbreak or system prompt, you can also find this at the top of the script.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.devcontainer		.devcontainer
.gitignore		.gitignore
README.md		README.md
jailbreak.py		jailbreak.py
model.py		model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prompt Jailbreak Example - HuggingFace Inference Endpoints

Instructions

Other LLMs and Examples

About

Releases

Packages

Languages

Mindgard/prompt_jailbreak

Folders and files

Latest commit

History

Repository files navigation

Prompt Jailbreak Example - HuggingFace Inference Endpoints

Instructions

Other LLMs and Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages