From 760bf76d77a69300c24927dc95fa3ad3dd096313 Mon Sep 17 00:00:00 2001 From: Leon Derczynski Date: Wed, 30 Oct 2024 09:08:17 +0100 Subject: [PATCH 1/3] add scope comment --- docs/source/contributing.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/source/contributing.rst b/docs/source/contributing.rst index f077716f..fb649c3f 100644 --- a/docs/source/contributing.rst +++ b/docs/source/contributing.rst @@ -10,6 +10,14 @@ You can use github's fork function to do this, which makes a copy of the ``garak In there, you can edit code and build. Once you're done, make a pull request to the main repo and describe what you've done -- and we'll take it from there! +Checking your contribution is within scope +------------------------------------------ + +``garak`` is a security toolkit rather than a content safety or bias toolkit. +The project scope relates primarily to LLM & dialog system security. +This is a huge area, and you can get an idea of the kind of contributions that are in scope from our `FAQ _` and our `Github issues `_ page. + + Connecting with the ``garak`` team & community ---------------------------------------------- From c90311b54a252b95745ea08d6f86e1a48a654714 Mon Sep 17 00:00:00 2001 From: Leon Derczynski Date: Wed, 30 Oct 2024 09:22:50 +0100 Subject: [PATCH 2/3] link primitives to doc pages --- docs/source/basic.rst | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/docs/source/basic.rst b/docs/source/basic.rst index efcf51e4..174460a1 100644 --- a/docs/source/basic.rst +++ b/docs/source/basic.rst @@ -9,8 +9,7 @@ organise this process. generators ---------- - -Generators wrap a target LLM or dialogue system. They take a prompt +:doc:`` wrap a target LLM or dialogue system. They take a prompt and return the output. The rest is abstracted away. Generator classes deal with things like authentication, loading, connection management, backoff, and all the behind-the-scenes things that need to happen @@ -18,14 +17,14 @@ to get that prompt/response interaction working. probes ------ -Each probe tries to exploit a weakness and elicit a failure. The probe +:doc:`` tries to exploit a weakness and elicit a failure. The probe manages all the interaction with the generator. It determines how often to prompt, and what the content of the prompts is. Interaction between probes and generators is mediated in an object called an attempt. attempt ------- -Attempts represent one unique try at breaking the target. A probe wraps +An :doc:`` represents one unique try at breaking the target. A probe wraps up each of its adversarial interactions in an attempt object, and passes this to the generator. The generator adds responses into the attempt and sends the attempt back. This is logged in ``garak`` reporting which contains (among other @@ -37,7 +36,7 @@ detector. detectors --------- -Each detector attempts to identify a single failure mode. This could be +:doc:`` attempt to identify a single failure mode. This could be for example some unsafe contact, or failure to refuse a request. Detectors do this by examining outputs that are stored in a prompt, looking for a certain phenomenon. This could be a lack of refusal, or continuation of a @@ -46,19 +45,19 @@ string in a certain way, or decoding an encoded prompt, for example. buffs ----- -Buffs adjust prompts before they're sent to a generator. This could involve +:doc:`` adjust prompts before they're sent to a generator. This could involve translating them to another language, or adding paraphrases for probes that have only a few, static prompts. evaluators ---------- -When detectors have added judgments to attempts, an evaluator converts the results +When detectors have added judgments to attempts, :doc:`` converts the results to an object containing pass/fail data for a specific probe and detector pair. harnesses --------- -The harnesses manage orchestration of a ``garak`` run. They select probes, then +The :doc:`` manage orchestration of a ``garak`` run. They select probes, then detectors, and co-ordinate running probes, passing results to detectors, and doing the final evaluation From 7c81501164109e5dbc783a616f4b9fa70b50218a Mon Sep 17 00:00:00 2001 From: Leon Derczynski Date: Wed, 30 Oct 2024 09:22:57 +0100 Subject: [PATCH 3/3] flesh out contributing guide more --- docs/source/contributing.rst | 51 ++++++++++++++++++++++++++---------- 1 file changed, 37 insertions(+), 14 deletions(-) diff --git a/docs/source/contributing.rst b/docs/source/contributing.rst index fb649c3f..32c0f61a 100644 --- a/docs/source/contributing.rst +++ b/docs/source/contributing.rst @@ -6,8 +6,8 @@ Getting ready ``garak``'s codebase is managed using github. The best and only way to contribute to ``garak`` is to start by getting a copy of the source code. -You can use github's fork function to do this, which makes a copy of the ``garak`` codebase under your github user. -In there, you can edit code and build. +You can use github's fork function to do this, which makes a copy of the ``garak`` codebase under your github account. +In there, you can branch, edit code and build. Once you're done, make a pull request to the main repo and describe what you've done -- and we'll take it from there! Checking your contribution is within scope @@ -18,7 +18,6 @@ The project scope relates primarily to LLM & dialog system security. This is a huge area, and you can get an idea of the kind of contributions that are in scope from our `FAQ _` and our `Github issues `_ page. - Connecting with the ``garak`` team & community ---------------------------------------------- @@ -31,6 +30,30 @@ There are a number of ways you can reach out to us: We'd love to help, and we're always interested to hear how you're using garak. + +Checklist for contributing +-------------------------- + +1. Set up a `Github `_ account, if you don't have one already. We develop in the open and the public repository is the authoritative one. +1. Fork the ``garak`` repository - ``_ +1. Work out what you're doing. If it's from a good first issue (`see the list `_), drop a note on that issue so that we know you're working on it, and so that nobody else also starts working on it. +1. Before you code anything: create a new branch for your work, e.g. ``git checkout -b feature/spicy_probe`` +1. Check out the rest of this page which includes links to detailed step-by-step guides to developing garak plugins +1. Code! +1. Run ``black`` on your code, so that it's well-formatted. Our github commit hook can refuse to accept ``black``-passing code. +1. Write your own tests - these are a requirement for merging! +1. When you're done, send a pull request. Github has big buttons for this and there's a template for you to fill in. +1. We'll discuss the code together with you, tune it up, and hopefully merge it in, maybe with some edits! +1. Now you're an official ``garak`` contributor, and will be permanently recognized in the project credits from the next official release. Thank you! + + + +Code structure +-------------- + +We have a page describing the :doc:`top-level concepts in garak `. +Rather than repeat that, take a look, so you have an idea about the code base! + Developing your own plugins --------------------------- @@ -40,10 +63,10 @@ The recipe for writing a new plugin or plugin class isn't outlandish: * Only start a new module if none of the current modules could fit * Take a look at how other plugins do it - * For an example Generator, check out `garak/probes/replicate.py` - * For an example Probe, check out `garak/probes/malwaregen.py` - * For an example Detector, check out `garak/detectors/toxicity.py` or `garak/detectors/specialwords.py` - * For an example Buff, check out `garak/buffs/lowercase.py` + * For an example Generator, check out :class:`garak.probes.replicate` + * For an example Probe, check out :class:`garak.probes.malwaregen` + * For an example Detector, check out :class:`garak.detectors.toxicity` or :class:`garak.detectors.specialwords` + * For an example Buff, check out :class:`garak.buffs.lowercase` * Start a new module inheriting from one of the base classes, e.g. :class:`garak.probes.base.Probe` * Override as little as possible. @@ -63,7 +86,7 @@ Describing your code changes Commit messages ~~~~~~~~~~~~~~~ -Commit messages should describe what is changed in the commit. Try to keep one "theme" per commit. We read commit messages to work out what the intent of the commit is. We're all trying to save time here, and clear commit messages that include context can be a great time saver. Check out this guide to writing [commit messages](https://www.freecodecamp.org/news/how-to-write-better-git-commit-messages/). +Commit messages should describe what is changed in the commit. Try to keep one "theme" per commit. We read commit messages to work out what the intent of the commit is. We're all trying to save time here, and clear commit messages that include context can be a great time saver. Check out this guide to writing `commit messages `_. Pull requests ~~~~~~~~~~~~~ @@ -83,13 +106,13 @@ Testing during development You can test your code in a few ways: * Start an interactive Python session - * Import the model, e.g. `import garak.probes.mymodule` - * Instantiate the plugin, e.g. `p = garak.probes.mymodule.MyProbe()` -* Get ``garak`` to list all the plugins of the type you're writing, with `--list_probes`, `--list_detectors`, or `--list_generators`: `python3 -m garak --list_probes` + * Instantiate the plugin, e.g. ``import garak._plugins`` then ``probe = garak._plugins.load_plugin("garak.probes.mymodule.MyProbe")`` + * Check out that the values and methods work as you'd expect +* Get ``garak`` to list all the plugins of the type you're writing, with ``--list_probes``, ``--list_detectors``, or ``--list_generators``: ```python3 -m garak --list_probes`` * Run a scan with test plugins - * For probes, try a blank generator and always.Pass detector: `python3 -m garak -m test.Blank -p mymodule -d always.Pass` - * For detectors, try a blank generator and a blank probe: `python3 -m garak -m test.Blank -p test.Blank -d mymodule` - * For generators, try a blank probe and always.Pass detector: `python3 -m garak -m mymodule -p test.Blank -d always.Pass` + * For probes, try a blank generator and always.Pass detector: ``python3 -m garak -m test.Blank -p mymodule -d always.Pass`` + * For detectors, try a blank generator and a blank probe: ``python3 -m garak -m test.Blank -p test.Blank -d mymodule`` + * For generators, try a blank probe and always.Pass detector: ``python3 -m garak -m mymodule -p test.Blank -d always.Pass`` garak supports pytest tests in garak/tests. You can run these with ``python -m pytest tests/`` from the root directory.