Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: clarify "contributing" document #968

Merged
merged 3 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 7 additions & 8 deletions docs/source/basic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,22 @@ organise this process.

generators
----------

Generators wrap a target LLM or dialogue system. They take a prompt
:doc:`<generators>` wrap a target LLM or dialogue system. They take a prompt
and return the output. The rest is abstracted away. Generator classes
deal with things like authentication, loading, connection management,
backoff, and all the behind-the-scenes things that need to happen
to get that prompt/response interaction working.

probes
------
Each probe tries to exploit a weakness and elicit a failure. The probe
:doc:`<probes>` tries to exploit a weakness and elicit a failure. The probe
manages all the interaction with the generator. It determines how
often to prompt, and what the content of the prompts is. Interaction
between probes and generators is mediated in an object called an attempt.

attempt
-------
Attempts represent one unique try at breaking the target. A probe wraps
An :doc:`<attempt>` represents one unique try at breaking the target. A probe wraps
up each of its adversarial interactions in an attempt object, and passes this
to the generator. The generator adds responses into the attempt and sends
the attempt back. This is logged in ``garak`` reporting which contains (among other
Expand All @@ -37,7 +36,7 @@ detector.

detectors
---------
Each detector attempts to identify a single failure mode. This could be
:doc:`<detectors>` attempt to identify a single failure mode. This could be
for example some unsafe contact, or failure to refuse a request. Detectors
do this by examining outputs that are stored in a prompt, looking for a
certain phenomenon. This could be a lack of refusal, or continuation of a
Expand All @@ -46,19 +45,19 @@ string in a certain way, or decoding an encoded prompt, for example.

buffs
-----
Buffs adjust prompts before they're sent to a generator. This could involve
:doc:`<buffs>` adjust prompts before they're sent to a generator. This could involve
translating them to another language, or adding paraphrases for probes that
have only a few, static prompts.


evaluators
----------
When detectors have added judgments to attempts, an evaluator converts the results
When detectors have added judgments to attempts, :doc:`<evaluators>` converts the results
to an object containing pass/fail data for a specific probe and detector pair.

harnesses
---------
The harnesses manage orchestration of a ``garak`` run. They select probes, then
The :doc:`<harnesses>` manage orchestration of a ``garak`` run. They select probes, then
detectors, and co-ordinate running probes, passing results to detectors, and
doing the final evaluation

Expand Down
57 changes: 44 additions & 13 deletions docs/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,17 @@ Getting ready

``garak``'s codebase is managed using github.
The best and only way to contribute to ``garak`` is to start by getting a copy of the source code.
You can use github's fork function to do this, which makes a copy of the ``garak`` codebase under your github user.
In there, you can edit code and build.
You can use github's fork function to do this, which makes a copy of the ``garak`` codebase under your github account.
In there, you can branch, edit code and build.
Once you're done, make a pull request to the main repo and describe what you've done -- and we'll take it from there!

Checking your contribution is within scope
------------------------------------------

``garak`` is a security toolkit rather than a content safety or bias toolkit.
The project scope relates primarily to LLM & dialog system security.
This is a huge area, and you can get an idea of the kind of contributions that are in scope from our `FAQ <https://github.com/leondz/garak/blob/main/FAQ.md>_` and our `Github issues <https://github.com/leondz/garak/issues>`_ page.


Connecting with the ``garak`` team & community
----------------------------------------------
Expand All @@ -23,6 +30,30 @@ There are a number of ways you can reach out to us:

We'd love to help, and we're always interested to hear how you're using garak.


Checklist for contributing
--------------------------

1. Set up a `Github <https://github.com/>`_ account, if you don't have one already. We develop in the open and the public repository is the authoritative one.
1. Fork the ``garak`` repository - `<https://github.com/leondz/garak/fork>`_
1. Work out what you're doing. If it's from a good first issue (`see the list <https://github.com/leondz/garak/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22>`_), drop a note on that issue so that we know you're working on it, and so that nobody else also starts working on it.
1. Before you code anything: create a new branch for your work, e.g. ``git checkout -b feature/spicy_probe``
1. Check out the rest of this page which includes links to detailed step-by-step guides to developing garak plugins
1. Code!
1. Run ``black`` on your code, so that it's well-formatted. Our github commit hook can refuse to accept ``black``-passing code.
1. Write your own tests - these are a requirement for merging!
1. When you're done, send a pull request. Github has big buttons for this and there's a template for you to fill in.
1. We'll discuss the code together with you, tune it up, and hopefully merge it in, maybe with some edits!
1. Now you're an official ``garak`` contributor, and will be permanently recognized in the project credits from the next official release. Thank you!



Code structure
--------------

We have a page describing the :doc:`top-level concepts in garak <basic>`.
Rather than repeat that, take a look, so you have an idea about the code base!

Developing your own plugins
---------------------------

Expand All @@ -32,10 +63,10 @@ The recipe for writing a new plugin or plugin class isn't outlandish:

* Only start a new module if none of the current modules could fit
* Take a look at how other plugins do it
* For an example Generator, check out `garak/probes/replicate.py`
* For an example Probe, check out `garak/probes/malwaregen.py`
* For an example Detector, check out `garak/detectors/toxicity.py` or `garak/detectors/specialwords.py`
* For an example Buff, check out `garak/buffs/lowercase.py`
* For an example Generator, check out :class:`garak.probes.replicate`
* For an example Probe, check out :class:`garak.probes.malwaregen`
* For an example Detector, check out :class:`garak.detectors.toxicity` or :class:`garak.detectors.specialwords`
* For an example Buff, check out :class:`garak.buffs.lowercase`
* Start a new module inheriting from one of the base classes, e.g. :class:`garak.probes.base.Probe`
* Override as little as possible.

Expand All @@ -55,7 +86,7 @@ Describing your code changes
Commit messages
~~~~~~~~~~~~~~~

Commit messages should describe what is changed in the commit. Try to keep one "theme" per commit. We read commit messages to work out what the intent of the commit is. We're all trying to save time here, and clear commit messages that include context can be a great time saver. Check out this guide to writing [commit messages](https://www.freecodecamp.org/news/how-to-write-better-git-commit-messages/).
Commit messages should describe what is changed in the commit. Try to keep one "theme" per commit. We read commit messages to work out what the intent of the commit is. We're all trying to save time here, and clear commit messages that include context can be a great time saver. Check out this guide to writing `commit messages <https://www.freecodecamp.org/news/how-to-write-better-git-commit-messages/>`_.

Pull requests
~~~~~~~~~~~~~
Expand All @@ -75,13 +106,13 @@ Testing during development
You can test your code in a few ways:

* Start an interactive Python session
* Import the model, e.g. `import garak.probes.mymodule`
* Instantiate the plugin, e.g. `p = garak.probes.mymodule.MyProbe()`
* Get ``garak`` to list all the plugins of the type you're writing, with `--list_probes`, `--list_detectors`, or `--list_generators`: `python3 -m garak --list_probes`
* Instantiate the plugin, e.g. ``import garak._plugins`` then ``probe = garak._plugins.load_plugin("garak.probes.mymodule.MyProbe")``
* Check out that the values and methods work as you'd expect
* Get ``garak`` to list all the plugins of the type you're writing, with ``--list_probes``, ``--list_detectors``, or ``--list_generators``: ```python3 -m garak --list_probes``
* Run a scan with test plugins
* For probes, try a blank generator and always.Pass detector: `python3 -m garak -m test.Blank -p mymodule -d always.Pass`
* For detectors, try a blank generator and a blank probe: `python3 -m garak -m test.Blank -p test.Blank -d mymodule`
* For generators, try a blank probe and always.Pass detector: `python3 -m garak -m mymodule -p test.Blank -d always.Pass`
* For probes, try a blank generator and always.Pass detector: ``python3 -m garak -m test.Blank -p mymodule -d always.Pass``
* For detectors, try a blank generator and a blank probe: ``python3 -m garak -m test.Blank -p test.Blank -d mymodule``
* For generators, try a blank probe and always.Pass detector: ``python3 -m garak -m mymodule -p test.Blank -d always.Pass``


garak supports pytest tests in garak/tests. You can run these with ``python -m pytest tests/`` from the root directory.
Expand Down
Loading