Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Gradio HiTL Scorer #722

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

mart123p
Copy link
Contributor

Description

This PR introduces a new UI for scoring using Gradio, which improves the user experience by running in a separate process.

UI Implementation

The new UI is implemented using Gradio and runs in a different process. This allows the UI to wait on scoring tasks and respond accordingly, preventing the UI from flashing and ensuring it remains on screen throughout the scoring process.

Communication Mechanism

The communication between the UI and the scoring process is done using rpyc, which implements an RPC mechanism on localhost.

Gradio Features

Gradio allows us to display more than just messages. We can display entire conversations or even multimedia content. By default, the Gradio HiTL will be launched in a web view, but the web browser can be opened by setting the argument open_browser to true.

Platform Support

Currently, this implementation supports Windows and is limited to string scoring. Future updates will include support for other platforms and content types.

Dependency Changes

The dependency aiofiles was downgraded from 24.1.0 to 23.2.1 to resolve a compatibility issue with Gradio. No issues were identified with this downgrade in PyRIT.

Tests and Documentation

I am looking for feedback on the types of tests (unit, integration, etc.) and documentation (user guide, API reference, etc.) that should be added for this PR. This scorer is meant as a drop-in replacement for the current HiTL scorer.

@mart123p mart123p changed the title FEAT: Gradio HiTL Scorer [DRAFT] FEAT: Gradio HiTL Scorer Feb 18, 2025
@mart123p mart123p changed the title [DRAFT] FEAT: Gradio HiTL Scorer FEAT: Gradio HiTL Scorer Feb 19, 2025
Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the comments, I have two more questions:

  • Is there any way to test this in an automated fashion?
  • Do we want to document this already or keep it "soft launched"/experimental until some other features are added or we get feedback? This is pretty cool and warrants a blog at some point for sure!

pyproject.toml Outdated
@@ -37,7 +37,7 @@ classifiers = [
requires-python = ">=3.10, <3.13"
dependencies = [
"aioconsole>=0.7.1",
"aiofiles>=24.1.0",
"aiofiles>=23.2.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this downgrade necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes there's an issue with Gradio

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we need to document that here otherwise we'll accidentally break it soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this just allows more versions and doesn't actually force downgrading

from typing import Optional

class HumanInTheLoopScorerGradio(Scorer):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class or constructor needs a docstring

if len(sys.argv) > 1:
open_browser = sys.argv[1] == "True"

from scorer import GradioApp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this is here so that we don't import it if the gradio extra isn't installed?

If so, we import gradio in the scorer file. So that won't help, right?

pyrit/ui/rpc.py Outdated
"""
RPC service is the service that RPyC is using
"""
def __init__(self, score_received_sem: Semaphore, client_ready_sem: Semaphore):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  • I'd recommend spelling it out. Characters are free 🙂
  • In most cases, I tell people to prefer kw-only args, so that would be __init__(self, *, ...)

pyrit/ui/rpc.py Outdated

# Start the RPC server.
self.__rpc_service = self.RpcService(self.__score_received_sem, self.__client_ready_sem)
self.__server = self.rpyc.ThreadedServer(self.__rpc_service, port=DEFAULT_PORT, protocol_config={"allow_all_attrs": True})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be configurable in case there's a conflict. A default value is reasonable, of course.

@mart123p
Copy link
Contributor Author

mart123p commented Mar 3, 2025

Aside from the comments, I have two more questions:

  • Is there any way to test this in an automated fashion?
  • Do we want to document this already or keep it "soft launched"/experimental until some other features are added or we get feedback? This is pretty cool and warrants a blog at some point for sure!
  • Automated testing could be done at the RPC level. Testing E2E would require a UI testing framework. I would rather go the low hanging fruit to start testing first.
  • I agree it should definitely be labeled as experimental, and not be ready for mass adoption yet. This will give us some time to review the architecture, maybe add supports for other platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants