Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug studio #1831

Closed
wants to merge 31 commits into from
Closed

Debug studio #1831

wants to merge 31 commits into from

Conversation

zolinthecow
Copy link
Contributor

Motivation

When developing apps (especially more complex ones) with SGLang's frontend it's a bit hard to tell what is actually getting passed into the generate/ endpoint. This PR introduces a "prompt debugging studio" that allows you to turn on a debug mode and will start intercepting generate requests and saving them to a debug server, where you can go to a web interface and view all the prompts and responses for SGL functions.

For example, if you start the debug server:

python -m sglang.launch_debug_server

It will start a debug server on http://0.0.0.0:56765

Then, you can add a debug region to an sgl function:

@sgl.function
def text_qa(s, question):
    s.begin_debug_region("TEXT_QA")
    s += "Q: " + question + "\n"
    s += "A:" + sgl.gen("answer", stop="\n")

state = text_qa.run(
    question="What is the capital of France?",
    temperature=0.1,
    stream=True
)

The s.begin_debug_region will start debugging under the prompt name "TEXT_QA". You can then go to http://localhost:56765 (will have to ssh forward if you're developing on a remote server) and you should see prompts start appearing

image

Modifications

  • Adds BeginDebugRegion and EndDebugRegion to SGLExpr for the frontend language
  • Adds launch_debug_server which under the hood relies on enochian-studio (one of my packages) for the debug server
    • enochian-studio can be found here, I can say it's up to date since I maintain it, along with Etched
  • Adds posting to the debug server if enabled for all backends

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@zolinthecow
Copy link
Contributor Author

^ Couldn't reproduce CI error locally :(

@zhaochenyang20 zhaochenyang20 self-requested a review November 1, 2024 18:11
@zhaochenyang20
Copy link
Collaborator

@zolinthecow Nice work! I will start to review this PR this weekend. Stay tuned!

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks good! I left a few minor comments.
One thing we want to ensure is that it does not introduce any overhead when debug mode is turned off (e.g., use as few function calls and if statements as possible at runtime).

To use it, first start the debug server:

```bash
python -m sglang.launch_debug_server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python -m sglang.launch_debug_server
python -m sglang.lang.launch_debug_server

@@ -9,4 +9,5 @@
- `bench_serving.py`: Benchmark online serving with dynamic requests.
- `global_config.py`: The global configs and constants.
- `launch_server.py`: The entry point for launching the local server.
- `launch_debug_server.py`: The entry point for launching the debug server + web app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an experimental feature for frontend language only, so please move it under python/sglang/lang.

@@ -50,13 +52,37 @@ def generate(
if s.cur_images
else s.text_
)

debug_request_id = str(uuid.uuid4())
s.log_debug(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to add s.log_debug to interpreter.py so we do not need to do it for every backend file.

Comment on lines +97 to +109
debug_request_id = str(uuid.uuid4())
debug_obj = s.log_debug(
[
{
"id": debug_request_id,
"requestPrompt": str(
[{"role": "system", "content": system}] + messages
),
"requestTimestamp": datetime.now().isoformat(),
"requestMetadata": sampling_params.to_anthropic_kwargs(),
}
]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not efficient enough. When debug is turned off, you still run the code to construct the argument, which takes some time. Please minimize the overhead and do not construct any objects when debug is turned off.

@merrymercy
Copy link
Contributor

closed due to inactivity and low priority

@merrymercy merrymercy closed this Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants