Add GRPO on Ray guided example with README by Fiona-Waters · Pull Request #86 · red-hat-data-services/red-hat-ai-examples

Fiona-Waters · 2026-05-28T15:36:02Z

Summary

Add guided example for multi-node distributed GRPO fine-tuning on Ray via CodeFlare SDK and KubeRay
New examples/fine-tuning/grpo_ray/ directory with README and notebook
Update parent fine-tuning README to include "Distributed on Ray" as a fourth execution mode

What's included

examples/fine-tuning/grpo_ray/README.md — algorithm overview, hardware requirements, GRPO-specific considerations, setup guide
examples/fine-tuning/grpo_ray/grpo_lora-rayjob.ipynb — example notebook: multi-node RayJob submission via CodeFlare SDK, training parameter configuration, reward curve plotting
examples/fine-tuning/README.md — updated to list Ray as an execution mode

Details

Uses Training Hub's lora_grpo() with the verl backend (FSDP + vLLM) distributed across 2 nodes (4 GPUs total)
Submitted via CodeFlare SDK RayJob + ManagedClusterConfig
Ray runtime image: tested & verified only (not productised)

Dependencies

Ray runtime image with training-hub, verl, and vLLM (RHOAIENG-61568)
CodeFlare SDK with RayJob + ManagedClusterConfig support

TODO

Update image reference once Konflux build succeeds (currently using spike image)
Validate notebook end-to-end on cluster with final image
Confirm reward curve parsing works with verl log output

Related Tickets

Related jira tickets here:

🎫 RHOAIENG-63703

How Has This Been Tested?

TODO - Run notebook manually on an openshift cluster

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published in downstream modules

coderabbitai · 2026-05-28T15:36:15Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 228f8047-0601-4477-9a33-94add55cee4c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

- Add examples/fine-tuning/grpo_ray/ with multi-node distributed GRPO training via CodeFlare SDK RayJob + ManagedClusterConfig - Notebook demonstrates verl backend with FSDP + vLLM across 2 nodes - Update fine-tuning README to include Ray as a fourth execution mode Signed-off-by: Fiona-Waters <fiwaters6@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Fiona-Waters force-pushed the grpo_ray_example branch 2 times, most recently from c2c9e3e to 2bcdd02 Compare May 29, 2026 09:14

Fiona-Waters force-pushed the grpo_ray_example branch from 2bcdd02 to d24acc6 Compare May 29, 2026 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GRPO on Ray guided example with README#86

Add GRPO on Ray guided example with README#86
Fiona-Waters wants to merge 1 commit into
red-hat-data-services:mainfrom
Fiona-Waters:grpo_ray_example

Fiona-Waters commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Fiona-Waters commented May 28, 2026

Summary

What's included

Details

Dependencies

TODO

Related Tickets

How Has This Been Tested?

Checklist

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 28, 2026 •

edited

Loading