Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(shortfin) (sdxl) (flux) Revisit refactoring of shortfin SD and Flux services. #1114

Open
monorimet opened this issue Mar 18, 2025 · 2 comments
Assignees

Comments

@monorimet
Copy link
Contributor

These two applications follow very similar structure of inference tasks, and should reuse utilities and other code where possible.

Solving some technical debt from the SDXL mlperf sprint, I ended up branching some SDXL builder logic that was generalized by @KyleHerndon.

There are a few more things we want to improve and align for the diffusion apps, and perhaps LLM as well.

  • Batcher improvements
  • Return/response handling (align FLUX with SDXL -- tracked in (shortfin) (flux) Align response handling with shark-ui specs #1113)
  • Cleanup and refactor of builder utilities to be config-driven, application-agnostic -- generalization mostly complete by @KyleHerndon 's refactor, but there's a lot of needlessly complicated logic in the shared utilities. We can move more of this to be config-driven.

This list, for now, is not exhaustive, but focused on what we can aim for in the upcoming release.

@monorimet monorimet self-assigned this Mar 18, 2025
@KyleHerndon
Copy link
Contributor

This sounds great. I definitely saw things I thought were needlessly complicated and tried to simplify where it seemed easy enough to do so, but I was mostly focused on code deduplication, so there's definitely work left to be done!

@KyleHerndon
Copy link
Contributor

One thing that's not necessarily directly related to this, but definitely tech debt we should address to make using the shortfin flux/sdxl pipeline easier is improved debugging from builders.py/iree.build.

This was particularly painful for me when debugging, especially because iree.build is naturally multiprocess. Having better logic/structure to propagate back failures or even just having a debug option to concatenate all output streams from all processes to a debug file would be extremely useful for me. Maybe such a thing exists, but I was not able to find it. In which case, it would be nice to just create/update the documentation on iree.build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants