Implement non-blocking model loading with accurate health state management #36

eliteprox · 2025-09-15T22:20:06Z

This pull request introduces several improvements to PyTrickle's model loading and server startup flow, focusing on ensuring that the model is loaded exactly once, synchronizing server health status with model readiness, and improving background task management to prevent memory leaks. The changes enhance reliability and make the server's health reporting more accurate during startup.

Model loading and state management:

Added a thread-safe ensure_model_loaded method to FrameProcessor that uses an asyncio.Lock to guarantee the model is loaded only once, and sets the pipeline state to ready after loading. (pytrickle/frame_processor.py pytrickle/frame_processor.pyL57-R76)
Updated the client and stream processor to call ensure_model_loaded before starting frame processing, ensuring the model is ready on the correct event loop. (pytrickle/client.py [1] pytrickle/stream_processor.py [2]
Changed the server startup logic so the pipeline state remains LOADING until the model is fully loaded, improving health endpoint accuracy. (pytrickle/server.py pytrickle/server.pyL634-R634)

Server and processor integration:

The stream processor now attaches the server state to the frame processor for coherent health transitions, and registers a startup hook to preload the model in the background with error handling. (pytrickle/stream_processor.py pytrickle/stream_processor.pyR87-R120)

Background task management:

Introduced tracking and cleanup of background tasks in StreamProcessor to prevent memory leaks, including a cleanup method called on shutdown. (pytrickle/stream_processor.py [1] [2]

Testing and examples:

Updated the test for frame processor state attachment to reflect the new default state behavior. (tests/test_state_integration.py tests/test_state_integration.pyL186-R190)
Modified the example processor to simulate a model warmup delay, ensuring the /health endpoint reports LOADING until the model is ready. (examples/process_video_example.py examples/process_video_example.pyR27-L40)

This reverts commit ecfc1ef.

pytrickle/stream_processor.py

pschroedl · 2025-09-20T00:05:41Z

pytrickle/stream_processor.py

+            # Schedule non-blocking background preload so server can accept /health immediately
+            async def _background_preload():
+                try:
+                    if getattr(self._frame_processor, "state", None) is not None:


This seems to be repeated a lot - we could extract this into a method, get it once, and maybe assign it to a variable if appropriate to avoid the repeated calls.

_on_startup is a "callback" method that's being registered to the server's existing post-startup event to initiate load_model

pytrickle/pytrickle/stream_processor.py

Lines 123 to 126 in f888c13

try:

self.server.app.on_startup.append(_on_startup)

except Exception as e:

logger.error(f"Failed to register startup hook: {e}")

This can also be done manually outside of pytrickle by accessing the StreamProcessor's server property as was done in ComfyStream to load the pipeline:

https://github.com/livepeer/comfystream/blob/ad301a4c2efca54c17fb583e063c54f321537a9e/server/byoc.py#L180

https://github.com/livepeer/comfystream/blob/ad301a4c2efca54c17fb583e063c54f321537a9e/server/frame_processor.py#L144-L157

Ah...you're referring to getting the current state. Totally agree!

The issue is the state of frame processor and server are still separate. I think we could simplify get and update the state of the Server instead of keeping a state on the frame processor and using "attach_state". wdyt?

I changed the state property on the base frame_processor class so these None checks could be removed
e810f27

pschroedl · 2025-09-20T01:15:16Z

pytrickle/stream_processor.py

+                    logger.error(f"Error preloading model on startup: {e}")
+
+            try:
+                asyncio.get_running_loop().create_task(_background_preload())


Do we really want this async? seems like we could get the stated intent of the PR, to load the model synchronously ( blocking ), with await _background_preload() - asyncio might report "ready" before the model is loaded

Well, due to this being called via the server startup event, it blocks the server from being available unless a task is created to run it in the background (non-blocking). The health state begins with LOADING and transitions to IDLE. For managed containers, it is important for the server to be available immediately to rule out other potential docker container issues and complete a health check (in this case LOADING is returned until set_startup_complete() is called at the end). In a sense, model loading is now synchronous due to the model loading lock and the health state.

Here is where the user's load_model callback is used with the lock

pytrickle/pytrickle/frame_processor.py

Lines 68 to 76 in e810f27

async def ensure_model_loaded(self, **kwargs):

"""Thread-safe wrapper that ensures model is loaded exactly once."""

async with self._model_load_lock:

if not self._model_loaded:

await self.load_model(**kwargs)

self._model_loaded = True

logger.debug(f"Model loaded for {self.__class__.__name__}")

else:

logger.debug(f"Model already loaded for {self.__class__.__name__}")

This can be tested by simpling running process_video_example.py from the launch config, and sending a curl request within the first 3 seconds:

curl -X GET http://localhost:8000/health -H "Accept: application/json"

It should read LOADING and flip to IDLE after 3 seconds. This can be adjusted here

pytrickle/examples/process_video_example.py

Line 28 in e810f27

MODEL_LOAD_DELAY_SECONDS = 3.0

Made another change to track the background model loading from stream processor and cancel if needed. I think this is more of what you were looking for? 3a8c523

…up _model_loaded var

load_model sync, attach frame processor to server health state

71e7c00

eliteprox marked this pull request as ready for review September 15, 2025 22:29

eliteprox requested review from pschroedl and ad-astra-video September 16, 2025 22:15

eliteprox added 3 commits September 19, 2025 11:56

fix status_code var

ecfc1ef

Revert "fix status_code var"

3a29a37

This reverts commit ecfc1ef.

Merge branch 'main' into fix/load-model-sync

bdf25fc

pschroedl reviewed Sep 20, 2025

View reviewed changes

pytrickle/stream_processor.py Outdated Show resolved Hide resolved

pschroedl reviewed Sep 20, 2025

View reviewed changes

fix model_loaded race condition with lock

f888c13

pschroedl reviewed Sep 20, 2025

View reviewed changes

frame_processor: create StreamState in base class to resolve None checks

e810f27

eliteprox changed the title ~~load_model sync, attach frame processor to server health state~~ Implement non-blocking model preloading with accurate health state management Sep 20, 2025

eliteprox changed the title ~~Implement non-blocking model preloading with accurate health state management~~ Implement non-blocking model loading with accurate health state management Sep 20, 2025

eliteprox added 5 commits September 19, 2025 21:55

stream_processor: track model loading as background task and remove d…

3a8c523

…up _model_loaded var

client: remove duplicate state tracking

09e3891

consolidate state management for model loading

1907c65

fix failing tests

cb62159

fix state check to be logical

1baa011

eliteprox marked this pull request as draft September 25, 2025 19:57

eliteprox mentioned this pull request Sep 29, 2025

PyTrickle: Ensure load_model runs automatically livepeer/comfystream#451

Open

eliteprox self-assigned this Oct 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement non-blocking model loading with accurate health state management #36

Implement non-blocking model loading with accurate health state management #36

eliteprox commented Sep 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

pschroedl Sep 20, 2025

Uh oh!

eliteprox Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

eliteprox Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

pschroedl Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025 •

edited

Loading

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

eliteprox Sep 20, 2025

Uh oh!

Uh oh!

	try:
	self.server.app.on_startup.append(_on_startup)
	except Exception as e:
	logger.error(f"Failed to register startup hook: {e}")

	async def ensure_model_loaded(self, **kwargs):
	"""Thread-safe wrapper that ensures model is loaded exactly once."""
	async with self._model_load_lock:
	if not self._model_loaded:
	await self.load_model(**kwargs)
	self._model_loaded = True
	logger.debug(f"Model loaded for {self.__class__.__name__}")
	else:
	logger.debug(f"Model already loaded for {self.__class__.__name__}")

Implement non-blocking model loading with accurate health state management #36

Are you sure you want to change the base?

Implement non-blocking model loading with accurate health state management #36

Conversation

eliteprox commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pschroedl Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

pschroedl Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

eliteprox Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eliteprox commented Sep 15, 2025 •

edited

Loading

eliteprox Sep 20, 2025 •

edited

Loading

eliteprox Sep 20, 2025 •

edited

Loading

pschroedl Sep 20, 2025 •

edited

Loading

eliteprox Sep 20, 2025 •

edited

Loading