Skip to content

Memory leak(not VRAM) by websocket api #6494

Open
@t00350320

Description

@t00350320

Expected Behavior

env:

linux
ComfyUI:v0.3.10
NVIDIA-SMI 535.104.12             Driver Version: 535.104.12   CUDA Version: 12.2 
torch                     2.5.1+cu121

testcomfyui_api.py send a prompt queue request to Native ComfyUI Server by a websocket:

workflow 0 load the modules to be loaded by Cache Backend Data(Inspire) Node
load init workflow 0: python testcomfyui_api.py 0 1.png 2.png
load api workflow 1:  python testcomfyui_api.py 1 1.png 2.png
then make a loop request python testcomfyui_api.py 1 1.png 2.png, 

then, add some trace code with tracemalloc and objgraph.show_backrefs in main.py

def prompt_worker(q, server_instance):
    current_time: float = 0.0
    e = execution.PromptExecutor(server_instance, lru_size=args.cache_lru)
    last_gc_collect = 0
    need_gc = False
    gc_collect_interval = 10.0

    tracemalloc.start()
    snapshot0 = tracemalloc.take_snapshot()

    while True:
        timeout = 100.0
        if need_gc:
            timeout = max(gc_collect_interval - (current_time - last_gc_collect), 0.0)

        queue_item = q.get(timeout=timeout)
        snapshot1 = tracemalloc.take_snapshot()
        if queue_item is not None:
            item, item_id = queue_item
            execution_start_time = time.perf_counter()
            prompt_id = item[1]
            server_instance.last_prompt_id = prompt_id

            e.execute(item[2], prompt_id, item[3], item[4])
            need_gc = True
            q.task_done(item_id,
                        e.history_result,
                        status=execution.PromptQueue.ExecutionStatus(
                            status_str='success' if e.success else 'error',
                            completed=e.success,
                            messages=e.status_messages))
            if server_instance.client_id is not None:
                server_instance.send_sync("executing", {"node": None, "prompt_id": prompt_id}, server_instance.client_id)

            current_time = time.perf_counter()
            execution_time = current_time - execution_start_time
            logging.info("Prompt executed in {:.2f} seconds".format(execution_time))
       flags = q.get_flags()
        free_memory = flags.get("free_memory", False)

        if flags.get("unload_models", free_memory):
            comfy.model_management.unload_all_models()
            need_gc = True
            last_gc_collect = 0

        if free_memory:
            e.reset()
            need_gc = True
            last_gc_collect = 0

        if need_gc:
            current_time = time.perf_counter()
            if (current_time - last_gc_collect) > gc_collect_interval:
                gc.collect()
                comfy.model_management.soft_empty_cache()
                last_gc_collect = current_time
                need_gc = False

        snapshot2 = tracemalloc.take_snapshot()
        top_stats = snapshot2.compare_to(snapshot1, 'lineno')

        print("[ Top 10 differences ]")
        for stat in top_stats[:10]:
            print(stat)
            logging.info("stat:{}".format(stat))

        top_stats1 = snapshot2.compare_to(snapshot0, 'lineno')
        print("[ Top 10 differences top_stats1]")
        for stat in top_stats1[:10]:
            print(stat)
            logging.info("stat1:{}".format(stat))

        print("item[2]:",item[2])
        objgraph.show_backrefs(item[2], filename='sample-backref-graph.png')

Actual Behavior

hours later, we find a memory increase and it may happened near "prompt_data = json.load(workflow_api_file)"
or caused by ComfyUI prompt reference, have no idea.

Some logs from main.py shows a decoder size increase ,

stat1:/usr/lib/python3.10/json/decoder.py:355: size=22.2 MiB (+22.2 MiB), count=358297 (+358297), average=65 B
stat1:/usr/lib/python3.10/json/decoder.py:355: size=22.6 MiB (-10.2 KiB), count=365251 (-139), average=65 B

Steps to Reproduce

sorry for file type reason, please move .txt file to .py file, testcomfyui_api.txt equals testcomfyui_api.py
testcomfyui_api.txt

Debug Logs

stat1:/usr/lib/python3.10/json/decoder.py:355: size=22.2 MiB (+22.2 MiB), count=358297 (+358297), average=65 B
stat1:/usr/lib/python3.10/json/decoder.py:355: size=22.6 MiB (-10.2 KiB), count=365251 (-139), average=65 B

Other

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Potential BugUser is reporting a bug. This should be tested.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions