Required GPU memory depends on the video length. #31

ysig · 2023-09-26T14:10:26Z

I've managed to run run_tokenflow_pnp.py for a small excerpt of my video (5s) - and it looks really cool - but when I run it on the full one (5min) it crashes with CUDA OOM error even when I drop the batch size down to 1.

This scaling dependence on the video length probably caused by the extended attention seems like a major limitation of the method and is not highlighted neither in the discussion section nor somewhere else in the paper (as far as I can tell).

Is it possible to offload part of the attention computation to the CPU so that the number of frames is not a bottleneck?

The text was updated successfully, but these errors were encountered:

eps696 · 2023-10-03T22:20:24Z

that's exactly what i did here #32 (in a way).
it handled longer sequences but not unlimited ones, as required to feed that whole attention data back to GPU before denoising latents step (and i didn't manage to make it feedable in batches on that step)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Required GPU memory depends on the video length. #31

Required GPU memory depends on the video length. #31

ysig commented Sep 26, 2023

eps696 commented Oct 3, 2023

Required GPU memory depends on the video length. #31

Required GPU memory depends on the video length. #31

Comments

ysig commented Sep 26, 2023

eps696 commented Oct 3, 2023