Skip to content

Comments

feat: support most of features in ccvfi#4

Merged
Tohrusky merged 20 commits intoEutropicAI:mainfrom
TohruskyDev:main
Sep 25, 2025
Merged

feat: support most of features in ccvfi#4
Tohrusky merged 20 commits intoEutropicAI:mainfrom
TohruskyDev:main

Conversation

@Tohrusky
Copy link
Member

No description provided.

@Tohrusky Tohrusky changed the title feat: ccvfi feat: support most of features in ccvfi Sep 25, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @Tohrusky, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for Video Frame Interpolation (VFI) into the system. It integrates two prominent VFI models, RIFE and DRBA, providing their respective architectures, configurations, and inference mechanisms. A key aspect of this feature is the seamless integration with VapourSynth, allowing for efficient video processing and frame generation. The changes also include necessary refactorings of core model and configuration classes, along with new utility functions for VFI-specific operations like scene change detection. Additionally, a Docker Compose setup has been added to streamline the development workflow.

Highlights

  • Video Frame Interpolation (VFI) Support: Introduced a new module for Video Frame Interpolation, enabling the generation of intermediate frames in a video sequence.
  • RIFE Model Integration: Added the RIFE (Real-time Intermediate Flow Estimation) model, including its architecture, configuration, and inference capabilities.
  • DRBA Model Integration: Implemented the DRBA (Deep Recursive Bidirectional Attention) model, complete with its architecture, configuration, and inference logic.
  • VapourSynth Integration for VFI: Integrated VFI models with VapourSynth, allowing users to apply frame interpolation directly within VapourSynth scripts.
  • Refactored Base Classes and Utilities: Updated base configuration and model classes, and utility functions to support the new VFI models and their specific requirements (e.g., VFIBaseConfig, inference_vfi, check_scene).
  • Docker Compose for Development: Added a vs-docker-compose.yml file to simplify the development environment setup for VapourSynth and CUDA.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant new functionality for Video Frame Interpolation (VFI), including new model architectures, configurations, and VapourSynth integration. The changes are extensive and well-structured. I've identified several critical issues related to tensor manipulation and global PyTorch settings that require attention. Additionally, I've noted some medium to high severity issues concerning potential bugs, code clarity, and best practices. Addressing these points will improve the robustness and maintainability of the new features.


disable_drm = False
if (_left_scene and not _right_scene) or (not _left_scene and _right_scene):
drm01r, drm21r = (ones_mask.clone() * 0.5 for _ in range(2))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This line assigns a generator object to drm01r and drm21r instead of the tensor values. This will cause an error in subsequent operations. You should use a tuple or list to unpack the values correctly.

Suggested change
drm01r, drm21r = (ones_mask.clone() * 0.5 for _ in range(2))
drm01r, drm21r = ones_mask.clone() * 0.5, ones_mask.clone() * 0.5

grid_cache = {}
batch_cache = {}
torch.set_float32_matmul_precision("medium")
torch.set_grad_enabled(False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Setting torch.set_grad_enabled(False) at the module level is dangerous as it disables gradient computation globally for your entire application. This can have unintended side effects. The functions in this file are already correctly decorated with @torch.inference_mode(), which handles this locally. This global setting is unnecessary and should be removed.

Comment on lines +57 to +61
I0 = resize(I0, scale).unsqueeze(0)
I1 = resize(I1, scale).unsqueeze(0)
I2 = resize(I2, scale).unsqueeze(0)

inp = torch.cat([I0, I1, I2], dim=1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The input tensor inp is being constructed with an incorrect shape. The unsqueeze(0) and torch.cat combination results in a shape of (1, 3*B, C, H', W'), but the DRBA model expects (B, 3, C, H', W'). You should use torch.stack to create the correct input shape.

Suggested change
I0 = resize(I0, scale).unsqueeze(0)
I1 = resize(I1, scale).unsqueeze(0)
I2 = resize(I2, scale).unsqueeze(0)
inp = torch.cat([I0, I1, I2], dim=1)
I0 = resize(I0, scale)
I1 = resize(I1, scale)
I2 = resize(I2, scale)
inp = torch.stack([I0, I1, I2], dim=1)


results, reuse = self.model(inp, minus_t, zero_t, plus_t, left_scene_change, right_scene_change, scale, reuse)

results = torch.cat(tuple(de_resize(result, h, w).unsqueeze(0) for result in results), dim=1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The output tensor results is being constructed with an incorrect shape. The current implementation with unsqueeze(0) and torch.cat results in a shape of (1, N*B, C, h, w). The expected output shape is (B, N, C, h, w). Using torch.stack will produce the correct shape.

Suggested change
results = torch.cat(tuple(de_resize(result, h, w).unsqueeze(0) for result in results), dim=1)
results = torch.stack([de_resize(r, h, w) for r in results], dim=1)


if _right_scene:
for _ in plus_t:
zero_t = np.append(zero_t, 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variable zero_t is a list of floats, but np.append returns a NumPy array. This changes the type of zero_t, which can lead to unexpected behavior. It's better to use list.append() to maintain type consistency.

Suggested change
zero_t = np.append(zero_t, 0)
zero_t.append(0)

torch.cat((img0[:, :3], img1[:, :3], f0, f1, timestep), 1), None, scale=scale_list[i]
)
if ensemble:
print("warning: ensemble is not supported since RIFEv4.21")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print for warnings is not ideal as it writes to standard output and cannot be easily configured (e.g., silenced or redirected). Consider using the logging module for warnings, like logging.warning(...). This applies to other print statements in this file as well (lines 74, 86).

Comment on lines +164 to +165
_drm01r[holes01r] = _drm01r[holes01r]
_drm21r[holes21r] = _drm21r[holes21r]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These lines are no-ops as they assign a tensor slice to itself. They appear to be leftover code and should be removed if they are not intended to have any effect.

Comment on lines +63 to +64
elif True:
print(strVariable, type(objValue))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This elif True: block with a print statement appears to be leftover debugging code and should be removed.

:return: 1 output frames (img0_1)
"""
if len(img_list) != 2:
raise ValueError("IFNet img_list must contain 2 images")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message refers to IFNet, but this is inside RIFEModel. For clarity and consistency, the error message should refer to RIFEModel.

Suggested change
raise ValueError("IFNet img_list must contain 2 images")
raise ValueError("RIFEModel img_list must contain 2 images")

cccv/vs/vfi.py Outdated
return vfi_methods[num_frame](inference, clip, mapper, scale, scdet, scdet_threshold, device)


def inference_vsr_two_frame_in(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function inference_vsr_two_frame_in (and inference_vsr_three_frame_in on line 160) is located in vfi.py and handles Video Frame Interpolation. The vsr in the name is misleading. Consider renaming them to inference_vfi_two_frame_in and inference_vfi_three_frame_in respectively to better reflect their purpose.

@Tohrusky Tohrusky requested a review from Copilot September 25, 2025 15:54
@codecov
Copy link

codecov bot commented Sep 25, 2025

Codecov Report

❌ Patch coverage is 94.09449% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.05%. Comparing base (ff5f80a) to head (7722d28).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
cccv/util/misc.py 92.47% 7 Missing ⚠️
cccv/model/vfi_base_model.py 81.25% 3 Missing ⚠️
cccv/model/vfi/drba_model.py 96.00% 2 Missing ⚠️
cccv/model/vfi/rife_model.py 95.74% 2 Missing ⚠️
cccv/model/vsr_base_model.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main       #4      +/-   ##
==========================================
+ Coverage   92.81%   93.05%   +0.24%     
==========================================
  Files          35       40       +5     
  Lines         918     1166     +248     
==========================================
+ Hits          852     1085     +233     
- Misses         66       81      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive support for Video Frame Interpolation (VFI) functionality to the CCCV library, expanding beyond its existing Super Resolution (SR) and Video Super Resolution (VSR) capabilities.

  • Implements VFI base models and specific implementations for RIFE and DRBA architectures
  • Adds VapourSynth integration for video frame interpolation processing
  • Introduces utility functions for scene detection, temporal mapping, and image processing operations

Reviewed Changes

Copilot reviewed 44 out of 57 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
vs-docker-compose.yml Adds Docker Compose configuration for development environment
cccv/model/vfi_base_model.py Implements base VFI model class with video inference capabilities
cccv/model/vfi/rife_model.py RIFE model implementation for two-frame interpolation
cccv/model/vfi/drba_model.py DRBA model implementation for three-frame interpolation
cccv/vs/vfi.py VapourSynth integration for VFI processing
cccv/util/misc.py Utility functions for interpolation, scene detection, and image operations
tests/vfi/ Test files for VFI functionality
tests/util.py Enhanced test utilities with VFI support
Comments suppressed due to low confidence (1)

cccv/vs/vfi.py:1

  • The error message refers to 'length of the input frames' but the parameter is named 'num_frame'. The message should be: '[CCCV] The number of input frames should be odd'
import math

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +54 to +60
for k in [
EVAL_IMG_PATH_DRBA_0,
EVAL_IMG_PATH_DRBA_1,
EVAL_IMG_PATH_DRBA_2,
EVAL_IMG_PATH_DRBA_3,
EVAL_IMG_PATH_DRBA_4,
]
Copy link

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The hardcoded list of DRBA evaluation image paths creates maintenance overhead. Consider using a range-based approach or storing the paths in a list to make it easier to modify the number of evaluation images.

Copilot uses AI. Check for mistakes.


def test_resize() -> None:
img = torch.randn(1, 3, 64, 64) # 创建一个随机的 4D 张量
Copy link

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chinese comment should be translated to English for consistency with the rest of the codebase: '# Create a random 4D tensor'

Copilot uses AI. Check for mistakes.
if src_fps > tar_fps:
raise ValueError("[CCCV] The target fps should be greater than the clip fps")

if scale < 0 or not math.log2(scale).is_integer():
Copy link

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition scale < 0 should be scale <= 0 to prevent division by zero and invalid scale values. A scale of 0 would cause issues in subsequent calculations.

Suggested change
if scale < 0 or not math.log2(scale).is_integer():
if scale <= 0 or not math.log2(scale).is_integer():

Copilot uses AI. Check for mistakes.
Comment on lines 36 to 38
# Flow distance calculator


Copy link

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment '# Flow distance calculator' is disconnected from the function it describes due to extra blank lines. Move the comment directly above the function definition for better code readability.

Suggested change
# Flow distance calculator
# Flow distance calculator

Copilot uses AI. Check for mistakes.
Comment on lines +176 to +177
for _ in plus_t:
zero_t = np.append(zero_t, 0)
Copy link

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using np.append() in a loop is inefficient as it creates a new array each time. Consider extending the list with [0] * len(plus_t) or converting to list, extending, then back to numpy array if needed.

Suggested change
for _ in plus_t:
zero_t = np.append(zero_t, 0)
zero_t = np.concatenate([zero_t, np.zeros(len(plus_t), dtype=zero_t.dtype)])

Copilot uses AI. Check for mistakes.
@Tohrusky Tohrusky merged commit 4226db0 into EutropicAI:main Sep 25, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant