-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT Video Converter: Adding Images to Videos #702
Open
jbolor21
wants to merge
21
commits into
Azure:main
Choose a base branch
from
jbolor21:users/bjagdagdorj/video_converter
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+375
−3
Open
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
1367ae0
initial commit adding new files
48e3c9e
adding converter file
a24a804
adding converter file
c425af0
removing testing code
0997c53
pre-commit
40e9b00
Merge branch 'main' of https://github.com/Azure/PyRIT into users/bjag…
841a7e3
Merge branch 'main' of https://github.com/Azure/PyRIT into users/bjag…
a598b14
working version of video converter still draft though
cb15543
working version of video converter still draft though
c77e9e8
addressed feedback
e8229f2
Merge branch 'main' of https://github.com/Azure/PyRIT into users/bjag…
419d7da
addressed feedback, fix doc strings, format, minor edits
39953eb
Merge branch 'main' of https://github.com/Azure/PyRIT into users/bjag…
e8cdb43
added unit tests, addressed formatting changes
023f8bf
remove print statement
81870a0
adding new notebook to toc
5f16d2b
adding sample video and formatting
60e4d18
Merge branch 'main' of https://github.com/Azure/PyRIT into users/bjag…
ddc1728
adding comment and getting latest code
eea9c3b
make opencv optional dependency
cafb2b1
make opencv optional dependency
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Adding Images to a Video\n", | ||
"\n", | ||
"This shows how to use the video converter to add an image to a video.\n", | ||
"To use this converter you'll need to install opencv which can be done with \n", | ||
"`pip install pyrit[opencv]`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"ConverterResult(output_text='..\\\\..\\\\..\\\\assets\\\\output_video.mp4', output_type='video_path')" | ||
] | ||
}, | ||
"execution_count": 22, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"import pathlib\n", | ||
"\n", | ||
"from pyrit.common import IN_MEMORY, initialize_pyrit\n", | ||
"from pyrit.prompt_converter import AddImageVideoConverter\n", | ||
"\n", | ||
"initialize_pyrit(memory_db_type=IN_MEMORY)\n", | ||
"\n", | ||
"input_video = str(pathlib.Path(\".\") / \"..\" / \"..\" / \"..\" / \"assets\" / \"sample_video.mp4\")\n", | ||
"input_image = str(pathlib.Path(\".\") / \"..\" / \"..\" / \"..\" / \"assets\" / \"pyrit_architecture.png\")\n", | ||
"\n", | ||
"video = AddImageVideoConverter(video_path=input_video)\n", | ||
"converted_vid = await video.convert_async(prompt=input_image, input_type=\"image_path\")\n", | ||
"converted_vid" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.9" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# --- | ||
# jupyter: | ||
# jupytext: | ||
# text_representation: | ||
# extension: .py | ||
# format_name: percent | ||
# format_version: '1.3' | ||
# jupytext_version: 1.16.4 | ||
# --- | ||
|
||
# %% [markdown] | ||
# # Adding Images to a Video | ||
# | ||
# This shows how to use the video converter to add an image to a video. | ||
# To use this converter you'll need to install opencv which can be done with | ||
# `pip install pyrit[opencv]` | ||
|
||
# %% | ||
import pathlib | ||
|
||
from pyrit.common import IN_MEMORY, initialize_pyrit | ||
from pyrit.prompt_converter import AddImageVideoConverter | ||
|
||
initialize_pyrit(memory_db_type=IN_MEMORY) | ||
|
||
input_video = str(pathlib.Path(".") / ".." / ".." / ".." / "assets" / "sample_video.mp4") | ||
input_image = str(pathlib.Path(".") / ".." / ".." / ".." / "assets" / "pyrit_architecture.png") | ||
|
||
video = AddImageVideoConverter(video_path=input_video) | ||
converted_vid = await video.convert_async(prompt=input_image, input_type="image_path") | ||
converted_vid |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,183 @@ | ||||||
# Copyright (c) Microsoft Corporation. | ||||||
# Licensed under the MIT license. | ||||||
|
||||||
import logging | ||||||
import os | ||||||
from pathlib import Path | ||||||
from typing import TYPE_CHECKING | ||||||
|
||||||
import numpy as np | ||||||
|
||||||
from pyrit.common.path import DB_DATA_PATH | ||||||
from pyrit.models import PromptDataType, data_serializer_factory | ||||||
from pyrit.prompt_converter import ConverterResult, PromptConverter | ||||||
|
||||||
logger = logging.getLogger(__name__) | ||||||
|
||||||
|
||||||
if TYPE_CHECKING: | ||||||
import cv2 | ||||||
|
||||||
# Choose the codec based on extension | ||||||
video_encoding_map = { | ||||||
"mp4": "mp4v", | ||||||
"avi": "XVID", | ||||||
"mov": "MJPG", | ||||||
"mkv": "X264", | ||||||
} | ||||||
|
||||||
|
||||||
class AddImageVideoConverter(PromptConverter): | ||||||
""" | ||||||
Adds an image to a video at a specified position. | ||||||
Also, currently the image is placed in the whole video, not at a specific timepoint | ||||||
|
||||||
Args: | ||||||
video_path (str): File path of video to add image to | ||||||
output_path (str, Optional): File path of output video. Defaults to None. | ||||||
img_position (tuple, Optional): Position to place image in video. Defaults to (10, 10). | ||||||
img_resize_size (tuple, Optional): Size to resize image to. Defaults to (500, 500). | ||||||
""" | ||||||
|
||||||
def __init__( | ||||||
self, | ||||||
video_path: str, | ||||||
output_path: str = None, | ||||||
img_position: tuple = (10, 10), | ||||||
img_resize_size: tuple = (500, 500), | ||||||
): | ||||||
if not video_path: | ||||||
raise ValueError("Please provide valid video path") | ||||||
|
||||||
self._output_path = output_path | ||||||
self._img_position = img_position | ||||||
self._img_resize_size = img_resize_size | ||||||
self._video_path = video_path | ||||||
|
||||||
async def _add_image_to_video(self, image_path: str, output_path: str): | ||||||
""" | ||||||
Adds image to video | ||||||
Args: | ||||||
image_path (str): The image path to add to video. | ||||||
output_path (str): The output video path. | ||||||
|
||||||
Returns: | ||||||
output_path (str): The output video path. | ||||||
""" | ||||||
|
||||||
if not image_path: | ||||||
nina-msft marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
raise ValueError("Please provide valid image path value") | ||||||
|
||||||
input_image_data = data_serializer_factory( | ||||||
category="prompt-memory-entries", data_type="image_path", value=image_path | ||||||
) | ||||||
input_video_data = data_serializer_factory( | ||||||
category="prompt-memory-entries", data_type="video_path", value=self._video_path | ||||||
) | ||||||
|
||||||
# Open the video to ensure it exists | ||||||
video_bytes = await input_video_data.read_data() | ||||||
|
||||||
azure_storage_flag = input_video_data._is_azure_storage_url(self._video_path) | ||||||
video_path = self._video_path | ||||||
|
||||||
try: | ||||||
if azure_storage_flag: | ||||||
# If the video is in Azure storage, download it first | ||||||
|
||||||
# Save the video bytes to a temporary file | ||||||
local_temp_path = Path(DB_DATA_PATH, "temp_video.mp4") | ||||||
with open(local_temp_path, "wb") as f: | ||||||
f.write(video_bytes) | ||||||
video_path = str(local_temp_path) | ||||||
|
||||||
cap = cv2.VideoCapture(video_path) | ||||||
|
||||||
# Get video properties | ||||||
fps = int(cap.get(cv2.CAP_PROP_FPS)) | ||||||
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) | ||||||
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) | ||||||
file_extension = video_path.split(".")[-1].lower() | ||||||
if file_extension in video_encoding_map: | ||||||
video_char_code = cv2.VideoWriter_fourcc(*video_encoding_map[file_extension]) | ||||||
output_video = cv2.VideoWriter(output_path, video_char_code, fps, (width, height)) | ||||||
else: | ||||||
raise ValueError(f"Unsupported video format: {file_extension}") | ||||||
|
||||||
# Load and resize the overlay image | ||||||
|
||||||
input_image_bytes = await input_image_data.read_data() | ||||||
image_np_arr = np.frombuffer(input_image_bytes, np.uint8) | ||||||
overlay = cv2.imdecode(image_np_arr, cv2.IMREAD_UNCHANGED) | ||||||
overlay = cv2.resize(overlay, self._img_resize_size) | ||||||
|
||||||
# Get overlay image dimensions | ||||||
image_height, image_width, _ = overlay.shape | ||||||
x, y = self._img_position # Position where the overlay will be placed | ||||||
|
||||||
while cap.isOpened(): | ||||||
ret, frame = cap.read() | ||||||
if not ret: | ||||||
break | ||||||
|
||||||
# Ensure overlay fits within the frame boundaries | ||||||
if x + image_width > width or y + image_height > height: | ||||||
logger.info("Overlay image is too large for the video frame. Resizing to fit.") | ||||||
overlay = cv2.resize(overlay, (width - x, height - y)) | ||||||
image_height, image_width, _ = overlay.shape | ||||||
|
||||||
# Blend overlay with frame | ||||||
if overlay.shape[2] == 4: # Check number of channels on image | ||||||
alpha_overlay = overlay[:, :, 3] / 255.0 | ||||||
for c in range(0, 3): | ||||||
frame[y : y + image_height, x : x + image_width, c] = ( | ||||||
alpha_overlay * overlay[:, :, c] | ||||||
+ (1 - alpha_overlay) * frame[y : y + image_height, x : x + image_width, c] | ||||||
) | ||||||
else: | ||||||
frame[y : y + image_height, x : x + image_width] = overlay | ||||||
|
||||||
# Write the modified frame to the output video | ||||||
output_video.write(frame) | ||||||
|
||||||
finally: | ||||||
# Release everything | ||||||
cap.release() | ||||||
output_video.release() | ||||||
cv2.destroyAllWindows() | ||||||
if azure_storage_flag: | ||||||
os.remove(local_temp_path) | ||||||
|
||||||
logger.info(f"Video saved as {output_path}") | ||||||
jsong468 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
return output_path | ||||||
|
||||||
async def convert_async(self, *, prompt: str, input_type: PromptDataType = "image_path") -> ConverterResult: | ||||||
""" | ||||||
Converter that adds an image to a video | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. NIT:
Suggested change
|
||||||
|
||||||
Args: | ||||||
prompt (str): The image file name to be added to the video. | ||||||
input_type (PromptDataType): type of data | ||||||
Returns: | ||||||
ConverterResult: The filename of the converted video as a ConverterResult Object | ||||||
""" | ||||||
if not self.input_supported(input_type): | ||||||
raise ValueError("Input type not supported") | ||||||
|
||||||
output_video_serializer = data_serializer_factory(category="prompt-memory-entries", data_type="video_path") | ||||||
|
||||||
if not self._output_path: | ||||||
output_video_serializer.value = await output_video_serializer.get_data_filename() | ||||||
else: | ||||||
output_video_serializer.value = self._output_path | ||||||
|
||||||
# Add video to the image | ||||||
updated_video = await self._add_image_to_video(image_path=prompt, output_path=output_video_serializer.value) | ||||||
return ConverterResult(output_text=str(updated_video), output_type="video_path") | ||||||
|
||||||
def input_supported(self, input_type: PromptDataType) -> bool: | ||||||
return input_type == "image_path" | ||||||
jbolor21 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
def output_supported(self, output_type: PromptDataType) -> bool: | ||||||
return output_type == "video_path" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest
async def _add_image_to_video(self, image_path: str, output_path: str) -> str: