Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking β€œSign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the pipeline for the task explanation and Llm #2190

Open
wants to merge 50 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
adbca17
Add Task EXPLANATION and the visualization of images with description.
Bepitic Jul 15, 2024
5611ec1
upd dataset task with explanation
Bepitic Jul 15, 2024
8ed23a3
fix tasktype on metrics, depth, cataset, inferencer.
Bepitic Jul 15, 2024
a463b5b
Merge branch 'main' into llm-pipeline
Bepitic Jul 15, 2024
d5baf6b
fix lint on visualization/image
Bepitic Jul 16, 2024
b7c8eaa
Merge branch 'openvinotoolkit:main' into llm-pipeline
Bepitic Jul 18, 2024
5b563d9
Merge branch 'llm-pipeline' of github.com:Bepitic/anomalib into llm-p…
Bepitic Jul 18, 2024
bfd936e
Fix formatting dataset
Bepitic Jul 18, 2024
f541316
fix format data/base/depth
Bepitic Jul 18, 2024
4e392a9
Fix formatting openvino_inferencer
Bepitic Jul 18, 2024
5fc70ba
fix formatting
Bepitic Jul 18, 2024
75099af
Add Explanation to error-msg.
Bepitic Aug 2, 2024
e5040d3
OpenAI - VLM init
Bepitic Aug 3, 2024
86ad803
Add wrapper to run OpenAI
Bepitic Aug 4, 2024
3678f72
add in ppyproject
Bepitic Aug 4, 2024
7413842
Add Test and fix description/title
Bepitic Aug 12, 2024
dc42cbd
Add Readme and fix bug.
Bepitic Aug 13, 2024
5788d22
Update src/anomalib/models/image/openai_vlm/lightning_model.py
Bepitic Aug 13, 2024
e4f6bec
Update src/anomalib/models/image/openai_vlm/__init__.py
Bepitic Aug 13, 2024
5437467
Add fix pipeline bug.
Bepitic Aug 13, 2024
982c9ca
Add test.
Bepitic Aug 13, 2024
642fd26
Merge branch 'OpenAI-VLM' of github.com:Bepitic/anomalib into OpenAI-VLM
Bepitic Aug 13, 2024
b8cacf0
add changes
Bepitic Aug 16, 2024
0929dc9
Add integration test and unit test + skip export.
Bepitic Aug 16, 2024
39cf996
change to LANGUAGE
Bepitic Aug 16, 2024
671693d
Update images in Readme.
Bepitic Aug 17, 2024
224118b
Update src/anomalib/models/image/chatgpt_vision/__init__.py
Bepitic Aug 20, 2024
b703a41
Update src/anomalib/models/image/chatgpt_vision/chatgpt.py
Bepitic Aug 20, 2024
24c5486
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic Aug 20, 2024
68e757e
Update tests/integration/model/test_models.py
Bepitic Aug 20, 2024
86714a1
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic Aug 20, 2024
196d2a3
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic Aug 20, 2024
b7f345a
fix comments
Bepitic Aug 20, 2024
b285d10
remove last file of chatgpt_vision.
Bepitic Aug 20, 2024
a688530
fix tests
Bepitic Aug 20, 2024
0fb5f79
Merge pull request #1 from Bepitic/OpenAI-VLM (GPTVad)
Bepitic Aug 20, 2024
6503543
Merge branch 'main' into llm-pipeline
Bepitic Aug 20, 2024
8e92e5e
Update src/anomalib/models/image/gptvad/chatgpt.py
Bepitic Aug 21, 2024
5ab044d
upd: language -> VISUAL_PROMPTING
Bepitic Aug 21, 2024
3f9ca93
fix visual prompting and model_name
Bepitic Aug 21, 2024
391b4c4
fix GPT for Gpt and the folder of the tests.
Bepitic Aug 21, 2024
ca1a0bb
fix: change import error outside.
Bepitic Aug 21, 2024
022dcb7
fix readme pointing to the right model.
Bepitic Aug 21, 2024
af7b9e9
fix import cycle, and separate usecase by explicit if.
Bepitic Aug 21, 2024
faf334f
upd: add comments to the few shot / zero shot.
Bepitic Aug 21, 2024
3ed8d3f
fix: dataset expected colums
Bepitic Aug 21, 2024
7f454c4
upd: add the same logic of the label on visualize_full.
Bepitic Aug 22, 2024
45bd520
Merge branch 'main' into llm-pipeline
Bepitic Aug 22, 2024
44586d6
Fix in the logic of the code.
Bepitic Aug 22, 2024
7adb835
Merge branch 'llm-pipeline' of github.com:Bepitic/anomalib into llm-p…
Bepitic Aug 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
OpenAI - VLM init
Signed-off-by: Bepitic <bepitic@gmail.com>
Bepitic committed Aug 3, 2024
commit e5040d32c6ad9d71740a20746d20f912eae3dcad
2 changes: 2 additions & 0 deletions src/anomalib/models/image/__init__.py
Original file line number Diff line number Diff line change
@@ -14,6 +14,7 @@
from .fastflow import Fastflow
from .fre import Fre
from .ganomaly import Ganomaly
from .openai_vlm import OpenaiVlm
from .padim import Padim
from .patchcore import Patchcore
from .reverse_distillation import ReverseDistillation
@@ -34,6 +35,7 @@
"Fastflow",
"Fre",
"Ganomaly",
"OpenaiVlm",
"Padim",
"Patchcore",
"ReverseDistillation",
8 changes: 8 additions & 0 deletions src/anomalib/models/image/openai_vlm/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"""Llm model."""

# Copyright (C) 2023-2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

from .lightning_model import OpenaiVlm

__all__ = ["OpenaiVlm"]
263 changes: 263 additions & 0 deletions src/anomalib/models/image/openai_vlm/lightning_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,263 @@
"""OpenAI Visual Large Model: Zero-/Few-Shot Anomaly Classification.

Paper (No paper)
"""
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import base64
import logging

import openai
import Path
import torch
from lightning.pytorch.utilities.types import STEP_OUTPUT
from torch.utils.data import DataLoader

from anomalib import LearningType
from anomalib.metrics.threshold import ManualThreshold
from anomalib.models.components import AnomalyModule

logger = logging.getLogger(__name__)

__all__ = ["OpenaiVlm"]


class OpenaiVlm(AnomalyModule):
"""OpenaiVlm Lightning model using OpenAI's GPT-4 for image anomaly detection.

Args:
key (str): API key for OpenAI.
https://platform.openai.com/docs/quickstart/step-2-set-up-your-api-key
k_shot(int): The number of images that will compare to detect if it is an anomaly.
"""

def __init__(
self,
k_shot: int = 0,
openai_key: None | str = "",
) -> None:
super().__init__()

self.k_shot = k_shot

# OpenAI API Key
if not openai_key:
from anomalib.engine.engine import UnassignedError

msg = "OpenAI key not found."
raise UnassignedError(msg)

self.openai_key = openai_key
self.openai = openai.client()
self.openai.api_key = self.openai_key
self.image_threshold = ManualThreshold()

def _setup(self) -> None:
dataloader = self.trainer.datamodule.train_dataloader()
pre_images = self.collect_reference_images(dataloader)
self.pre_images = pre_images

def api_call_few_shot(self, key: str, pre_img: str, prompt: str, image: str) -> str:
"""Makes an API call to OpenAI's GPT-4 model to detect anomalies in an image.

Args:
key (str): API key for OpenAI.
pre_img (list): List of paths to images that serve as examples of typical images without anomalies.
prompt (str): The prompt to provide to the GPT-4 model (not used in the current implementation).
image (str): Path to the image that needs to be checked for anomalies.

Returns:
str: The response from the GPT-4 model indicating whether the image has anomalies or not.
It returns 'NO' if there are no anomalies and 'YES: description' if there are anomalies,
where 'description' provides details of the anomaly and its position.

Raises:
openai.error.OpenAIError: If there is an error during the API call.
"""

# Function to encode the image
def encode_image(image_path: str) -> str:
with Path.open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")

# Getting the base64 string
base64_image = encode_image(image)
base64_image_pre = [encode_image(img) for img in pre_img]
messages = [
{
"role": "system",
"content": "You will receive an image that is going to be an example of the typical image without any anomaly, \nand the last image that you need to decide if it has an anomaly or not.\nAnswer with a 'NO' if it does not have any anomalies and 'YES: description' where description is a description of the anomaly provided, position.",
},
{"role": "user", "content": f"data:image/png;base64,{base64_image_pre[0]}"},
{"role": "user", "content": f"data:image/png;base64,{base64_image}"},
]

try:
# Make the API call using the openai library
response = self.openai.chat.completions.create(model="gpt-4", messages=messages, max_tokens=300)
return response.choices[-1].message["content"]
except openai.error.OpenAIError as e:
print(f"An error occurred Calling the API of OpenAI: {e}")
raise

def api_call(self, key: str, prompt: str, image: str) -> str:
"""Makes an API call to OpenAI's GPT-4 model to detect anomalies in an image.

Args:
key (str): API key for OpenAI.
prompt (str): The prompt to provide to the GPT-4 model (not used in the current implementation).
image (str): Path to the image that needs to be checked for anomalies.

Returns:
str: The response from the GPT-4 model indicating whether the image has anomalies or not.
It returns 'NO' if there are no anomalies and 'YES: description' if there are anomalies,
where 'description' provides details of the anomaly and its position.

Raises:
openai.error.OpenAIError: If there is an error during the API call.
"""
prompt = """
Examine the provided image carefully to determine if there is an obvious anomaly present.
Anomalies may include mechanical malfunctions, unexpected objects, safety hazards, structural damages,
or unusual patterns or defects in the objects.

Instructions:

1. Thoroughly inspect the image for any irregularities or deviations from normal operating conditions.

2. Clearly state if an obvious anomaly is detected.
- If an anomaly is detected, begin with 'YES,' followed by a detailed description of the anomaly.
- If no anomaly is detected, simply state 'NO' and end the analysis.

Example Output Structure:

'YES:
- Description: Conveyor belt misalignment causing potential blockages.
This may result in production delays and equipment damage.
Immediate realignment and inspection are recommended.'

'NO'

Considerations:

- Ensure accuracy in identifying anomalies to prevent overlooking critical issues.
- Provide clear and concise descriptions for any detected anomalies.
- Focus on obvious anomalies that could impact final use of the object operation or safety.
"""

# Function to encode the image
def encode_image(image_path: str) -> str:
with Path.open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")

# Getting the base64 string
base64_image = encode_image(image)

messages = [
{
"role": "system",
"content": f"{prompt}",
},
{"role": "user", "content": f"data:image/png;base64,{base64_image}"},
]

try:
# Make the API call using the openai library
response = self.openai.chat.completions.create(model="gpt-4o", messages=messages, max_tokens=300)
return response.choices[-1].message["content"]
except openai.error.OpenAIError as e:
print(f"An error occurred Calling the API of OpenAI: {e}")
raise

def training_step(self, batch: dict[str, str | torch.Tensor], *args, **kwargs) -> dict[str, str | torch.Tensor]:
"""Train Step of LLM."""
del args, kwargs # These variables are not used.
# no train on llm
return batch

@staticmethod
def configure_optimizers() -> None:
"""WinCLIP doesn't require optimization, therefore returns no optimizers."""
return

def validation_step(
self,
batch: dict[str, str | list[str] | torch.Tensor],
*args,
**kwargs,
) -> STEP_OUTPUT:
"""Get batch of anomaly maps from input image batch.

Args:
batch (dict[str, str | list[str] | torch.Tensor]): Batch containing image filename, image, label and mask
args: Additional arguments.
kwargs: Additional keyword arguments.

Returns:
dict[str, Any]: str_otput and pred_scores, the output of the Llm and pred_scores 1.0 if is an anomaly image.
"""
del args, kwargs # These variables are not used.
bsize = len(batch["image_path"])
out_list: list[str] = []
pred_list: list[float] = []
for i in range(bsize):
try:
if self.k_shot > 0:
output = str(
self.api_call_few_shot(self.openai_key, self.pre_images, "", batch["image_path"][i]),
).strip()
else:
output = str(self.api_call(self.openai_key, "", batch["image_path"][i])).strip()
except Exception as e:
print(f"Error:img_path:{batch['image_path']}")
logging.exception(
f"Error calling openAI API for image {batch['image_path'][i]}: {e}",
)
output = "Error"

prediction = 0.5
if output.startswith("N"):
prediction = 0.0
elif output.startswith("Y"):
prediction = 1.0

out_list.append(output)
pred_list.append(prediction)
logging.debug(f"Output: {output}, Prediction: {prediction}")

batch["str_output"] = out_list
batch["pred_scores"] = torch.tensor(pred_list).to(self.device)
batch["pred_labels"] = torch.tensor(pred_list).to(self.device)
return batch

@property
def trainer_arguments(self) -> dict[str, int | float]:
"""Set model-specific trainer arguments."""
return {}

@property
def learning_type(self) -> LearningType:
"""The learning type of the model.

Llm is a zero-/few-shot model, depending on the user configuration. Therefore, the learning type is
set to ``LearningType.FEW_SHOT`` when ``k_shot`` is greater than zero and ``LearningType.ZERO_SHOT`` otherwise.
"""
return LearningType.ZERO_SHOT if self.k_shot == 0 else LearningType.FEW_SHOT

def collect_reference_images(self, dataloader: DataLoader) -> torch.Tensor:
"""Collect reference images for few-shot inference.

The reference images are collected by iterating the training dataset until the required number of images are
collected.

Returns:
ref_images (Tensor): A tensor containing the reference images.
"""
ref_images: list[str] = []
for batch in dataloader:
images = batch["image_path"][: self.k_shot - len(ref_images)]
ref_images.extend(images)
if self.k_shot == len(ref_images):
break
return ref_images