From 376a651b3faf1e5412c96ff3fb0cf9572195aad2 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 22:18:03 +0800
Subject: [PATCH 01/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index e34d6a71..5c40a969 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -268,6 +268,10 @@ Your dataset
 ```shell
 cd path/to/playground/
 # build from source
+# 查看文档配置 mmdetection https://mmdetection.readthedocs.io/zh_CN/latest/get_started.html
+# pip install -U openmim
+# mim install mmengine
+# mim install "mmcv>=2.0.0"
 git clone https://github.com/open-mmlab/mmdetection.git
 cd mmdetection; pip install -e .; cd ..
 ```

From 97eabb1012f39df96d418f446f461137c225ae40 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 22:22:02 +0800
Subject: [PATCH 02/17] Update readme.md

---
 label_anything/readme.md | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/label_anything/readme.md b/label_anything/readme.md
index ae01dd6f..c0280228 100644
--- a/label_anything/readme.md
+++ b/label_anything/readme.md
@@ -232,6 +232,8 @@ The polygon instance format is not easy to control the number of points, too muc
 
 Here we provide a conversion script to convert the json format of label-studio output to COCO format.
 
+⚠Only items that have been annotated with all images are supported.
+
 ```shell
 cd path/to/playground/label_anything
 python tools/convert_to_rle_mask_coco.py --json_file_path path/to/LS_json --out_dir path/to/output/file
@@ -261,8 +263,12 @@ First get mmdetection in the playground directory.
 ```shell
 cd path/to/playground/
 # build from source
+# View the documentation to install mmdetection https://mmdetection.readthedocs.io/en/latest/get_started.html#
+# pip install -U openmim
+# mim install mmengine
+# mim install "mmcv>=2.0.0"
 git clone https://github.com/open-mmlab/mmdetection.git
-cd mmdetection; pip install -e . ; cd .
+cd mmdetection; pip install -e .; cd ..
 ```
 
 Then use this script to output the config for training on demand, where the template `mask-rcnn_r50_fpn` is provided in `label_anything/config_template`.

From 84bb415bcaf57e839366db82705f0422f2cd4519 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 22:22:57 +0800
Subject: [PATCH 03/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index 5c40a969..13d3185e 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -240,6 +240,8 @@ polygon 实例格式由于不太好控制点数，太多不方便微调（不像
 
 此处提供将 label-studio 输出的json格式转换为COCO格式的转换脚本。
 
+⚠目前仅支持已经标注完所有图片的项目.
+
 ```shell
 cd path/to/playground/label_anything
 python tools/convert_to_rle_mask_coco.py --json_file_path path/to/LS_json --out_dir path/to/output/file

From b445cd407258a1ecb983c0a89d1849408973e7ad Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 23:31:57 +0800
Subject: [PATCH 04/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index 13d3185e..d06b82b4 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -271,9 +271,9 @@ Your dataset
 cd path/to/playground/
 # build from source
 # 查看文档配置 mmdetection https://mmdetection.readthedocs.io/zh_CN/latest/get_started.html
-# pip install -U openmim
-# mim install mmengine
-# mim install "mmcv>=2.0.0"
+pip install -U openmim
+mim install mmengine
+mim install "mmcv>=2.0.0"
 git clone https://github.com/open-mmlab/mmdetection.git
 cd mmdetection; pip install -e .; cd ..
 ```

From a1b0456d21c19cb39226bfedb03a4e71c332cdf4 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 23:32:31 +0800
Subject: [PATCH 05/17] Update readme.md

---
 label_anything/readme.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/label_anything/readme.md b/label_anything/readme.md
index c0280228..e5851803 100644
--- a/label_anything/readme.md
+++ b/label_anything/readme.md
@@ -264,9 +264,9 @@ First get mmdetection in the playground directory.
 cd path/to/playground/
 # build from source
 # View the documentation to install mmdetection https://mmdetection.readthedocs.io/en/latest/get_started.html#
-# pip install -U openmim
-# mim install mmengine
-# mim install "mmcv>=2.0.0"
+pip install -U openmim
+mim install mmengine
+mim install "mmcv>=2.0.0"
 git clone https://github.com/open-mmlab/mmdetection.git
 cd mmdetection; pip install -e .; cd ..
 ```

From 6b139462a4edeb46a9e50ab0494f3b09711f976a Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 23:34:42 +0800
Subject: [PATCH 06/17] Update readme.md

---
 label_anything/readme.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/label_anything/readme.md b/label_anything/readme.md
index e5851803..60a99d82 100644
--- a/label_anything/readme.md
+++ b/label_anything/readme.md
@@ -263,7 +263,6 @@ First get mmdetection in the playground directory.
 ```shell
 cd path/to/playground/
 # build from source
-# View the documentation to install mmdetection https://mmdetection.readthedocs.io/en/latest/get_started.html#
 pip install -U openmim
 mim install mmengine
 mim install "mmcv>=2.0.0"

From cea0139d0e9c9022a725f8ac9759df053de677ad Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 23:35:07 +0800
Subject: [PATCH 07/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index d06b82b4..f8d7645f 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -270,7 +270,6 @@ Your dataset
 ```shell
 cd path/to/playground/
 # build from source
-# 查看文档配置 mmdetection https://mmdetection.readthedocs.io/zh_CN/latest/get_started.html
 pip install -U openmim
 mim install mmengine
 mim install "mmcv>=2.0.0"

From 1e15f8198b6d6f58d3853564677a9eec7637f9a4 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 23:37:51 +0800
Subject: [PATCH 08/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index f8d7645f..b8438cb4 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -270,6 +270,7 @@ Your dataset
 ```shell
 cd path/to/playground/
 # build from source
+conda activate rtmdet-sam
 pip install -U openmim
 mim install mmengine
 mim install "mmcv>=2.0.0"

From 7f8674996980cc6063e421dc7067260126240694 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 8 May 2023 23:38:07 +0800
Subject: [PATCH 09/17] Update readme.md

---
 label_anything/readme.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/label_anything/readme.md b/label_anything/readme.md
index 60a99d82..903e2925 100644
--- a/label_anything/readme.md
+++ b/label_anything/readme.md
@@ -263,6 +263,7 @@ First get mmdetection in the playground directory.
 ```shell
 cd path/to/playground/
 # build from source
+conda activate rtmdet-sam
 pip install -U openmim
 mim install mmengine
 mim install "mmcv>=2.0.0"

From a9b9e25b5c7684cc119b7ccf47bd205a8fa28080 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Tue, 9 May 2023 00:02:30 +0800
Subject: [PATCH 10/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index b8438cb4..4ec33758 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -271,6 +271,8 @@ Your dataset
 cd path/to/playground/
 # build from source
 conda activate rtmdet-sam
+# Windows 用户需要使用 conda 安装 pycocotools
+# conda install pycocotools -c conda-forge
 pip install -U openmim
 mim install mmengine
 mim install "mmcv>=2.0.0"

From 86b44d35f37187b35b73133e46ce99ffd3863f85 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Tue, 9 May 2023 00:03:53 +0800
Subject: [PATCH 11/17] Update readme.md

---
 label_anything/readme.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/label_anything/readme.md b/label_anything/readme.md
index 903e2925..554e7282 100644
--- a/label_anything/readme.md
+++ b/label_anything/readme.md
@@ -264,6 +264,8 @@ First get mmdetection in the playground directory.
 cd path/to/playground/
 # build from source
 conda activate rtmdet-sam
+# Windows users need to install pycocotools using conda
+conda install pycocotools -c conda-forge 
 pip install -U openmim
 mim install mmengine
 mim install "mmcv>=2.0.0"

From 145210a778ec6da0066b198a498b0e22817f1909 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Tue, 9 May 2023 00:05:29 +0800
Subject: [PATCH 12/17] Update readme.md

---
 label_anything/readme.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/label_anything/readme.md b/label_anything/readme.md
index 554e7282..51c1b7e9 100644
--- a/label_anything/readme.md
+++ b/label_anything/readme.md
@@ -265,7 +265,7 @@ cd path/to/playground/
 # build from source
 conda activate rtmdet-sam
 # Windows users need to install pycocotools using conda
-conda install pycocotools -c conda-forge 
+# conda install pycocotools -c conda-forge 
 pip install -U openmim
 mim install mmengine
 mim install "mmcv>=2.0.0"

From 8cfb57d592e66bc277982c4526bf6ce9d6ccbf58 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 15 May 2023 12:36:07 +0800
Subject: [PATCH 13/17] Update mmdetection.py

---
 label_anything/sam/mmdetection.py | 209 ++++++++++++++++++++++++------
 1 file changed, 171 insertions(+), 38 deletions(-)

diff --git a/label_anything/sam/mmdetection.py b/label_anything/sam/mmdetection.py
index b7149a78..75653633 100644
--- a/label_anything/sam/mmdetection.py
+++ b/label_anything/sam/mmdetection.py
@@ -7,6 +7,7 @@
 import numpy as np
 from label_studio_converter import brush
 import torch
+from torch.nn import functional as F
 
 import cv2
 
@@ -19,8 +20,13 @@
 
 # from mmdet.apis import inference_detector, init_detector
 from segment_anything import SamPredictor, sam_model_registry, SamAutomaticMaskGenerator
+from segment_anything.utils.transforms import ResizeLongestSide
 import random
 import string
+import time
+import onnxruntime
+
+
 logger = logging.getLogger(__name__)
 
 def load_my_model(device="cuda:0",sam_config="vit_b",sam_checkpoint_file="sam_vit_b_01ec64.pth"):
@@ -34,6 +40,28 @@ def load_my_model(device="cuda:0",sam_config="vit_b",sam_checkpoint_file="sam_vi
         return predictor
 
 
+def load_my_onnx(onnx_config:dict):
+    # !wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/encoder.onnx
+    # !wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/decoder.onnx
+    encoder_model_abs_path = "./encoder.onnx"
+    decoder_model_abs_path = "./decoder.onnx"
+
+
+    providers = onnxruntime.get_available_providers()
+    if providers:
+        logging.info(
+                "Available providers for ONNXRuntime: %s", ", ".join(providers)
+            )
+    else:
+        logging.warning("No available providers for ONNXRuntime")
+    encoder_session = onnxruntime.InferenceSession(
+            encoder_model_abs_path, providers=providers
+            )
+    decoder_session = onnxruntime.InferenceSession(
+            decoder_model_abs_path, providers=providers
+        )
+
+    return encoder_session,decoder_session
 
 class MMDetection(LabelStudioMLBase):
     """Object detector based on https://github.com/open-mmlab/mmdetection."""
@@ -50,21 +78,23 @@ def __init__(self,
                  out_poly=False,
                  score_threshold=0.5,
                  device='cpu',
+                 onnx=False,
                  **kwargs):
 
         super(MMDetection, self).__init__(**kwargs)
+        self.onnx=onnx
+        if self.onnx:
+            PREDICTOR=load_my_onnx(device)
+        else:
+            PREDICTOR=load_my_model(device,sam_config,sam_checkpoint_file)
 
-        PREDICTOR=load_my_model(device,sam_config,sam_checkpoint_file)
+        
         self.PREDICTOR = PREDICTOR
 
         self.out_mask = out_mask
         self.out_bbox = out_bbox
         self.out_poly = out_poly
 
-        # config_file = config_file or os.environ['config_file']
-        # checkpoint_file = checkpoint_file or os.environ['checkpoint_file']
-        # self.config_file = config_file
-        # self.checkpoint_file = checkpoint_file
         self.labels_file = labels_file
         # default Label Studio image upload folder
         upload_dir = os.path.join(get_data_dir(), 'media', 'upload')
@@ -76,8 +106,6 @@ def __init__(self,
         else:
             self.label_map = {}
 
-        # self.from_name, self.to_name, self.value, self.labels_in_config = get_single_tag_keys(  # noqa E501
-        #     self.parsed_label_config, 'RectangleLabels', 'Image')
 
         self.labels_in_config = dict(
                 label=self.parsed_label_config['KeyPointLabels']
@@ -132,6 +160,78 @@ def __init__(self,
         # self.model = init_detector(config_file, checkpoint_file, device=device)
         self.score_thresh = score_threshold
 
+
+    def pre_process(self, image):
+        image_size = 1024
+        transform = ResizeLongestSide(image_size)
+
+        input_image = transform.apply_image(image)
+        input_image_torch = torch.as_tensor(input_image, device="cpu")
+        input_image_torch = input_image_torch.permute(2, 0, 1).contiguous()[None, :, :, :]
+        pixel_mean = torch.Tensor([123.675, 116.28, 103.53]).view(-1, 1, 1)
+        pixel_std = torch.Tensor([58.395, 57.12, 57.375]).view(-1, 1, 1)
+        x = (input_image_torch - pixel_mean) / pixel_std
+        h, w = x.shape[-2:]
+        padh = image_size - h
+        padw = image_size - w
+        x = F.pad(x, (0, padw, 0, padh))
+        x = x.numpy()
+
+        encoder_inputs = {
+            "x": x,
+        }
+        return encoder_inputs, image.shape[:2]
+
+    def run_encoder(self, encoder_inputs):
+        output = self.encoder_session.run(None, encoder_inputs)
+        image_embedding = output[0]
+        return image_embedding
+
+
+
+    def run_decoder(
+        self, image_embedding, input_prompt,img_size):
+        (original_height,original_width)=img_size
+        points=input_prompt['points']
+        masks=input_prompt['mask']
+        boxes=input_prompt['boxes']
+        labels=input_prompt['label']
+
+        image_size = 1024
+        transform = ResizeLongestSide(image_size)
+        if boxes is not None:
+            onnx_box_coords = boxes.reshape(2, 2)
+            input_labels = np.array([2,3])
+
+            onnx_coord = np.concatenate([onnx_box_coords, np.array([[0.0, 0.0]])], axis=0)[None, :, :]
+            onnx_label = np.concatenate([input_labels, np.array([-1])], axis=0)[None, :].astype(np.float32)
+        elif points is not None:
+            input_point=points
+            input_label = np.array([1])
+            onnx_coord = np.concatenate([input_point, np.array([[0.0, 0.0]])], axis=0)[None, :, :]
+            onnx_label = np.concatenate([input_label, np.array([-1])], axis=0)[None, :].astype(np.float32)
+
+        onnx_coord = transform.apply_coords(onnx_coord, img_size).astype(np.float32)
+
+        onnx_mask_input = np.zeros((1, 1, 256, 256), dtype=np.float32)
+        onnx_has_mask_input = np.zeros(1, dtype=np.float32)
+
+        
+        decoder_inputs = {
+            "image_embeddings": image_embedding,
+            "point_coords": onnx_coord,
+            "point_labels": onnx_label,
+            "mask_input": onnx_mask_input,
+            "has_mask_input": onnx_has_mask_input,
+            "orig_im_size": np.array(
+                img_size, dtype=np.float32
+            ),
+        }
+        masks, _, _ = self.decoder_session.run(None, decoder_inputs)
+        masks = masks > 0.0
+
+        return masks
+
     def _get_image_url(self, task):
         image_url = task['data'].get(
             self.value) or task['data'].get(DATA_UNDEFINED_NAME)
@@ -156,8 +256,7 @@ def _get_image_url(self, task):
 
     def predict(self, tasks, **kwargs):
 
-        predictor = self.PREDICTOR
-
+        start = time.time()
         results = []
         assert len(tasks) == 1
         task = tasks[0]
@@ -167,48 +266,83 @@ def predict(self, tasks, **kwargs):
         if kwargs.get('context') is None:
             return []
         
-        # image = cv2.imread(f"./{split}")
         image = cv2.imread(image_path)
         image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
-        predictor.set_image(image)
-        
         prompt_type = kwargs['context']['result'][0]['type']
         original_height = kwargs['context']['result'][0]['original_height']
         original_width = kwargs['context']['result'][0]['original_width']
 
+        if self.onnx:
+            self.encoder_session,self.decoder_session=self.PREDICTOR
+            encoder_inputs,_ = self.pre_process(image)
 
-        if prompt_type == 'keypointlabels':
-            # getting x and y coordinates of the keypoint
-            x = kwargs['context']['result'][0]['value']['x'] * original_width / 100
-            y = kwargs['context']['result'][0]['value']['y'] * original_height / 100
-            output_label = kwargs['context']['result'][0]['value']['labels'][0]
+            input_prompt={}
 
+            input_prompt['boxes']=input_prompt['mask']=input_prompt['points']=input_prompt['label']=None
+            if prompt_type == 'keypointlabels':
+                # getting x and y coordinates of the keypoint
+                x = kwargs['context']['result'][0]['value']['x'] * original_width / 100
+                y = kwargs['context']['result'][0]['value']['y'] * original_height / 100
+                output_label = kwargs['context']['result'][0]['value']['labels'][0]
 
-            masks, scores, logits = predictor.predict(
-                point_coords=np.array([[x, y]]),
-                # box=np.array([x.cpu() for x in bbox[:4]]),
-                point_labels=np.array([1]),
-                multimask_output=False,
-            )
+                input_prompt['points']=np.array([[x, y]])
+                input_prompt['label']=np.array([1])
+
+            
+            if prompt_type == 'rectanglelabels':
 
+                x = kwargs['context']['result'][0]['value']['x'] * original_width / 100
+                y = kwargs['context']['result'][0]['value']['y'] * original_height / 100
+                w = kwargs['context']['result'][0]['value']['width'] * original_width / 100
+                h = kwargs['context']['result'][0]['value']['height'] * original_height / 100
 
-        if prompt_type == 'rectanglelabels':
+                output_label = kwargs['context']['result'][0]['value']['rectanglelabels'][0]
+            
+                input_prompt['boxes']=np.array([x, y, x+w, y+h])
 
+                input_prompt['label'] = np.array([2,3])
+            
+            
+            #encoder
+            image_embedding = self.run_encoder(encoder_inputs)
+            masks = self.run_decoder(image_embedding,input_prompt,\
+                                     (original_height,original_width))
+            masks = masks[0].astype(np.uint8)
 
-            x = kwargs['context']['result'][0]['value']['x'] * original_width / 100
-            y = kwargs['context']['result'][0]['value']['y'] * original_height / 100
-            w = kwargs['context']['result'][0]['value']['width'] * original_width / 100
-            h = kwargs['context']['result'][0]['value']['height'] * original_height / 100
+        else:
+            predictor = self.PREDICTOR
+            predictor.set_image(image)
 
-            output_label = kwargs['context']['result'][0]['value']['rectanglelabels'][0]
+            if prompt_type == 'keypointlabels':
+                # getting x and y coordinates of the keypoint
+                x = kwargs['context']['result'][0]['value']['x'] * original_width / 100
+                y = kwargs['context']['result'][0]['value']['y'] * original_height / 100
+                output_label = kwargs['context']['result'][0]['value']['labels'][0]
+
+
+                masks, scores, logits = predictor.predict(
+                    point_coords=np.array([[x, y]]),
+                    # box=np.array([x.cpu() for x in bbox[:4]]),
+                    point_labels=np.array([1]),
+                    multimask_output=False,
+                )
 
-            masks, scores, logits = predictor.predict(
-                # point_coords=np.array([[x, y]]),
-                box=np.array([x, y, x+w, y+h]),
-                point_labels=np.array([1]),
-                multimask_output=False,
-            )
 
+            if prompt_type == 'rectanglelabels':
+
+                x = kwargs['context']['result'][0]['value']['x'] * original_width / 100
+                y = kwargs['context']['result'][0]['value']['y'] * original_height / 100
+                w = kwargs['context']['result'][0]['value']['width'] * original_width / 100
+                h = kwargs['context']['result'][0]['value']['height'] * original_height / 100
+
+                output_label = kwargs['context']['result'][0]['value']['rectanglelabels'][0]
+
+                masks, scores, logits = predictor.predict(
+                    # point_coords=np.array([[x, y]]),
+                    box=np.array([x, y, x+w, y+h]),
+                    point_labels=np.array([1]),
+                    multimask_output=False,
+                )
 
         mask = masks[0].astype(np.uint8) # each mask has shape [H, W]
         # converting the mask from the model to RLE format which is usable in Label Studio
@@ -216,12 +350,11 @@ def predict(self, tasks, **kwargs):
         # 找到轮廓
         contours, hierarchy = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
 
-
+        end = time.time()
+        print(end-start)
 
 
         # 计算外接矩形
-
-
         if self.out_bbox:
             new_contours = []
             for contour in contours:

From da641c599481c67f846fc08860533e48cc5dfa26 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 15 May 2023 12:45:20 +0800
Subject: [PATCH 14/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index d27f8a2a..a6f11c89 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -384,5 +384,35 @@ python tools/test.py data/my_set/mask-rcnn_r50_fpn.py path/of/your/checkpoint --
 
 到此半自动化标注就完成了, 通过 Label-Studio 的半自动化标注功能，可以让用户在标注过程中，通过点击一下鼠标，就可以完成目标的分割和检测，大大提高了标注效率。部分代码借鉴自 label-studio-ml-backend ID 为 253 的 Pull Request，感谢作者的贡献。同时感谢社区同学 [ATang0729](https://github.com/ATang0729) 为脚本测试重新标注了喵喵数据集，以及 [JimmyMa99](https://github.com/JimmyMa99) 同学提供的转换脚本、 config 模板以及文档优化。
 
+## (测试阶段)🚀使用 onnx runtime 进行 SAM 后端推理🚀（可选）
+
+我们使用 onnx runtime 进行 SAM 后端推理以提升 SAM 的推理速度，在一张 3090 上测试，使用 pytorch 需要 4.6s ，使用 onnx runtime 只要 0.24s。
+
+首先下载 huggingface 上转换好的 onnx。
+
+```shell
+cd path/to/playground/label_anything
+wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/encoder.onnx
+wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/decoder.onnx
+```
+
+接着开启后端推理。
+
+```shell
+cd path/to/playground/label_anything
+
+label-studio-ml start sam --port 8003 --with \
+sam_config=vit_b \
+sam_checkpoint_file=./sam_vit_b_01ec64.pth \
+out_mask=True \
+out_bbox=True \
+device=cuda:0 \
+onnx=True \
+# device=cuda:0 为使用 GPU 推理，如果使用 cpu 推理，将 cuda:0 替换为 cpu
+# out_poly=True 返回外接多边形的标注
+```
+
+⚠目前仅支持 sam_vit_b。
+
 
 

From e0498ace5a9dce4b9f3d0fe129e730f5d5e98c3a Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Mon, 15 May 2023 12:46:04 +0800
Subject: [PATCH 15/17] Update readme.md

---
 label_anything/readme.md | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/label_anything/readme.md b/label_anything/readme.md
index 91a660ee..754f4609 100644
--- a/label_anything/readme.md
+++ b/label_anything/readme.md
@@ -378,3 +378,33 @@ When finished, we can get the model test visualization. On the left is the annot
 With the semi-automated annotation function of Label-Studio, users can complete object segmentation and detection by simply clicking the mouse during the annotation process, greatly improving the efficiency of annotation.
 
 Some of the code was borrowed from Pull Request ID 253 of label-studio-ml-backend. Thank you to the author for their contribution. Also, thanks to fellow community member [ATang0729](https://github.com/ATang0729) for re-labeling the meow dataset for script testing, and [JimmyMa99](https://github.com/JimmyMa99) for the conversion script, config template, and documentation Optimization.
+
+## (beta)🚀 SAM backend inference using onnx runtime🚀 (optional)
+
+We use onnx runtime for SAM back-end inference to improve the speed of SAM inference, tested on a 3090, which takes 4.6s with pytorch and 0.24s with onnx runtime.
+
+First download the converted onnx from huggingface.
+
+```shell
+cd path/to/playground/label_anything
+wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/encoder.onnx
+wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/decoder.onnx
+```
+
+Then turn on back-end reasoning.
+
+```shell
+cd path/to/playground/label_anything
+
+label-studio-ml start sam --port 8003 --with \
+sam_config=vit_b \
+sam_checkpoint_file=. /sam_vit_b_01ec64.pth \
+out_mask=True \
+out_bbox=True \
+device=cuda:0 \
+onnx=True \
+# device=cuda:0 for GPU inference, if cpu inference is used, replace cuda:0 with cpu
+# out_poly=True returns the annotation of the external polygon
+```
+
+⚠ Currently only sam_vit_b is supported.

From d82c3319d0cefa689a87b70c5d916d480debf29a Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Thu, 18 May 2023 11:09:29 +0800
Subject: [PATCH 16/17] Update mmdetection.py

---
 label_anything/sam/mmdetection.py | 87 ++++++++++++++++++++-----------
 1 file changed, 57 insertions(+), 30 deletions(-)

diff --git a/label_anything/sam/mmdetection.py b/label_anything/sam/mmdetection.py
index 75653633..708fb54c 100644
--- a/label_anything/sam/mmdetection.py
+++ b/label_anything/sam/mmdetection.py
@@ -23,29 +23,24 @@
 from segment_anything.utils.transforms import ResizeLongestSide
 import random
 import string
-import time
-import onnxruntime
-
-
 logger = logging.getLogger(__name__)
 
-def load_my_model(device="cuda:0",sam_config="vit_b",sam_checkpoint_file="sam_vit_b_01ec64.pth"):
-        """
-        Loads the Segment Anything model on initializing Label studio, so if you call it outside MyModel it doesn't load every time you try to make a prediction
-        Returns the predictor object. For more, look at Facebook's SAM docs
-        """
-        sam = sam_model_registry[sam_config](checkpoint=sam_checkpoint_file)
-        sam.to(device=device)
-        predictor = SamPredictor(sam)
-        return predictor
 
+import onnxruntime
+import time
 
-def load_my_onnx(onnx_config:dict):
+def load_my_onnx(encoder_model_abs_path,decoder_model_abs_path):
     # !wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/encoder.onnx
     # !wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/decoder.onnx
-    encoder_model_abs_path = "./encoder.onnx"
-    decoder_model_abs_path = "./decoder.onnx"
-
+    # if onnx_config == 'vit_b':
+    #     encoder_model_abs_path = "models/segment_anything_vit_b_encoder_quant.onnx"
+    #     decoder_model_abs_path = "models/segment_anything_vit_b_decoder_quant.onnx"
+    # elif onnx_config == 'vit_l':
+    #     encoder_model_abs_path = "models/segment_anything_vit_l_encoder_quant.onnx"
+    #     decoder_model_abs_path = "models/segment_anything_vit_l_decoder_quant.onnx"
+    # elif onnx_config == 'vit_h':
+    #     encoder_model_abs_path = "models/segment_anything_vit_h_encoder_quant.onnx"
+    #     decoder_model_abs_path = "models/segment_anything_vit_h_decoder_quant.onnx"
 
     providers = onnxruntime.get_available_providers()
     if providers:
@@ -62,6 +57,19 @@ def load_my_onnx(onnx_config:dict):
         )
 
     return encoder_session,decoder_session
+ 
+
+def load_my_model(device="cuda:0",sam_config="vit_b",sam_checkpoint_file="sam_vit_b_01ec64.pth"):
+        """
+        Loads the Segment Anything model on initializing Label studio, so if you call it outside MyModel it doesn't load every time you try to make a prediction
+        Returns the predictor object. For more, look at Facebook's SAM docs
+        """
+        sam = sam_model_registry[sam_config](checkpoint=sam_checkpoint_file)
+        sam.to(device=device)
+        predictor = SamPredictor(sam)
+        return predictor
+
+
 
 class MMDetection(LabelStudioMLBase):
     """Object detector based on https://github.com/open-mmlab/mmdetection."""
@@ -79,22 +87,27 @@ def __init__(self,
                  score_threshold=0.5,
                  device='cpu',
                  onnx=False,
+                 onnx_encoder_file=None,
+                 onnx_decoder_file=None,
                  **kwargs):
 
         super(MMDetection, self).__init__(**kwargs)
+
         self.onnx=onnx
         if self.onnx:
-            PREDICTOR=load_my_onnx(device)
+            PREDICTOR=load_my_onnx(onnx_encoder_file,onnx_decoder_file)
         else:
-            PREDICTOR=load_my_model(device,sam_config,sam_checkpoint_file)
-
-        
+            PREDICTOR=load_my_model(device,sam_config)
         self.PREDICTOR = PREDICTOR
 
         self.out_mask = out_mask
         self.out_bbox = out_bbox
         self.out_poly = out_poly
 
+        # config_file = config_file or os.environ['config_file']
+        # checkpoint_file = checkpoint_file or os.environ['checkpoint_file']
+        # self.config_file = config_file
+        # self.checkpoint_file = checkpoint_file
         self.labels_file = labels_file
         # default Label Studio image upload folder
         upload_dir = os.path.join(get_data_dir(), 'media', 'upload')
@@ -106,6 +119,8 @@ def __init__(self,
         else:
             self.label_map = {}
 
+        # self.from_name, self.to_name, self.value, self.labels_in_config = get_single_tag_keys(  # noqa E501
+        #     self.parsed_label_config, 'RectangleLabels', 'Image')
 
         self.labels_in_config = dict(
                 label=self.parsed_label_config['KeyPointLabels']
@@ -160,6 +175,7 @@ def __init__(self,
         # self.model = init_detector(config_file, checkpoint_file, device=device)
         self.score_thresh = score_threshold
 
+####################################################################################################
 
     def pre_process(self, image):
         image_size = 1024
@@ -187,8 +203,6 @@ def run_encoder(self, encoder_inputs):
         image_embedding = output[0]
         return image_embedding
 
-
-
     def run_decoder(
         self, image_embedding, input_prompt,img_size):
         (original_height,original_width)=img_size
@@ -228,9 +242,11 @@ def run_decoder(
             ),
         }
         masks, _, _ = self.decoder_session.run(None, decoder_inputs)
+        # masks = masks[0, 0, :, :]  # Only get 1 mask
         masks = masks > 0.0
-
+        # masks = masks.reshape(img_size)
         return masks
+##########################################################################################
 
     def _get_image_url(self, task):
         image_url = task['data'].get(
@@ -255,7 +271,7 @@ def _get_image_url(self, task):
         return image_url
 
     def predict(self, tasks, **kwargs):
-
+        #共用区域
         start = time.time()
         results = []
         assert len(tasks) == 1
@@ -266,12 +282,13 @@ def predict(self, tasks, **kwargs):
         if kwargs.get('context') is None:
             return []
         
+        # image = cv2.imread(f"./{split}")
         image = cv2.imread(image_path)
         image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
         prompt_type = kwargs['context']['result'][0]['type']
         original_height = kwargs['context']['result'][0]['original_height']
         original_width = kwargs['context']['result'][0]['original_width']
-
+        #############################################
         if self.onnx:
             self.encoder_session,self.decoder_session=self.PREDICTOR
             encoder_inputs,_ = self.pre_process(image)
@@ -298,6 +315,7 @@ def predict(self, tasks, **kwargs):
 
                 output_label = kwargs['context']['result'][0]['value']['rectanglelabels'][0]
             
+            
                 input_prompt['boxes']=np.array([x, y, x+w, y+h])
 
                 input_prompt['label'] = np.array([2,3])
@@ -308,10 +326,16 @@ def predict(self, tasks, **kwargs):
             masks = self.run_decoder(image_embedding,input_prompt,\
                                      (original_height,original_width))
             masks = masks[0].astype(np.uint8)
+            # mask = masks.astype(np.uint8)
+            # shapes = self.post_process(masks, resized_ratio)
 
         else:
             predictor = self.PREDICTOR
+
             predictor.set_image(image)
+            
+
+
 
             if prompt_type == 'keypointlabels':
                 # getting x and y coordinates of the keypoint
@@ -344,17 +368,20 @@ def predict(self, tasks, **kwargs):
                     multimask_output=False,
                 )
 
+
+            
+
+            # 找到轮廓
         mask = masks[0].astype(np.uint8) # each mask has shape [H, W]
         # converting the mask from the model to RLE format which is usable in Label Studio
-
-        # 找到轮廓
         contours, hierarchy = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
-
         end = time.time()
         print(end-start)
-
+########################
 
         # 计算外接矩形
+
+
         if self.out_bbox:
             new_contours = []
             for contour in contours:

From 4cb1f9aded6ab7a17f679ecc3d2712a8a3c73bc9 Mon Sep 17 00:00:00 2001
From: Ma Zhiming <101508488+JimmyMa99@users.noreply.github.com>
Date: Thu, 18 May 2023 11:13:41 +0800
Subject: [PATCH 17/17] Update readme_zh.md

---
 label_anything/readme_zh.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/label_anything/readme_zh.md b/label_anything/readme_zh.md
index a6f11c89..b667f7f5 100644
--- a/label_anything/readme_zh.md
+++ b/label_anything/readme_zh.md
@@ -394,6 +394,7 @@ python tools/test.py data/my_set/mask-rcnn_r50_fpn.py path/of/your/checkpoint --
 cd path/to/playground/label_anything
 wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/encoder.onnx
 wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/decoder.onnx
+#其他版本可以在 https://github.com/vietanhdev/anylabeling-assets/releases/tag/v0.2.0 下载
 ```
 
 接着开启后端推理。
@@ -402,17 +403,16 @@ wget https://huggingface.co/visheratin/segment-anything-vit-b/resolve/main/decod
 cd path/to/playground/label_anything
 
 label-studio-ml start sam --port 8003 --with \
-sam_config=vit_b \
-sam_checkpoint_file=./sam_vit_b_01ec64.pth \
 out_mask=True \
 out_bbox=True \
 device=cuda:0 \
 onnx=True \
+onnx_encoder_file='encoder.onnx' \
+onnx_decoder_file='decoder.onnx'
 # device=cuda:0 为使用 GPU 推理，如果使用 cpu 推理，将 cuda:0 替换为 cpu
 # out_poly=True 返回外接多边形的标注
 ```
 
-⚠目前仅支持 sam_vit_b。