layout | background-class | body-class | title | summary | category | image | author | tags | github-link | github-id | featured_image_1 | featured_image_2 | accelerator | demo-model-link | order | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hub_detail |
hub-background |
hub |
Deeplabv3 |
DeepLabV3 models with ResNet-50, ResNet-101 and MobileNet-V3 backbones |
researchers |
deeplab2.png |
Pytorch Team |
|
pytorch/vision |
deeplab1.png |
deeplab2.png |
cuda-optional |
1 |
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet50', pretrained=True)
# λλ μλ μ€ νλ
# model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_resnet101', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'deeplabv3_mobilenet_v3_large', pretrained=True)
model.eval()
μ¬μ νλ ¨λ λͺ¨λ λͺ¨λΈλ€μ λμΌν λ°©μμΌλ‘ μ κ·νλ μ
λ ₯ μ΄λ―Έμ§λ₯Ό κΈ°λν©λλ€.
μ¦, (N, 3, H, W)
λͺ¨μμ 3μ±λ RGB μ΄λ―Έμ§μ λ―Έλ λ°°μΉ, μ¬κΈ°μ N
μ μ΄λ―Έμ§μ κ°μ, H
μ W
μ κ°κ° μ΅μ 224
ν½μ
λ€λ‘ μ΄λ£¨μ΄μ§ κ²μΌλ‘ κΈ°λν©λλ€.
μ΄λ―Έμ§λ [0, 1]
λ²μλ‘ λ‘λν λ€μ mean = [0.485, 0.456, 0.406]
κ³Ό std = [0.229, 0.224, 0.225]
λ₯Ό μ¬μ©νμ¬ μ κ·νλ₯Ό μ§νν©λλ€.
λͺ¨λΈμ μ
λ ₯ Tensorμ λμ΄μ λλΉκ° κ°μ§λ§ 21κ°μ ν΄λμ€κ° μλ λ κ°μ ν
μκ° μλ OrderedDict
λ₯Ό λ°νν©λλ€.
output['out']
μλ―Έλ‘ μ λ§μ€ν¬λ₯Ό ν¬ν¨νκ³ μκ³ , output['aux']
μλ ν½μ
λΉ λ³΄μ‘° μμ€(auxiliary loss) κ°μ ν¬ν¨νκ³ μμ΅λλ€. μΆλ‘ λͺ¨λμμλ, output['aux']
λ μ μ©νμ§ μμ΅λλ€.
λ°λΌμ, output['out']
μ (N, 21, H, W)
κ³Ό κ°μ λͺ¨μμ κ°μ§λλ€. μ’ λ μμΈν μ 보λ μ΄κ³³μμ νμΈν μ μμ΅λλ€.
# νμ΄ν μΉ μΉμ¬μ΄νΈμμ μμ μ΄λ―Έμ§λ₯Ό λ€μ΄λ‘λν©λλ€.
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/deeplab1.png", "deeplab1.png")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
# μνμ μ€νν©λλ€. (torchvisionμ΄ νμν©λλ€.)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
input_image = input_image.convert("RGB")
preprocess = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # λͺ¨λΈμ΄ μνλ λ―Έλ λ°°μΉλ₯Ό λ§λλλ€.
# κ°λ₯ν κ²½μ° μλλ₯Ό λΉ λ₯΄κ² νκΈ° μν΄ μ
λ ₯ λ° λͺ¨λΈμ GPUλ‘ μ΄λν©λλ€.
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
model.to('cuda')
with torch.no_grad():
output = model(input_batch)['out'][0]
output_predictions = output.argmax(0)
μ¬κΈ°μ μΆλ ₯μ (21, H, W)
ννμ΄λ©°, κ° μμΉμμλ ν΄λμ€λ§λ€ μμΈ‘μ ν΄λΉνλ μ κ·νλμ§ μμ νλ₯ μ΄ μμ΅λλ€.
κ° ν΄λμ€μ μ΅λ μμΈ‘κ°μ μ»μ λ€μ λ€μ΄μ€νΈλ¦Ό μμ
μ μ¬μ©νλ €λ©΄, output_predictions = output.argmax(0)
λ₯Ό μνν©λλ€.
λ€μμ κ°κ° ν΄λμ€λ§λ€ μμμ΄ ν λΉλ μμΈ‘μ λνλ΄λ μμ μ‘°κ°μ λλ€.
# μμ νλ νΈλ₯Ό λ§λ€κ³ κ° ν΄λμ€μ μμμ μ νν©λλ€.
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2 ** 21 - 1])
colors = torch.as_tensor([i for i in range(21)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")
# κ° μμμμ 21κ° ν΄λμ€μ μλ―Έλ‘ μ λΆν μμΈ‘μ νλ‘ν
ν©λλ€.
r = Image.fromarray(output_predictions.byte().cpu().numpy()).resize(input_image.size)
r.putpalette(colors)
import matplotlib.pyplot as plt
plt.imshow(r)
# plt.show()
Deeplabv3-ResNetμ ResNet-50 λλ ResNet-101 λ°±λ³Έμ΄ μλ Deeplabv3 λͺ¨λΈλ‘ ꡬμ±λμ΄ μμ΅λλ€. Deeplabv3-MobileNetV3-Largeλ MobileNetV3 large λ°±λ³Έμ΄ μλ DeepLabv3 λͺ¨λΈλ‘ ꡬμ±λμ΄ μμ΅λλ€. μ¬μ νλ ¨λ λͺ¨λΈμ Pascal VOC λ°μ΄ν° μΈνΈμ μλ 20κ° μΉ΄ν κ³ λ¦¬μ λν΄ COCO train2017μ μΌλΆλΆ λ°μ΄ν° μ μ λν΄ νλ ¨λμμ΅λλ€.
COCO val2017 λ°μ΄ν° μ μμ νκ°λ μ¬μ νλ ¨λ λͺ¨λΈμ μ νλλ λ€μκ³Ό κ°μ΅λλ€.
Model structure | Mean IOU | Global Pixelwise Accuracy |
---|---|---|
deeplabv3_resnet50 | 66.4 | 92.4 |
deeplabv3_resnet101 | 67.4 | 92.4 |
deeplabv3_mobilenet_v3_large | 60.3 | 91.2 |