Skip to content

Latest commit

ย 

History

History
96 lines (78 loc) ยท 4.42 KB

pytorch_vision_fcn_resnet101.md

File metadata and controls

96 lines (78 loc) ยท 4.42 KB
layout background-class body-class title summary category image author tags github-link github-id featured_image_1 featured_image_2 accelerator order demo-model-link
hub_detail
hub-background
hub
FCN
Fully-Convolutional Network model with ResNet-50 and ResNet-101 backbones
researchers
fcn2.png
Pytorch Team
vision
scriptable
pytorch/vision
deeplab1.png
fcn2.png
cuda-optional
10
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'fcn_resnet50', pretrained=True)
# or
# model = torch.hub.load('pytorch/vision:v0.10.0', 'fcn_resnet101', pretrained=True)
model.eval()

๋ชจ๋“  ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์€ ๋™์ผํ•œ ๋ฐฉ์‹์œผ๋กœ ์ •๊ทœํ™”๋œ ์ž…๋ ฅ ์ด๋ฏธ์ง€, ์ฆ‰ N์ด ์ด๋ฏธ์ง€ ์ˆ˜์ด๊ณ , H์™€ W๋Š” ์ตœ์†Œ 224ํ”ฝ์…€์ธ (N, 3, H, W)ํ˜•ํƒœ์˜ 3์ฑ„๋„ RGB ์ด๋ฏธ์ง€์˜ ๋ฏธ๋‹ˆ ๋ฐฐ์น˜๋ฅผ ์š”๊ตฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋ฅผ [0, 1] ๋ฒ”์œ„๋กœ ๋กœ๋“œํ•œ ๋‹ค์Œ mean = [0.485, 0.456, 0.406] ๋ฐ std = [0.229, 0.224, 0.225]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ •๊ทœํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์€ ์ž…๋ ฅ ํ…์„œ์™€ ๋†’์ด์™€ ๋„ˆ๋น„๋Š” ๊ฐ™์ง€๋งŒ ํด๋ž˜์Šค๊ฐ€ 21๊ฐœ์ธ ํ…์„œ๋ฅผ ๊ฐ€์ง„ OrderedDict๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. output['out']์—๋Š” ์‹œ๋ฉ˜ํ‹ฑ ๋งˆ์Šคํฌ๊ฐ€ ํฌํ•จ๋˜๋ฉฐ output['aux']์—๋Š” ํ”ฝ์…€๋‹น ๋ณด์กฐ ์†์‹ค ๊ฐ’์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์ถ”๋ก  ๋ชจ๋“œ์—์„œ๋Š” output['aux']์ด ์œ ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ output['out']์˜ ํฌ๊ธฐ๋Š” (N, 21, H, W)์ž…๋‹ˆ๋‹ค. ์ถ”๊ฐ€ ์„ค๋ช…์„œ๋Š” ์—ฌ๊ธฐ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

# ํŒŒ์ดํ† ์น˜ ์›น์‚ฌ์ดํŠธ์—์„œ ์˜ˆ์ œ ์ด๋ฏธ์ง€ ๋‹ค์šด๋กœ๋“œ
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/deeplab1.png", "deeplab1.png")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
# ์‹คํ–‰ ์˜ˆ์‹œ (torchvision ํ•„์š”)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
input_image = input_image.convert("RGB")
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # ๋ชจ๋ธ์—์„œ ์š”๊ตฌ๋˜๋Š” ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ์ƒ์„ฑ

# ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ์†๋„๋ฅผ ์œ„ํ•ด ์ž…๋ ฅ๊ณผ ๋ชจ๋ธ์„ GPU๋กœ ์˜ฎ๊น๋‹ˆ๋‹ค
if torch.cuda.is_available():
    input_batch = input_batch.to('cuda')
    model.to('cuda')

with torch.no_grad():
    output = model(input_batch)['out'][0]
output_predictions = output.argmax(0)

์—ฌ๊ธฐ์„œ์˜ ์ถœ๋ ฅ ํ˜•ํƒœ๋Š” (21, H, W)์ด๋ฉฐ, ๊ฐ ์œ„์น˜์—๋Š” ๊ฐ ํด๋ž˜์Šค์˜ ์˜ˆ์ธก์— ํ•ด๋‹นํ•˜๋Š” ์ •๊ทœํ™”๋˜์ง€ ์•Š์€ ํ™•๋ฅ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ ํด๋ž˜์Šค์˜ ์ตœ๋Œ€ ์˜ˆ์ธก์„ ๊ฐ€์ ธ์˜จ ๋‹ค์Œ ์ด๋ฅผ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์— ์‚ฌ์šฉํ•˜๋ ค๋ฉด output_propertions = output.slmax(0)๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ๊ฐ ํด๋ž˜์Šค์— ํ• ๋‹น๋œ ๊ฐ ์ƒ‰์ƒ๊ณผ ํ•จ๊ป˜ ์˜ˆ์ธก์„ ํ‘œ์‹œํ•˜๋Š” ์ž‘์€ ํ† ๋ง‰๊ธ€ ์ž…๋‹ˆ๋‹ค(์™ผ์ชฝ์˜ ์‹œ๊ฐํ™” ์ด๋ฏธ์ง€ ์ฐธ์กฐ).

# ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ์ƒ‰์ƒ์„ ์„ ํƒํ•˜์—ฌ ์ƒ‰์ƒ ํŒ”๋ ˆํŠธ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2 ** 21 - 1])
colors = torch.as_tensor([i for i in range(21)])[:, None] * palette
colors = (colors % 255).numpy().astype("uint8")

# ๊ฐ ์ƒ‰์ƒ์˜ 21๊ฐœ ํด๋ž˜์Šค์˜ ์‹œ๋ฉ˜ํ‹ฑ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์˜ˆ์ธก์„ ๊ทธ๋ฆผ์œผ๋กœ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.
r = Image.fromarray(output_predictions.byte().cpu().numpy()).resize(input_image.size)
r.putpalette(colors)

import matplotlib.pyplot as plt
plt.imshow(r)
# plt.show()

๋ชจ๋ธ ์„ค๋ช…

FCN-ResNet์€ ResNet-50 ๋˜๋Š” ResNet-101 ๋ฐฑ๋ณธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์™„์ „ ์ปจ๋ณผ๋ฃจ์…˜ ๋„คํŠธ์›Œํฌ ๋ชจ๋ธ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์€ Pascal VOC ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ์กด์žฌํ•˜๋Š” 20๊ฐœ ๋ฒ”์ฃผ์— ๋Œ€ํ•œ COCO 2017์˜ ํ•˜์œ„ ์ง‘ํ•ฉ์— ๋Œ€ํ•ด ํ›ˆ๋ จ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

COCO val 2017 ๋ฐ์ดํ„ฐ์…‹์—์„œ ํ‰๊ฐ€๋œ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์˜ ์ •ํ™•์„ฑ์€ ์•„๋ž˜์— ๋‚˜์—ด๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

Model structure Mean IOU Global Pixelwise Accuracy
fcn_resnet50 60.5 91.4
fcn_resnet101 63.7 91.9

Resources