Skip to content

Latest commit

Β 

History

History
143 lines (105 loc) Β· 5.39 KB

File metadata and controls

143 lines (105 loc) Β· 5.39 KB
layout hub_detail
background-class hub-background
body-class hub
title Transformer (NMT)
summary μ˜μ–΄-ν”„λž‘μŠ€μ–΄ λ²ˆμ—­κ³Ό μ˜μ–΄-독일어 λ²ˆμ—­μ„ μœ„ν•œ 트랜슀포머 λͺ¨λΈ
category researchers
image fairseq_logo.png
author Facebook AI (fairseq Team)
tags
nlp
github-link https://github.com/pytorch/fairseq/
github-id pytorch/fairseq
featured_image_1 no-image
featured_image_2 no-image
accelerator cuda-optional
order 2
demo-model-link https://huggingface.co/spaces/pytorch/Transformer_NMT

λͺ¨λΈ μ„€λͺ…

λ…Όλ¬Έ Attention Is All You Need에 μ†Œκ°œλ˜μ—ˆλ˜ 트랜슀포머(Transformer)λŠ”
κ°•λ ₯ν•œ μ‹œν€€μŠ€-투-μ‹œν€€μŠ€ λͺ¨λΈλ§ μ•„ν‚€ν…μ²˜λ‘œ μ΅œμ‹  기계 신경망 λ²ˆμ—­ μ‹œμŠ€ν…œμ„ κ°€λŠ₯ν•˜κ²Œ ν•©λ‹ˆλ‹€.

졜근, fairseqνŒ€μ€ μ—­λ²ˆμ—­λœ 데이터λ₯Ό ν™œμš©ν•œ 트랜슀포머의 λŒ€κ·œλͺ¨ 쀀지도 ν•™μŠ΅μ„ 톡해 λ²ˆμ—­ μˆ˜μ€€μ„ 기쑴보닀 ν–₯μƒμ‹œμΌ°μŠ΅λ‹ˆλ‹€. 더 μžμ„Έν•œ λ‚΄μš©μ€ λΈ”λ‘œκ·Έ 포슀트λ₯Ό 톡해 μ°ΎμœΌμ‹€ 수 μžˆμŠ΅λ‹ˆλ‹€.

μš”κ΅¬μ‚¬ν•­

μ „μ²˜λ¦¬ 과정을 μœ„ν•΄ λͺ‡ κ°€μ§€ python λΌμ΄λΈŒλŸ¬λ¦¬κ°€ ν•„μš”ν•©λ‹ˆλ‹€:

pip install bitarray fastBPE hydra-core omegaconf regex requests sacremoses subword_nmt

μ˜μ–΄ ➑️ ν”„λž‘μŠ€μ–΄ λ²ˆμ—­

μ˜μ–΄λ₯Ό ν”„λž‘μŠ€μ–΄λ‘œ λ²ˆμ—­ν•˜κΈ° μœ„ν•΄ Scaling Neural Machine Translation λ…Όλ¬Έμ˜ λͺ¨λΈμ„ ν™œμš©ν•©λ‹ˆλ‹€:

import torch

# WMT'14 dataμ—μ„œ ν•™μŠ΅λœ μ˜μ–΄ ➑️ ν”„λž‘μŠ€μ–΄ 트랜슀포머 λͺ¨λΈ 뢈러였기:
en2fr = torch.hub.load('pytorch/fairseq', 'transformer.wmt14.en-fr', tokenizer='moses', bpe='subword_nmt')

# GPU μ‚¬μš© (선택사항):
en2fr.cuda()

# beam searchλ₯Ό ν†΅ν•œ λ²ˆμ—­:
fr = en2fr.translate('Hello world!', beam=5)
assert fr == 'Bonjour Γ  tous !'

# 토큰화:
en_toks = en2fr.tokenize('Hello world!')
assert en_toks == 'Hello world !'

# BPE 적용:
en_bpe = en2fr.apply_bpe(en_toks)
assert en_bpe == 'H@@ ello world !'

# 이진화:
en_bin = en2fr.binarize(en_bpe)
assert en_bin.tolist() == [329, 14044, 682, 812, 2]

# top-k sampling을 톡해 λ‹€μ„― λ²ˆμ—­ 사둀 생성:
fr_bin = en2fr.generate(en_bin, beam=5, sampling=True, sampling_topk=20)
assert len(fr_bin) == 5

# μ˜ˆμ‹œμ€‘ ν•˜λ‚˜λ₯Ό λ¬Έμžμ—΄λ‘œ λ³€ν™˜ν•˜κ³  비토큰화
fr_sample = fr_bin[0]['tokens']
fr_bpe = en2fr.string(fr_sample)
fr_toks = en2fr.remove_bpe(fr_bpe)
fr = en2fr.detokenize(fr_toks)
assert fr == en2fr.decode(fr_sample)

μ˜μ–΄ ➑️ 독일어 λ²ˆμ—­

μ—­λ²ˆμ—­μ— λŒ€ν•œ μ€€μ§€λ„ν•™μŠ΅μ€ λ²ˆμ—­ μ‹œμŠ€ν…œμ„ ν–₯μƒμ‹œν‚€λŠ”λ° 효율적인 λ°©λ²•μž…λ‹ˆλ‹€. λ…Όλ¬Έ Understanding Back-Translation at Scaleμ—μ„œ, 좔가적인 ν•™μŠ΅ λ°μ΄ν„°λ‘œ μ‚¬μš©ν•˜κΈ° μœ„ν•΄ 2μ–΅κ°œ μ΄μƒμ˜ 독일어 λ¬Έμž₯을 μ—­λ²ˆμ—­ν•©λ‹ˆλ‹€. 이 λ‹€μ„― λͺ¨λΈλ“€μ˜ 앙상블은 WMT'18 English-German news translation competition의 μˆ˜μƒμž‘μž…λ‹ˆλ‹€.

noisy-channel reranking을 톡해 이 접근법을 더 ν–₯μƒμ‹œν‚¬ 수 μžˆμŠ΅λ‹ˆλ‹€. 더 μžμ„Έν•œ λ‚΄μš©μ€ λΈ”λ‘œκ·Έ ν¬μŠ€νŠΈμ—μ„œ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€. μ΄λŸ¬ν•œ λ…Έν•˜μš°λ‘œ ν•™μŠ΅λœ λͺ¨λΈλ“€μ˜ 앙상블은 WMT'19 English-German news translation competition의 μˆ˜μƒμž‘μž…λ‹ˆλ‹€.

μ•žμ„œ μ†Œκ°œλœ λŒ€νšŒ μˆ˜μƒ λͺ¨λΈ 쀑 ν•˜λ‚˜λ₯Ό μ‚¬μš©ν•˜μ—¬ μ˜μ–΄λ₯Ό λ…μΌμ–΄λ‘œ λ²ˆμ—­ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€:

import torch

# WMT'19 dataμ—μ„œ ν•™μŠ΅λœ μ˜μ–΄ ➑️ 독일어 트랜슀포머 λͺ¨λΈ 뢈러였기:
en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model', tokenizer='moses', bpe='fastbpe')

# κΈ°λ³Έ 트랜슀포머 λͺ¨λΈμ— μ ‘κ·Ό
assert isinstance(en2de.models[0], torch.nn.Module)

# μ˜μ–΄ ➑️ 독일어 λ²ˆμ—­
de = en2de.translate('PyTorch Hub is a pre-trained model repository designed to facilitate research reproducibility.')
assert de == 'PyTorch Hub ist ein vorgefertigtes Modell-Repository, das die Reproduzierbarkeit der Forschung erleichtern soll.'

κ΅μ°¨λ²ˆμ—­μœΌλ‘œ 같은 λ¬Έμž₯에 λŒ€ν•œ μ˜μ—­μ„ λ§Œλ“€ μˆ˜λ„ μžˆμŠ΅λ‹ˆλ‹€:

# μ˜μ–΄ ↔️ 독일어 κ΅μ°¨λ²ˆμ—­:
en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model', tokenizer='moses', bpe='fastbpe')
de2en = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.de-en.single_model', tokenizer='moses', bpe='fastbpe')

paraphrase = de2en.translate(en2de.translate('PyTorch Hub is an awesome interface!'))
assert paraphrase == 'PyTorch Hub is a fantastic interface!'

# μ˜μ–΄ ↔️ λŸ¬μ‹œμ•„μ–΄ κ΅μ°¨λ²ˆμ—­κ³Ό 비ꡐ:
en2ru = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-ru.single_model', tokenizer='moses', bpe='fastbpe')
ru2en = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.ru-en.single_model', tokenizer='moses', bpe='fastbpe')

paraphrase = ru2en.translate(en2ru.translate('PyTorch Hub is an awesome interface!'))
assert paraphrase == 'PyTorch is a great interface!'

μ°Έκ³  λ¬Έν—Œ