Highlights

NeMo ASR

HybridTransducer-CTC ASR
Greedy timestamp decoding with inference script
MHA adapters
Conformer local attention (longformer)
High level beam search API
Multiblank transducer
Multi-channel audio processing model
AIstore for ASR datasets

NeMo Megatron

ALiBi position embeddings support for T5

NeMo TTS

Chinese TTS pipeline with polyphone disambiguation

NeMo Core

Optimizer based EMA
MLFlow logger support

Models

stt_eo_conformer_ctc_large (HF, NGC) Esperanto ASR model.
stt_eo_conformer_transducer_large (HF, NGC) Esperanto ASR model.

Detailed Changelogs

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.12

ASR

Changelog

optimized loop and bugfix by @Jorjeous :: PR: #5573
Update torchmetrics by @nithinraok :: PR: #5566
Add an option to defer data setup from init to setup by @anteju :: PR: #5569
AIStore for ASR datasets by @anteju :: PR: #5462
Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
Update documentation and tutorials for Adapters by @titu1994 :: PR: #5610
Conformer local attention by @sam1373 :: PR: #5525
Add core classes and functions for online clustering diarizer part 1 by @tango4j :: PR: #5526
[Add] ASR+VAD Inference Pipeline by @stevehuang52 :: PR: #5575
[ASR] Audio processing base, multi-channel enhancement models by @anteju :: PR: #5356
Expose ClusteringDiarizer device by @SeanNaren :: PR: #5681
Add Beam Search support to ASR transcribe() by @titu1994 :: PR: #5443
Multiblank Transducer by @hainan-xv :: PR: #5527
pin torchmetrics version by @nithinraok :: PR: #5720
Update torchaudio dependency version for tutorials by @titu1994 :: PR: #5781
update torchmetrics to latest version by @nithinraok :: PR: #5801
Fix transducer and question answering tutorial bugs bugs by @Zhilin123 :: PR: #5809
[BugFix] Updated CTC decoders installation in tutorial by @vsl9 :: PR: #5833
update torchmetrics args confusionmatrix by @nithinraok :: PR: #5853
indentation fix by @nithinraok :: PR: #5861
Fix wrong label mapping in batch_inference for label_model by @fayejf :: PR: #5767

TTS

Changelog

Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
[TTS] fix ranges of char set for accented letters. by @XuesongYang :: PR: #5607
[TTS] add type hints and change varialbe names for tokenizers and g2p by @XuesongYang :: PR: #5602
Fixed RadTTS unit test by @borisfom :: PR: #5572
[TTS][ZH] Disambiguate polyphones with augmented dict and Jieba segmenter for Chinese FastPitch by @yuekaizhang :: PR: #5541
Add duration padding support for RADTTS inference by @kevjshih :: PR: #5650
[TTS] add tts dict cust notebook by @ekmb :: PR: #5662
[TN/TTS docs] TN customization, g2p docs moved to tts by @ekmb :: PR: #5683
typo and link fixed by @ekmb :: PR: #5741
link fixed by @ekmb :: PR: #5745
Update Tacotron2 NGC checkpoint load to latest version by @redoctopus :: PR: #5760
Docs g2p update by @ekmb :: PR: #5769
[TTS][ZH] bugfix import jieba errors. by @XuesongYang :: PR: #5776

NLP / NMT

Changelog

Text generation improvement (UI client, data parallel support) by @yidong72 :: PR: #5437
O2 style amp for gpt3 ptuning by @JimmyZhang12 :: PR: #5246
Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
Bert interleaved by @shanmugamr1992 :: PR: #5556
Port stateless timer to exp manager by @MaximumEntropy :: PR: #5584
Add interface for making amax reduction optional for FP8 by @ksivaman :: PR: #5447
Propagate attention_dropout flag for GPT-3 by @mikolajblaz :: PR: #5669
Enc-Dec model size reporting fixes by @MaximumEntropy :: PR: #5623
Add prompt learning tests by @arendu :: PR: #5649
Fix missing torchelastic fixes for PTL 1.8 by @MaximumEntropy :: PR: #5691
ALiBi Positional Embeddings by @michalivne :: PR: #5467
Megatron export triton update by @Davood-M :: PR: #5766
Fix transducer and question answering tutorial bugs bugs by @Zhilin123 :: PR: #5809
Update description for question answering tutorial by @Zhilin123 :: PR: #5814
TPMLP for T5-based models by @Davood-M :: PR: #5840
Megatron positional encoding alibi fix by @michalivne :: PR: #5808

Export

Changelog

Add keep_initializers_as_inputs to _export method by @pks :: PR: #5731
Megatron export triton update by @Davood-M :: PR: #5766

General Improvements

Changelog

Update to pytorch 22.12 container by @ericharper :: PR: #5694
optimized loop and bugfix by @Jorjeous :: PR: #5573
Expose ClusteringDiarizer device by @SeanNaren :: PR: #5681
remove useless files. by @XuesongYang :: PR: #5580
[Fix] setup_multiple validation/test data by @anteju :: PR: #5585
Move to optimizer based EMA implementation by @SeanNaren :: PR: #5169
[Temp workaround] Disable test with cache_audio to unblock CI by @anteju :: PR: #5615
[EMA] Change success message to reduce confusion by @SeanNaren :: PR: #5621
Temporarily disable prompt learning CI tests by @ericharper :: PR: #5633
[Dockerfile] Remove AIS archive from docker image by @anteju :: PR: #5629
[workflow] add exclude labels option to ignore cherry-picks in releas… by @XuesongYang :: PR: #5645
Add DLLogger support to exp_manager by @milesial :: PR: #5658
Fix EMA restart by allowing device to be set by the class init by @SeanNaren :: PR: #5668
Remove SDP (moved to separate repo) - merge to main by @erastorgueva-nv :: PR: #5630
temp disable speaker recognision CI test by @fayejf :: PR: #5696
Don't print exp_manager warning when max_steps == -1 by @milesial :: PR: #5725
Add tabular data generation documents to the index file by @yidong72 :: PR: #5733
fix token id bug by @yidong72 :: PR: #5777
Update numpy requirements from 1.21 to 1.22 by @Zhilin123 :: PR: #5785
Fix setuptools to usable version by @titu1994 :: PR: #5798
add apt-get upgrade -y in dockerfile by @fayejf :: PR: #5817
Update NeMo Multi-Run docs by @titu1994 :: PR: #5844
add ambernet to readme by @fayejf :: PR: #5872
update apex install instructions for 1.15 by @ericharper :: PR: #5901

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Neural Modules 1.15.0

Highlights

NeMo ASR

NeMo Megatron

NeMo TTS

NeMo Core

Models

Detailed Changelogs

Container

ASR

TTS

NLP / NMT

Export

General Improvements

Contributors