04 Aug 19:50

ericharper

2baef81

NVIDIA Neural Modules 1.20.0

Highlights

Models

STT En Fast Conformer CTC XXLarge - 1.2 B param Fast Conformer CTC
STT En Fast Conformer Transducer XXLarge - 1.2 B param Fast Conformer Transducer
STT En Fast Conformer Transducer XLarge - XLarge Fast Conformer English
STT En Fast Conformer CTC XLarge - XLarge Fast Conformer CTC
STT En Fast Conformer Transducer XLarge - XLarge Fast Conformer Transducer
STT En Fast Conformer CTC Large - Large Fast Conformer CTC
STT En Fast Conformer Transducer Large - Large Fast Conformer Transducer
STT It Fast Conformer Hybrid Large P&C - Large P&C Italian Fast Conformer
STT Ua Fast Conformer Hybrid Large P&C - Large Ukranian Fast Conformer

NeMo ASR

Graph-RNN-T #6168
WildCard-RNN-T #6168
Confidence Ensembles for ASR
Token-and-Duration Transducer (TDT) #6536
Spellchecking ASR #6179
Numba FP16 RNNT Loss #6991

NeMo TTS

TTS Adapter Customization
TTS Dataloader Framework

NeMo Framework

LoRA for T5 and mT5 #6612
Flash Attention integration #6666
Mosaic 7B compatibility
Models with LongContext (32K) #6666, #6687, #6773

NeMo Tools

Speech Data Explorer: Utterance level ASR model comparsion #6669
Speech Data Processor: Spanish P&C
NeMo Forced Aligner: Large sequence alignment + memory reduction #6695

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.06

Detailed Changelogs

ASR

Changelog

[ASR] Adding ssl config for fast-conformer by @krishnacpuvvada :: PR: #6672
Fix for interctc test random failure by @Kipok :: PR: #6644
sharded manifests docs by @bmwshop :: PR: #6751
[TTS] Implement new vocoder dataset by @rlangman :: PR: #6670
TDT model pull request by @hainan-xv :: PR: #6536
Spec aug fix by @tbartley94 :: PR: #6775
Support large inputs to Conformer and Fast Conformer by @bmwshop :: PR: #6556
sharded manifests updated docs by @bmwshop :: PR: #6833
added fc-xl, xxl and titanet-s models by @nithinraok :: PR: #6832
Multi-lookahead cache-aware streaming models by @VahidooX :: PR: #6711
Update transcribe_utils.py by @stevehuang52 :: PR: #6865
Fix k2 build topo helper by @artbataev :: PR: #6887
Fix transcribe_utils.py for hybrid models in partial transcribe mode by @stevehuang52 :: PR: #6899
Add hybrid model support to transcribe_speech_parallel.py by @stevehuang52 :: PR: #6906
Update Frame-VAD doc by @stevehuang52 :: PR: #6902
Make sure asr_model.change_attention_model is run if either cfg.model_path or cfg.pretrained_name is specified by @erastorgueva-nv :: PR: #6908
Update fvad doc by @stevehuang52 :: PR: #6920
Online Code Switching Dataset for ASR by @trias702 :: PR: #6579
Fix AN4 dataset links by @artbataev :: PR: #6926
Fix confidence ensembles RNNT logprobs selection logic for exclude_blank scenario by @KunalDhawan :: PR: #6937
Adding cache-aware streaming ASR checkpoints. by @VahidooX :: PR: #6940
Remove from metrics by @titu1994 :: PR: #6979
Hybrid conformer export by @borisfom :: PR: #6983
Cache handling without input tensors mutation by @borisfom :: PR: #6980
Fixing an issue with confidence ensembles by @Kipok :: PR: #6987
Add ASR with TTS Tutorial. Fix enhancer usage. by @artbataev :: PR: #6955
fix install_beamsearch_decoders.sh by @karpnv :: PR: #7019
Add support for Numba FP16 RNNT Loss (#6991) by @titu1994 :: PR: #7038
Fix typo and branch in tutorial by @artbataev :: PR: #7048
Refined export_config by @borisfom :: PR: #7053
Fix documentation for Numba by @titu1994 :: PR: #7065
Adding docs and models for multiple lookahead cache-aware ASR by @VahidooX :: PR: #7067
Add updated fc ctc and rnnt xxl models by @nithinraok :: PR: #7128
Update notebook branch by @ericharper :: PR: #7135
Fixed main and merging this to r1.20 by @tango4j :: PR: #7127
Fix default context size by @nithinraok :: PR: #7141
Fix incorrect embedding grads with distopt BF16 grad reductions by @timmoon10 :: PR: #6958

TTS

Changelog

[TTS] Add callback for saving audio during FastPitch training by @rlangman :: PR: #6665
[TTS] Add script for text preprocessing by @rlangman :: PR: #6541
[TTS] Fix adapter duration issue by @hsiehjackson :: PR: #6697
[TTS] Filter out silent audio files during preprocessing by @rlangman :: PR: #6716
[TTS] fix inconsistent type hints for IpaG2p by @XuesongYang :: PR: #6733
[TTS] relax hardcoded prefix for phonemes and tones and infer phoneme set through dict by @XuesongYang :: PR: #6735
[TTS] corrected misleading deprecation warnings. by @XuesongYang :: PR: #6702
Fix TTS adapter tutorial by @hsiehjackson :: PR: #6741
[TTS][zh] refine hardcoded lowercase for ASCII letters. by @XuesongYang :: PR: #6781
[TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models by @racoiaws :: PR: #7012

NLP / NMT

Changelog

minor fix for missing chat attr by @arendu :: PR: #6671
eval fix by @arendu :: PR: #6685
VP Fixes for converter + Config management by @titu1994 :: PR: #6698
lora notebook by @arendu :: PR: #6765
peft eval directly from ckpt by @arendu :: PR: #6785
GPT inference long context by @ekmb :: PR: #6687
Fix validation with drop_last=False by @mikolajblaz :: PR: #6704
fix spellmapper tutorial, change branch to main by @bene-ges :: PR: #6803
text_generation_utils memory reduction if no logprob needed by @yzhang123 :: PR: #6773
Add optional index mapping dir in mmap text datasets by @gheinrich :: PR: #6683
Add inference kv cache support for transformer TE path by @yen-shi :: PR: #6627
add reference to our paper by @bene-ges :: PR: #6821
added changes to ramp up bs by @dimapihtar :: PR: #6799
t5 lora tuning by @arendu :: PR: #6612
Added rouge monitoring support for T5 by @jubick1337 :: PR: #6737
GPT extrapolatable position embedding (xpos/sandwich/alibi/kerple) and Flash Attention by @hsiehjackson :: PR: #6666
Import Enum for chatbot component by @ericharper :: PR: #6877
typo fix from #6666 by @arendu :: PR: #6882
removed unnecessary print by @dimapihtar :: PR: #6884
Fix destructor for delayed mmap dataset case by @mikolajblaz :: PR: #6703
Make Gradio library optional by @yidong72 :: PR: #6904
Fix fast-glu activation in change partitions by @hsiehjackson :: PR: #6909
Documentation for ONNX export of Megatron Models by @asfiyab-nvidia :: PR: #6914
FixTextMemMapDataset index file creation in multi-node setup by @gheinrich :: PR: #6768
Fix flash-attention by @hsiehjackson :: PR: #6901
ptuning oom fix by @arendu :: PR: #6916
add rampup bs assertion by @dimapihtar :: PR: #6927
Enable methods in bert-like models by @sararb :: PR: #6898
support value attribution condition by @yidong72 :: PR: #6934
Add missing save restore connector to eval scripts by @titu1994 :: PR: #6935
Merge release r1.19.0 into main by @ericharper :: PR: #6948
Stop at the stop token by @yidong72 :: PR: #6957
fixes for spellmapper by @bene-ges :: PR: #6994
Fix tabular data text generation by @yidong72 :: PR: #7022
fix pos id - hf update by @ekmb :: PR: #7075
fix syntax error introduced in PR-7079 by @bene-ges :: PR: #7102

NeMo Tools

Changelog

SDE unt lvl comparison by @Jorjeous :: PR: #6669
hot fix SDE by @Jorjeous :: PR: #6897

Bugfixes

Changelog

small Bugfix by @fayejf :: PR: #7079
Fix caching bug in causal convolutions for cache-aware ASR models by @VahidooX :: PR: #7034
Fix masking bug for TTS Aligner by @redoctopus :: PR: #6677
[bugfix] avoid the random shuffle of phoneme and tone tokens. by @XuesongYang :: PR: #6855
fix ptuning residuals bug by @arendu :: PR: #6866
TE bug fix by @dimapihtar :: PR: #7027
Update distopt API for coalesced NCCL calls by @timmoon10 :: PR: #6886

General Improvements

Changelog

update batch size recommendation to min 32 for 43b by @Zhilin123 :: PR: #6675
Make Note usage consistent in adapter_mixins.py by @BrianMcBrayer :: PR: #6678
Update all invalid tree references to blobs for NeMo samples by @BrianMcBrayer :: PR: #6679
Update README.rst about container by @fayejf :: PR: #6686
karpnv/issues6690 by @karpnv :: PR: #6705
Limit codeql scope by @titu1994 :: PR: #6710
Not pinning Gradio version by @yidong72 :: PR: #6680
preprocess squad in sft format by @arendu :: PR: #6727
Fix Codeql config by @titu1994 :: PR: #6731
Fix fastpitch test nightly by @hsiehjackson :: PR: #6730
Lora/PEFT training script CI test by @arendu :: PR: #6664
fixed decor to show messages only when the wrapped object is called. by @XuesongYang :: PR: #6793
lora pp2 by @arendu :: PR: #6818
Upperbound Numpy to < 1.24 by @titu1994 :: PR: #6829
Fix typo in documentation by @Dounx :: PR: #6838
NFA updates by @erastorgueva-nv :: PR: #6695
Update container for import action by @ericharper :: PR: #6883
removed some tests by @arendu :: PR: #6900
Update contai...

Contributors

bmwshop, karpnv, and 39 other contributors

Assets 20

13 Jul 20:42

ericharper

v1.19.1

55f8ce5

NVIDIA Neural Modules 1.19.1

This release is a small patch to fix torchmetrics.

Remove deprecated arg compute_on_step. See #6979.

Assets 2

15 Jun 23:46

ericharper

v1.19.0

2331b06

NVIDIA Neural Modules 1.19.0

Highlights

NeMo ASR

Sharded Manifests for Tarred Datasets #6395
Frame-VAD model + datasets support #6441
Noise Norm Perturbation #6445
Code Switched Dataset with IID Sampling #6448

NeMo TTS

Speaker adaptation for FastPitch #6416, #6417

NeMo Megatron

Batch size rampup #6424
Unify dataset and model classes for all PEFT #6391
LoRA for GPT #6391
Convert interleaved pipeline model to non-interleaved #6498
Dialog Dataset for SFT #6654
Dynamic length batches for GPT SFT #6510
Merge LoRA weights into base model #6597

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.04

Detailed Changelogs

ASR

Changelog

Sharded manifests for tarred datasets by @bmwshop :: PR: #6395
Update script for ngram rnnt and hat beam search decoding by @andrusenkoau :: PR: #6370
Add disclaimer about dataset for ASR by @titu1994 :: PR: #6496
New noise_norm perturbation based on Riva work by @trias702 :: PR: #6445
Add Frame-VAD model and datasets by @stevehuang52 :: PR: #6441
removing unnecessary avoid_bfloat16_autocast_context by @bmwshop :: PR: #6481
FC models in menu by @bmwshop :: PR: #6473
Separate punctuation by whitespace by @karpnv :: PR: #6574
Cherry pick commits in #6601 to main by @fayejf :: PR: #6611
Offline and streaming inference support for hybrid model by @fayejf :: PR: #6570
Disable interctc tests by @Kipok :: PR: #6638
ASR-TTS Models: Support hybrid RNNT-CTC, improve docs. by @artbataev :: PR: #6620
Confidence ensembles implementation by @Kipok :: PR: #6614
Confidence ensembles: fix issues and add tuning functionality by @Kipok :: PR: #6657
Add support for RNNT/hybrid models to partial transcribe by @stevehuang52 :: PR: #6609
eval_beamsearch_ngram.py with hybrid ctc by @karpnv :: PR: #6656

TTS

Changelog

[TTS] FastPitch adapter fine-tune and conditional layer normalization by @hsiehjackson :: PR: #6416
[TTS] whitelist broken path fix. by @XuesongYang :: PR: #6412
[TTS] FastPitch speaker encoder by @hsiehjackson :: PR: #6417
Update NeMo_TTS_Primer.ipynb by @pythinker :: PR: #6436
[TTS] Create functions for TTS preprocessing without dataloader by @rlangman :: PR: #6317
[TTS] Fix FastPitch energy code by @rlangman :: PR: #6511
[TTS] Add script for computing feature stats by @rlangman :: PR: #6508
[TTS] Add tutorials for FastPitch TTS speaker adaptation with adapters by @hsiehjackson :: PR: #6431
[TTS] Create initial TTS dataset feature processors by @rlangman :: PR: #6507
[TTS] Add script for mapping speaker names to indices by @rlangman :: PR: #6509
[TTS] Implement new TextToSpeech dataset by @rlangman :: PR: #6575

NLP / NMT

Changelog

Add patches for Virtual Parallel conversion by @titu1994 :: PR: #6589
Update wfst_text_normalization.rst by @jimregan :: PR: #6374
add rampup batch size support for Megatron GPT by @dimapihtar :: PR: #6424
Add interleaved pp support by @titu1994 :: PR: #6498
Support dynamic length batches with GPT SFT by @aklife97 :: PR: #6510
Framework for PEFT via mixins by @arendu :: PR: #6391
Add GPT eval mode fix for interleaved to main (#6449) by @aklife97 :: PR: #6610
sft model can use this script for eval by @arendu :: PR: #6637
Patch memory used for NeMo Megatron models by @titu1994 :: PR: #6615
merge lora weights into base model by @arendu :: PR: #6597
Dialogue dataset by @yidong72 :: PR: #6654
check for first or last stage by @ericharper :: PR: #6708
A few small typo fixes by @Kipok :: PR: #6599
Lddl bert by @wdykas :: PR: #6761
Debug Transformer Engine FP8 support with Megatron-core infrastructure by @timmoon10 :: PR: #6740
Tensor-parallel communication overlap with userbuffer backend by @erhoo82 :: PR: #6780
Add ub communicator initialization to validation step by @erhoo82 :: PR: #6807
Add trainer.validate example for GPT by @ericharper :: PR: #6794
Add API docs for NeMo Megatron by @ericharper :: PR: #6850
Apply garbage collection interval to validation steps by @erhoo82 :: PR: #6870

Bugfixes

Changelog

[BugFix] Force _get_batch_preds() to keep logits in decoder timestamps generator by @tango4j :: PR: #6499
small bugfix for asr_evaluator by @fayejf :: PR: #6636
fix bucketing bug issue for picking new bucket by @nithinraok :: PR: #6663
[TTS] Fix TTS audio preprocessing bugs by @rlangman :: PR: #6628
Fix a bug, use _ceil_to_nearest instead as _round_to_nearest is not d… by @BestJuly :: PR: #6681
Bug fix to restore act ckpt by @markelsanz14 :: PR: #6753
Bug fix to reset sequence parallelism by @markelsanz14 :: PR: #6756
Bug fix for reset_sequence_parallel_args by @markelsanz14 :: PR: #6802
Fix adapter tutorial r1.19.0 by @hsiehjackson :: PR: #6776
Fix error appearing when using tar datasets by @Jorjeous :: PR: #6502
Fix normalization of impulse response in ImpulsePerturbation by @anteju :: PR: #6505
Fix typos by @titu1994 :: PR: #6523
Fix notebook bad json by @titu1994 :: PR: #6561
[ASR] Fix for old models in change_attention_model by @sam1373 :: PR: #6608
Fix k2 installation in Docker with CUDA 12 by @artbataev :: PR: #6707
Tutorial fixes by @titu1994 :: PR: #6717
Vp fixes by @titu1994 :: PR: #6738
[TTS] Fix aligner nan loss in fp32 by @hsiehjackson :: PR: #6435
fix conversion and eval by @arendu :: PR: #6648
Fix checkpointed forward and add test for full activation checkpointing by @aklife97 :: PR: #6744
add call to p2p overlap by @aklife97 :: PR: #6779
Fix get_parameters when using main params optimizer by @ericharper :: PR: #6764
Fix GPTDataset Assert by @MaximumEntropy :: PR: #6798
fix notebook error by @yidong72 :: PR: #6840
final fix of notebook by @yidong72 :: PR: #6842

General Improvements

Changelog

Code-Switching dataset creation - upgrading to aggregate tokenizer manifest format by @KunalDhawan :: PR: #6448
Fix an invalid link in get_data.py of ljspeech by @pythinker :: PR: #6456
Update manifest.py to use os.path for get_full_path by @stevehuang52 :: PR: #6598
Cherry pick commits in #6528 to main by @timmoon10 :: PR: #6613
Move black parameters to pyproject.toml by @artbataev :: PR: #6647
handle artifacts when path is an extracted dir by @arendu :: PR: #6658
remove upgrading setuptools in reinstall.sh by @XuesongYang :: PR: #6659
Upgrade to PyTorch 23.04 Container by @ericharper :: PR: #6660
Fix fastpitch test nightly by @hsiehjackson :: PR: #6742
Fix Links for tutorials by @titu1994 :: PR: #6777
Update core version in Jenkinsfile by @aklife97 :: PR: #6817
Update mcore requirement to 0.2.0 by @ericharper :: PR: #6875

Contributors

jimregan, bmwshop, and 29 other contributors

Assets 3

17 May 19:09

titu1994

v1.18.1

85f5f47

NVIDIA Neural Modules 1.18.1

Highlights

For the complete release note, please see NeMo 1.18.0 Release Notes

Bugfix

This patch release fixes a major bug in ASR Bucketing datasets that was introduced in r1.17.0 in PR #6191. Due to this bug, while each bucket is randomly shuffled before selection on each rank, only a single bucket would loop infinitely - without continuing onto subsequent buckets.

Effect: Significantly worse WER would be obtained since not all buckets would be used.

This has been patched and should work correctly in 1.18.1 onwards.

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.03

Assets 2

12 May 17:49

ericharper

v1.18.0

ab651ca

NVIDIA Neural Modules 1.18.0

Highlights

Models

NeMo ASR

Hybrid Autoregressive Transducer (HAT) #6260
Apple MPS Support for ASR Inference #6289
InterCTC Support for Hybrid ASR Models #6215
RNNT N-Gram Fusion with mAES algo #6118
ASR + Apple M2 CPU/GPU MPS #6289

NeMo TTS

TTS directory structure refactor
User-set symbol vocabulary #6172

NeMo Megatron

Model parallelism from Megatron Core #6393
Continued training for P-tuning #6273
SFT for GPT-3 #6210
Tensor and pipeline model parallel conversion #6218
Megatron NMT Export to Riva

NeMo Core

Detailed Changelogs

ASR

Changelog

minor cleanup by @messiaen :: PR: #6311
docs on the use of heterogeneous test / val manifests by @bmwshop :: PR: #6352
[WIP] add buffered chunked streaming for nemo force aligner by @Slyne :: PR: #6185
Word boosting for Flashlight decoder by @trias702 :: PR: #6367
Add installation and ASR inference instructions for Mac by @artbataev :: PR: #6377
specaug speedup by @1-800-BAD-CODE :: PR: #6347
updated lr for FC configs by @bmwshop :: PR: #6379
Make possible to control tqdm progress bar in ASR models by @SN4KEBYTE :: PR: #6375
[ASR] Conformer global tokens in local attention by @sam1373 :: PR: #6253
fixed torch warning on using a list of numpy arrays by @MKNachesa :: PR: #6382
Fix FastConformer config: correct bucketing strategy by @artbataev :: PR: #6413
fixing the ability to use temp sampling with concat datasets by @bmwshop :: PR: #6423
add conformer configs for hat model by @andrusenkoau :: PR: #6372
[ASR] Add optimization util for linear sum assignment algorithm by @tango4j :: PR: #6349
Added/updated new Conformer configs by @VahidooX :: PR: #6426
Fix typos by @titu1994 :: PR: #6494
Fix typos (#6523) by @titu1994 :: PR: #6539
added back the fast emit section to the configs. by @VahidooX :: PR: #6540
Add FastConformer Hybrid ASR models for EN, ES, IT, DE, PL, HR, UA, BY by @KunalDhawan :: PR: #6549
Add scores for FastConformer models by @titu1994 :: PR: #6557
Patch transcribe and support offline transcribe for hybrid model by @fayejf :: PR: #6550
More streaming conformer export fixes by @messiaen :: PR: #6567
Documentation for ASR-TTS models by @artbataev :: PR: #6594
Patch transcribe_util for steaming mode and add wer calculation back to inference scripts by @fayejf :: PR: #6601
Add HAT image to docs by @andrusenkoau :: PR: #6619
Patch decoding for PC models by @titu1994 :: PR: #6630
Fix wer.py where 'errors' variable was not set by @stevehuang52 :: PR: #6633
Fix for old models in change_attention_model by @VahidooX :: PR: #6635

TTS

Changelog

VITS HiFiTTS doc by @treacker :: PR: #6288
fix broken links r1.18.0 by @ekmb :: PR: #6501
[TTS] fixed broken path. by @XuesongYang :: PR: #6514

NLP / NMT

Changelog

[Core] return_config=True now extracts just config, not full tarfile by @titu1994 :: PR: #6346
restore path for p-tuning by @arendu :: PR: #6273
taskname and early stopping for adapters by @arendu :: PR: #6366
Adapter tuning accepts expanded language model dir by @arendu :: PR: #6376
Update gpt_training.rst by @blisc :: PR: #6378
Megatron GPT model finetuning by @MaximumEntropy :: PR: #6210
[NeMo Megatron] Cleanup configs to infer the models TP PP config automatically by @titu1994 :: PR: #6368
Fix prompt template unescaping by @MaximumEntropy :: PR: #6399
Add support for Megatron GPT Untied Embd TP PP Change by @titu1994 :: PR: #6388
Move Parallelism usage from Apex -> Megatron Core by @aklife97 :: PR: #6393
Add ability to enable/disable act ckpt and seq parallelism in GPT by @markelsanz14 :: PR: #6327
Refactor PP conversion + add support for TP only conversion by @titu1994 :: PR: #6419
fix CPU overheads of GPT synthetic dataset by @xrennvidia :: PR: #6427
check if grad is none before calling all_reduce by @arendu :: PR: #6428
Fix replace_bos_with_pad not found by @aklife97 :: PR: #6443
Support Swiglu in TP PP Conversion by @titu1994 :: PR: #6437
BERT pre-training mp fork to spawn by @aklife97 :: PR: #6442
Meagtron encoder decoder fix for empty validation outputs by @michalivne :: PR: #6459
Reduce workers on NMT CI by @aklife97 :: PR: #6472
Switch to NVIDIA Megatron repo by @aklife97 :: PR: #6465
Megatron KERPLE positional embeddings by @michalivne :: PR: #6478
Support in external sample mapping for Megatron datasets by @michalivne :: PR: #6462
Fix custom by @aklife97 :: PR: #6512
GPT fp16 inference fix by @MaximumEntropy :: PR: #6543
Fix for T5 FT model by @aklife97 :: PR: #6529
Pass instead of scaler object to core by @aklife97 :: PR: #6545
Change Megatron Enc Dec model to use persistent_workers by @aklife97 :: PR: #6548
Turn autocast off when precision is fp32 by @aklife97 :: PR: #6554
Fix batch size reconf for T5 FT for multi-validation by @aklife97 :: PR: #6582
Make tensor split contiguous for qkv and kv in attention by @aklife97 :: PR: #6580
Patches from main to r1.18.0 for Virtual Parallel by @titu1994 :: PR: #6592
Create dummy iters to satisy iter type len checks in core + update core commit by @aklife97 :: PR: #6600
Restore GPT support for interleaved pipeline parallelism by @timmoon10 :: PR: #6528
Add megatron_core to requirements by @ericharper :: PR: #6639

Export

Changelog

Bugfixes

Changelog

Fix the GPT SFT datasets loss mask bug by @yidong72 :: PR: #6409
[BugFix] Fix multi-processing bug in data simulator by @tango4j :: PR: #6310
Fix cache aware hybrid bugs by @VahidooX :: PR: #6466
[BugFix] Force _get_batch_preds() to keep logits in decoder timestamp… by @tango4j :: PR: #6500
Fixing bug in unsort_tensor by @borisfom :: PR: #6320
Bugfix for BF16 grad reductions with distopt by @timmoon10 :: PR: #6340
Limit urllib3 version to patch issue with RTD by @aklife97 :: PR: #6568

General improvements

Changelog

Pin the version to hopefully fix rtd build by @SeanNaren :: PR: #6334
enabling diverse datasets in val / test by @bmwshop :: PR: #6306
extract inference weights by @arendu :: PR: #6353
Add opengraph support for NeMo docs by @titu1994 :: PR: #6380
Adding basic preemption code by @athitten :: PR: #6161
Add documentation for preemption support by @athitten :: PR: #6403
Update hyperparameter recommendation based on experiments by @Zhilin123 :: PR: #6405
exceptions with empty test / val ds config sections by @bmwshop :: PR: #6421
Upgrade pt 23.03 by @ericharper :: PR: #6430
Update README to add core installation by @aklife97 :: PR: #6488
Not doing CastToFloat by default by @borisfom :: PR: #6524
Update manifest.py for speedup by @stevehuang52 :: PR: #6565
Update SDP docs by @erastorgueva-nv :: PR: #6485
Update core commit hash in readme by @aklife97 :: PR: #6622
Remove from jenkins by @ericharper :: PR: #6641
Remove dup by @ericharper :: PR: #6643

Contributors

bmwshop, XuesongYang, and 32 other contributors

Assets 7

05 Apr 00:10

ericharper

v1.17.0

d3017e4

NVIDIA Neural Modules 1.17.0

Highlights

NeMo ASR

Online Clustering Diarizer
High Level Diarization API
PyCTC Decode Beam Search Support
RNNT Beam Search Alignment Extraction
InterCTC Loss
AIStore Documentation
ASR & AWS Multi-node Integration
Convolution Invariant SDR losses

NeMo TTS

NeMo Megatron

SqaredReLU, SwiGLU, No-Dropout
Rotary Position Embedding
Untie word embeddings and output projection

NeMo Core

Dynamic freezing of modules during training
NeMo Multi-Run Documentation
ClearML Logging
Early Stopping
Experiment Manager Docs Update

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.02

Detailed Changelogs

ASR

Changelog

Support Alignment Extraction for all RNNT Beam decoding methods by @titu1994 :: PR: #5925
Use module-based k2 import guard by @artbataev :: PR: #6006
Default RNNT loss to int64 targets by @titu1994 :: PR: #6011
Added documentation section for ASR datasets from AIStore by @anteju :: PR: #6008
Change perturb rng for reproducing results easily by @fayejf :: PR: #6042
InterCTC loss and stochastic depth implementation by @Kipok :: PR: #6013
Add pyctcdecode to high level beam search API by @titu1994 :: PR: #6026
Convert esperanto into a notebook by @SeanNaren :: PR: #6070
[ASR] Added a script for evaluating metrics for audio-to-audio by @anteju :: PR: #5971
[ASR] Convolution-invariant SDR loss + unit tests by @anteju :: PR: #5992
Adjust stochastic depth dropout probability calculation by @anteju :: PR: #6120
Add file class based inference API for diarization by @SeanNaren :: PR: #5945
Ngram by @karpnv :: PR: #6063
remove duplicate definition of manifest read and write func. by @XuesongYang :: PR: #6088
Streaming conformer CTC export by @messiaen :: PR: #5837
[TTS] Make mel spectrogram norm configurable by @rlangman :: PR: #6155
Ngram lm fusion for RNNT maes decoding by @andrusenkoau :: PR: #6118
ASR Beam search documentation by @titu1994 :: PR: #6244

TTS

Changelog

[TTS][ZH] added new NGC model cards with polyphone disambiguation. by @XuesongYang :: PR: #5940
[TTS] deprecate AudioToCharWithPriorAndPitchDataset. by @XuesongYang :: PR: #5959
[TTS][G2P] deprecate add_symbols by @XuesongYang :: PR: #5961
Added list_available_models by @treacker :: PR: #5967
Update Fastpitch energy bug by @blisc :: PR: #5969
removed WHATEVER(1) ˌhwʌˈtɛvɚ from scripts/tts_dataset_files/ipa_cmudict-0.7b_nv22.10.txt by @MikyasDesta :: PR: #5869
ONNX export for RadTTS by @borisfom :: PR: #5880
Add some info about FastPitch SSL model by @redoctopus :: PR: #5994
Vits doc by @treacker :: PR: #5989
Ragged batching changes for RadTTS, some refactoring by @borisfom :: PR: #6020
Working enabled ragged batching with ONNX by @borisfom :: PR: #6030
[TTS/TN/G2P] Remove Text Processing from NeMo, move G2P to TTS by @ekmb :: PR: #5982
[TTS] Add Spanish IPA dictionaries and heteronyms by @rlangman :: PR: #6037
[TTS] Separate TTS tokenization and g2p util to fix circular import by @rlangman :: PR: #6080
[TTS][refactor] Part 7 - move module from model file. by @XuesongYang :: PR: #6098
[TTS][refactor] Part 1 - nemo.collections.tts.data by @XuesongYang :: PR: #6099
[TTS][refactor] Part 2 - nemo.colletions.tts.parts by @XuesongYang :: PR: #6105
[TTS][refactor] Part 6 - remove nemo.collections.tts.torch.README.md and tts_dataset.yaml by @XuesongYang :: PR: #6103
[TTS][refactor] Part 3 - nemo.collections.tts.g2p.models by @XuesongYang :: PR: #6113
[TTS] update German NGC models trained on Thorsten Datasets by @XuesongYang :: PR: #6125
[TTS] remove old waveglow model that relies on torch_stft. by @XuesongYang :: PR: #6128
[TTS] Move Spanish polyphones from heteronym to dictionary by @rlangman :: PR: #6123
[TTS][refactor] Part 8 - added model inference tests to safeguard changes. by @XuesongYang :: PR: #6129
remove duplicate definition of manifest read and write func. by @XuesongYang :: PR: #6088
[TTS][refactor] update tutorial import paths. by @XuesongYang :: PR: #6176
[TTS] Add univnet scheduler by @ArtyomZemlyak :: PR: #6157
[TTS] Make mel spectrogram norm configurable by @rlangman :: PR: #6155

NLP / NMT

Changelog

add new lannguages to doc by @yzhang123 :: PR: #5939
Distributed Adam optimizer overlaps param all-gather with forward compute by @timmoon10 :: PR: #5684
Refactor the retrieval services for microservice architecture by @yidong72 :: PR: #5910
make validation accuracy reporting optional for adapters/ptuning by @arendu :: PR: #5843
Add BERT support for overlapping forward compute with distopt communication by @timmoon10 :: PR: #6024
[TTS/TN/G2P] Remove Text Processing from NeMo, move G2P to TTS by @ekmb :: PR: #5982
adding early stop callback to ptuning by @arendu :: PR: #6028
Pr doc tn by @yzhang123 :: PR: #6041
Adds several configurable flags for Megatron GPT models by @MaximumEntropy :: PR: #5991
P-tuning refactor Part 1/N by @arendu :: PR: #6054
Fast glu activations by @MaximumEntropy :: PR: #6058
P-tuning refactor Part 2/N by @arendu :: PR: #6056
P-tuning refactor Part 3/N by @arendu :: PR: #6106
Explicitly check for united embeddings when logging params by @MaximumEntropy :: PR: #6085
Add flag to get attention from fusion by @ericharper :: PR: #6049
Improving text memmap generated index files error messages by @michalivne :: PR: #6093
Megatron Encoder-Decoder Sampler Function by @michalivne :: PR: #6095
Sentence piece legacy false compatibility by @arendu :: PR: #6154
convert Megatron LM ckpt to NeMo PP support. by @yidong72 :: PR: #6159
Avoid multiple warnings for loss mask by @mikolajblaz :: PR: #6062
Propagate LayerNorm1P to TE by @mikolajblaz :: PR: #6061
Filter p-tuning by example length by @arendu :: PR: #6182
Add sequence parallel support to Rope positional embedding by @yidong72 :: PR: #6178
Use a separate communicator for DP AMAX reduction by @erhoo82 :: PR: #6022
Add persistent workers to GPT by @ericharper :: PR: #6205
Micro batch loader for bert model by @shanmugamr1992 :: PR: #6046
GPT P tuning Eval changes (#5952) by @aklife97 :: PR: #6272
add template for taskname=taskname by @Zhilin123 :: PR: #6283
added RPE + fixed RMSNorm by @Davood-M :: PR: #6304
simplified notebook for p-tuning by @arendu :: PR: #6326
Added num decoder blocks in megatron export by @Davood-M :: PR: #6331

Text Normalization / Inverse Text Normalization

Changelog

[TTS/TN/G2P] Remove Text Processing from NeMo, move G2P to TTS by @ekmb :: PR: #5982

Export

Changelog

ONNX export for RadTTS by @borisfom :: PR: #5880
Working enabled ragged batching with ONNX by @borisfom :: PR: #6030
Update docs for ExpManager and Exportable frameworks by @titu1994 :: PR: #6165
Streaming conformer CTC export by @messiaen :: PR: #5837
MixedFusedRMSNorm Export Fix by @Davood-M :: PR: #6296
Added num decoder blocks in megatron export by @Davood-M :: PR: #6331

Bugfixes

Changelog

Fix bug where GPT always enabled distopt overlapped param sync by @timmoon10 :: PR: #5995
CS bugfix by @bmwshop :: PR: #6122
RNNT patch by @titu1994 :: PR: #6231
Notebook fixes by @titu1994 :: PR: #6212
Small fixes for flashlight decoder by @trias702 :: PR: #6071
Various fixes in docs and RNNT by @titu1994 :: PR: #6156
Fix k2 and torchaudio installation (Docker, macOS) by @artbataev :: PR: #6094
update and deprecate warning for Mic notebook by @fayejf :: PR: #6307
small bugfix and add asr evaluator to doc by @fayejf :: PR: #6229
Bug fixing for bucketing dataset by @VahidooX :: PR: #6191
Fix character beam decoding algorithm with vocab index map by @titu1994 :: PR: #6140
fix typo in asr evaluator readme by @fayejf :: PR: #6053
Fix typos by @titu1994 :: PR: #6241
[ASR]:fixed augmentor arguments for transcribe functionality of Hybrid CTC-RNNT model by @KunalDhawan :: PR: #6290
Fix hybrid transcribe by @ArtyomZemlyak :: PR: #6003
Fix buckeing seeding by @VahidooX :: PR: #6254
Fix for CTC decoder setup by @vsl9 :: PR: #6303
Fix RNNT Joint narrow() by @titu1994 :: PR: #6336
Fix bugs with interctc mixin by @Kipok :: PR: #6228
Update IPA dict path in tutorial by @redoctopus :: PR: #6208
[TTS] fix broken tutorial for Tacotron2 by @XuesongYang :: PR: #6199
[TTS] fix bugs for chinese and german tutorials. by @XuesongYang :: PR: #6216
Fix radtts sort r17 by @borisfom :: PR: #6344
Quick Fix for RadTTS test by @blisc :: PR: #6034
Disabling radtts tests untin we have real model by @borisfom :: PR: #6036
fix val loss computation in megatron by @anmolgupt :: PR: #5871
Fix incomplete batches by @mikolajblaz :: PR: #6083
Avoid unnecessarily accessing data loader with pipeline parallelism by @timmoon10 :: PR: #6164
bugfix: file handlers are not closed. by @XuesongYang :: PR: #5956
Fix Silence Sampling Algorithm for ASR Multi-speaker Data Simulator by @stevehuang52 :: PR: #5897
Fix Windows bug with save_restore_connector by @trias702 :: PR: #5919
fix broken link by @ericharper :: PR: #5968
Fix torchaudio installation by @artbataev :: PR: #5850
Fix reinstall.sh dependencies by @titu1994 :: PR: #6027
Adding changes to fix the mv error by @tango4j :: PR: #6087
Fix README by @flx42 :: PR: #6137
Fix typos in voiceapp notebook by @titu1994 :: PR: #6262
[BugFix] Fix diarization result path errors in tutorial notebook for r1.17.0 by @tango4j :: PR: #6234
[BugFix] Fix ...

Contributors

bmwshop, karpnv, and 39 other contributors

Assets 3

08 Mar 04:35

ericharper

v1.16.0

1631118

NVIDIA Neural Modules 1.16.0

Highlights

NeMo ASR

ASR Evaluator
Multi-channel dereverberation algorithm
Hybrid ASR-TTS Models
Flashlight Decoder Beam Search
FastConformer Encoder with 8x subsampling

NeMo TTS

SSL Voice Conversion
Spectrogram Enhancer
VITS

NeMo Megatron

Per microbatch dataloader for GPT and BERT
Adapters compatible with Faster Transformer

NeMo Core

Nested model support

NeMo Tools

NeMo Forced Aligner

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:23.01

ASR

Changelog

Fix for incorrect computation of batched alignment in transducers by @Kipok :: PR: #5692
Set the stream position to 0 for pydub by @jonghwanhyeon :: PR: #5752
[Fix] ConformerEncoder forward when length is None by @anteju :: PR: #5761
ASR evaluator by @fayejf :: PR: #5728
[ASR][Test] Enable test for cache audio with a single worker by @anteju :: PR: #5763
Flashlight Decoder for Nemo by @trias702 :: PR: #5790
Fix data simulator by @stevehuang52 :: PR: #5813
[ASR] Mask-based dereverb algorithm by @anteju :: PR: #5693
Concat dataset and aistore support for label models by @Kipok :: PR: #5826
Adding new features and speed up for multi-speaker data simulator by @tango4j :: PR: #5846
Add Esperanto ASR example by @andrusenkoau :: PR: #5772
Fix memory allocation of NeMo Multi-speaker Data Simulator by @stevehuang52 :: PR: #5864
[ASR] Separate Audio-to-Text (BPE, Char) dataset construction by @artbataev :: PR: #5774
Reduce memory usage in getMultiScaleCosAffinityMatrix function by @gabitza-tech :: PR: #5876
Hybrid ASR-TTS models by @artbataev :: PR: #5659
Set providers for onnxruntime inference session by @athitten :: PR: #5903
[ASR] Configurable metrics for audio-to-audio + removed experimental decorators by @anteju :: PR: #5827
Correct doc for RNNT transcribe() function by @titu1994 :: PR: #5904
Update isort to the latest version by @artbataev :: PR: #5895
FilterbankFeaturesTA to match FilterbankFeatures by @msis :: PR: #5913
Fix hybridasr bug by @VahidooX :: PR: #5950
replace symbols by @nithinraok :: PR: #5974
fast conformer configs and doc by @bmwshop :: PR: #5970
Update TitaNet-L and MSDD models by @nithinraok :: PR: #6023
Fix enhancer usage by @artbataev :: PR: #6059
update librosa args by @nithinraok :: PR: #6086
Fix enhancer usage in ASR-TTS examples by @artbataev :: PR: #6116
Fix k2 and torchaudio installation (Docker, macOS). Cherry-pick (#6094) by @artbataev :: PR: #6124

TTS

Changelog

[TTS] Update Spanish TTS model to 1.15 by @rlangman :: PR: #5742
[TTS][DE] refine grapheme-based tokenizer and fastpitch training recipe on thorsten's neutral datasets. by @XuesongYang :: PR: #5753
No-script TS export, prepared for ONNX export by @borisfom :: PR: #5653
Fixing masking in RadTTS bottleneck layer by @borisfom :: PR: #5771
Port Riva's mel cepstral distortion w/ dynamic time warping notebook by @redoctopus :: PR: #5778
Update radtts' infer path by @blisc :: PR: #5788
[TTS][DE] Augment tokenization/G2P to preserve capitalization of words and mix phonemes with word-level graphemes for an input text. by @XuesongYang :: PR: #5805
[TTS] porting VITS implementation by @treacker :: PR: #5600
[TTS][DE] updated IPA dictionary and heteronyms by @XuesongYang :: PR: #5860
[TTS] GAN-based spectrogram enhancer by @racoiaws :: PR: #5565
TTS inference with Heteronym classification model, hc model inference refactoring by @ekmb :: PR: #5768
Remove MCD_DTW tarball by @redoctopus :: PR: #5889
Hybrid ASR-TTS models by @artbataev :: PR: #5659
Moved eval notebook data to aws by @redoctopus :: PR: #5911
[G2P] fixed typos and broken import library. by @XuesongYang :: PR: #5978
[G2P] backward compatibility for english tokenizer and bugfix by @XuesongYang :: PR: #5980
fix links, add missing file by @ekmb :: PR: #6044
[TTS] Spectrogram Enhancer: correct dim for length when loading data by @racoiaws :: PR: #6048
[TTS] bugfix for fastpitch German tutorial by @XuesongYang :: PR: #6051
[TTS] bugfix Chinese Fastpitch tutorial by @XuesongYang :: PR: #6055
Fix enhancer usage by @artbataev :: PR: #6059
[TTS] Spectrogram Enhancer: support arbitrary input length by @racoiaws :: PR: #6060
Fix enhancer usage in ASR-TTS examples by @artbataev :: PR: #6116
[TTS] Spectrogram Enhancer: add option to zero out the initial tensor by @racoiaws :: PR: #6136
[TTS][DE] Augment tokenization/G2P to preserve capitalization of words and mix phonemes with word-level graphemes for an input text. by @XuesongYang :: PR: #5805

NLP / NMT

Changelog

Fix P-Tuning Truncation by @vadam5 :: PR: #5663
Adithyare/prompt learning seed by @arendu :: PR: #5749
Add extra data args to support proper finetuning of HF converted T5 checkpoints by @MaximumEntropy :: PR: #5719
Don't add output directory twice when creating shared sentencepiece tokenizer by @pks :: PR: #5737
add constraint info on batch size for tar dataset by @yzhang123 :: PR: #5812
remove transformer version upper bound by @Zhilin123 :: PR: #5831
Adithyare/adapter new placement by @arendu :: PR: #5791
Add SSL import functionality for Audio Lexical PNC Models by @trias702 :: PR: #5834
validation batch sizing and drop_last controls by @arendu :: PR: #5830
Remove ending newlines when encoding strings w/ sentencepiece tokenizer by @pks :: PR: #5739
Fix segmenting for pcla inference by @jubick1337 :: PR: #5849
RETRO model finetuning by @yidong72 :: PR: #5800
Optimizing distributed Adam when running with one work queue by @timmoon10 :: PR: #5560
Add option to disable distributed parameters in distributed Adam optimizer by @timmoon10 :: PR: #5685
set max_steps for lr decay through config by @anmolgupt :: PR: #5780
Fix Prompt text space issue by @aklife97 :: PR: #5983
Add batch_size to prompt_learning generate by @aklife97 :: PR: #6091

NeMo Tools

Changelog

[Tools] NeMo Forced Aligner by @erastorgueva-nv :: PR: #5571
[Tools] Fix ctc segmentation: exclude audacity files by @ekmb :: PR: #6009

Export

Changelog

No-script TS export, prepared for ONNX export by @borisfom :: PR: #5653
Set providers for onnxruntime inference session by @athitten :: PR: #5903
Add segmentation export to Audacity label file by @Ca-ressemble-a-du-fake :: PR: #5857

General Improvements

Changelog

Pin lightning version less than 1.9.0 by @SeanNaren :: PR: #5822
Davidm/cherrypick r1.16.0 by @Davood-M :: PR: #6082
Update files for lightning 1.9.0 by @SeanNaren :: PR: #5823
Tn doc 16 by @yzhang123 :: PR: #5954
Ensure EMA checkpoints are also deleted when normal checkpoints are by @SeanNaren :: PR: #5724
[Fix] ConformerEncoder forward when length is None by @anteju :: PR: #5761
Fix EMA topk checkpoint deletion by @SeanNaren :: PR: #5758
[BugFix] decoder timestamp count has a mismatch when is decoded by @tango4j :: PR: #5825
Update 00_NeMo_Primer.ipynb by @schaltung :: PR: #5740
Sanitize params before DLLogger log_hyperparams by @milesial :: PR: #5736
NeMo Forced Aligner by @erastorgueva-nv :: PR: #5571
Add EMA Docs, fix common collection documentation by @SeanNaren :: PR: #5757
Add container info to main page by @fayejf :: PR: #5816
CommonVoice support for script by @SeanNaren :: PR: #5797
Support nested NeMo models by @artbataev :: PR: #5671
fix max len generation t5 by @ekmb :: PR: #5852
NFA samples fix by @erastorgueva-nv :: PR: #5856
fix(readme): fix typo by @jqueguiner :: PR: #5883
Block large files from being merged into NeMo main by @SeanNaren :: PR: #5898
Pin isort version by @artbataev :: PR: #5914
fixed missing long_description_content_type by @XuesongYang :: PR: #5909
Update container to 23.01 by @ericharper :: PR: #5917
remove conda pynini install by @ekmb :: PR: #5921
Update align.py by @Slyne :: PR: #6043
Fixing data simulator argument and bash scripting error by @tango4j :: PR: #6112
Update apex commit by @ericharper :: PR: #6148

Contributors

pks, bmwshop, and 42 other contributors

Assets 2

02 Feb 00:49

ericharper

v1.15.0

8c785ec

NVIDIA Neural Modules 1.15.0

Highlights

NeMo ASR

HybridTransducer-CTC ASR
Greedy timestamp decoding with inference script
MHA adapters
Conformer local attention (longformer)
High level beam search API
Multiblank transducer
Multi-channel audio processing model
AIstore for ASR datasets

NeMo Megatron

ALiBi position embeddings support for T5

NeMo TTS

Chinese TTS pipeline with polyphone disambiguation

NeMo Core

Optimizer based EMA
MLFlow logger support

Models

stt_eo_conformer_ctc_large (HF, NGC) Esperanto ASR model.
stt_eo_conformer_transducer_large (HF, NGC) Esperanto ASR model.

Detailed Changelogs

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.12

ASR

Changelog

optimized loop and bugfix by @Jorjeous :: PR: #5573
Update torchmetrics by @nithinraok :: PR: #5566
Add an option to defer data setup from init to setup by @anteju :: PR: #5569
AIStore for ASR datasets by @anteju :: PR: #5462
Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
Update documentation and tutorials for Adapters by @titu1994 :: PR: #5610
Conformer local attention by @sam1373 :: PR: #5525
Add core classes and functions for online clustering diarizer part 1 by @tango4j :: PR: #5526
[Add] ASR+VAD Inference Pipeline by @stevehuang52 :: PR: #5575
[ASR] Audio processing base, multi-channel enhancement models by @anteju :: PR: #5356
Expose ClusteringDiarizer device by @SeanNaren :: PR: #5681
Add Beam Search support to ASR transcribe() by @titu1994 :: PR: #5443
Multiblank Transducer by @hainan-xv :: PR: #5527
pin torchmetrics version by @nithinraok :: PR: #5720
Update torchaudio dependency version for tutorials by @titu1994 :: PR: #5781
update torchmetrics to latest version by @nithinraok :: PR: #5801
Fix transducer and question answering tutorial bugs bugs by @Zhilin123 :: PR: #5809
[BugFix] Updated CTC decoders installation in tutorial by @vsl9 :: PR: #5833
update torchmetrics args confusionmatrix by @nithinraok :: PR: #5853
indentation fix by @nithinraok :: PR: #5861
Fix wrong label mapping in batch_inference for label_model by @fayejf :: PR: #5767

TTS

Changelog

Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
[TTS] fix ranges of char set for accented letters. by @XuesongYang :: PR: #5607
[TTS] add type hints and change varialbe names for tokenizers and g2p by @XuesongYang :: PR: #5602
Fixed RadTTS unit test by @borisfom :: PR: #5572
[TTS][ZH] Disambiguate polyphones with augmented dict and Jieba segmenter for Chinese FastPitch by @yuekaizhang :: PR: #5541
Add duration padding support for RADTTS inference by @kevjshih :: PR: #5650
[TTS] add tts dict cust notebook by @ekmb :: PR: #5662
[TN/TTS docs] TN customization, g2p docs moved to tts by @ekmb :: PR: #5683
typo and link fixed by @ekmb :: PR: #5741
link fixed by @ekmb :: PR: #5745
Update Tacotron2 NGC checkpoint load to latest version by @redoctopus :: PR: #5760
Docs g2p update by @ekmb :: PR: #5769
[TTS][ZH] bugfix import jieba errors. by @XuesongYang :: PR: #5776

NLP / NMT

Changelog

Text generation improvement (UI client, data parallel support) by @yidong72 :: PR: #5437
O2 style amp for gpt3 ptuning by @JimmyZhang12 :: PR: #5246
Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
Bert interleaved by @shanmugamr1992 :: PR: #5556
Port stateless timer to exp manager by @MaximumEntropy :: PR: #5584
Add interface for making amax reduction optional for FP8 by @ksivaman :: PR: #5447
Propagate attention_dropout flag for GPT-3 by @mikolajblaz :: PR: #5669
Enc-Dec model size reporting fixes by @MaximumEntropy :: PR: #5623
Add prompt learning tests by @arendu :: PR: #5649
Fix missing torchelastic fixes for PTL 1.8 by @MaximumEntropy :: PR: #5691
ALiBi Positional Embeddings by @michalivne :: PR: #5467
Megatron export triton update by @Davood-M :: PR: #5766
Fix transducer and question answering tutorial bugs bugs by @Zhilin123 :: PR: #5809
Update description for question answering tutorial by @Zhilin123 :: PR: #5814
TPMLP for T5-based models by @Davood-M :: PR: #5840
Megatron positional encoding alibi fix by @michalivne :: PR: #5808

Export

Changelog

Add keep_initializers_as_inputs to _export method by @pks :: PR: #5731
Megatron export triton update by @Davood-M :: PR: #5766

General Improvements

Changelog

Update to pytorch 22.12 container by @ericharper :: PR: #5694
optimized loop and bugfix by @Jorjeous :: PR: #5573
Expose ClusteringDiarizer device by @SeanNaren :: PR: #5681
remove useless files. by @XuesongYang :: PR: #5580
[Fix] setup_multiple validation/test data by @anteju :: PR: #5585
Move to optimizer based EMA implementation by @SeanNaren :: PR: #5169
[Temp workaround] Disable test with cache_audio to unblock CI by @anteju :: PR: #5615
[EMA] Change success message to reduce confusion by @SeanNaren :: PR: #5621
Temporarily disable prompt learning CI tests by @ericharper :: PR: #5633
[Dockerfile] Remove AIS archive from docker image by @anteju :: PR: #5629
[workflow] add exclude labels option to ignore cherry-picks in releas… by @XuesongYang :: PR: #5645
Add DLLogger support to exp_manager by @milesial :: PR: #5658
Fix EMA restart by allowing device to be set by the class init by @SeanNaren :: PR: #5668
Remove SDP (moved to separate repo) - merge to main by @erastorgueva-nv :: PR: #5630
temp disable speaker recognision CI test by @fayejf :: PR: #5696
Don't print exp_manager warning when max_steps == -1 by @milesial :: PR: #5725
Add tabular data generation documents to the index file by @yidong72 :: PR: #5733
fix token id bug by @yidong72 :: PR: #5777
Update numpy requirements from 1.21 to 1.22 by @Zhilin123 :: PR: #5785
Fix setuptools to usable version by @titu1994 :: PR: #5798
add apt-get upgrade -y in dockerfile by @fayejf :: PR: #5817
Update NeMo Multi-Run docs by @titu1994 :: PR: #5844
add ambernet to readme by @fayejf :: PR: #5872
update apex install instructions for 1.15 by @ericharper :: PR: #5901

Contributors

pks, kevjshih, and 29 other contributors

Assets 2

24 Dec 02:49

ericharper

v1.14.0

0a0b8a1

NVIDIA Neural Modules 1.14.0

Highlights

NeMo ASR

Hybrid CTC + Transducer loss ASR #5364
Sampled Softmax RNNT (Enables large vocab RNNT, for speech translation and multilingual ASR) #5216
ASR Adapters hyper parameter search scripts #5159
RNNT {ONNX, TorchScript} x GPU export infer #5248
Exportable MelSpectrogram (TorchScript) #5512
Audio To Audio Dataset Processor #5196
Multi Channel Audio Transcription #5479
Silence Augmentation #5476

NeMo Megatron

Support for the Mixture of Experts for T5
Fix PTL model size output for GPT-3 and BERT
BERT with Tensor Parallelism & Pipeline Parallel Support

NeMo Core

Hydra Multirun core support + NeMo HP optim in YAML #5159

NeMo Models

TTS Zh Fastpitch HifiGan SFSpeech

Detailed Changelogs

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.11

ASR

Changelog

[Tools][ASR] Tool for generating data using simulated RIRs by @anteju :: PR: #5158
Modernize RNNT ONNX export and add TS export by @titu1994 :: PR: #5248
Add Gradio App to ASR Docs by @titu1994 :: PR: #5270
Add support for Sampled Softmax for RNNT Joint by @titu1994 :: PR: #5216
Speed up HF data processing script for ASR by @titu1994 :: PR: #5330
bugfix in volume loss for CTC models by @bmwshop :: PR: #5348
Add cpWER for evaluation of ASR with diarization by @tango4j :: PR: #5279
Fix for getting tokenizer in character-based ASR models when using tarred dataset by @jonghwanhyeon :: PR: #5442
Refactor/unify ASR offline and buffered inference by @fayejf :: PR: #5440
Standalone diarization+ASR evaluation script by @tango4j :: PR: #5439
[ASR] Transcribe for multi-channel signals by @anteju :: PR: #5479
Add Silence Augmentation by @fayejf :: PR: #5476
add exportable mel spec by @1-800-BAD-CODE :: PR: #5512
add RNN-T loss implemented by PyTorch and test code by @hainan-xv :: PR: #5312
[ASR] AudioToAudio datasets and related test by @anteju :: PR: #5196
Add StreamingFeatureBufferer class for real-life streaming decoding by @tango4j :: PR: #5534
Pool stats with padding by @1-800-BAD-CODE :: PR: #5403
Adding Hybrid RNNT-CTC model by @VahidooX :: PR: #5364
Fix ASR Buffered inference scripts by @titu1994 :: PR: #5552
Add wer details - insertion, deletion, substitution rate by @fayejf :: PR: #5557
Add support for Time Stamp calculation using transcribe_speech.py by @titu1994 :: PR: #5568
[STT] Add Esperanto (Eo) ASR Conformer-CTC and Conformer-Transducer models by @andrusenkoau :: PR: #5639

TTS

Changelog

[TTS] Fastpitch energy condition and refactoring by @subhankar-ghosh :: PR: #5218
[TTS] HiFi-TTS Download Script by @oleksiivolk :: PR: #5241
[TTS] Add Mandarin/English Bilingual Recipe for Training Fastpitch Models by @yuekaizhang :: PR: #5208
[TTS] fixed type of filepath and rename openslr. by @XuesongYang :: PR: #5276
[TTS] replace obsolete torch_tts unit test marker with run_only_on('CPU') by @XuesongYang :: PR: #5307
[TTS] bugfix IPAG2P and refactor to remove duplicate process. by @XuesongYang :: PR: #5304
Update path to get_data.py in TTS tutorial by @redoctopus :: PR: #5311
[TTS] Replace IPA lambda arguments with locale string by @rlangman :: PR: #5298
[TTS] expand to support flexible dictionary entry formats in IPAG2P. by @XuesongYang :: PR: #5318
[TTS] update organization of model checkpoints and their pointers. by @XuesongYang :: PR: #5327
[TTS] bugfix for the script of generating mels from fastpitch. by @XuesongYang :: PR: #5344
[TTS] Add Spanish model documentation by @rlangman :: PR: #5390
[TTS] Add Spanish FastPitch training configs by @rlangman :: PR: #5383
[TTS] replace pitch normalization params with ??? by @XuesongYang :: PR: #5392
[TTS] Create script for processing TTS training audio by @rlangman :: PR: #5262
[TTS] remove useless logic for set_tokenizer. by @XuesongYang :: PR: #5430
[TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue by @borisfom :: PR: #5358
JOC Optimization in FastPitch by @subhankar-ghosh :: PR: #5450
[TTS] Support speaker level pitch normalization by @rlangman :: PR: #5455
TTS tutorial update: use speaker 9017 instead of 6097 by @redoctopus :: PR: #5532
[TTS] Remove unused TTS eval function by @redoctopus :: PR: #5605
[TTS][ZH] add fastpitch and hifigan model NGC urls and update NeMo docs. by @XuesongYang :: PR: #5596
[TTS][DOC] add notes about automatic conversion to target sampling ra… by @XuesongYang :: PR: #5624
[TTS][ZH] bugfix for the tutorial and add NGC CLI installation guide. by @XuesongYang :: PR: #5643
[TTS][ZH] bugfix for ngc cli installation. by @XuesongYang :: PR: #5652
[TTS][ZH] fix broken link for the script. by @XuesongYang :: PR: #5666

NLP / NMT

Changelog

Option to pad the last validation input sequence if its smaller than the encoder sequence length for MegatronGPT by @anmolgupt :: PR: #5243
Fixes bugs with loss averaging with for Megatron GPT by @shanmugamr1992 :: PR: #5329
Fixing bug in Megatron BERT when loss mask is all zeros by @shanmugamr1992 :: PR: #5424
support to disable sequence length + 1 input tokens for each sample in MegatronGPT by @anmolgupt :: PR: #5363
[TN] raise NotImplementedError for unsupported languages and other minor fixes by @XuesongYang :: PR: #5414
Bug fix/gpt by @shanmugamr1992 :: PR: #5493
prompt tuning fix for unscale grad errors by @arendu :: PR: #5523
Bert sequence parallel support by @shanmugamr1992 :: PR: #5494
NLP docs fixes by @vsl9 :: PR: #5528
Switch order of args in optimizer_step override by @ericharper :: PR: #5549
Upgrade to 22.11 by @ericharper :: PR: #5550
Merge r1.13.0 main by @ericharper :: PR: #5570
some tokenizers do not have additional_special_tokens_ids attribute by @arendu :: PR: #5642
Remove cell output from tutorial by @ericharper :: PR: #5689

Text Normalization / Inverse Text Normalization

Changelog

[ITN] fix year date graph, cardinals extension for hundreds by @ekmb :: PR: #5435
[TN] raise NotImplementedError for unsupported languages and other minor fixes by @XuesongYang :: PR: #5414

Export

Changelog

Fixed the onnx bug in conformer for non-streaming models. by @VahidooX :: PR: #5242
Modernize RNNT ONNX export and add TS export by @titu1994 :: PR: #5248
Fixes for Conformer-xl export by @borisfom :: PR: #5309
Remove onnx graphsurgery from Dockerfile by @titu1994 :: PR: #5320
add exportable mel spec by @1-800-BAD-CODE :: PR: #5512

General Improvements

Changelog

bugfix in volume loss for CTC models by @bmwshop :: PR: #5348
Fix setting up of learning rate scheduler by @PeganovAnton :: PR: #5444
Better patch hydra by @titu1994 :: PR: #5591
[TTS][ZH] bugfix for the tutorial and add NGC CLI installation guide. by @XuesongYang :: PR: #5643
Add fully torch.jit.script-able speaker clustering module by @tango4j :: PR: #5191
Update perturb.py by @stevehuang52 :: PR: #5231
remove CV requirements. by @XuesongYang :: PR: #5233
checks for accepted adapter type at module level by @arendu :: PR: #5194
fix hypotheses return by @nithinraok :: PR: #5253
Support for inserting additional subsampling in conformer encoder by @shan18 :: PR: #5224
update tutorials to use meeting config as default and VAD by @nithinraok :: PR: #5237
Specifying audio signal dropout separately for the Conformer Encoder by @shan18 :: PR: #5263
created by @bmwshop :: PR: #5268
Fix failing speaker counting for short audio samples by @tango4j :: PR: #5267
O2bert + apex pipeline functions by @shanmugamr1992 :: PR: #5221
Upperbound PTL by @titu1994 :: PR: #5302
Update Interface(s) phonetic entry by @blisc :: PR: #5212
add label inference support to EncDecSpeakerLabel class by @nithinraok :: PR: #5278
Add italian model checkpoints by @Kipok :: PR: #5315
Text Memmap Parsing Improvements by @michalivne :: PR: #5265
Update librosa signature in HF processing script by @titu1994 :: PR: #5321
Force wav file format for audio_filepath by @titu1994 :: PR: #5323
Updates to T0 Dataset and Model by @MaximumEntropy :: PR: #5201
[DOC] add sphinx-copybutton requirement to copy button on code snippets. by @XuesongYang :: PR: #5326
Add support for Hydra multirun to NeMo by @titu1994 :: PR: #5159
typo fix by @arendu :: PR: #5328
add precommit hood to automatic sort entries in requirements. by @XuesongYang :: PR: #5333
Add speaker clustering arguments to forward function by @tango4j :: PR: #5306
Fixing de-autocast by @borisfom :: PR: #5319
[Bugfix] Added rm -f / wget- nc command to avoid bash error in multispeaker sim notebook by @tango4j :: PR: #5292
[DOC] added ipython dependency to support IPython.sphinxext extension by @XuesongYang :: PR: #5345
Bug fix (removing old compute consumed samples) by @shanmugamr1992 :: PR: #5355
removed uninstall nemo_cv and nemo_simple_gan and relax numba version… by @XuesongYang :: PR: #5332
Enable mlflow logger by @whrichd :: PR: #4893
Fix Python type hints according to Python Docs by @artbataev :: PR: #5370
Distributed optimizer support for BERT by @timmoon10 :: PR: #5305
SpeakerClustering: fix tensor dimennsions in forward() by @virajkarandikar :: PR: #5387
add squad by @arendu :: PR: #5407
added python and c++ alignment code by @yzhang123 :: PR: #5346
Add MoE support for T5 model (w/o expert parallel) by @aklife97 :: PR: #5409
Fix...

Contributors

bmwshop, jonghwanhyeon, and 39 other contributors

Assets 2

07 Dec 21:14

ericharper

v1.13.0

1ff05cc

NVIDIA Neural Modules 1.13.0

Highlights

NeMo ASR

Spoken Language Understanding (SLU) models based on Conformer encoder and transformer decoder
Support for codeswitched manifests during training
Support for Language ID during inference for ML models
Support of cache-aware streaming for offline models
Word confidence estimation for CTC & RNNT greedy decoding

NeMo Megatron

Interleaved Pipeline schedule
Transformer Engine for GPT
HF T5v1.1 -> NeMo-Megatron conversion and finetuning/p-tuning
IA3 and Adapter Tuning (Tensor + Pipeline Parallel)
Pipeline Parallel Support for T5 Prompt Learning
MegatronNMT export

NeMo TTS

TTS introductory tutorial
Phonemizer/espeak removal (Spanish/German)
Char-only support for Spanish/German models
Documentation Refactor

NeMo Core

Upgrade to NGC PyTorch 22.09 container
Add pre-commit hooks
Exponential moving average (EMA) of weights during training

NeMo Models

ASR Conformer Croatian: stt_hr_conformer_ctc_large and stt_hr_conformer_transducer_large
ASR Conformer Belarusian: stt_be_conformer_ctc_large and stt_be_conformer_transducer_large
ASR Squeezeformer Librispeech: 6 checkpoints (XS, S, SM, M, ML, L)
SLURP Intent Classification / Slot Filling: slu_conformer_transformer_large_slurp
LanguageID AmberNet: langid_ambernet

Detailed Changelogs

Container

For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo

docker pull nvcr.io/nvidia/nemo:22.09

Known Issues

Issues

pytest for RadTTSModel_export_to_torchscript are failing intermittently due to random input values. Fixed in main.

ASR

Changelog

Add docs tutorial on kinyarwanda asr by @bene-ges :: PR: #4953
Asr codeswitch by @bmwshop :: PR: #4821
Add test for nested ASR model by @titu1994 :: PR: #5002
Greedy decoding confidence for CTC and RNNT by @GNroy :: PR: #4931
[ASR][Tools] RIR corpus generator by @anteju :: PR: #4927
Add Squeezeformer CTC model checkpoints on Librispeech by @titu1994 :: PR: #5121
adding loss normalization options to rnnt joint by @bmwshop :: PR: #4829
Asr concat dataloader by @bmwshop :: PR: #5108
Added ASR model comparison to SDE by @Jorjeous :: PR: #5043
Add scripts for converting Spoken Wikipedia to asr dataset by @bene-ges :: PR: #5138
ASR confidence bug fix for older Python versions by @GNroy :: PR: #5180
Update ASR Scores and Results by @titu1994 :: PR: #5254
[STT] Add Ru ASR Conformer-CTC and Conformer-Transducer by @ssh-meister :: PR: #5340

TTS

Changelog

[TTS] Adding speaker embedding conditioning in fastpitch by @subhankar-ghosh :: PR: #4986
[TTS] Remove PhonemizerTokenizer by @rlangman :: PR: #4990
[TTS] FastPitch speaker interpolation by @subhankar-ghosh :: PR: #4997
RADTTS model changes to accommodate export with batch size > 1 by @borisfom :: PR: #4947
[TTS] remove phonemizer.py by @XuesongYang :: PR: #5090
[TTS] Add NeMo TTS Primer Tutorial by @rlangman :: PR: #4933
[TTS] Add SpanishCharsTokenizer by @rlangman :: PR: #5135
Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training by @Kipok :: PR: #5118
refactor TTS documentation organization and add new contents. by @XuesongYang :: PR: #5137
[TTS][DOC] update models trained on HifiTTS dataset. by @XuesongYang :: PR: #5173
[TTS] Fix TTS Primer image markup by @rlangman :: PR: #5192
[TTS] deprecate TextToWaveform base class. by @XuesongYang :: PR: #5205
[TTS] remove the avoidance of circular imports by @XuesongYang :: PR: #5214
[TTS] remove LinVocoder and apply Vocoder as parent class. by @XuesongYang :: PR: #5206
[TTS] unify requirements_tts.txt and requirements_torch_tts.txt by @XuesongYang :: PR: #5232
Minor typo fixes in TTS tutorial by @redoctopus :: PR: #5266
Radtts 1.13 by @borisfom :: PR: #5451
Radtts 1.13 plus by @borisfom :: PR: #5457

NLP / NMT

Changelog

IA3 support for GPT and T5 by @arendu :: PR: #4909
Fix and refactor consumed samples save/restore for Megatron models. by @MaximumEntropy :: PR: #5077
Remove unsupported arguments from MegatronNMT by @MaximumEntropy :: PR: #5065
Update megatron interface to dialogue by @Zhilin123 :: PR: #4936
gpt ia3 CI tests by @arendu :: PR: #5140
Fix NMT Eval Sampler by @aklife97 :: PR: #5154
Add interleaved pipeline schedule to GPT by @ericharper :: PR: #5025
fix for bug in bignlp by @arendu :: PR: #5172
Fixes some args that were not removed properly for multilingual Megatron NMT by @MaximumEntropy :: PR: #5142
Fix absolute path in GPT Adapter CI tests by @arendu :: PR: #5184
Add ability to configure drop last batch for validation datasets with MegatronGPT by @shanmugamr1992 :: PR: #5067
Megatron Export Update by @Davood-M :: PR: #5343
Fix GPT generation when using sentencepiece tokenizer by @MaximumEntropy :: PR: #5413
Disable sync_batch_comm in validation_step for GPT by @ericharper :: PR: #5397
Set sync_batch_comm=False in prompt learning and inference by @MaximumEntropy :: PR: #5448
Fix a bug with positional vs key-word based argument passing in the transformer layer by @MaximumEntropy :: PR: #5475

Text Normalization / Inverse Text Normalization

Changelog

[Chinese text normalization] speed up graph building by @pengzhendong :: PR: #5128

NeMo Tools

Changelog

Added ASR model comparison to SDE by @Jorjeous :: PR: #5043

Export

Changelog

Fix export bug by @VahidooX :: PR: #5009
RADTTS model changes to accommodate export with batch size > 1 by @borisfom :: PR: #4947
Support TorchScript export for Squeezeformer by @titu1994 :: PR: #5164
Expose keep_initializers_as_inputs to Exportable class by @pks :: PR: #5052
Fix the self-attention export bug for cache-aware streaming Conformer by @VahidooX :: PR: #5114
replace ColumnParallelLinear with nn.Linear in export_utils by @arendu :: PR: #5217
Megatron Export Update by @Davood-M :: PR: #5343
Fix Conformer Export in 1.13.0 (cherry-pick from main) by @artbataev :: PR: #5446
export_utils bugfix by @Davood-M :: PR: #5480
Export fixes for Riva by @borisfom :: PR: #5496

General Improvements and Bugfixes

Changelog

don't use bfloat16 when in jit by @bmwshop :: PR: #5051
Set sync_batch_comm=False in prompt learning and inference by @MaximumEntropy :: PR: #5448
Fix a bug with positional vs key-word based argument passing in the transformer layer by @MaximumEntropy :: PR: #5475
Pin Transformers version to fix CI by @SeanNaren :: PR: #4955
Fix changelog builder (#4962) by @titu1994 :: PR: #4963
Checkpoint averaging class fix by @michalivne :: PR: #4946
Add ability to give seperate datasets for test, train and validation by @shanmugamr1992 :: PR: #4798
Add simple pre-commit file by @SeanNaren :: PR: #4983
Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execut… by @liji-nv :: PR: #4951
Improvements to AMI script by @SeanNaren :: PR: #4974
clean warnings from tests and CI runs, and prepare for upgrade to PTL 1.8 by @nithinraok :: PR: #4830
Update libraries by @titu1994 :: PR: #5010
add close inactive issues and PRs github action. by @XuesongYang :: PR: #5015
Fix filename extraction in vad_utils.py by @GKPr0 :: PR: #4999
Add black to pre-commit by @SeanNaren :: PR: #5027
[CI] Enable previous build abort when new commit pushed by @SeanNaren :: PR: #5041
Tutorials and Docs for Multi-scale Diarization Decoder by @tango4j :: PR: #4930
Refactor output directory for MSDD Inference Notebook by @SeanNaren :: PR: #5044
text_memmap dataset index range testing fix by @michalivne :: PR: #5034
fix undefined constant in code example by @bene-ges :: PR: #5046
Text generation refactor and RETRO text generation implementation by @yidong72 :: PR: #4985
Lids by @bmwshop :: PR: #4820
Add datasets folder, add diarization datasets voxconverse/aishell by @SeanNaren :: PR: #5042
Fix the bugs in cache-aware streaming Conformer by @VahidooX :: PR: #5032
Bug fix - Limit val batches set to 1.0 by @shanmugamr1992 :: PR: #5023
[bug_fix] kv_channels is used when available by @arendu :: PR: #5066
Add spe_split_by_unicode_script arg by @piraka9011 :: PR: #5072
Transformer Engine Integration by @ericharper :: PR: #5104
Text memmap dataset index memory efficiency by @michalivne :: PR: #5056
Add NGC links for Aligner and FastPitch by @redoctopus :: PR: #5235
Fix link to inference notebook by @redoctopus :: PR: #5247
Fix links to speaker identification notebook by @SeanNaren :: PR: #5260
Fix bug into Dialogue tutorial by @Zhilin123 :: PR: #5277
PCLA tutorial typo fix by @jubick1337 :: PR: #5288
Fix dialogue tutorial bug by @Zhilin123 :: PR: #5297
small bugfix for r1.13.0 by @fayejf :: PR: #5310
Add italian model checkpoints by @Kipok :: PR: #5316
Pcla tutorial fixes by @jubick1337 :: PR: #5313
Fix issue with HF Model upload tutorial by @titu1994 :: PR: #5359
P&C LA tutorial fixes by @jubick1337 :: PR: #5354
Add SDP documentation by @erastorgueva...

Contributors

pks, bmwshop, and 36 other contributors

Assets 2

Releases: NVIDIA/NeMo

NVIDIA Neural Modules 1.20.0

Highlights

Models

NeMo ASR

NeMo TTS

NeMo Framework

NeMo Tools

Container

Detailed Changelogs

ASR

TTS

NLP / NMT

NeMo Tools

Bugfixes

General Improvements

Contributors

NVIDIA Neural Modules 1.19.1

NVIDIA Neural Modules 1.19.0

Highlights

NeMo ASR

NeMo TTS

NeMo Megatron

Container

Detailed Changelogs

ASR

TTS

NLP / NMT

Bugfixes

General Improvements

Contributors

NVIDIA Neural Modules 1.18.1

Highlights

Bugfix

Container

NVIDIA Neural Modules 1.18.0

Highlights

Models

NeMo ASR

NeMo TTS

NeMo Megatron

NeMo Core

Detailed Changelogs

ASR

TTS

NLP / NMT

Export

Bugfixes

General improvements

Contributors

NVIDIA Neural Modules 1.17.0

Highlights

NeMo ASR

NeMo TTS

NeMo Megatron

NeMo Core

Container

Detailed Changelogs

ASR

TTS

NLP / NMT

Text Normalization / Inverse Text Normalization

Export

Bugfixes

Contributors

NVIDIA Neural Modules 1.16.0

Highlights

NeMo ASR

NeMo TTS

NeMo Megatron

NeMo Core

NeMo Tools

Container

ASR

TTS

NLP / NMT

NeMo Tools

Export

General Improvements

Contributors