Releases: NVIDIA/NeMo
NVIDIA Neural Modules 1.20.0
Highlights
Models
- STT En Fast Conformer CTC XXLarge - 1.2 B param Fast Conformer CTC
- STT En Fast Conformer Transducer XXLarge - 1.2 B param Fast Conformer Transducer
- STT En Fast Conformer Transducer XLarge - XLarge Fast Conformer English
- STT En Fast Conformer CTC XLarge - XLarge Fast Conformer CTC
- STT En Fast Conformer Transducer XLarge - XLarge Fast Conformer Transducer
- STT En Fast Conformer CTC Large - Large Fast Conformer CTC
- STT En Fast Conformer Transducer Large - Large Fast Conformer Transducer
- STT It Fast Conformer Hybrid Large P&C - Large P&C Italian Fast Conformer
- STT Ua Fast Conformer Hybrid Large P&C - Large Ukranian Fast Conformer
NeMo ASR
- Graph-RNN-T #6168
- WildCard-RNN-T #6168
- Confidence Ensembles for ASR
- Token-and-Duration Transducer (TDT) #6536
- Spellchecking ASR #6179
- Numba FP16 RNNT Loss #6991
NeMo TTS
- TTS Adapter Customization
- TTS Dataloader Framework
NeMo Framework
- LoRA for T5 and mT5 #6612
- Flash Attention integration #6666
- Mosaic 7B compatibility
- Models with LongContext (32K) #6666, #6687, #6773
NeMo Tools
- Speech Data Explorer: Utterance level ASR model comparsion #6669
- Speech Data Processor: Spanish P&C
- NeMo Forced Aligner: Large sequence alignment + memory reduction #6695
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:23.06
Detailed Changelogs
ASR
Changelog
- [ASR] Adding ssl config for fast-conformer by @krishnacpuvvada :: PR: #6672
- Fix for interctc test random failure by @Kipok :: PR: #6644
- sharded manifests docs by @bmwshop :: PR: #6751
- [TTS] Implement new vocoder dataset by @rlangman :: PR: #6670
- TDT model pull request by @hainan-xv :: PR: #6536
- Spec aug fix by @tbartley94 :: PR: #6775
- Support large inputs to Conformer and Fast Conformer by @bmwshop :: PR: #6556
- sharded manifests updated docs by @bmwshop :: PR: #6833
- added fc-xl, xxl and titanet-s models by @nithinraok :: PR: #6832
- Multi-lookahead cache-aware streaming models by @VahidooX :: PR: #6711
- Update transcribe_utils.py by @stevehuang52 :: PR: #6865
- Fix k2 build topo helper by @artbataev :: PR: #6887
- Fix transcribe_utils.py for hybrid models in partial transcribe mode by @stevehuang52 :: PR: #6899
- Add hybrid model support to transcribe_speech_parallel.py by @stevehuang52 :: PR: #6906
- Update Frame-VAD doc by @stevehuang52 :: PR: #6902
- Make sure asr_model.change_attention_model is run if either cfg.model_path or cfg.pretrained_name is specified by @erastorgueva-nv :: PR: #6908
- Update fvad doc by @stevehuang52 :: PR: #6920
- Online Code Switching Dataset for ASR by @trias702 :: PR: #6579
- Fix AN4 dataset links by @artbataev :: PR: #6926
- Fix confidence ensembles RNNT logprobs selection logic for exclude_blank scenario by @KunalDhawan :: PR: #6937
- Adding cache-aware streaming ASR checkpoints. by @VahidooX :: PR: #6940
- Remove from metrics by @titu1994 :: PR: #6979
- Hybrid conformer export by @borisfom :: PR: #6983
- Cache handling without input tensors mutation by @borisfom :: PR: #6980
- Fixing an issue with confidence ensembles by @Kipok :: PR: #6987
- Add ASR with TTS Tutorial. Fix enhancer usage. by @artbataev :: PR: #6955
- fix install_beamsearch_decoders.sh by @karpnv :: PR: #7019
- Add support for Numba FP16 RNNT Loss (#6991) by @titu1994 :: PR: #7038
- Fix typo and branch in tutorial by @artbataev :: PR: #7048
- Refined export_config by @borisfom :: PR: #7053
- Fix documentation for Numba by @titu1994 :: PR: #7065
- Adding docs and models for multiple lookahead cache-aware ASR by @VahidooX :: PR: #7067
- Add updated fc ctc and rnnt xxl models by @nithinraok :: PR: #7128
- Update notebook branch by @ericharper :: PR: #7135
- Fixed main and merging this to r1.20 by @tango4j :: PR: #7127
- Fix default context size by @nithinraok :: PR: #7141
- Fix incorrect embedding grads with distopt BF16 grad reductions by @timmoon10 :: PR: #6958
TTS
Changelog
- [TTS] Add callback for saving audio during FastPitch training by @rlangman :: PR: #6665
- [TTS] Add script for text preprocessing by @rlangman :: PR: #6541
- [TTS] Fix adapter duration issue by @hsiehjackson :: PR: #6697
- [TTS] Filter out silent audio files during preprocessing by @rlangman :: PR: #6716
- [TTS] fix inconsistent type hints for IpaG2p by @XuesongYang :: PR: #6733
- [TTS] relax hardcoded prefix for phonemes and tones and infer phoneme set through dict by @XuesongYang :: PR: #6735
- [TTS] corrected misleading deprecation warnings. by @XuesongYang :: PR: #6702
- Fix TTS adapter tutorial by @hsiehjackson :: PR: #6741
- [TTS][zh] refine hardcoded lowercase for ASCII letters. by @XuesongYang :: PR: #6781
- [TTS] Append pretrained FastPitch & SpectrogamEnhancer pair to available models by @racoiaws :: PR: #7012
NLP / NMT
Changelog
- minor fix for missing chat attr by @arendu :: PR: #6671
- eval fix by @arendu :: PR: #6685
- VP Fixes for converter + Config management by @titu1994 :: PR: #6698
- lora notebook by @arendu :: PR: #6765
- peft eval directly from ckpt by @arendu :: PR: #6785
- GPT inference long context by @ekmb :: PR: #6687
- Fix validation with drop_last=False by @mikolajblaz :: PR: #6704
- fix spellmapper tutorial, change branch to main by @bene-ges :: PR: #6803
- text_generation_utils memory reduction if no logprob needed by @yzhang123 :: PR: #6773
- Add optional index mapping dir in mmap text datasets by @gheinrich :: PR: #6683
- Add inference kv cache support for transformer TE path by @yen-shi :: PR: #6627
- add reference to our paper by @bene-ges :: PR: #6821
- added changes to ramp up bs by @dimapihtar :: PR: #6799
- t5 lora tuning by @arendu :: PR: #6612
- Added rouge monitoring support for T5 by @jubick1337 :: PR: #6737
- GPT extrapolatable position embedding (xpos/sandwich/alibi/kerple) and Flash Attention by @hsiehjackson :: PR: #6666
- Import Enum for chatbot component by @ericharper :: PR: #6877
- typo fix from #6666 by @arendu :: PR: #6882
- removed unnecessary print by @dimapihtar :: PR: #6884
- Fix destructor for delayed mmap dataset case by @mikolajblaz :: PR: #6703
- Make Gradio library optional by @yidong72 :: PR: #6904
- Fix fast-glu activation in change partitions by @hsiehjackson :: PR: #6909
- Documentation for ONNX export of Megatron Models by @asfiyab-nvidia :: PR: #6914
- FixTextMemMapDataset index file creation in multi-node setup by @gheinrich :: PR: #6768
- Fix flash-attention by @hsiehjackson :: PR: #6901
- ptuning oom fix by @arendu :: PR: #6916
- add rampup bs assertion by @dimapihtar :: PR: #6927
- Enable methods in bert-like models by @sararb :: PR: #6898
- support value attribution condition by @yidong72 :: PR: #6934
- Add missing save restore connector to eval scripts by @titu1994 :: PR: #6935
- Merge release r1.19.0 into main by @ericharper :: PR: #6948
- Stop at the stop token by @yidong72 :: PR: #6957
- fixes for spellmapper by @bene-ges :: PR: #6994
- Fix tabular data text generation by @yidong72 :: PR: #7022
- fix pos id - hf update by @ekmb :: PR: #7075
- fix syntax error introduced in PR-7079 by @bene-ges :: PR: #7102
NeMo Tools
Bugfixes
Changelog
- small Bugfix by @fayejf :: PR: #7079
- Fix caching bug in causal convolutions for cache-aware ASR models by @VahidooX :: PR: #7034
- Fix masking bug for TTS Aligner by @redoctopus :: PR: #6677
- [bugfix] avoid the random shuffle of phoneme and tone tokens. by @XuesongYang :: PR: #6855
- fix ptuning residuals bug by @arendu :: PR: #6866
- TE bug fix by @dimapihtar :: PR: #7027
- Update distopt API for coalesced NCCL calls by @timmoon10 :: PR: #6886
General Improvements
Changelog
- update batch size recommendation to min 32 for 43b by @Zhilin123 :: PR: #6675
- Make Note usage consistent in adapter_mixins.py by @BrianMcBrayer :: PR: #6678
- Update all invalid tree references to blobs for NeMo samples by @BrianMcBrayer :: PR: #6679
- Update README.rst about container by @fayejf :: PR: #6686
- karpnv/issues6690 by @karpnv :: PR: #6705
- Limit codeql scope by @titu1994 :: PR: #6710
- Not pinning Gradio version by @yidong72 :: PR: #6680
- preprocess squad in sft format by @arendu :: PR: #6727
- Fix Codeql config by @titu1994 :: PR: #6731
- Fix fastpitch test nightly by @hsiehjackson :: PR: #6730
- Lora/PEFT training script CI test by @arendu :: PR: #6664
- fixed decor to show messages only when the wrapped object is called. by @XuesongYang :: PR: #6793
- lora pp2 by @arendu :: PR: #6818
- Upperbound Numpy to < 1.24 by @titu1994 :: PR: #6829
- Fix typo in documentation by @Dounx :: PR: #6838
- NFA updates by @erastorgueva-nv :: PR: #6695
- Update container for import action by @ericharper :: PR: #6883
- removed some tests by @arendu :: PR: #6900
- Update contai...
NVIDIA Neural Modules 1.19.1
This release is a small patch to fix torchmetrics.
- Remove deprecated arg
compute_on_step
. See #6979.
NVIDIA Neural Modules 1.19.0
Highlights
NeMo ASR
- Sharded Manifests for Tarred Datasets #6395
- Frame-VAD model + datasets support #6441
- Noise Norm Perturbation #6445
- Code Switched Dataset with IID Sampling #6448
NeMo TTS
NeMo Megatron
- Batch size rampup #6424
- Unify dataset and model classes for all PEFT #6391
- LoRA for GPT #6391
- Convert interleaved pipeline model to non-interleaved #6498
- Dialog Dataset for SFT #6654
- Dynamic length batches for GPT SFT #6510
- Merge LoRA weights into base model #6597
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:23.04
Detailed Changelogs
ASR
Changelog
- Sharded manifests for tarred datasets by @bmwshop :: PR: #6395
- Update script for ngram rnnt and hat beam search decoding by @andrusenkoau :: PR: #6370
- Add disclaimer about dataset for ASR by @titu1994 :: PR: #6496
- New noise_norm perturbation based on Riva work by @trias702 :: PR: #6445
- Add Frame-VAD model and datasets by @stevehuang52 :: PR: #6441
- removing unnecessary avoid_bfloat16_autocast_context by @bmwshop :: PR: #6481
- FC models in menu by @bmwshop :: PR: #6473
- Separate punctuation by whitespace by @karpnv :: PR: #6574
- Cherry pick commits in #6601 to main by @fayejf :: PR: #6611
- Offline and streaming inference support for hybrid model by @fayejf :: PR: #6570
- Disable interctc tests by @Kipok :: PR: #6638
- ASR-TTS Models: Support hybrid RNNT-CTC, improve docs. by @artbataev :: PR: #6620
- Confidence ensembles implementation by @Kipok :: PR: #6614
- Confidence ensembles: fix issues and add tuning functionality by @Kipok :: PR: #6657
- Add support for RNNT/hybrid models to partial transcribe by @stevehuang52 :: PR: #6609
- eval_beamsearch_ngram.py with hybrid ctc by @karpnv :: PR: #6656
TTS
Changelog
- [TTS] FastPitch adapter fine-tune and conditional layer normalization by @hsiehjackson :: PR: #6416
- [TTS] whitelist broken path fix. by @XuesongYang :: PR: #6412
- [TTS] FastPitch speaker encoder by @hsiehjackson :: PR: #6417
- Update NeMo_TTS_Primer.ipynb by @pythinker :: PR: #6436
- [TTS] Create functions for TTS preprocessing without dataloader by @rlangman :: PR: #6317
- [TTS] Fix FastPitch energy code by @rlangman :: PR: #6511
- [TTS] Add script for computing feature stats by @rlangman :: PR: #6508
- [TTS] Add tutorials for FastPitch TTS speaker adaptation with adapters by @hsiehjackson :: PR: #6431
- [TTS] Create initial TTS dataset feature processors by @rlangman :: PR: #6507
- [TTS] Add script for mapping speaker names to indices by @rlangman :: PR: #6509
- [TTS] Implement new TextToSpeech dataset by @rlangman :: PR: #6575
NLP / NMT
Changelog
- Add patches for Virtual Parallel conversion by @titu1994 :: PR: #6589
- Update wfst_text_normalization.rst by @jimregan :: PR: #6374
- add rampup batch size support for Megatron GPT by @dimapihtar :: PR: #6424
- Add interleaved pp support by @titu1994 :: PR: #6498
- Support dynamic length batches with GPT SFT by @aklife97 :: PR: #6510
- Framework for PEFT via mixins by @arendu :: PR: #6391
- Add GPT eval mode fix for interleaved to main (#6449) by @aklife97 :: PR: #6610
- sft model can use this script for eval by @arendu :: PR: #6637
- Patch memory used for NeMo Megatron models by @titu1994 :: PR: #6615
- merge lora weights into base model by @arendu :: PR: #6597
- Dialogue dataset by @yidong72 :: PR: #6654
- check for first or last stage by @ericharper :: PR: #6708
- A few small typo fixes by @Kipok :: PR: #6599
- Lddl bert by @wdykas :: PR: #6761
- Debug Transformer Engine FP8 support with Megatron-core infrastructure by @timmoon10 :: PR: #6740
- Tensor-parallel communication overlap with userbuffer backend by @erhoo82 :: PR: #6780
- Add ub communicator initialization to validation step by @erhoo82 :: PR: #6807
- Add trainer.validate example for GPT by @ericharper :: PR: #6794
- Add API docs for NeMo Megatron by @ericharper :: PR: #6850
- Apply garbage collection interval to validation steps by @erhoo82 :: PR: #6870
Bugfixes
Changelog
- [BugFix] Force _get_batch_preds() to keep logits in decoder timestamps generator by @tango4j :: PR: #6499
- small bugfix for asr_evaluator by @fayejf :: PR: #6636
- fix bucketing bug issue for picking new bucket by @nithinraok :: PR: #6663
- [TTS] Fix TTS audio preprocessing bugs by @rlangman :: PR: #6628
- Fix a bug, use _ceil_to_nearest instead as _round_to_nearest is not d… by @BestJuly :: PR: #6681
- Bug fix to restore act ckpt by @markelsanz14 :: PR: #6753
- Bug fix to reset sequence parallelism by @markelsanz14 :: PR: #6756
- Bug fix for reset_sequence_parallel_args by @markelsanz14 :: PR: #6802
- Fix adapter tutorial r1.19.0 by @hsiehjackson :: PR: #6776
- Fix error appearing when using tar datasets by @Jorjeous :: PR: #6502
- Fix normalization of impulse response in ImpulsePerturbation by @anteju :: PR: #6505
- Fix typos by @titu1994 :: PR: #6523
- Fix notebook bad json by @titu1994 :: PR: #6561
- [ASR] Fix for old models in change_attention_model by @sam1373 :: PR: #6608
- Fix k2 installation in Docker with CUDA 12 by @artbataev :: PR: #6707
- Tutorial fixes by @titu1994 :: PR: #6717
- Vp fixes by @titu1994 :: PR: #6738
- [TTS] Fix aligner nan loss in fp32 by @hsiehjackson :: PR: #6435
- fix conversion and eval by @arendu :: PR: #6648
- Fix checkpointed forward and add test for full activation checkpointing by @aklife97 :: PR: #6744
- add call to p2p overlap by @aklife97 :: PR: #6779
- Fix get_parameters when using main params optimizer by @ericharper :: PR: #6764
- Fix GPTDataset Assert by @MaximumEntropy :: PR: #6798
- fix notebook error by @yidong72 :: PR: #6840
- final fix of notebook by @yidong72 :: PR: #6842
General Improvements
Changelog
- Code-Switching dataset creation - upgrading to aggregate tokenizer manifest format by @KunalDhawan :: PR: #6448
- Fix an invalid link in get_data.py of ljspeech by @pythinker :: PR: #6456
- Update manifest.py to use os.path for get_full_path by @stevehuang52 :: PR: #6598
- Cherry pick commits in #6528 to main by @timmoon10 :: PR: #6613
- Move black parameters to pyproject.toml by @artbataev :: PR: #6647
- handle artifacts when path is an extracted dir by @arendu :: PR: #6658
- remove upgrading setuptools in reinstall.sh by @XuesongYang :: PR: #6659
- Upgrade to PyTorch 23.04 Container by @ericharper :: PR: #6660
- Fix fastpitch test nightly by @hsiehjackson :: PR: #6742
- Fix Links for tutorials by @titu1994 :: PR: #6777
- Update core version in Jenkinsfile by @aklife97 :: PR: #6817
- Update mcore requirement to 0.2.0 by @ericharper :: PR: #6875
NVIDIA Neural Modules 1.18.1
Highlights
For the complete release note, please see NeMo 1.18.0 Release Notes
Bugfix
This patch release fixes a major bug in ASR Bucketing datasets that was introduced in r1.17.0 in PR #6191. Due to this bug, while each bucket is randomly shuffled before selection on each rank, only a single bucket would loop infinitely - without continuing onto subsequent buckets.
Effect: Significantly worse WER would be obtained since not all buckets would be used.
This has been patched and should work correctly in 1.18.1 onwards.
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:23.03
NVIDIA Neural Modules 1.18.0
Highlights
Models
- GPT-2B-001, trained on 1.1T tokens with 4K sequence length.
- STT En Fast Conformer-CTC Large
- STT En Fast Conformer-Transducer Large
- STT En Fast Conformer-Transducer Large LibriSpeech
- STT En FastConformer Hybrid Transducer-CTC Large P&C
- STT De FastConformer Hybrid Transducer-CTC Large P&C
- STT Es FastConformer Hybrid Transducer-CTC Large P&C
- STT It FastConformer Hybrid Transducer-CTC Large P&C
- STT Pl FastConformer Hybrid Transducer-CTC Large P&C
- STT Ua FastConformer Hybrid Transducer-CTC Large P&C
- STT Hr FastConformer Hybrid Transducer-CTC Large P&C
- STT By Conformer-RNNT Large
NeMo ASR
- Hybrid Autoregressive Transducer (HAT) #6260
- Apple MPS Support for ASR Inference #6289
- InterCTC Support for Hybrid ASR Models #6215
- RNNT N-Gram Fusion with mAES algo #6118
- ASR + Apple M2 CPU/GPU MPS #6289
NeMo TTS
- TTS directory structure refactor
- User-set symbol vocabulary #6172
NeMo Megatron
- Model parallelism from Megatron Core #6393
- Continued training for P-tuning #6273
- SFT for GPT-3 #6210
- Tensor and pipeline model parallel conversion #6218
- Megatron NMT Export to Riva
NeMo Core
Detailed Changelogs
ASR
Changelog
- minor cleanup by @messiaen :: PR: #6311
- docs on the use of heterogeneous test / val manifests by @bmwshop :: PR: #6352
- [WIP] add buffered chunked streaming for nemo force aligner by @Slyne :: PR: #6185
- Word boosting for Flashlight decoder by @trias702 :: PR: #6367
- Add installation and ASR inference instructions for Mac by @artbataev :: PR: #6377
- specaug speedup by @1-800-BAD-CODE :: PR: #6347
- updated lr for FC configs by @bmwshop :: PR: #6379
- Make possible to control tqdm progress bar in ASR models by @SN4KEBYTE :: PR: #6375
- [ASR] Conformer global tokens in local attention by @sam1373 :: PR: #6253
- fixed torch warning on using a list of numpy arrays by @MKNachesa :: PR: #6382
- Fix FastConformer config: correct bucketing strategy by @artbataev :: PR: #6413
- fixing the ability to use temp sampling with concat datasets by @bmwshop :: PR: #6423
- add conformer configs for hat model by @andrusenkoau :: PR: #6372
- [ASR] Add optimization util for linear sum assignment algorithm by @tango4j :: PR: #6349
- Added/updated new Conformer configs by @VahidooX :: PR: #6426
- Fix typos by @titu1994 :: PR: #6494
- Fix typos (#6523) by @titu1994 :: PR: #6539
- added back the fast emit section to the configs. by @VahidooX :: PR: #6540
- Add FastConformer Hybrid ASR models for EN, ES, IT, DE, PL, HR, UA, BY by @KunalDhawan :: PR: #6549
- Add scores for FastConformer models by @titu1994 :: PR: #6557
- Patch transcribe and support offline transcribe for hybrid model by @fayejf :: PR: #6550
- More streaming conformer export fixes by @messiaen :: PR: #6567
- Documentation for ASR-TTS models by @artbataev :: PR: #6594
- Patch transcribe_util for steaming mode and add wer calculation back to inference scripts by @fayejf :: PR: #6601
- Add HAT image to docs by @andrusenkoau :: PR: #6619
- Patch decoding for PC models by @titu1994 :: PR: #6630
- Fix wer.py where 'errors' variable was not set by @stevehuang52 :: PR: #6633
- Fix for old models in change_attention_model by @VahidooX :: PR: #6635
TTS
Changelog
NLP / NMT
Changelog
- [Core] return_config=True now extracts just config, not full tarfile by @titu1994 :: PR: #6346
- restore path for p-tuning by @arendu :: PR: #6273
- taskname and early stopping for adapters by @arendu :: PR: #6366
- Adapter tuning accepts expanded language model dir by @arendu :: PR: #6376
- Update gpt_training.rst by @blisc :: PR: #6378
- Megatron GPT model finetuning by @MaximumEntropy :: PR: #6210
- [NeMo Megatron] Cleanup configs to infer the models TP PP config automatically by @titu1994 :: PR: #6368
- Fix prompt template unescaping by @MaximumEntropy :: PR: #6399
- Add support for Megatron GPT Untied Embd TP PP Change by @titu1994 :: PR: #6388
- Move Parallelism usage from Apex -> Megatron Core by @aklife97 :: PR: #6393
- Add ability to enable/disable act ckpt and seq parallelism in GPT by @markelsanz14 :: PR: #6327
- Refactor PP conversion + add support for TP only conversion by @titu1994 :: PR: #6419
- fix CPU overheads of GPT synthetic dataset by @xrennvidia :: PR: #6427
- check if grad is none before calling all_reduce by @arendu :: PR: #6428
- Fix replace_bos_with_pad not found by @aklife97 :: PR: #6443
- Support Swiglu in TP PP Conversion by @titu1994 :: PR: #6437
- BERT pre-training mp fork to spawn by @aklife97 :: PR: #6442
- Meagtron encoder decoder fix for empty validation outputs by @michalivne :: PR: #6459
- Reduce workers on NMT CI by @aklife97 :: PR: #6472
- Switch to NVIDIA Megatron repo by @aklife97 :: PR: #6465
- Megatron KERPLE positional embeddings by @michalivne :: PR: #6478
- Support in external sample mapping for Megatron datasets by @michalivne :: PR: #6462
- Fix custom by @aklife97 :: PR: #6512
- GPT fp16 inference fix by @MaximumEntropy :: PR: #6543
- Fix for T5 FT model by @aklife97 :: PR: #6529
- Pass instead of scaler object to core by @aklife97 :: PR: #6545
- Change Megatron Enc Dec model to use persistent_workers by @aklife97 :: PR: #6548
- Turn autocast off when precision is fp32 by @aklife97 :: PR: #6554
- Fix batch size reconf for T5 FT for multi-validation by @aklife97 :: PR: #6582
- Make tensor split contiguous for qkv and kv in attention by @aklife97 :: PR: #6580
- Patches from main to r1.18.0 for Virtual Parallel by @titu1994 :: PR: #6592
- Create dummy iters to satisy iter type len checks in core + update core commit by @aklife97 :: PR: #6600
- Restore GPT support for interleaved pipeline parallelism by @timmoon10 :: PR: #6528
- Add megatron_core to requirements by @ericharper :: PR: #6639
Export
Changelog
Bugfixes
Changelog
- Fix the GPT SFT datasets loss mask bug by @yidong72 :: PR: #6409
- [BugFix] Fix multi-processing bug in data simulator by @tango4j :: PR: #6310
- Fix cache aware hybrid bugs by @VahidooX :: PR: #6466
- [BugFix] Force _get_batch_preds() to keep logits in decoder timestamp… by @tango4j :: PR: #6500
- Fixing bug in unsort_tensor by @borisfom :: PR: #6320
- Bugfix for BF16 grad reductions with distopt by @timmoon10 :: PR: #6340
- Limit urllib3 version to patch issue with RTD by @aklife97 :: PR: #6568
General improvements
Changelog
- Pin the version to hopefully fix rtd build by @SeanNaren :: PR: #6334
- enabling diverse datasets in val / test by @bmwshop :: PR: #6306
- extract inference weights by @arendu :: PR: #6353
- Add opengraph support for NeMo docs by @titu1994 :: PR: #6380
- Adding basic preemption code by @athitten :: PR: #6161
- Add documentation for preemption support by @athitten :: PR: #6403
- Update hyperparameter recommendation based on experiments by @Zhilin123 :: PR: #6405
- exceptions with empty test / val ds config sections by @bmwshop :: PR: #6421
- Upgrade pt 23.03 by @ericharper :: PR: #6430
- Update README to add core installation by @aklife97 :: PR: #6488
- Not doing CastToFloat by default by @borisfom :: PR: #6524
- Update manifest.py for speedup by @stevehuang52 :: PR: #6565
- Update SDP docs by @erastorgueva-nv :: PR: #6485
- Update core commit hash in readme by @aklife97 :: PR: #6622
- Remove from jenkins by @ericharper :: PR: #6641
- Remove dup by @ericharper :: PR: #6643
NVIDIA Neural Modules 1.17.0
Highlights
NeMo ASR
- Online Clustering Diarizer
- High Level Diarization API
- PyCTC Decode Beam Search Support
- RNNT Beam Search Alignment Extraction
- InterCTC Loss
- AIStore Documentation
- ASR & AWS Multi-node Integration
- Convolution Invariant SDR losses
NeMo TTS
NeMo Megatron
- SqaredReLU, SwiGLU, No-Dropout
- Rotary Position Embedding
- Untie word embeddings and output projection
NeMo Core
- Dynamic freezing of modules during training
- NeMo Multi-Run Documentation
- ClearML Logging
- Early Stopping
- Experiment Manager Docs Update
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:23.02
Detailed Changelogs
ASR
Changelog
- Support Alignment Extraction for all RNNT Beam decoding methods by @titu1994 :: PR: #5925
- Use module-based k2 import guard by @artbataev :: PR: #6006
- Default RNNT loss to int64 targets by @titu1994 :: PR: #6011
- Added documentation section for ASR datasets from AIStore by @anteju :: PR: #6008
- Change perturb rng for reproducing results easily by @fayejf :: PR: #6042
- InterCTC loss and stochastic depth implementation by @Kipok :: PR: #6013
- Add pyctcdecode to high level beam search API by @titu1994 :: PR: #6026
- Convert esperanto into a notebook by @SeanNaren :: PR: #6070
- [ASR] Added a script for evaluating metrics for audio-to-audio by @anteju :: PR: #5971
- [ASR] Convolution-invariant SDR loss + unit tests by @anteju :: PR: #5992
- Adjust stochastic depth dropout probability calculation by @anteju :: PR: #6120
- Add file class based inference API for diarization by @SeanNaren :: PR: #5945
- Ngram by @karpnv :: PR: #6063
- remove duplicate definition of manifest read and write func. by @XuesongYang :: PR: #6088
- Streaming conformer CTC export by @messiaen :: PR: #5837
- [TTS] Make mel spectrogram norm configurable by @rlangman :: PR: #6155
- Ngram lm fusion for RNNT maes decoding by @andrusenkoau :: PR: #6118
- ASR Beam search documentation by @titu1994 :: PR: #6244
TTS
Changelog
- [TTS][ZH] added new NGC model cards with polyphone disambiguation. by @XuesongYang :: PR: #5940
- [TTS] deprecate AudioToCharWithPriorAndPitchDataset. by @XuesongYang :: PR: #5959
- [TTS][G2P] deprecate add_symbols by @XuesongYang :: PR: #5961
- Added list_available_models by @treacker :: PR: #5967
- Update Fastpitch energy bug by @blisc :: PR: #5969
- removed WHATEVER(1) ˌhwʌˈtɛvɚ from scripts/tts_dataset_files/ipa_cmudict-0.7b_nv22.10.txt by @MikyasDesta :: PR: #5869
- ONNX export for RadTTS by @borisfom :: PR: #5880
- Add some info about FastPitch SSL model by @redoctopus :: PR: #5994
- Vits doc by @treacker :: PR: #5989
- Ragged batching changes for RadTTS, some refactoring by @borisfom :: PR: #6020
- Working enabled ragged batching with ONNX by @borisfom :: PR: #6030
- [TTS/TN/G2P] Remove Text Processing from NeMo, move G2P to TTS by @ekmb :: PR: #5982
- [TTS] Add Spanish IPA dictionaries and heteronyms by @rlangman :: PR: #6037
- [TTS] Separate TTS tokenization and g2p util to fix circular import by @rlangman :: PR: #6080
- [TTS][refactor] Part 7 - move module from model file. by @XuesongYang :: PR: #6098
- [TTS][refactor] Part 1 - nemo.collections.tts.data by @XuesongYang :: PR: #6099
- [TTS][refactor] Part 2 - nemo.colletions.tts.parts by @XuesongYang :: PR: #6105
- [TTS][refactor] Part 6 - remove nemo.collections.tts.torch.README.md and tts_dataset.yaml by @XuesongYang :: PR: #6103
- [TTS][refactor] Part 3 - nemo.collections.tts.g2p.models by @XuesongYang :: PR: #6113
- [TTS] update German NGC models trained on Thorsten Datasets by @XuesongYang :: PR: #6125
- [TTS] remove old waveglow model that relies on torch_stft. by @XuesongYang :: PR: #6128
- [TTS] Move Spanish polyphones from heteronym to dictionary by @rlangman :: PR: #6123
- [TTS][refactor] Part 8 - added model inference tests to safeguard changes. by @XuesongYang :: PR: #6129
- remove duplicate definition of manifest read and write func. by @XuesongYang :: PR: #6088
- [TTS][refactor] update tutorial import paths. by @XuesongYang :: PR: #6176
- [TTS] Add univnet scheduler by @ArtyomZemlyak :: PR: #6157
- [TTS] Make mel spectrogram norm configurable by @rlangman :: PR: #6155
NLP / NMT
Changelog
- add new lannguages to doc by @yzhang123 :: PR: #5939
- Distributed Adam optimizer overlaps param all-gather with forward compute by @timmoon10 :: PR: #5684
- Refactor the retrieval services for microservice architecture by @yidong72 :: PR: #5910
- make validation accuracy reporting optional for adapters/ptuning by @arendu :: PR: #5843
- Add BERT support for overlapping forward compute with distopt communication by @timmoon10 :: PR: #6024
- [TTS/TN/G2P] Remove Text Processing from NeMo, move G2P to TTS by @ekmb :: PR: #5982
- adding early stop callback to ptuning by @arendu :: PR: #6028
- Pr doc tn by @yzhang123 :: PR: #6041
- Adds several configurable flags for Megatron GPT models by @MaximumEntropy :: PR: #5991
- P-tuning refactor Part 1/N by @arendu :: PR: #6054
- Fast glu activations by @MaximumEntropy :: PR: #6058
- P-tuning refactor Part 2/N by @arendu :: PR: #6056
- P-tuning refactor Part 3/N by @arendu :: PR: #6106
- Explicitly check for united embeddings when logging params by @MaximumEntropy :: PR: #6085
- Add flag to get attention from fusion by @ericharper :: PR: #6049
- Improving text memmap generated index files error messages by @michalivne :: PR: #6093
- Megatron Encoder-Decoder Sampler Function by @michalivne :: PR: #6095
- Sentence piece legacy false compatibility by @arendu :: PR: #6154
- convert Megatron LM ckpt to NeMo PP support. by @yidong72 :: PR: #6159
- Avoid multiple warnings for loss mask by @mikolajblaz :: PR: #6062
- Propagate LayerNorm1P to TE by @mikolajblaz :: PR: #6061
- Filter p-tuning by example length by @arendu :: PR: #6182
- Add sequence parallel support to Rope positional embedding by @yidong72 :: PR: #6178
- Use a separate communicator for DP AMAX reduction by @erhoo82 :: PR: #6022
- Add persistent workers to GPT by @ericharper :: PR: #6205
- Micro batch loader for bert model by @shanmugamr1992 :: PR: #6046
- GPT P tuning Eval changes (#5952) by @aklife97 :: PR: #6272
- add template for taskname=taskname by @Zhilin123 :: PR: #6283
- added RPE + fixed RMSNorm by @Davood-M :: PR: #6304
- simplified notebook for p-tuning by @arendu :: PR: #6326
- Added num decoder blocks in megatron export by @Davood-M :: PR: #6331
Text Normalization / Inverse Text Normalization
Export
Changelog
- ONNX export for RadTTS by @borisfom :: PR: #5880
- Working enabled ragged batching with ONNX by @borisfom :: PR: #6030
- Update docs for ExpManager and Exportable frameworks by @titu1994 :: PR: #6165
- Streaming conformer CTC export by @messiaen :: PR: #5837
- MixedFusedRMSNorm Export Fix by @Davood-M :: PR: #6296
- Added num decoder blocks in megatron export by @Davood-M :: PR: #6331
Bugfixes
Changelog
- Fix bug where GPT always enabled distopt overlapped param sync by @timmoon10 :: PR: #5995
- CS bugfix by @bmwshop :: PR: #6122
- RNNT patch by @titu1994 :: PR: #6231
- Notebook fixes by @titu1994 :: PR: #6212
- Small fixes for flashlight decoder by @trias702 :: PR: #6071
- Various fixes in docs and RNNT by @titu1994 :: PR: #6156
- Fix k2 and torchaudio installation (Docker, macOS) by @artbataev :: PR: #6094
- update and deprecate warning for Mic notebook by @fayejf :: PR: #6307
- small bugfix and add asr evaluator to doc by @fayejf :: PR: #6229
- Bug fixing for bucketing dataset by @VahidooX :: PR: #6191
- Fix character beam decoding algorithm with vocab index map by @titu1994 :: PR: #6140
- fix typo in asr evaluator readme by @fayejf :: PR: #6053
- Fix typos by @titu1994 :: PR: #6241
- [ASR]:fixed augmentor arguments for transcribe functionality of Hybrid CTC-RNNT model by @KunalDhawan :: PR: #6290
- Fix hybrid transcribe by @ArtyomZemlyak :: PR: #6003
- Fix buckeing seeding by @VahidooX :: PR: #6254
- Fix for CTC decoder setup by @vsl9 :: PR: #6303
- Fix RNNT Joint narrow() by @titu1994 :: PR: #6336
- Fix bugs with interctc mixin by @Kipok :: PR: #6228
- Update IPA dict path in tutorial by @redoctopus :: PR: #6208
- [TTS] fix broken tutorial for Tacotron2 by @XuesongYang :: PR: #6199
- [TTS] fix bugs for chinese and german tutorials. by @XuesongYang :: PR: #6216
- Fix radtts sort r17 by @borisfom :: PR: #6344
- Quick Fix for RadTTS test by @blisc :: PR: #6034
- Disabling radtts tests untin we have real model by @borisfom :: PR: #6036
- fix val loss computation in megatron by @anmolgupt :: PR: #5871
- Fix incomplete batches by @mikolajblaz :: PR: #6083
- Avoid unnecessarily accessing data loader with pipeline parallelism by @timmoon10 :: PR: #6164
- bugfix: file handlers are not closed. by @XuesongYang :: PR: #5956
- Fix Silence Sampling Algorithm for ASR Multi-speaker Data Simulator by @stevehuang52 :: PR: #5897
- Fix Windows bug with save_restore_connector by @trias702 :: PR: #5919
- fix broken link by @ericharper :: PR: #5968
- Fix torchaudio installation by @artbataev :: PR: #5850
- Fix reinstall.sh dependencies by @titu1994 :: PR: #6027
- Adding changes to fix the mv error by @tango4j :: PR: #6087
- Fix README by @flx42 :: PR: #6137
- Fix typos in voiceapp notebook by @titu1994 :: PR: #6262
- [BugFix] Fix diarization result path errors in tutorial notebook for r1.17.0 by @tango4j :: PR: #6234
- [BugFix] Fix ...
NVIDIA Neural Modules 1.16.0
Highlights
NeMo ASR
- ASR Evaluator
- Multi-channel dereverberation algorithm
- Hybrid ASR-TTS Models
- Flashlight Decoder Beam Search
- FastConformer Encoder with 8x subsampling
NeMo TTS
- SSL Voice Conversion
- Spectrogram Enhancer
- VITS
NeMo Megatron
- Per microbatch dataloader for GPT and BERT
- Adapters compatible with Faster Transformer
NeMo Core
- Nested model support
NeMo Tools
- NeMo Forced Aligner
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:23.01
ASR
Changelog
- Fix for incorrect computation of batched alignment in transducers by @Kipok :: PR: #5692
- Set the stream position to 0 for pydub by @jonghwanhyeon :: PR: #5752
- [Fix] ConformerEncoder forward when length is None by @anteju :: PR: #5761
- ASR evaluator by @fayejf :: PR: #5728
- [ASR][Test] Enable test for cache audio with a single worker by @anteju :: PR: #5763
- Flashlight Decoder for Nemo by @trias702 :: PR: #5790
- Fix data simulator by @stevehuang52 :: PR: #5813
- [ASR] Mask-based dereverb algorithm by @anteju :: PR: #5693
- Concat dataset and aistore support for label models by @Kipok :: PR: #5826
- Adding new features and speed up for multi-speaker data simulator by @tango4j :: PR: #5846
- Add Esperanto ASR example by @andrusenkoau :: PR: #5772
- Fix memory allocation of NeMo Multi-speaker Data Simulator by @stevehuang52 :: PR: #5864
- [ASR] Separate Audio-to-Text (BPE, Char) dataset construction by @artbataev :: PR: #5774
- Reduce memory usage in getMultiScaleCosAffinityMatrix function by @gabitza-tech :: PR: #5876
- Hybrid ASR-TTS models by @artbataev :: PR: #5659
- Set providers for onnxruntime inference session by @athitten :: PR: #5903
- [ASR] Configurable metrics for audio-to-audio + removed experimental decorators by @anteju :: PR: #5827
- Correct doc for RNNT transcribe() function by @titu1994 :: PR: #5904
- Update isort to the latest version by @artbataev :: PR: #5895
- FilterbankFeaturesTA to match FilterbankFeatures by @msis :: PR: #5913
- Fix hybridasr bug by @VahidooX :: PR: #5950
- replace symbols by @nithinraok :: PR: #5974
- fast conformer configs and doc by @bmwshop :: PR: #5970
- Update TitaNet-L and MSDD models by @nithinraok :: PR: #6023
- Fix enhancer usage by @artbataev :: PR: #6059
- update librosa args by @nithinraok :: PR: #6086
- Fix enhancer usage in ASR-TTS examples by @artbataev :: PR: #6116
- Fix k2 and torchaudio installation (Docker, macOS). Cherry-pick (#6094) by @artbataev :: PR: #6124
TTS
Changelog
- [TTS] Update Spanish TTS model to 1.15 by @rlangman :: PR: #5742
- [TTS][DE] refine grapheme-based tokenizer and fastpitch training recipe on thorsten's neutral datasets. by @XuesongYang :: PR: #5753
- No-script TS export, prepared for ONNX export by @borisfom :: PR: #5653
- Fixing masking in RadTTS bottleneck layer by @borisfom :: PR: #5771
- Port Riva's mel cepstral distortion w/ dynamic time warping notebook by @redoctopus :: PR: #5778
- Update radtts' infer path by @blisc :: PR: #5788
- [TTS][DE] Augment tokenization/G2P to preserve capitalization of words and mix phonemes with word-level graphemes for an input text. by @XuesongYang :: PR: #5805
- [TTS] porting VITS implementation by @treacker :: PR: #5600
- [TTS][DE] updated IPA dictionary and heteronyms by @XuesongYang :: PR: #5860
- [TTS] GAN-based spectrogram enhancer by @racoiaws :: PR: #5565
- TTS inference with Heteronym classification model, hc model inference refactoring by @ekmb :: PR: #5768
- Remove MCD_DTW tarball by @redoctopus :: PR: #5889
- Hybrid ASR-TTS models by @artbataev :: PR: #5659
- Moved eval notebook data to aws by @redoctopus :: PR: #5911
- [G2P] fixed typos and broken import library. by @XuesongYang :: PR: #5978
- [G2P] backward compatibility for english tokenizer and bugfix by @XuesongYang :: PR: #5980
- fix links, add missing file by @ekmb :: PR: #6044
- [TTS] Spectrogram Enhancer: correct dim for length when loading data by @racoiaws :: PR: #6048
- [TTS] bugfix for fastpitch German tutorial by @XuesongYang :: PR: #6051
- [TTS] bugfix Chinese Fastpitch tutorial by @XuesongYang :: PR: #6055
- Fix enhancer usage by @artbataev :: PR: #6059
- [TTS] Spectrogram Enhancer: support arbitrary input length by @racoiaws :: PR: #6060
- Fix enhancer usage in ASR-TTS examples by @artbataev :: PR: #6116
- [TTS] Spectrogram Enhancer: add option to zero out the initial tensor by @racoiaws :: PR: #6136
- [TTS][DE] Augment tokenization/G2P to preserve capitalization of words and mix phonemes with word-level graphemes for an input text. by @XuesongYang :: PR: #5805
NLP / NMT
Changelog
- Fix P-Tuning Truncation by @vadam5 :: PR: #5663
- Adithyare/prompt learning seed by @arendu :: PR: #5749
- Add extra data args to support proper finetuning of HF converted T5 checkpoints by @MaximumEntropy :: PR: #5719
- Don't add output directory twice when creating shared sentencepiece tokenizer by @pks :: PR: #5737
- add constraint info on batch size for tar dataset by @yzhang123 :: PR: #5812
- remove transformer version upper bound by @Zhilin123 :: PR: #5831
- Adithyare/adapter new placement by @arendu :: PR: #5791
- Add SSL import functionality for Audio Lexical PNC Models by @trias702 :: PR: #5834
- validation batch sizing and drop_last controls by @arendu :: PR: #5830
- Remove ending newlines when encoding strings w/ sentencepiece tokenizer by @pks :: PR: #5739
- Fix segmenting for pcla inference by @jubick1337 :: PR: #5849
- RETRO model finetuning by @yidong72 :: PR: #5800
- Optimizing distributed Adam when running with one work queue by @timmoon10 :: PR: #5560
- Add option to disable distributed parameters in distributed Adam optimizer by @timmoon10 :: PR: #5685
- set max_steps for lr decay through config by @anmolgupt :: PR: #5780
- Fix Prompt text space issue by @aklife97 :: PR: #5983
- Add batch_size to prompt_learning generate by @aklife97 :: PR: #6091
NeMo Tools
Changelog
- [Tools] NeMo Forced Aligner by @erastorgueva-nv :: PR: #5571
- [Tools] Fix ctc segmentation: exclude audacity files by @ekmb :: PR: #6009
Export
Changelog
General Improvements
Changelog
- Pin lightning version less than 1.9.0 by @SeanNaren :: PR: #5822
- Davidm/cherrypick r1.16.0 by @Davood-M :: PR: #6082
- Update files for lightning 1.9.0 by @SeanNaren :: PR: #5823
- Tn doc 16 by @yzhang123 :: PR: #5954
- Ensure EMA checkpoints are also deleted when normal checkpoints are by @SeanNaren :: PR: #5724
- [Fix] ConformerEncoder forward when length is None by @anteju :: PR: #5761
- Fix EMA topk checkpoint deletion by @SeanNaren :: PR: #5758
- [BugFix] decoder timestamp count has a mismatch when is decoded by @tango4j :: PR: #5825
- Update 00_NeMo_Primer.ipynb by @schaltung :: PR: #5740
- Sanitize params before DLLogger log_hyperparams by @milesial :: PR: #5736
- NeMo Forced Aligner by @erastorgueva-nv :: PR: #5571
- Add EMA Docs, fix common collection documentation by @SeanNaren :: PR: #5757
- Add container info to main page by @fayejf :: PR: #5816
- CommonVoice support for script by @SeanNaren :: PR: #5797
- Support nested NeMo models by @artbataev :: PR: #5671
- fix max len generation t5 by @ekmb :: PR: #5852
- NFA samples fix by @erastorgueva-nv :: PR: #5856
- fix(readme): fix typo by @jqueguiner :: PR: #5883
- Block large files from being merged into NeMo main by @SeanNaren :: PR: #5898
- Pin isort version by @artbataev :: PR: #5914
- fixed missing long_description_content_type by @XuesongYang :: PR: #5909
- Update container to 23.01 by @ericharper :: PR: #5917
- remove conda pynini install by @ekmb :: PR: #5921
- Update align.py by @Slyne :: PR: #6043
- Fixing data simulator argument and bash scripting error by @tango4j :: PR: #6112
- Update apex commit by @ericharper :: PR: #6148
NVIDIA Neural Modules 1.15.0
Highlights
NeMo ASR
- HybridTransducer-CTC ASR
- Greedy timestamp decoding with inference script
- MHA adapters
- Conformer local attention (longformer)
- High level beam search API
- Multiblank transducer
- Multi-channel audio processing model
- AIstore for ASR datasets
NeMo Megatron
- ALiBi position embeddings support for T5
NeMo TTS
- Chinese TTS pipeline with polyphone disambiguation
NeMo Core
- Optimizer based EMA
- MLFlow logger support
Models
- stt_eo_conformer_ctc_large (HF, NGC) Esperanto ASR model.
- stt_eo_conformer_transducer_large (HF, NGC) Esperanto ASR model.
Detailed Changelogs
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:22.12
ASR
Changelog
- optimized loop and bugfix by @Jorjeous :: PR: #5573
- Update torchmetrics by @nithinraok :: PR: #5566
- Add an option to defer data setup from init to setup by @anteju :: PR: #5569
- AIStore for ASR datasets by @anteju :: PR: #5462
- Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
- Update documentation and tutorials for Adapters by @titu1994 :: PR: #5610
- Conformer local attention by @sam1373 :: PR: #5525
- Add core classes and functions for online clustering diarizer part 1 by @tango4j :: PR: #5526
- [Add] ASR+VAD Inference Pipeline by @stevehuang52 :: PR: #5575
- [ASR] Audio processing base, multi-channel enhancement models by @anteju :: PR: #5356
- Expose ClusteringDiarizer device by @SeanNaren :: PR: #5681
- Add Beam Search support to ASR transcribe() by @titu1994 :: PR: #5443
- Multiblank Transducer by @hainan-xv :: PR: #5527
- pin torchmetrics version by @nithinraok :: PR: #5720
- Update torchaudio dependency version for tutorials by @titu1994 :: PR: #5781
- update torchmetrics to latest version by @nithinraok :: PR: #5801
- Fix transducer and question answering tutorial bugs bugs by @Zhilin123 :: PR: #5809
- [BugFix] Updated CTC decoders installation in tutorial by @vsl9 :: PR: #5833
- update torchmetrics args confusionmatrix by @nithinraok :: PR: #5853
- indentation fix by @nithinraok :: PR: #5861
- Fix wrong label mapping in batch_inference for label_model by @fayejf :: PR: #5767
TTS
Changelog
- Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
- [TTS] fix ranges of char set for accented letters. by @XuesongYang :: PR: #5607
- [TTS] add type hints and change varialbe names for tokenizers and g2p by @XuesongYang :: PR: #5602
- Fixed RadTTS unit test by @borisfom :: PR: #5572
- [TTS][ZH] Disambiguate polyphones with augmented dict and Jieba segmenter for Chinese FastPitch by @yuekaizhang :: PR: #5541
- Add duration padding support for RADTTS inference by @kevjshih :: PR: #5650
- [TTS] add tts dict cust notebook by @ekmb :: PR: #5662
- [TN/TTS docs] TN customization, g2p docs moved to tts by @ekmb :: PR: #5683
- typo and link fixed by @ekmb :: PR: #5741
- link fixed by @ekmb :: PR: #5745
- Update Tacotron2 NGC checkpoint load to latest version by @redoctopus :: PR: #5760
- Docs g2p update by @ekmb :: PR: #5769
- [TTS][ZH] bugfix import jieba errors. by @XuesongYang :: PR: #5776
NLP / NMT
Changelog
- Text generation improvement (UI client, data parallel support) by @yidong72 :: PR: #5437
- O2 style amp for gpt3 ptuning by @JimmyZhang12 :: PR: #5246
- Add support for MHA adapters to ASR by @titu1994 :: PR: #5396
- Bert interleaved by @shanmugamr1992 :: PR: #5556
- Port stateless timer to exp manager by @MaximumEntropy :: PR: #5584
- Add interface for making amax reduction optional for FP8 by @ksivaman :: PR: #5447
- Propagate attention_dropout flag for GPT-3 by @mikolajblaz :: PR: #5669
- Enc-Dec model size reporting fixes by @MaximumEntropy :: PR: #5623
- Add prompt learning tests by @arendu :: PR: #5649
- Fix missing torchelastic fixes for PTL 1.8 by @MaximumEntropy :: PR: #5691
- ALiBi Positional Embeddings by @michalivne :: PR: #5467
- Megatron export triton update by @Davood-M :: PR: #5766
- Fix transducer and question answering tutorial bugs bugs by @Zhilin123 :: PR: #5809
- Update description for question answering tutorial by @Zhilin123 :: PR: #5814
- TPMLP for T5-based models by @Davood-M :: PR: #5840
- Megatron positional encoding alibi fix by @michalivne :: PR: #5808
Export
Changelog
General Improvements
Changelog
- Update to pytorch 22.12 container by @ericharper :: PR: #5694
- optimized loop and bugfix by @Jorjeous :: PR: #5573
- Expose ClusteringDiarizer device by @SeanNaren :: PR: #5681
- remove useless files. by @XuesongYang :: PR: #5580
- [Fix] setup_multiple validation/test data by @anteju :: PR: #5585
- Move to optimizer based EMA implementation by @SeanNaren :: PR: #5169
- [Temp workaround] Disable test with cache_audio to unblock CI by @anteju :: PR: #5615
- [EMA] Change success message to reduce confusion by @SeanNaren :: PR: #5621
- Temporarily disable prompt learning CI tests by @ericharper :: PR: #5633
- [Dockerfile] Remove AIS archive from docker image by @anteju :: PR: #5629
- [workflow] add exclude labels option to ignore cherry-picks in releas… by @XuesongYang :: PR: #5645
- Add DLLogger support to exp_manager by @milesial :: PR: #5658
- Fix EMA restart by allowing device to be set by the class init by @SeanNaren :: PR: #5668
- Remove SDP (moved to separate repo) - merge to main by @erastorgueva-nv :: PR: #5630
- temp disable speaker recognision CI test by @fayejf :: PR: #5696
- Don't print exp_manager warning when max_steps == -1 by @milesial :: PR: #5725
- Add tabular data generation documents to the index file by @yidong72 :: PR: #5733
- fix token id bug by @yidong72 :: PR: #5777
- Update numpy requirements from 1.21 to 1.22 by @Zhilin123 :: PR: #5785
- Fix setuptools to usable version by @titu1994 :: PR: #5798
- add apt-get upgrade -y in dockerfile by @fayejf :: PR: #5817
- Update NeMo Multi-Run docs by @titu1994 :: PR: #5844
- add ambernet to readme by @fayejf :: PR: #5872
- update apex install instructions for 1.15 by @ericharper :: PR: #5901
NVIDIA Neural Modules 1.14.0
Highlights
NeMo ASR
- Hybrid CTC + Transducer loss ASR #5364
- Sampled Softmax RNNT (Enables large vocab RNNT, for speech translation and multilingual ASR) #5216
- ASR Adapters hyper parameter search scripts #5159
- RNNT {ONNX, TorchScript} x GPU export infer #5248
- Exportable MelSpectrogram (TorchScript) #5512
- Audio To Audio Dataset Processor #5196
- Multi Channel Audio Transcription #5479
- Silence Augmentation #5476
NeMo Megatron
- Support for the Mixture of Experts for T5
- Fix PTL model size output for GPT-3 and BERT
- BERT with Tensor Parallelism & Pipeline Parallel Support
NeMo Core
- Hydra Multirun core support + NeMo HP optim in YAML #5159
NeMo Models
Detailed Changelogs
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:22.11
ASR
Changelog
- [Tools][ASR] Tool for generating data using simulated RIRs by @anteju :: PR: #5158
- Modernize RNNT ONNX export and add TS export by @titu1994 :: PR: #5248
- Add Gradio App to ASR Docs by @titu1994 :: PR: #5270
- Add support for Sampled Softmax for RNNT Joint by @titu1994 :: PR: #5216
- Speed up HF data processing script for ASR by @titu1994 :: PR: #5330
- bugfix in volume loss for CTC models by @bmwshop :: PR: #5348
- Add cpWER for evaluation of ASR with diarization by @tango4j :: PR: #5279
- Fix for getting tokenizer in character-based ASR models when using tarred dataset by @jonghwanhyeon :: PR: #5442
- Refactor/unify ASR offline and buffered inference by @fayejf :: PR: #5440
- Standalone diarization+ASR evaluation script by @tango4j :: PR: #5439
- [ASR] Transcribe for multi-channel signals by @anteju :: PR: #5479
- Add Silence Augmentation by @fayejf :: PR: #5476
- add exportable mel spec by @1-800-BAD-CODE :: PR: #5512
- add RNN-T loss implemented by PyTorch and test code by @hainan-xv :: PR: #5312
- [ASR] AudioToAudio datasets and related test by @anteju :: PR: #5196
- Add StreamingFeatureBufferer class for real-life streaming decoding by @tango4j :: PR: #5534
- Pool stats with padding by @1-800-BAD-CODE :: PR: #5403
- Adding Hybrid RNNT-CTC model by @VahidooX :: PR: #5364
- Fix ASR Buffered inference scripts by @titu1994 :: PR: #5552
- Add wer details - insertion, deletion, substitution rate by @fayejf :: PR: #5557
- Add support for Time Stamp calculation using transcribe_speech.py by @titu1994 :: PR: #5568
- [STT] Add Esperanto (Eo) ASR Conformer-CTC and Conformer-Transducer models by @andrusenkoau :: PR: #5639
TTS
Changelog
- [TTS] Fastpitch energy condition and refactoring by @subhankar-ghosh :: PR: #5218
- [TTS] HiFi-TTS Download Script by @oleksiivolk :: PR: #5241
- [TTS] Add Mandarin/English Bilingual Recipe for Training Fastpitch Models by @yuekaizhang :: PR: #5208
- [TTS] fixed type of filepath and rename openslr. by @XuesongYang :: PR: #5276
- [TTS] replace obsolete torch_tts unit test marker with run_only_on('CPU') by @XuesongYang :: PR: #5307
- [TTS] bugfix IPAG2P and refactor to remove duplicate process. by @XuesongYang :: PR: #5304
- Update path to get_data.py in TTS tutorial by @redoctopus :: PR: #5311
- [TTS] Replace IPA lambda arguments with locale string by @rlangman :: PR: #5298
- [TTS] expand to support flexible dictionary entry formats in IPAG2P. by @XuesongYang :: PR: #5318
- [TTS] update organization of model checkpoints and their pointers. by @XuesongYang :: PR: #5327
- [TTS] bugfix for the script of generating mels from fastpitch. by @XuesongYang :: PR: #5344
- [TTS] Add Spanish model documentation by @rlangman :: PR: #5390
- [TTS] Add Spanish FastPitch training configs by @rlangman :: PR: #5383
- [TTS] replace pitch normalization params with ??? by @XuesongYang :: PR: #5392
- [TTS] Create script for processing TTS training audio by @rlangman :: PR: #5262
- [TTS] remove useless logic for set_tokenizer. by @XuesongYang :: PR: #5430
- [TTS] Fixing RADTTS training - removing view buffer and fixing accuracy issue by @borisfom :: PR: #5358
- JOC Optimization in FastPitch by @subhankar-ghosh :: PR: #5450
- [TTS] Support speaker level pitch normalization by @rlangman :: PR: #5455
- TTS tutorial update: use speaker 9017 instead of 6097 by @redoctopus :: PR: #5532
- [TTS] Remove unused TTS eval function by @redoctopus :: PR: #5605
- [TTS][ZH] add fastpitch and hifigan model NGC urls and update NeMo docs. by @XuesongYang :: PR: #5596
- [TTS][DOC] add notes about automatic conversion to target sampling ra… by @XuesongYang :: PR: #5624
- [TTS][ZH] bugfix for the tutorial and add NGC CLI installation guide. by @XuesongYang :: PR: #5643
- [TTS][ZH] bugfix for ngc cli installation. by @XuesongYang :: PR: #5652
- [TTS][ZH] fix broken link for the script. by @XuesongYang :: PR: #5666
NLP / NMT
Changelog
- Option to pad the last validation input sequence if its smaller than the encoder sequence length for MegatronGPT by @anmolgupt :: PR: #5243
- Fixes bugs with loss averaging with for Megatron GPT by @shanmugamr1992 :: PR: #5329
- Fixing bug in Megatron BERT when loss mask is all zeros by @shanmugamr1992 :: PR: #5424
- support to disable sequence length + 1 input tokens for each sample in MegatronGPT by @anmolgupt :: PR: #5363
- [TN] raise NotImplementedError for unsupported languages and other minor fixes by @XuesongYang :: PR: #5414
- Bug fix/gpt by @shanmugamr1992 :: PR: #5493
- prompt tuning fix for unscale grad errors by @arendu :: PR: #5523
- Bert sequence parallel support by @shanmugamr1992 :: PR: #5494
- NLP docs fixes by @vsl9 :: PR: #5528
- Switch order of args in optimizer_step override by @ericharper :: PR: #5549
- Upgrade to 22.11 by @ericharper :: PR: #5550
- Merge r1.13.0 main by @ericharper :: PR: #5570
- some tokenizers do not have additional_special_tokens_ids attribute by @arendu :: PR: #5642
- Remove cell output from tutorial by @ericharper :: PR: #5689
Text Normalization / Inverse Text Normalization
Changelog
- [ITN] fix year date graph, cardinals extension for hundreds by @ekmb :: PR: #5435
- [TN] raise NotImplementedError for unsupported languages and other minor fixes by @XuesongYang :: PR: #5414
Export
Changelog
- Fixed the onnx bug in conformer for non-streaming models. by @VahidooX :: PR: #5242
- Modernize RNNT ONNX export and add TS export by @titu1994 :: PR: #5248
- Fixes for Conformer-xl export by @borisfom :: PR: #5309
- Remove onnx graphsurgery from Dockerfile by @titu1994 :: PR: #5320
- add exportable mel spec by @1-800-BAD-CODE :: PR: #5512
General Improvements
Changelog
- bugfix in volume loss for CTC models by @bmwshop :: PR: #5348
- Fix setting up of learning rate scheduler by @PeganovAnton :: PR: #5444
- Better patch hydra by @titu1994 :: PR: #5591
- [TTS][ZH] bugfix for the tutorial and add NGC CLI installation guide. by @XuesongYang :: PR: #5643
- Add fully torch.jit.script-able speaker clustering module by @tango4j :: PR: #5191
- Update perturb.py by @stevehuang52 :: PR: #5231
- remove CV requirements. by @XuesongYang :: PR: #5233
- checks for accepted adapter type at module level by @arendu :: PR: #5194
- fix hypotheses return by @nithinraok :: PR: #5253
- Support for inserting additional subsampling in conformer encoder by @shan18 :: PR: #5224
- update tutorials to use meeting config as default and VAD by @nithinraok :: PR: #5237
- Specifying audio signal dropout separately for the Conformer Encoder by @shan18 :: PR: #5263
- created by @bmwshop :: PR: #5268
- Fix failing speaker counting for short audio samples by @tango4j :: PR: #5267
- O2bert + apex pipeline functions by @shanmugamr1992 :: PR: #5221
- Upperbound PTL by @titu1994 :: PR: #5302
- Update Interface(s) phonetic entry by @blisc :: PR: #5212
- add label inference support to EncDecSpeakerLabel class by @nithinraok :: PR: #5278
- Add italian model checkpoints by @Kipok :: PR: #5315
- Text Memmap Parsing Improvements by @michalivne :: PR: #5265
- Update librosa signature in HF processing script by @titu1994 :: PR: #5321
- Force wav file format for audio_filepath by @titu1994 :: PR: #5323
- Updates to T0 Dataset and Model by @MaximumEntropy :: PR: #5201
- [DOC] add sphinx-copybutton requirement to copy button on code snippets. by @XuesongYang :: PR: #5326
- Add support for Hydra multirun to NeMo by @titu1994 :: PR: #5159
- typo fix by @arendu :: PR: #5328
- add precommit hood to automatic sort entries in requirements. by @XuesongYang :: PR: #5333
- Add speaker clustering arguments to forward function by @tango4j :: PR: #5306
- Fixing de-autocast by @borisfom :: PR: #5319
- [Bugfix] Added rm -f / wget- nc command to avoid bash error in multispeaker sim notebook by @tango4j :: PR: #5292
- [DOC] added ipython dependency to support IPython.sphinxext extension by @XuesongYang :: PR: #5345
- Bug fix (removing old compute consumed samples) by @shanmugamr1992 :: PR: #5355
- removed uninstall nemo_cv and nemo_simple_gan and relax numba version… by @XuesongYang :: PR: #5332
- Enable mlflow logger by @whrichd :: PR: #4893
- Fix Python type hints according to Python Docs by @artbataev :: PR: #5370
- Distributed optimizer support for BERT by @timmoon10 :: PR: #5305
- SpeakerClustering: fix tensor dimennsions in forward() by @virajkarandikar :: PR: #5387
- add squad by @arendu :: PR: #5407
- added python and c++ alignment code by @yzhang123 :: PR: #5346
- Add MoE support for T5 model (w/o expert parallel) by @aklife97 :: PR: #5409
- Fix...
NVIDIA Neural Modules 1.13.0
Highlights
NeMo ASR
- Spoken Language Understanding (SLU) models based on Conformer encoder and transformer decoder
- Support for codeswitched manifests during training
- Support for Language ID during inference for ML models
- Support of cache-aware streaming for offline models
- Word confidence estimation for CTC & RNNT greedy decoding
NeMo Megatron
- Interleaved Pipeline schedule
- Transformer Engine for GPT
- HF T5v1.1 -> NeMo-Megatron conversion and finetuning/p-tuning
- IA3 and Adapter Tuning (Tensor + Pipeline Parallel)
- Pipeline Parallel Support for T5 Prompt Learning
- MegatronNMT export
NeMo TTS
- TTS introductory tutorial
- Phonemizer/espeak removal (Spanish/German)
- Char-only support for Spanish/German models
- Documentation Refactor
NeMo Core
- Upgrade to NGC PyTorch 22.09 container
- Add pre-commit hooks
- Exponential moving average (EMA) of weights during training
NeMo Models
- ASR Conformer Croatian: stt_hr_conformer_ctc_large and stt_hr_conformer_transducer_large
- ASR Conformer Belarusian: stt_be_conformer_ctc_large and stt_be_conformer_transducer_large
- ASR Squeezeformer Librispeech: 6 checkpoints (XS, S, SM, M, ML, L)
- SLURP Intent Classification / Slot Filling: slu_conformer_transformer_large_slurp
- LanguageID AmberNet: langid_ambernet
Detailed Changelogs
Container
For additional information regarding NeMo containers, please visit: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo
docker pull nvcr.io/nvidia/nemo:22.09
Known Issues
Issues
- pytest for RadTTSModel_export_to_torchscript are failing intermittently due to random input values. Fixed in main.
ASR
Changelog
- Add docs tutorial on kinyarwanda asr by @bene-ges :: PR: #4953
- Asr codeswitch by @bmwshop :: PR: #4821
- Add test for nested ASR model by @titu1994 :: PR: #5002
- Greedy decoding confidence for CTC and RNNT by @GNroy :: PR: #4931
- [ASR][Tools] RIR corpus generator by @anteju :: PR: #4927
- Add Squeezeformer CTC model checkpoints on Librispeech by @titu1994 :: PR: #5121
- adding loss normalization options to rnnt joint by @bmwshop :: PR: #4829
- Asr concat dataloader by @bmwshop :: PR: #5108
- Added ASR model comparison to SDE by @Jorjeous :: PR: #5043
- Add scripts for converting Spoken Wikipedia to asr dataset by @bene-ges :: PR: #5138
- ASR confidence bug fix for older Python versions by @GNroy :: PR: #5180
- Update ASR Scores and Results by @titu1994 :: PR: #5254
- [STT] Add Ru ASR Conformer-CTC and Conformer-Transducer by @ssh-meister :: PR: #5340
TTS
Changelog
- [TTS] Adding speaker embedding conditioning in fastpitch by @subhankar-ghosh :: PR: #4986
- [TTS] Remove PhonemizerTokenizer by @rlangman :: PR: #4990
- [TTS] FastPitch speaker interpolation by @subhankar-ghosh :: PR: #4997
- RADTTS model changes to accommodate export with batch size > 1 by @borisfom :: PR: #4947
- [TTS] remove phonemizer.py by @XuesongYang :: PR: #5090
- [TTS] Add NeMo TTS Primer Tutorial by @rlangman :: PR: #4933
- [TTS] Add SpanishCharsTokenizer by @rlangman :: PR: #5135
- Fixes for docs/typos + remove max_utts parameter from tarred datasets as it causes hang in training by @Kipok :: PR: #5118
- refactor TTS documentation organization and add new contents. by @XuesongYang :: PR: #5137
- [TTS][DOC] update models trained on HifiTTS dataset. by @XuesongYang :: PR: #5173
- [TTS] Fix TTS Primer image markup by @rlangman :: PR: #5192
- [TTS] deprecate TextToWaveform base class. by @XuesongYang :: PR: #5205
- [TTS] remove the avoidance of circular imports by @XuesongYang :: PR: #5214
- [TTS] remove LinVocoder and apply Vocoder as parent class. by @XuesongYang :: PR: #5206
- [TTS] unify requirements_tts.txt and requirements_torch_tts.txt by @XuesongYang :: PR: #5232
- Minor typo fixes in TTS tutorial by @redoctopus :: PR: #5266
- Radtts 1.13 by @borisfom :: PR: #5451
- Radtts 1.13 plus by @borisfom :: PR: #5457
NLP / NMT
Changelog
- IA3 support for GPT and T5 by @arendu :: PR: #4909
- Fix and refactor consumed samples save/restore for Megatron models. by @MaximumEntropy :: PR: #5077
- Remove unsupported arguments from MegatronNMT by @MaximumEntropy :: PR: #5065
- Update megatron interface to dialogue by @Zhilin123 :: PR: #4936
- gpt ia3 CI tests by @arendu :: PR: #5140
- Fix NMT Eval Sampler by @aklife97 :: PR: #5154
- Add interleaved pipeline schedule to GPT by @ericharper :: PR: #5025
- fix for bug in bignlp by @arendu :: PR: #5172
- Fixes some args that were not removed properly for multilingual Megatron NMT by @MaximumEntropy :: PR: #5142
- Fix absolute path in GPT Adapter CI tests by @arendu :: PR: #5184
- Add ability to configure drop last batch for validation datasets with MegatronGPT by @shanmugamr1992 :: PR: #5067
- Megatron Export Update by @Davood-M :: PR: #5343
- Fix GPT generation when using sentencepiece tokenizer by @MaximumEntropy :: PR: #5413
- Disable sync_batch_comm in validation_step for GPT by @ericharper :: PR: #5397
- Set sync_batch_comm=False in prompt learning and inference by @MaximumEntropy :: PR: #5448
- Fix a bug with positional vs key-word based argument passing in the transformer layer by @MaximumEntropy :: PR: #5475
Text Normalization / Inverse Text Normalization
Changelog
- [Chinese text normalization] speed up graph building by @pengzhendong :: PR: #5128
NeMo Tools
Export
Changelog
- Fix export bug by @VahidooX :: PR: #5009
- RADTTS model changes to accommodate export with batch size > 1 by @borisfom :: PR: #4947
- Support TorchScript export for Squeezeformer by @titu1994 :: PR: #5164
- Expose keep_initializers_as_inputs to Exportable class by @pks :: PR: #5052
- Fix the self-attention export bug for cache-aware streaming Conformer by @VahidooX :: PR: #5114
- replace ColumnParallelLinear with nn.Linear in export_utils by @arendu :: PR: #5217
- Megatron Export Update by @Davood-M :: PR: #5343
- Fix Conformer Export in 1.13.0 (cherry-pick from main) by @artbataev :: PR: #5446
- export_utils bugfix by @Davood-M :: PR: #5480
- Export fixes for Riva by @borisfom :: PR: #5496
General Improvements and Bugfixes
Changelog
- don't use bfloat16 when in jit by @bmwshop :: PR: #5051
- Set sync_batch_comm=False in prompt learning and inference by @MaximumEntropy :: PR: #5448
- Fix a bug with positional vs key-word based argument passing in the transformer layer by @MaximumEntropy :: PR: #5475
- Pin Transformers version to fix CI by @SeanNaren :: PR: #4955
- Fix changelog builder (#4962) by @titu1994 :: PR: #4963
- Checkpoint averaging class fix by @michalivne :: PR: #4946
- Add ability to give seperate datasets for test, train and validation by @shanmugamr1992 :: PR: #4798
- Add simple pre-commit file by @SeanNaren :: PR: #4983
- Import pycuda.autoprimaryctx or pycuda.autoinit to init pycuda execut… by @liji-nv :: PR: #4951
- Improvements to AMI script by @SeanNaren :: PR: #4974
- clean warnings from tests and CI runs, and prepare for upgrade to PTL 1.8 by @nithinraok :: PR: #4830
- Update libraries by @titu1994 :: PR: #5010
- add close inactive issues and PRs github action. by @XuesongYang :: PR: #5015
- Fix filename extraction in vad_utils.py by @GKPr0 :: PR: #4999
- Add black to pre-commit by @SeanNaren :: PR: #5027
- [CI] Enable previous build abort when new commit pushed by @SeanNaren :: PR: #5041
- Tutorials and Docs for Multi-scale Diarization Decoder by @tango4j :: PR: #4930
- Refactor output directory for MSDD Inference Notebook by @SeanNaren :: PR: #5044
- text_memmap dataset index range testing fix by @michalivne :: PR: #5034
- fix undefined constant in code example by @bene-ges :: PR: #5046
- Text generation refactor and RETRO text generation implementation by @yidong72 :: PR: #4985
- Lids by @bmwshop :: PR: #4820
- Add datasets folder, add diarization datasets voxconverse/aishell by @SeanNaren :: PR: #5042
- Fix the bugs in cache-aware streaming Conformer by @VahidooX :: PR: #5032
- Bug fix - Limit val batches set to 1.0 by @shanmugamr1992 :: PR: #5023
- [bug_fix] kv_channels is used when available by @arendu :: PR: #5066
- Add spe_split_by_unicode_script arg by @piraka9011 :: PR: #5072
- Transformer Engine Integration by @ericharper :: PR: #5104
- Text memmap dataset index memory efficiency by @michalivne :: PR: #5056
- Add NGC links for Aligner and FastPitch by @redoctopus :: PR: #5235
- Fix link to inference notebook by @redoctopus :: PR: #5247
- Fix links to speaker identification notebook by @SeanNaren :: PR: #5260
- Fix bug into Dialogue tutorial by @Zhilin123 :: PR: #5277
- PCLA tutorial typo fix by @jubick1337 :: PR: #5288
- Fix dialogue tutorial bug by @Zhilin123 :: PR: #5297
- small bugfix for r1.13.0 by @fayejf :: PR: #5310
- Add italian model checkpoints by @Kipok :: PR: #5316
- Pcla tutorial fixes by @jubick1337 :: PR: #5313
- Fix issue with HF Model upload tutorial by @titu1994 :: PR: #5359
- P&C LA tutorial fixes by @jubick1337 :: PR: #5354
- Add SDP documentation by @erastorgueva...