Install all dependencies before testing/training.
pip install -r ./requirements.txt
If the package ctcdecode doesn't install properly, follow the instuctions in the README of the official repository: https://github.com/parlance/ctcdecode
git clone --recursive https://github.com/parlance/ctcdecode.git
cd ctcdecode && pip install .
The final model was trained in 4 steps, the configs are located in this directory The first run trained for 31 epochs, the second trained for 19 (32 to 50). The third trained for 24 (51 to 74). The fourth trained for 26 (75 to 100).
In order to replicate the model you need to run the train script 4 times, two times with the run1and2.json
config, and once with config3.json
and config4.json
configs. Each following run takes the model_best.pth
checkpoint from the last one. The data_dir
arguments in the config files point to the directory containing training data.
The main report is written in Wandb (in Russian): report
- Train the model to get the checkpoint file.
- Copy the config file
ds_config.json
fromhw_asr/configs
to the root directory of the project (next to the checkpoint):
cp ./hw_asr/configs/ds_config.json config.json
- Run the test:
python test.py --config config.json --resume model_best.pth --output testout.json