How to train a multilingual model, is there a script for it?

I see that the w2v-conformer pre-trained model is trained using a multilingual dataset. Currently I have not found a relevant multilingual training solution or script.

Some of the problems encountered so far are how to choose the text modeling unit, is it BPE or char or something else?