Skip to content

blingfire pypi package v0.1.3

Compare
Choose a tag to compare
@SergeiAlonichau SergeiAlonichau released this 25 Jun 17:27
· 172 commits to master since this release
ccd642c

Four tokenization algorithms supported: patterns, word-piece, unigram lm, bpe. Added space normalization api, Added a few more popular models, added unigram lm tokenization models trained on uniformly represented ~84 languages from wikimatrix set. Bug fixes, parity fixes.