Scripts for testing lexicony stuff in giellalt plus some processing lexc python scripts.
Uses pyhfst to load HFST automata. Run poetry install to install dependencies.
Spell-checker testing uses
divvunspell binaries. You can install
divvunspell with cargo.
You can install giellaltlextools with pipx: pipx install git+https://github.com/divvun/giellaltlextools.
This project uses Poetry's build system to ensure optimal pyhfst installation.
The project is configured to automatically optimize pyhfst installation with Cython for better performance:
- Build System: Declares Cython as a build-time requirement
- Build Script:
scripts/build.pyautomatically handles pyhfst optimization - Dependencies: Cython is included as both a runtime and build dependency
The build script runs automatically during poetry install and poetry build, ensuring pyhfst is always installed with Cython support when available.
Mainly from make check in GiellaLT infra.
There are currently three programs installed:
gtlemmatestfor testing that a generator generates lemmas found from a lexc filegtparadigmtesetfor testing that a generator generates full paradigms of the lemmasgtspelltestfor testing that a spell checker accepts lemmas from lexc files.
$ gtlemmatest -l src/fst/morphology/stems/nouns.lexc \
-a src/fst/analyser-gt-desc.hfstol \
-g src/fst/generator-gt-desc.hfstol \
-t +N+Sg+Nom -t +N+Pl+NomThe lexc files should mainly contain lexc lines that contain full lemma forms.
$ gtparadigmtest -l src/fst/morphology/stens/nouns.lexc \
-p src/fst/morphology/test/testnounparadigm.txt \
-g src/fst/generator-gt-desc.hfstolThe lexc files should mainly contain lexc lines that contain full lemma forms.
$ gtspelltest -z tools/spellcheckers/se.zhfst -D divvunspell \
src/fst/morphology/stems/*.lexcThe lexc files should mainly contain lexc lines that contain full lemma forms.