Scripts for IMPACT ground-truth generation.
Install go, make sure that you have both python2 and python3
installed.
- Make sure that
$HOME/go/binis in yourPATH - install the
impgtthelper tools:cd impgtt && go install - Install python3-venv if you are on debian or ubuntu
- Install libtk
Use make setup py2=my-python2 py3=my-python3 to setup the
tools. This will
- install ocropus into the
env/2virtual environment, - install calamari into the
env/3virtual environment and - download the calamari OCR-models into the
modelfolder.
If impgtt is for any reason not installed at its default location
$HOME/go/bin you can set it: make setup py2=my-python2 py3=my-python3.
There are various scripts in the scripts directory:
scripts/run.bashruns the whole segmentation and alignment process (ie. runs the following three scripts in the right order)scripts/segment.bashsegments the GT into regions and the regions into lines usingocropus-nlbinscripts/predict.bashruns the ocr-recognition (usingcalamari-predictand the linesscripts/align.bashalgins the ocred lines with the ground-truth linesimpgtt/...IMPACTGT-tools: helper tools for the scripts.
General usage: script/run.bash [-nobin] [-imgext EXT] IN [OUT]
From this repositorie's root directory run the segmentation over the
data in the IN directory using bash scripts/run.bash IN. The
result will be written to the segmented/IN directory. You can use
the -imgext EXT option to set the extension of the input images,
i.e. bash script/run.bash -imgext .sau.png IN runs the segmentation
over all the .sau.png image files.
Currently it is not possible to run the scripts outside of this
repositorie's root directory. This is due to the fact that the
run.bash script assumes to find ocorpus and calamari installed
in the env/2 and env/3 directories.