pip3 install -r requirements.txt
NOTE: If you are using pyenv
to install older versions of Python, you might need to install development versions of libsqlite3x
, ncurses
, readline
, and tkinter
. For example, on Fedora: dnf install libsq3-devel ncurses-devel readline-devel tk-devel
.
Run the dewarp.py
script :
python ./dewarp.py ./sample.png ./output.png
Run the tight_dewarp.py
script :
python ./tight_dewarp.py ./sample.png ./output.png
Both functions exhibit comparable performance, with no discernible advantage in either. The primary distinction lies in their operational scope: dewarp.py
operates across the entire image, whereas tight_dewarp.py
specifically tracks the leftmost and rightmost black pixels within Otsu's threshold image, concentrating its efforts within that identified range.
- Load Image :
- Convert from RGB to Grayscale :
- Apply Otsu's Thresholding Method, Erosion and then Dilation :
- Calculate curve using Generalized Additive Model :
- Final Image :
- Input Image :
- Output Image :
- Input Image :
- Semi-processed Image :
- Output Image :
The rectification dataset can be viewed and downloaded through this link.
The number of splines used for the initial curve estimation is 8, and the number of splines used for the final alignment is 12.
Warping Function | DW | Word Error Rate w/o Rectification | Character Error Rate w/o Rectification | Word Error Rate w/ Rectification | Character Error Rate w/ Rectification |
---|---|---|---|---|---|
y = -x | 99.86% | 0.9440 | 0.5063 | 0.1552 | 0.0237 |
y = x2 | 99.86% | 1.3352 | 0.8339 | 0.3973 | 0.0620 |
y = -x3 | 99.88% | 1.1067 | 0.6613 | 0.1838 | 0.0318 |
y = x4 | 99.92% | 1.7962 | 0.7910 | 0.3772 | 0.0575 |
Suppose we aim to improve performance for the y = x2 scenario by identifying an optimal set of numbers. Below is the variation in CER and WER scores based on the number of splines used:
If you have found value in this repository, we kindly request that you consider citing it as a source of reference:
Stogiannopoulos, Thomas. “Curved Line Text Alignment: A Function That Takes as Input a Cropped Text Line Image, and Outputs the Dewarped Image.” GitHub, December 1, 2022. https://github.com/TomStog/curved-text-alignment.
For more information, you can also check my paper "Curved Text Line Rectification via Bresenham’s Algorithm and Generalized Additive Models" here.
@article{Stogiannopoulos2024CurvedTL,
title={Curved Text Line Rectification via Bresenham’s Algorithm and Generalized Additive Models},
author={Thomas Stogiannopoulos and Ilias Theodorakopoulos},
journal={Signals},
year={2024},
url={https://api.semanticscholar.org/CorpusID:273595704}
}