OCR/ICR on documents
Objective:
The goal is to find an algorithm that can extract the maximum information from a given page
I broke the process in to the following 6 steps:
- Character isolation
- Noise reduction
- Boundary removal
- Normalising
- Thinning
- Feature extraction
Challenges:
There were many challenges to overcome.
- Black Border Removal
- ICR (Intelligent Character Recognition): recognize and convert hand-drawn characters into text
- Scanned page (Detect edges and apply a perspective transform to obtain the top-down view of the document)
- Remove noise
- Shape detection and extraction
- OCR
- Handwriting recognition
- Minimize errors But the main problem was to “identify which part of the form contains text”.
My Approach
Input image => Detecting orientation of Image => Detecting and fixing skew angle => Removing form/table structure => Removing noise and making text clearer => Applying OCR and handwriting recognition