Abstract:
Handwriting recognition of historical documents is still largely unsolved problem in the
field of pattern recognition. This thesis investigates how the-state-of-the-art deep learning
techniques perform handwriting recognition in the context of historical Ge’ez manuscripts.
Though Ge’ez was the language of literature in Ethiopia until the middle of the 19th
century, it is underrepresented in the research areas of document image analysis and
recognition. Thus handwriting recognition system is proposed based on real-world large
scale digitization scenarios. Its architecture is comprised of tasks, namely: pre-processing
(binarization and skew estimation), page layout analysis, recognition model, and post processing. For each task, experimental setup is designed. In the task of binarization,
four binarization methods (Otsu’s global method, Otsu’s local method, Sauvola’s method
and Gato’s adaptive method) were investigated using FM, p-FM, PSNR and DRD
evaluation metrics. Sauvola’s method outperforms all other methods on all the metrics. In
the document image skew estimation task, Hough transform based method was
investigated by experimenting and examining the results over a dataset. Evaluation
criterion AED, TOP80, and CE were used and obtained values equal to 0.3115, 0.058,
and 76.00 respectively. In the page layout analysis task, the performance of Leptonica
which is open source C library was investigated and achieved results with high success
rate on region and text line level over a wide variety of page layouts of actual historical
Ge’ez manuscripts. The final experimental setup was designed for building a recognition
model using Tesseract OCR engine. Due to a difficulty to prepare large training data with
ground truth from actual historical documents, fine tuning approach was proposed and
applied in the context of historical Ge’ez manuscripts. A total of 257 text line images
collected from 15 different pages were prepared and able to build a recognition model
with character error rate of 2.632%. Overall, the performed experiments with the
prototyping approach have produced encouraging results so that a complete OCR system
development for historical Ge’ez manuscripts is applicable. The major weakness of the
study is optimization. Therefore, further optimization technique with large training sample
is required. Furthermore, as a future work, investigation needs to consider incorporating
post-processing into the recognition process