Thai handwritten recognition on text block-based from thai archive manuscripts
Date
2020
Authors
Chamchong, R.
Gao, W.
McDonnell, M.D.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings of the ... International Conference on Document Analysis and Recognition / sponsored by the IAPR TC-11 and TC-10, in cooperation with the IEEE Computer Society and IGS. International Conference on Document Analysis and Recog..., 2020, pp.1346-1351
Statement of Responsibility
Conference Name
International Conference on Document Analysis and Recognition (ICDAR) (20 Sep 2019 - 25 Sep 2019 : Sydney, NSW, Australia)
Abstract
Automatic transcription of ancient handwritten manuscripts can be a challenging task when compared with a transcription of contemporary handwriting. Characters and words can have unusual and varying shapes, with significant variation between writers, and sufficient labelled data from which to train machine learning algorithms can be difficult to access. This paper describes ancient Thai handwriting transcription on block-based from archive manuscripts, using a hybrid deep neural network with both convolutional (CNN) and recurrent (RNN) layers, trained using Connectionist Temporal Classification (CTC) loss. Six architecture variations are compared. Data augmentation was applied to synthetically increase the number of training samples, resulting in improved learning. Thai archive manuscripts were collected from the Thai National Library. The character error rate (CER) in the best architecture was found to be 11.9 percent.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
Copyright 2019 IEEE