Thai handwritten recognition on text block-based from thai archive manuscripts

Date

2020

Authors

Chamchong, R.
Gao, W.
McDonnell, M.D.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

Proceedings of the ... International Conference on Document Analysis and Recognition / sponsored by the IAPR TC-11 and TC-10, in cooperation with the IEEE Computer Society and IGS. International Conference on Document Analysis and Recog..., 2020, pp.1346-1351

Statement of Responsibility

Conference Name

International Conference on Document Analysis and Recognition (ICDAR) (20 Sep 2019 - 25 Sep 2019 : Sydney, NSW, Australia)

Abstract

Automatic transcription of ancient handwritten manuscripts can be a challenging task when compared with a transcription of contemporary handwriting. Characters and words can have unusual and varying shapes, with significant variation between writers, and sufficient labelled data from which to train machine learning algorithms can be difficult to access. This paper describes ancient Thai handwriting transcription on block-based from archive manuscripts, using a hybrid deep neural network with both convolutional (CNN) and recurrent (RNN) layers, trained using Connectionist Temporal Classification (CTC) loss. Six architecture variations are compared. Data augmentation was applied to synthetically increase the number of training samples, resulting in improved learning. Thai archive manuscripts were collected from the Thai National Library. The character error rate (CER) in the best architecture was found to be 11.9 percent.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

Copyright 2019 IEEE

License

Grant ID

Call number

Persistent link to this record