Cross domain evaluation of text detection models.

Ali-Gombe, Adamu; Elyan, Eyad; Moreno-García, Carlos; Jayne, Chrisina

doi:10.1007/978-3-031-15934-3_5

Cross domain evaluation of text detection models.

Ali-Gombe, Adamu; Elyan, Eyad; Moreno-García, Carlos; Jayne, Chrisina

Authors

Adamu Ali-Gombe

Professor Eyad Elyan e.elyan@rgu.ac.uk
Professor

Dr Carlos Moreno-Garcia c.moreno-garcia@rgu.ac.uk
Associate Professor

Chrisina Jayne

Contributors

Elias Pimenidis
Editor

Plamen Angelov
Editor

Christina Jayne
Editor

Antonios Papaleonidas
Editor

Mehmet Aydin
Editor

Abstract

Text detection is a very common task across a wide range of domains, such as document image analysis, remote identity verification, amongst others. It is also considered an integral component of any text recognition system, where the performance of recognition tasks largely depends on the accuracy of the detection of text components. Various text detection models have been developed in the past decade. However, localizing text characters is still considered as one of the most challenging computer vision tasks within the text recognition task. Typical challenges include illumination, font types and sizes, languages, and many others. Furthermore, detection models are often evaluated using specific datasets without much work on cross-datasets and domain evaluation. In this paper, we present an experimental framework to evaluate the generalization capability of state-of-the-art text detection models across different application domains. Extensive experiments were carried using different established methods: EAST, CRAFT, Tessaract and Ensembles applied to various publicly available datasets. The generalisation performance of the models was evaluated and compared using precision, recall and F1-score. This paper opens a future direction in investigating ensemble models for text detection to improve generalisation.

Citation

ALI-GOMBE, A., ELYAN, E., MORENO-GARCÍA, C. and JAYNE, C. 2022. Cross domain evaluation of text detection models. In Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A. and Aydin, M. (eds.) Artificial neural networks and machine learning - ICANN 2022: proceedings of the 31st International conference on artificial neural networks (ICANN 2022), 6-9 September 2022, Bristol, UK, part III. Lecture notes in computer science, 13531. Cham: Springer [online], pages 50-61. Available from: https://doi.org/10.1007/978-3-031-15934-3_5

Presentation Conference Type	Conference Paper (published)
Conference Name	31st International conference on artificial neural networks 2022 (ICANN22)
Start Date	Sep 6, 2022
End Date	Sep 9, 2022
Acceptance Date	Jun 20, 2022
Online Publication Date	Sep 9, 2022
Publication Date	Dec 31, 2022
Deposit Date	Sep 13, 2022
Publicly Available Date	Sep 10, 2023
Publisher	Springer
Peer Reviewed	Peer Reviewed
Pages	50-61
Series Title	Lecture notes in computer science
Series Number	13531
Series ISSN	0302-9743; 1611-3349
Book Title	Artificial neural networks and machine learning - ICANN 2022
ISBN	9783031159336
DOI	https://doi.org/10.1007/978-3-031-15934-3_5
Keywords	Text detection; Efficient and accurate scene text detector; Character aware region awareness for text detection; Tesseract; Ensembles
Public URL	https://rgu-repository.worktribe.com/output/1752655

Files

ALI-GOMBE 2022 Cross domain evaluation (AAM) (3.8 Mb)
PDF

Deep transfer learning on the aggregated dataset for face presentation attack detection. (2022)
Journal Article

Deep learning for symbols detection and classification in engineering drawings. (2020)
Journal Article

MFC-GAN: class-imbalanced dataset classification using multiple fake class generative adversarial network. (2019)
Journal Article

Face detection with YOLO on edge. (2021)
Presentation / Conference Contribution

Few-shot classifier GAN. (2018)
Presentation / Conference Contribution

Downloadable Citations

HTML

BIB

RTF