Engineering drawings such as Piping and Instrumentation Diagrams contain a vast amount of text data which is essential to identify shapes, pipeline activities, tags, amongst others. These diagrams are often stored in undigitised format, such as paper copy, meaning the information contained within the diagrams is not readily accessible to inspect and use for further data analytics. In this paper, we make use of the benefits of recent deep learning advances by selecting models for both text detection and text recognition, and apply them to the digitisation of text from within real world complex engineering diagrams. Results show that 90% of text strings were detected including vertical text strings, however certain non text diagram elements were detected as text. Text strings were obtained by the text recognition method for 86% of detected text instances. The findings show that whilst the chosen Deep Learning methods were able to detect and recognise text which occurred in simple scenarios, more complex representations of text including those text strings located in close proximity to other drawing elements were highlighted as a remaining challenge.
JAMIESON, L, MORENO-GARCIA, C.F. and ELYAN, E. 2020. Deep learning for text detection and recognition in complex engineering diagrams. In Proceedings of the 2020 Institute of Electrical and Electronics Engineers (IEEE) International joint conference on neural networks (IEEE IJCNN 2020), part of the 2020 IEEE World congress on computational intelligence (IEEE WCCI 2020) and co-located with the 2020 IEEE congress on evolutionary computation (IEEE CEC 2020) and the 2020 IEEE International fuzzy systems conference (FUZZ-IEEE 2020), 19-24 July 2020, [virtual conference]. Piscataway: IEEE [online], article ID 9207127. Available from: https://doi.org/10.1109/IJCNN48605.2020.9207127