Skip to main content

Research Repository

Advanced Search

A multiclass imbalanced dataset classification of symbols from piping and instrumentation diagrams.

Jamieson, Laura; Moreno-García, Carlos Francisco; Elyan, Eyad

Authors



Contributors

Elisa H. Barney Smith
Editor

Marcus Liwicki
Editor

Liangrui Peng
Editor

Abstract

Engineering diagrams provide rich source of information and are widely used across different industries. Recent years have seen growing research interest in developing solutions for processing and analysing these diagrams using wide range of image-processing and computer vision techniques. In this paper, we first, present a new multiclass imbalanced dataset of symbols extracted from Piping and Instrumentation Diagrams (P&IDs). The dataset contains 7,728 instances representing 48 different types of engineering symbols and it is considered the first of its kind in the research community. Second, we present a new method for handling multiclass imbalance classification based on class decomposition by means of unsupervised machine learning methods. Experiments using Convolutional Neural Networks showed that using class decomposition significantly improves the classification performance that can be achieved, without causing information loss, as it is the case with other class imbalance data sampling approaches.

Citation

JAMIESON, L., MORENO-GARCÍA, C.F. and ELYAN, E. 2024. A multiclass imbalanced dataset classification of symbols from piping and instrumentation diagrams. In Barney Smith, E.H., Liwicki, M. and Peng, L. (eds.) Proceedings of the 18th International conference on Document analysis and recognition 2024 (ICDAR 2024), 30 August - 04 September 2024, Athens, Greece. Lecture notes in computer science, 14804. Cham: Springer [online], part 1, pages 3-16. Available from: https://doi.org/10.1007/978-3-031-70533-5_1

Presentation Conference Type Conference Paper (published)
Conference Name 18th International conference on Document analysis and recognition 2024 (ICDAR 2024)
Start Date Sep 2, 2024
Acceptance Date Mar 31, 2024
Online Publication Date Sep 8, 2024
Publication Date Dec 31, 2024
Deposit Date Sep 8, 2024
Publicly Available Date Sep 9, 2025
Peer Reviewed Peer Reviewed
Volume Part 1
Pages 3-16
Series Title Lecture notes in computer science (LNCS)
Series Number 14804
Series ISSN 0302-9743; 1611-3349
Book Title Document Analysis and Recognition - ICDAR 2024
ISBN 9783031705328; 9783031705335
DOI https://doi.org/10.1007/978-3-031-70533-5_1
Keywords Piping and instrumentation diagrams; Class imbalance; Convolutional neural networks
Public URL https://rgu-repository.worktribe.com/output/2457618

Files

This file is under embargo until Sep 9, 2025 due to copyright reasons.

Contact publications@rgu.ac.uk to request a copy for personal use.



You might also like



Downloadable Citations