A graph repository for learning error-tolerant graph matching.

. In the last years, efforts in the pattern recognition ﬁ eld have been 12 especially focused on developing systems that use graph based representations. 13 To that aim, some graph repositories have been presented to test graph-matching 14 algorithms or to learn some parameters needed on such algorithms. The aim of 15 these tests has always been to increase the recognition ratio in a classi ﬁ cation 16 framework. Nevertheless, some graph-matching applications are not solely 17 intended for classi ﬁ cation purposes, but to detect similarities between the local 18 parts of the objects that they represent. Thus, current state of the art repositories 19 provide insuf ﬁ cient information. We present a graph repository structure such 20 that each register is not only composed of a graph and its class, but also of a pair 21 of graphs and a ground-truth correspondence between them, as well as their 22 class. This repository structure is useful to analyse and develop graph-matching 23 algorithms and to learn their parameters in a broadly manner. We present seven


Introduction
For this reason, in 2008, a specific database to perform benchmarking on graph databases was published for the first time [6]. As authors reported, they presented such 46 database and published its paper with the aim of providing to the scientific community a public and general framework to evaluate graph representations and graph algorithms 48 [7][8][9], such as error-tolerant graph matching, [10][11][12][13][14][15] learning the consensus of several 49 correspondences, [16][17][18][19][20], image registration based on graphs, [21,22], learning graph-50 matching parameters [23,24], and so on. Note that a huge amount of methods has been 51 presented, and the previous list is simply a small sample of them. For a detailed list of 52 methods, we refer to the aforementioned surveys [1][2][3]

74
In this paper, we present a new graph-database structure. Registers on this database 75 are composed of a pair of graphs, a ground-truth correspondence between them as well 76 as the class of these graphs. This ground-truth is independent of the graph-matching 77 algorithm and also on their specific parameters, since it has been imposed by a human 78 or an optimal automatic technique. Therefore, the quality measures that we can extract 79 not only are the ones related on classification, but also the ones related on the ground-80 truth correspondence, such as the Hamming distance (HD) between the obtained 81 correspondence and the ground-truth correspondence. Moreover, some graph-matching 82 learning algorithms that need a given ground-truth correspondence [19,33,[35][36][37] 83 could be applied and evaluated. We concretise this structure on seven different 84 databases, and we present some quality measures experimented on them.

92
The rest of the paper is structured in two other sections. In the first one, we present 93 the graph repository and its benchmarks. In the second one, we conclude the paper.

07
G 0i ; f i ; C i . Attributed graphs G i and G 0i need to be defined in the same attribute domain, 08 but may have different orders. The ground-truth correspondence f i between the nodes of 09 G i and G 0i may have some nodes of G i mapped to nodes of G 0i , and other ones mapped 10 to a null node. Nevertheless, two nodes of G i cannot be mapped to the same node of G 0i .

11
The null node is a mechanism to represent that a node of G i do not have to be mapped 12 to any node of G 0i [10]. Note some nodes of G 0i may not have been mapped to any node 13 of G i through f i . Moreover, we impose both graphs to belong to the same class. This is 14 because we consider it has no sense to map local parts of objects that belong to different 15 classes. For instance, if graphs represent hand-written characters, there is no ground-16 truth correspondence between an "A" and a "J".

23
The output has the format G i ; G 0i ; C i ; f i ; I i ; I 0i . G i and G 0i are both graphs with their 24 class C i , f i is the ground-truth correspondence, and values I i and I 0i are the indices 25 of graphs G i and G 0i respectively. These indices are useful to know which graphs have 26 been mapped to other ones since any given graph can appear in several registers 27 although each time has to be mapped to a different graph.

76
In order to construct this database, we used palmprint images contained in the Tsinghua 77 500 DPI Database [41], which currently has more than 150 subjects whose right and 78 left palm has been scanned a total of 8 times each. Using the first 20 palms of the 79 original database (10 right hands and 10 left hands), this database is constituted by a 80 total of 20 classes of 8 graphs each. Minutiae were extracted using the algorithm 81 proposed in [42] and graphs were constructed with each node representing a minutia.

82
Node attributes contain information such as the minutiae position, angle, type 83 (termination or bifurcation) and quality (good or poor). Edges are conformed using the 84 Delaunay triangulation and do not have attributes. Finally, a correspondence between 85 all graphs of the same class is generated using a greedy matching algorithm based on 86 the Hough transform [43]. An example of a palmprint image and its graph is provided 87 in Fig. 3.

06
There are three variants of the database depending on the degree of distortion with 07 respect to the original prototype (adding, deleting and moving nodes and edges), viz. 08 low, medium and high. The ground-truth correspondence between the nodes is well-09 known, because graphs of each class are generated from an original prototype.

11
The Sagrada Familia 3D database consist of a set of graphs, where each one represents  (Fig. 6).