Skip to main content

Research Repository

Advanced Search

Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions.

Eldj�rn, Gr�mur Hj�rleifsson; Ramsay, Andrew; van der Hooft, Justin J.J.; Duncan, Katherine R.; Soldatou, Sylvia; Rousu, Juho; Daly, R�n�n; Wandy, Joe; Rogers, Simon


Gr�mur Hj�rleifsson Eldj�rn

Andrew Ramsay

Justin J.J. van der Hooft

Katherine R. Duncan

Sylvia Soldatou

Juho Rousu

R�n�n Daly

Joe Wandy

Simon Rogers


Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.


ELDJÁRN, G.H., RAMSAY, A., VAN DER HOOFT, J.J.J., DUNCAN, K.R., SOLDATOU, S., ROUSU, J., DALY, J., WANDY, J. and ROGERS, S. 2021. Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions. PLOS computational biology [online], 17(5), e1008920. Available from:

Journal Article Type Article
Acceptance Date Mar 26, 2021
Online Publication Date May 4, 2021
Publication Date May 31, 2021
Deposit Date May 31, 2021
Publicly Available Date May 31, 2021
Journal PLOS Computational Biology
Print ISSN 1553-734X
Electronic ISSN 1553-7358
Publisher Public Library of Science
Peer Reviewed Peer Reviewed
Volume 17
Issue 5
Article Number e1008920
Keywords Ecology; Modelling and simulation; Computational theory and mathematics; Genetics; Ecology, evolution, behavior and systematics; Molecular biology; Cellular and molecular neuroscience
Public URL
Related Public URLs


Related Outputs

You might also like

Downloadable Citations