Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers.

Regnier-Coudert, Olivier; McCall, John; Lothian, Robert; Lam, Thomas; McClinton, Sam; N'Dow, James

doi:10.1016/j.artmed.2011.11.003

Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers.

Regnier-Coudert, Olivier; McCall, John; Lothian, Robert; Lam, Thomas; McClinton, Sam; N'Dow, James

Authors

Olivier Regnier-Coudert

Professor John McCall j.mccall@rgu.ac.uk
Professorial Lead

Dr Robert Lothian r.m.lothian@rgu.ac.uk
Lecturer

Thomas Lam

Sam McClinton

James N'Dow

Abstract

Prediction of prostate cancer pathological stage is an essential step in a patient's pathway. It determines the treatment that will be applied further. In current practice, urologists use the pathological stage predictions provided in Partin tables to support their decisions. However, Partin tables are based on logistic regression (LR) and built from US data. Our objective is to investigate a range of both predictive methods and of predictive variables for pathological stage prediction and assess them with respect to their predictive quality based on UK data. The latest version of Partin tables was applied to a large scale British dataset in order to measure their performances by mean of concordance index (c-index). The data was collected by the British Association of Urological Surgeons (BAUS) and gathered records from over 1700 patients treated with prostatectomy in 57 centers across UK. The original methodology was replicated using the BAUS dataset and evaluated using concordance index. In addition, a selection of classifiers, including, among others, LR, artificial neural networks and Bayesian networks (BNs) was applied to the same data and compared with each other using the area under the ROC curve (AUC). Subsets of the data were created in order to observe how classifiers perform with the inclusion of extra variables. Finally a local dataset prepared by the Aberdeen Royal Infirmary was used to study the effect on predictive performance of using different variables. Partin tables have low predictive quality (c-index = 0.602) when applied on UK data for comparison on patients with organ confined and extra prostatic extension conditions, patients at the two most frequently observed pathological stages. The use of replicate lookup tables built from British data shows an improvement in the classification, but the overall predictive quality remains low (c-index = 0.610). Comparing a range of classifiers shows that BNs generally outperform other methods. Using the four variables from Partin tables, naive Bayes is the best classifier for the prediction of each class label (AUC = 0.662 for OC). When two additional variables are added, the results of LR (0.675), artificial neural networks (0.656) and BN methods (0.679) are overall improved. BNs show higher AUCs than the other methods when the number of variables raises. The predictive quality of Partin tables can be described as low to moderate on UK data. This means that following the predictions generated by Partin tables, many patients would received an inappropriate treatment, generally associated with a deterioration of their quality of life. In addition to demographic differences between UK and the original US population, the methodology and in particular LR present limitations. BN represents a promising alternative to LR from which prostate cancer staging can benefit. Heuristic search for structure learning and the inclusion of more variables are elements that further improve BN models quality.

Citation

REGNIER-COUDERT, O., MCCALL, J., LOTHIAN, R., LAM, T., MCCLINTON, S. and N'DOW, J. 2012. Machine learning for improved pathological staging of prostate cancer: a performance comparison on a range of classifiers. Artificial intelligence in medicine [online], 55(1), pages 25-35. Available from: https://doi.org/10.1016/j.artmed.2011.11.003

Journal Article Type	Article
Acceptance Date	Nov 17, 2011
Online Publication Date	Dec 27, 2011
Publication Date	May 31, 2012
Deposit Date	Jul 20, 2018
Publicly Available Date	Jul 20, 2018
Journal	Artificial intelligence in medicine
Print ISSN	0933-3657
Electronic ISSN	1873-2860
Publisher	Elsevier
Peer Reviewed	Peer Reviewed
Volume	55
Issue	1
Pages	25-35
DOI	https://doi.org/10.1016/j.artmed.2011.11.003
Keywords	Predictive modelling; Bayesian networks; Logistic regression; Prostate cancer staging; Partin tables
Public URL	http://hdl.handle.net/10059/3010
Contract Date	Jul 20, 2018