Skip to main content

Research Repository

Advanced Search

An analytical prediction of breast cancer using machine learning.

Chilukuri, N.V.S. Guru Sai Sarma; Bano, Shahana; Tholeti, Guru Sree Ram; Kamma, Sai Pavan; Niharika, Gorsa Lakshmi

Authors

N.V.S. Guru Sai Sarma Chilukuri

Guru Sree Ram Tholeti

Sai Pavan Kamma

Gorsa Lakshmi Niharika



Contributors

Amit Kumar
Editor

Sabrina Senatore
Editor

Vinit Kumar Gunjan
Editor

Abstract

Breast cancer is one of the most frequent cancers among women, affecting about 2 million people. There is 98% chance of 5-years survival rate if detected at early stage. The data about breast cancer used in this paper is the Wisconsin dataset, which is taken from Kaggle. This is a classification problem; there are two classes (0 representing a non-malignant tumor, 1 representing malignancy). Min-max scalar is used for preprocessing of data, to limit data within certain range (known as scaling). The algorithms used for classification are support vector classifier, random forest, naïve Bayes, decision tree and k-nearest neighbours. Evaluation metrics - such as area under curve-rectified operational characteristics curve, confusion matrix, recall score - were used to determine accuracy. To avoid overfitting, cross validation is used where k fold value is 3. Support vector classifier and random forest gave the highest accuracy.

Citation

CHILUKURI, N.V.S.G.S.S., BANO, S., THOLETI, G.S.R., KAMMA, S.P. and NIHARIKA, G.L. 2022. An analytical prediction of breast cancer using machine learning. In Kumar, A., Senatore, S. and Gunjan, V.K. (eds.) Proceedings of the 2nd International conference on data science, machine learning and applications (ICDSMLA 2020), 21-22 November 2020, Pune, India. Lecture notes in electrical engineering, 783. Singapore: Springer [online], pages 185-202. Available from: https://doi.org/10.1007/978-981-16-3690-5_17

Conference Name 2nd International conference on data science, machine learning and applications (ICDSMLA 2020)
Conference Location Pune, India
Start Date Nov 21, 2020
End Date Nov 22, 2020
Acceptance Date Oct 15, 2020
Online Publication Date Nov 9, 2021
Publication Date Dec 31, 2022
Deposit Date Jun 7, 2024
Publicly Available Date Jun 7, 2024
Publisher Springer
Pages 185-202
Series Title Lecture notes in electrical engineering
Series Number 783
Series ISSN 1876-1100; 1876-1119
ISBN 9789811636899
DOI https://doi.org/10.1007/978-981-16-3690-5_17
Keywords Breast cancer detection; Artificial neural networks; Artificial intelligence and medicine; Random forests; Decision trees
Public URL https://rgu-repository.worktribe.com/output/2063956

Files






You might also like



Downloadable Citations