Cancer Research Journal

Submit a Manuscript

Publishing with us to make your research visible to the widest possible audience.

Propose a Special Issue

Building a community of authors and readers to discuss the latest research and develop new ideas.

High Accuracy Classification of Populations with Breast Cancer: SVM Approach

Breast cancer is one of the most common cancers diagnosed in the United States. Breast cancer can occur in both men and women. The number of deaths associated with this disease is steadily declining, largely due to factors such as earlier detection and a new personalized approach to treatment. In this article, we offer a highly accurate and reliable classification approach based on feature engineering and an improved support vector machine (SVM) classifier. We examine a dataset with 30 features and use in-depth data analytics and visualization to pinpoint the top nine features that have a significant impact on classification accuracy. The SVM classification outperformed other classifiers, including kernel extensions, with a high accuracy of 99.12%. The study stresses the value of machine learning in medical diagnosis, notably in the early detection of breast cancer, and indicates the possibility for further research in this area utilizing deep learning architectures. Early detection of breast cancer is critical, and our findings contribute to the growing body of knowledge in this area, opening new avenues for improving cancer diagnosis and patient care.

Breast Cancer, Support Vector Machine, Feature Engineering, Early Detection, Machine Learning, Classification, Data Analytics

Philip de Melo, Mane Davtyan. (2023). High Accuracy Classification of Populations with Breast Cancer: SVM Approach. Cancer Research Journal, 11(3), 94-104.

Copyright © 2023 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Vapnik V. N. Complete Statistical Theory of Learning Automation and Remote Control. 80: 1949-1975. DOI: 10.1134/S000511791911002X.
2. Vapnik V. N, Lerner A. Y, Chervonenkis A. Y. Learning Methods in Problems of Diagnosis Ifac Proceedings Volumes. 2: 741-747. DOI: 10.1016/S1474-6670(17)68922-5.
3. Vapnik V. N. An overview of statistical learning theory. IEEE Transactions On Neural Networks a Publication of the IEEE Neural Networks Council. 10: 988-99. PMID 18252602 DOI: 10.1109/72.788640.
4. Alireza Osareh; Bita Shadgar, Machine learning techniques to diagnose breast cancer, 2010 5th International Symposium on Health Informatics and Bioinformatics.
5. Fan J, Wu Y, Yuan M, Page D, Liu J, Ong IM, Peissig P, Burnside E. Structure-leveraged methods in breast cancer risk prediction. The Journal of Machine Learning Research. 2016; 17 (1): 2956–70. [PMC free article]
6. Oyewola D, Hakimi D, Adeboye K, Shehu MD. Using five machine learning for breast cancer biopsy predictions based on mammographic diagnosis. International Journal of Engineering Technologies. 2016; 2 (4): 142–5. doi: 10.19072/ijet.280563. [CrossRef]
7. Hou C, Zhong X, He P, Xu B, Diao S, Yi F, Zheng H, Li J. Predicting Breast Cancer in Chinese Women Using Machine Learning Techniques: Algorithm Development. JMIR Med Inform. 2020; 8 (6): e17364. doi: 10.2196/17364. [ PMC Free Article]
8. Behravan H, Hartikainen JM, Tengström M, Kosma VM, Mannermaa A. Predicting breast cancer risk using interacting genetic and demographic factors and machine learning. Sci Rep. 2020; 10 (1): 11044. doi: 10.1038/s41598-020-66907-9. [PMC Free Article]
9. Kumar K, Singh VV, Ramaswamy R. Different Perspective of Machine Learning Technique to Better Predict Breast Cancer Survival. BioRxiv. 2020 doi: 10.1101/2020.07.03.186890. [CrossRef] [Google Scholar]
10. Alghunaim S, Al-Baity HH. On the scalability of machine-learning algorithms for breast cancer prediction in big data context. IEEE Access. 2019; 7: 91535–46. doi: 10.1109/ACCESS.2019.2927080. [CrossRef] [Google Scholar]
11. Asri H, Mousannif H, Al Moatassime H, Noel T. Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science. 2016; 83: 1064–9. doi: 10.1016/j.procs.2016.04.224. [CrossRef] [Google Scholar]
12. Lotfnezhad Afshar H, Jabbari N, Khalkhali HR, Esnaashari O. Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation. Iran J Public Health. 2021; 50 (3): 598–605. doi: 10.18502/ijph.v50i3.5606. [ PMC Free Article]
13. Tapak L, Shirmohammadi-Khorram N, Amini P, Alafchi B, Hamidi O, Poorolajal J. Prediction of survival and metastasis in breast cancer patients using machine learning classifiers. Clinical Epidemiology and Global Health. 2019; 7 (3): 293–9. doi: 10.1016/j.cegh.2018.10.003. [CrossRef] [Google Scholar]
14. Gaye, B., Zhang, D., & Wulamu, A. (2021). Improvement of support vector machine algorithm in big data background. Mathematical Problems in Engineering, 2021, 1–9.
15. Yamasari, Y., Qoiriah, A., Rochmawati, N., Suartana, I., Putra, O. V., & Nurhidayat, A. I. (2022). Exploring the kernel on SVM to enhance the classification performance of students’ academic performance.
16. Shao, J., Liu, X., & He, W. (2021). Kernel Based Data-Adaptive Support vector machines for Multi-Class classification. Mathematics, 9 (9), 936.
17. Support Vector machine based diagnosis of breast cancer. (2020, July 1). IEEE Conference Publication | IEEE Xplore.