A Narrative Review on Machine Learning Methods for Pre-Malignant Blood Cells Identification in Hematological Malignancies

Samuel Bamidele Afolabi; Tobechi Brendan Nnanna; Glory Ojoma, Simon; Damian Ndubuisi Nwajei; Victor Damilare Oladele; Adibia, Umoroye Nathan; Tobiloba Philip Olatokun

doi:10.9734/irjo/2026/v9i1203

A Narrative Review on Machine Learning Methods for Pre-Malignant Blood Cells Identification in Hematological Malignancies

Full Article - PDF Review History Discussion

Published: 2026-04-30

DOI: 10.9734/irjo/2026/v9i1203

Page: 134-150

Issue: 2026 - Volume 9 [Issue 1]

Samuel Bamidele Afolabi *

Federal University, Oye-Ekiti, Nigeria.

Tobechi Brendan Nnanna

Aston Pharmacy School, College of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK.

Glory Ojoma, Simon

Department of Information Technology, AI, Data Science and Machine Learning, Halmstad University, Halmstad, Sweden.

Damian Ndubuisi Nwajei

Department of Microbiology, University of Port Harcourt, Port Harcourt, Nigeria.

Victor Damilare Oladele

Department of Biomedical Science, University of Salford, Salford, UK.

Adibia, Umoroye Nathan

Department of Microbiology, University of Port Harcourt, Port Harcourt, Nigeria.

Tobiloba Philip Olatokun

MPH Environmental Health, University of Illinois Springfield, Springfield, Illinois, USA.

*Author to whom correspondence should be addressed.

Abstract

Hematological malignancies, including leukemias, lymphomas, myelomas, and myelodysplastic syndromes, impose a substantial global health burden, accounting for approximately 10% of all new cancer diagnoses worldwide. Early identification of pre-malignant blood cells a critical window for preventive intervention remains clinically challenging due to the limitations of conventional diagnostic tools such as light microscopy, flow cytometry, and next-generation sequencing, none of which is individually optimized for risk stratification prior to overt disease manifestation. This review examines machine learning (ML) approaches for classifying pre-malignant blood cells, synthesizing evidence from 25 studies encompassing 38,417 participants across diverse clinical settings. Ensemble methods and Random Forest algorithms demonstrated consistently strong discriminative performance, achieving AUC-ROC values ranging from 0.856 to 0.932. Multi-omics integration combining morphological, immunophenotypic, genetic, and epigenetic data systematically outperformed single-domain approaches, underscoring the biological complexity of pre-malignant transformation. Key predictive biomarkers identified across studies included CD34 expression levels, telomere length attrition, and TP53 mutation status, consistent with established pathways of clonal hematopoietic evolution. Despite these promising findings, significant methodological limitations were identified: external validation was reported in only 44% of studies, and open-source code availability was documented in just 40%, raising concerns about reproducibility and generalizability. Additionally, most training cohorts lacked demographic diversity, limiting applicability across varied populations. Successful translation of ML-based pre-malignant cell classification into routine clinical practice will require prospective validation trials, standardized reporting frameworks aligned with existing diagnostic criteria, and the development of ethnically and geographically diverse training datasets.

Keywords: Machine learning, hematological malignancies, pre-malignant blood cells, cancer risk stratification, multi-omics integration

How to Cite

Afolabi, Samuel Bamidele, Tobechi Brendan Nnanna, Glory Ojoma, Simon, Damian Ndubuisi Nwajei, Victor Damilare Oladele, Adibia, Umoroye Nathan, and Tobiloba Philip Olatokun. 2026. “A Narrative Review on Machine Learning Methods for Pre-Malignant Blood Cells Identification in Hematological Malignancies”. International Research Journal of Oncology 9 (1):134-50. https://doi.org/10.9734/irjo/2026/v9i1203.

Downloads

Download data is not yet available.