New machine learning technique found to be 30% better at predicting cancer cure rates

statistics

With the rapid development in computing power over the past few decades, machine-learning (ML) techniques have become popular in medical settings as a way to predict survival rates and life expectancies among patients diagnosed with diseases such as cancer, heart disease, stroke, and more recently, COVID-19. Such statistical modeling helps patients and caregivers balance treatment that offers the highest chance of a cure while minimizing the consequences of potential side effects.

A professor and his doctoral student at The University of Texas at Arlington have published a new model of predicting survival from cancer that they say is 30% more effective than previous models in predicting who will be cured of the disease. This model can help patients avoid treatments they don’t need while allowing treatment teams to focus instead on others who need additional interventions.

The work is published in The Annals of Applied Statistics.

“Previous studies modeling the probability of a cure, also called the cure rate, used a generalized linear model with a known parametric link function such as the logistic link function. However, this type of research doesn’t capture non-linear or complex relationships between the cure probability and important covariates, such as the age of the patient or the age of a bone marrow donor,” said principal investigator Suvra Pal, associate professor of statistics in the Department of Mathematics.

“Our research takes the previously tested promotion time cure model (PCM) and combines it with a supervised type of ML algorithm called a support vector machine (SVM) that is used to capture non-linear relationships between covariates and cure probability.”

The new SVM-integrated PCM model (PCM-SVM) is developed in a way that builds upon a simple interpretation of covariables to predict which patients will be uncured at the end of their initial treatment and will need additional medical interventions.

To test the technique, Pal and his student Wisdom Aselisewine took real survival data for patients with leukemia, a type of blood cancer that is often treated with a bone marrow transplant. The researchers chose leukemia because it is caused by the rapid production of abnormal cancerous, white blood cells. Since this does not happen in healthy people, they were able to clearly see which patients in the historic data set were cured by treatments and which were not.

Both statistical models were tested and the newer PCM-SVM technique was found to be 30% more effective at predicting who would be cured by the treatments compared to the previous technique.

“These findings clearly demonstrate the superiority of the proposed model,” Pal said. “With our improved predictive accuracy of cure, patients with significantly high cure rates can be protected from the additional risks of high-intensity treatments. Similarly, patients with low cure rates can be recommended timely treatment so that the disease does not progress to an advanced stage for which therapeutic options are limited. The proposed model will play an important role in defining the optimal treatment strategy.”

More information:
Suvra Pal et al, A semiparametric promotion time cure model with support vector machine, The Annals of Applied Statistics (2023). DOI: 10.1214/23-AOAS1741

Journal information:
Annals of Applied Statistics

Source: Read Full Article