Leveraging machine learning sparks innovation in bone marrow transplantation
Major innovation often follows from the intersection of multiple technologies or disciplines. One prominent example may be in your hand or pocket right now: the combination of a communication device, computer, camera, entertainment system and other electronics that comprise the smartphone. Innovation can stem from even simpler mixtures of just two elements, such as how combining the chemical capturing of light with theater created movies or how peanut butter and chocolate together make up a famous, orange-wrapped candy. In medicine, the combination of computation and clinical expertise is opening doors toward better care. Machine learning, the use of computer systems designed to adapt and learn from a set of parameters without following explicit instructions, is a pioneering technology. When scientists from St. Jude Children’s Research Hospital combined machine learning with expertise in bone marrow transplantation, they were able to create a machine learning algorithm that predicted patient survival more accurately than ever at 100 days, one year and two years post-procedure. “One of the things that we have always struggled with is how to predict who’s going to have a bad outcome after transplantation,” said Akshay Sharma, MBBS, St. Jude Department of Bone Marrow Transplantation and Cellular Therapy, and co-corresponding author on the recent study in Blood Advances. “If we can predict who will have these bad outcomes, we can do something about it. This study was a proof-of-concept that it is possible to improve early prediction of bad outcomes after bone marrow transplantation using machine learning.” “We are well aware that transplantation, being a major medical procedure, typically entails extended inpatient stays lasting for weeks,” added co-corresponding author Li Tang, PhD, St. Jude Department of Biostatistics. “It is also common practice in most centers to conduct frequent monitoring and regular tests on transplant patients. However, prior to our research, there had been limited exploration into how continuous monitoring over time, the integration of electronic medical records and the assimilation of other procedural information could be leveraged to enhance the accuracy for predicting adverse outcomes in these patients.” “Our algorithm achieved better accuracy for short- and long-term survival predictions than any preexisting risk prediction models,” said first author Yiwang Zhou, PhD, St. Jude Department of Biostatistics. “That outperformance came from our machine learning algorithm and incorporation of longitudinal measurements from these transplant patients.” Bone marrow transplantation, a curative procedure for leukemia, can be very risky for patients. Patients are treated with chemotherapy to remove their own bone marrow, then receive new bone marrow from a donor. The process is meant to replace their cancerous blood cells — and remaining normal blood-forming cells — with healthy blood-forming cells from the donor. However, there are many potential complications, such as graft-versus-host disease, graft failure and opportunistic infections, which are all potentially life-threatening. Given the significant risks associated with bone marrow transplantation, physicians have created risk prediction models to identify who needs additional medical interventions because they are at the highest risk of developing these complications after transplant. Many of these models are less than ideal, with only 50% accuracy when estimating mortality risk. That is because many of these risk prediction models have a fundamental design flaw: they only use data from a single point in time. “In most models, that snapshot in time occurs before transplantation,” Sharma explained. “Nobody was accounting for the changes that happened to patients after transplantation, the most intense part of this whole treatment. So, we have been making predictions without considering the procedure.” The St. Jude algorithm is different because it looks at multiple time points, from one month before up to one month after the transplant, thus creating a longitudinal dataset and enabling the algorithm to find patterns. “It’s very new for children with cancer,” Zhou said. “This is the first prediction model incorporating longitudinal data for pediatric patients during the allogeneic hematopoietic cell transplantation.” “We were really surprised that even though the data in the model was only up to 30 days after transplantation, it greatly improved the accuracy of our predictions at one year and two years after transplant compared to just considering baseline values,” Sharma said. “It is a testament to the power of collecting and using longitudinal data in a practical way.” In addition to including longitudinal data, the St. Jude model improved its predictions by including far more variables in its analysis than its predecessors. “Prior models only included 10 to 20 variables,” Zhou said. “In our method, in addition to that baseline information, we started with over 100 longitudinal measurements from clinical tests. That is a huge increase in the number of variables considered.” To protect transplant patients, physicians regularly collect blood samples to monitor their condition. That data is housed in standardized electronic health records and represents an opportunity for machine learning. Unlike early statistical models, machine learning models can handle the volume of data held in these records. Instead of incorporating 10-20 variables, it can use over 100 per day per patient. That influx of information, previously too vast to analyze, combined with the trending nature of longitudinal data, allowed the St. Jude algorithm to find patterns indicating if a patient was likely to develop complications and succumb to them. The scientists validated the algorithm at both St. Jude and a partner institution. At St. Jude, 70% of the total information of the analyzed cohort was included in training the algorithm. The statisticians then validated the prediction for the remaining 30%. But the true test was if it functioned for patients treated elsewhere. When they applied it to a large cohort from Memorial Sloan Kettering Cancer Center, the model continued to predict potential complications successfully. “This validation shows our model is very robust,” Zhou explained. “It’s reliable, and the prediction can be reproduced based on data collected from a different institution.” By combining two disparate disciplines, namely machine learning and bone marrow transplantation, St. Jude scientists created the first version of a powerful prediction tool that could one day improve patient outcomes. The special mixture of disciplines was the recipe for innovation. “St. Jude is a unique place where we’re not only good at doing transplants, and we do a lot of them, but we also have a great biostatistics core,” Sharma said. “We can put our minds together to develop a solution no one else has been able to create yet.” “We showed how both statisticians and doctors can work together to make something new,” Zhou said. “I would encourage pediatric oncology researchers to think about using machine learning algorithms in their future research. Databases are becoming bigger and bigger, collecting huge amounts of data in electronic health records. In the future, machine learning methods will help us uncover more of these discoveries in many areas of medicine.” “This marks the beginning of our journey, and our plans involve delving deeper into this data with cutting-edge computational methods,” emphasized Tang. “We aim to uncover additional insights, piece together a clearer picture and develop enhanced solutions for the benefit of both clinicians and patients. It’s important to stress that this is just the outset.”