For the application of real health data and federated learning in AI, we still have much work to do – and biases to address. If we do this right, a global health care system can emerge that not only improves patient outcomes, but is also non-racist. In a special edition of the Medical Data + Pizza event series on Sept. 22, this was the main conclusion of the international speakers.
The Amsterdam Medical Data Science (AMDS) network held a masterclass on ‘Real World Data – Promises, Pitfalls and Racism’ during this Medical Data + Pizza event. We still have much to do. But, as one participant put it, “Challenges are meant to be tackled.” Another said, “We need to take the time to do this right.”
And we might as well start with ourselves. “I think the most important concrete step we can take is to diversify the people involved in AI research,” said Leo Celi, of Harvard University and MIT. “Expertise is important. But so is perspective.”
Devil is in the detail
In his presentation, “How to use real-world data to improve patient outcomes,” clinician-scientist Chris Sauer, a doctoral candidate at VU Amsterdam, talked about his paper, which would be released 45 minutes later. “So this is really hot off the presses,” he smiled. “The paper is about the lessons learned in working with electronic health records. I believe these data hold a lot of promise.”
The application of electronic health record (EHR) data is not just about using already available real-world data to dramatically reduce research time and costs. The data can also be used on its own to provide unique and localised insights. At the same time, they are sufficiently random to minimise any bias.
But it remains tricky. Sauer’s paper“Leveraging electronic health records for data science: common pitfalls and how to avoid them”emphasises the importance of in-depth knowledge of clinical workflows. “The devil is in the details,” Sauer said more than once during his presentation. One might think, for example, that the ICs in Amsterdam and Berlin are similar in terms of outcomes. In reality, there appears to be a nearly 50% difference in ICU mortality rates. Therefore, you would want to find the root causes of this difference before sharing data for research.
In short, the quality of methodology – the application of thoughtful human intelligence – is the key to deriving valid and accurate study results. Or as Sauer puts it, “Be transparent. Never lie to yourself or the readers.”
First learn to swim, then deep dive
Christian Hinske is professor of data management and clinical decision support at Augsburg University Hospital. He quickly answers the suggestive title of his presentation, ‘Federated learning – A failed promise?’ “No, it’s not. But we still have a lot of work to do.”
Federated learning is an approach in which algorithms are trained together, without the need to exchange data. Thus, it directly confronts a major challenge surrounding the digital transformation of healthcare: data governance and privacy. The secure multi-party computation approach originally arose from the‘Millionaire’s Problem’: how two millionaires can figure out who is richer – and thus pay the dinner bill – without exchanging their bank statements or having a third party to verify it.
“The idea of federated learning is that we don’t really bring the data together, but we try to bring the algorithm to the data,” Hinske said. This is still a growing sector in terms of investment and number of tools, and it is still very in its infancy. “I attended a talk with NVIDIA and they told us how easy their federated learning tool was to use in the ICU. But when I asked if I could do a simple t-test, silence fell.” With that, he echoes Chris Sauer’s lecture, “We need to do the simple things first, before we start deep learning.”
Still, Hinske remains quite hopeful. Recent research has shown that collaborative models work better than single-setting models, and reliable performance was still found in datasets where the distribution was unbalanced and extreme. “My message is that it is a young and powerful technology with exciting promise. But we need more time and experience to tackle the challenges. After all, challenges are meant to be met.”
As someone who has practiced medicine on three continents, Leo Anthony Celi has a global perspective when it comes to providing health care. As director of clinical research and principal investigator of the MIT Laboratory for Computational Physiology, he is a proponent of the idea that data and machine learning will prove to be the most impactful of all medicines. This is reflected in projects his lab has led, such as the MIMIC Code Repository, MIT Critical Data and the collaboration with Philips: eICU Collaborative Research Database. As he puts it, “Our mission is to innovate the way we create medical knowledge.”
Celi has a practiced eye when it comes to biases in the system. In short: health care is intrinsically unfair. While Celi’s lecture was announced as ‘Why medical AI is racist,’ he renamed it ‘Healthcare is not ready for AI – and neither are humans’. “My main point is that we still have a lot of work to do – as the previous speakers also emphasised. We are only at the beginning of this journey. There is so much hype now. But I think we are going in the wrong direction and we need to reset our GPS.”
“But the really good thing is that AI has held up this mirror to us. We could no longer avoid it, because AI is all about the data. So now everything is under a magnifying glass, we really look at everything we do under the microscope. That to me is the biggest contribution of AI.”
The Amsterdam Medical Data Science Group meetings are supported by The Right Data Right Now consortium, which includes Amsterdam UMC, OLVG, Vrije Universiteit, Pacmed, Amsterdam Economic Board and Smart Health Amsterdam.