A new artificial intelligence provides for which diseases you will suffer in the future

Smoking can lead to suffering from cancer. Eating too much and not making movement makes obese, and excess weight in turn puts the heart at risk and many other organs. Knowing these risk factors can help predict which dangers the future is expected, and to take measures to avoid them. But a new artificial intelligence developed by researchers of the European Molecular Biology Laboratory (Embl) aims to do much more: to estimate the risks that runs each of us, year by year, to suffer from over a thousand diseases in the course of life.

Decipher the grammar of health

The IA created by the Ambl researchers, with the collaboration of the German Cancer Center and the University of Copenhagen, was recently presented in an article published on Nature. It is based on an algorithm developed ad hoc with a scaffolding similar to the Large Langue Models of modern generative generative such as chatgpt, programs that are trained on huge text databases, and which independently learn to build phrases and periods by weighing the chances that each word is the right one in the context in which they operate.

The Medical Ia of the Ambl does the same, but working on the probability that a pathology arises in a person.

The algorithm was trained using the data of over 400 thousand people taken from the UK Biobank, a database that collects, for research purposes, detailed information on the health of hundreds of thousands of citizens of the United Kingdom. In this way, he has learned independently to analyze the clinical history of a patient as a sequence of events that develop over time, and to calculate which events are more likely to occur in the future, based on those that occurred in the past. Somehow, it is as if it had learned how to decipher the “grammar” with which health data write our future: habits, diagnoses, analysis results, and any other information potentially present on a medical folder become a variable, with which the model can predict how the patient’s medical history will continue.

The field test

Once the model has been trained, all that remains is to test it. It was made first on a set of data taken from UK Biobank, different from that used in the training phase. And therefore using the data of almost two million people contained in the national register of patients of the Danish hospital system. By obtaining encouraging results: artificial intelligence has proven to be able to reliably predict the onset of many pathologies, in particular those that follow a relatively regular clinical path, such as some forms of cancer, cardiovascular diseases and sepsis.

Orange curves indicate the simulations of the -2m, the data observed in blue

Like a model for the weather forecast, the AI’s predictions have also proved to be more accurate on short periods of time: calculating the risk of a heart attack over the next year is easier, and the most reliable results, than it is to foresee the risk to five, or ten years later.

“Medical events often follow predictable paths – explains Tom Fitzgerald, an Ambl researcher who has collaborated on the development of AI – our model learns to identify these patterns, and to predict future health results. It provides us with a tool to explore what could happen based on the medical history of a person and other key factors. It is important to understand that it is not a certainty, but an estimate of the potential. Risks “.

What will we do?

For now, the model has been developed only for research purposes, and does not aim to be adopted in clinical practice. At least, not in the near future. The limits of research – their authors warn – are still many. Starting from the database used to train the model, composed mainly of people of European ethnicity and aged between 40 and 60, who therefore does not allow the AI to reliably evaluate the risks that run in other age groups (such as childhood and adolescence) or in different ethnic or socio-economic groups.

Although it is not ready for clinical use, the model will still be useful for research purposes, to study more deeply how pathologies develop over time, explore the long -term effects of life styles and previous pathologies on our health, and to carry out epidemiological and other simulations, in contexts where you do not have access to real health data.

“It is the beginning of a new way of understanding human health and the development of diseases,” explains Maritz Gerstung, of the German Cancer Research Center. “Generative models like ours one day will help to customize care and anticipate health needs on a large scale. Learning from large populations, these models offer a powerful magnifying glass to study the development of diseases and possibly support more early and customized interventions,” he concludes.