Predictive health intelligence: Potential, limitations and sense making

: We discuss the new paradigm of predictive health intelligence , based on the use of modern deep learning algorithms and big biomedical data, along the various dimensions of: a) its potential, b) the limitations it encounters, and c) the sense it makes. We conclude by reasoning on the idea that viewing data as the unique source of sanitary knowledge, fully abstracting from human medical reasoning, may affect the scientific credibility of health predictions.


Introduction
As we have learned with the Covid-19 pandemic, anticipating a disease dynamics, worldwide or at a specific country-level, beyond monitoring infections, means to develop an analytical intelligence which is able to assess, compare and predict risks of outbreaks and threats to the global or regional health of populations [1][2][3][4].With the term predictive intelligence, we refer to a new computing framework that draws upon the use of a sophisticated variety of statistical, mathematical and computational methods, ranging from traditional data mining and optimization techniques up to artificial intelligence (AI) and deep learning [5][6][7][8].Not only are these techniques useful to analyze and interpret historical and current data, but they often become indispensable to the aim of understanding the relationships among different criteria and factors, thus guiding the decision making in almost every sector.These prediction intelligence technologies, based on machine and deep learning in particular, have recently seized the health context, especially for all those cases when, given a certain emerging health phenomenon: i) huge amounts of sparse sanitary data arrive from unreliable and heterogeneous sources, ii) quick interpretations and rapid responses are required to face with the most urgent implications of that phenomenon, iii) and, finally, the human ability to make accurate health predictions is impaired by the increasing number of the interacting dimensions [9][10][11].In the following Sections of this Editorial, we discuss this notion of predictive health intelligence, based on the use of modern machine learning algorithms and of big biomedical data, along the dimensions of: a) its potential, b) the limitations it encounters, and c) the sense it makes.

Potential
Based on the use of big biomedical data, an interdisciplinary subject has recently emerged combining medicine, computational sciences, biology and mathematics.It primarily uses methods of sub-symbolic artificial intelligence, and great amounts of data, to intelligently understand the principles and the physiological mechanisms behind human diseases, providing a guidance for disease predictions, and medical diagnosis as well.Deep learning is a bright exemplar, in this context, that has overcome the disadvantages of other more traditional mathematical and computational methods, up to the point it has been used to map the concepts coded in electronic health data records of patients and clinical images, helping doctors to predicting outcomes, like the need of hospitalization (or rehospitalization) and even mortality [12,13].Moreover, the recent introduction of the so-called attention mechanisms has further helped, enabling deep learning models to focus from the multitude of medical data to know what information in that data contributes to a more accurate health prediction.Finally, not only can those AI-based methods account for the health conditions of a given individual, but with the exploitation of any available dataset, they are also able to incorporate various key social determinants of health as wider predictors.Artificial intelligence-powered algorithms, in fact, could predict future risks of particular racial, gender, ethnic sectors of a population, by considering social factors, like education and socioeconomic status, thus extending the reach of risk prediction, prevention, and treatment much beyond the perspective of the single individual's biology [14,15].

Limitations
The majority of health prediction technologies builds on the principles of supervised learning to recognize data patterns and predict events.Unfortunately, unseen events cannot be predicted by a learning algorithm that has never received a specific training on events which have never occurred or are completely unexpected.Extending to the concept of unsupervised learning has not demonstrated there is an effective solution space to this problem, at least in the field of health predictions.Even more worrying is the condition when the prediction provides a wrong answer, in the erroneous confidence that it is right.Such a prediction failure may occur frequently, if we deal with incomplete or unrepresentative data, data of scarce quality or precision, ambiguous or biased sources of information [16].From this point of view, electronic medical record data may be often at the basis of prediction failures, because they can be can be flawed for many motivations, including the limitation given by their time of validity.Finally, one should never disregard the fact that we are dealing with statistical machines (according to the definition provided by Noam Chomsky) that after taking enormous quantities of data and having searched for common patterns in it have become tremendously proficient at generating statistically probable outcomes [17].Nonetheless, balancing the probability values inside those outputs in a way that may favor either false positives or false negatives is still an issue from which a misdiagnosis or an overdiagnosis can depend.

Sense making
With the term sense making we intend the process by which human beings give meaning to a collective experience, with the aim of providing a rationale to what they are doing.From this perspective, we should never forget what machine learning research actually is, especially when it deals with the health of humans.Despite the huge help that statistics and machine learning cultures are giving towards the goal of predicting health, it would be dangerous if we could think that health predictions lie in the data alone.Judea Pearl, a famous Turing laureate, has been one of the first to dispute against this data-fitting ideology, in contrast with a so-called data-interpreting approach [18].He warns us against the danger of idolizing the possibility of having a perfect prediction, simply taking as input all the data that we can collect.He informs us that a fully synthetic data-centric approach alone cannot rival with the human knowledge and the perfect balance between its implicit/explicit components.In fact, if we think about the nature of sanitary knowledge only in terms of processgenerated data, while abstracting from other fundamentals notions, like those of theory or cause-effect relationship, we are going to run the empirical risks that AI mechanisms pose in terms of sampling bias to the input data, or in terms of inaccurate health predictions that arise from statistical outputs [19,20].In contrast, restoring a balance between human reasoning and the data a given environment generates will provide means for a better interpretation of the reality, and hence better predictions [21,22].

Ethics of research
This Editorial does not contain any private information on patients.Therefore, ethical approval is not required.

Conclusions
Predictive health intelligence has recently made giant steps forwards in providing accurate and credible health predictions, both at the level of single individuals and at the level of group of individuals, up to that of an entire population.These advances have been favored by the emergence of AI-powered mechanisms (e.g., deep learning) combined with the use of huge amounts of biomedical data available under different forms (electronic records, clinical images and others).In this Editorial, we have discussed about this phenomenon from different perspectives.We have concluded our discussion by maintaining that viewing data as the sole source of health knowledge, without any deeper scrutiny by means of medical experts with clinical skills, may raise serious concerns on the validity of the relative diagnosis/predictions, thus affecting the scientific credibility of the approach.