Clinical Judgment and “Big Data”

In 1971 Godfrey Hounsfield and his team took the first image of the human brain in vivo at Atkinson Morley Hospital in London, in order to investigate a patient with a suspected frontal lobe tumour [1]. In the same period Tim de Dombal, a surgeon at St. James Infirmary in Leeds, successfully carried out a trial of a “computer assisted diagnosis” system that successfully used statistical techniques to assess the likelihood of patients who presented with acute abdominal pain suffering from each of a number of conditions [2].


Introduction
In 1971 Godfrey Hounsfield and his team took the first image of the human brain in vivo at Atkinson Morley Hospital in London, in order to investigate a patient with a suspected frontal lobe tumour [1]. In the same period Tim de Dombal, a surgeon at St. James Infirmary in Leeds, successfully carried out a trial of a "computer assisted diagnosis" system that successfully used statistical techniques to assess the likelihood of patients who presented with acute abdominal pain suffering from each of a number of conditions [2]. By 1975 the first commercial CT scanners were available (at about $1M a machine) and every major hospital in the western world aspired to have one; by 2007 72 million CT scans were being performed annually in the USA alone [1]. By 1975 it was also possible to program de Dombal's method on a $99 Hewlett Packard calculator. However, by 2007 the use of statistical methods in diagnosis and treatment decisions had achieved almost no penetration into routine clinical practice ( Figure 1). This is a remarkable contrast, particularly when one considers that the main uses of CT and most other imaging systems are limited to investigating conditions that are manifested by anatomical abnormalities that can be visualised, while statistical decision analysis is potentially useful in any clinical decision, from risk assessment, diagnosis and prognosis to selection of tests and investigations, treatment planning and many other routine clinical problems -and it doesn't require any special equipment.
Clinical decision support systems are an obvious class of applications for the modern tablet computer and the smart phone so why does statistical decision-making still have so little impact on clinical practice? It's not that the techniques are immature; Bayesian inference and decision analysis methods are very well understood from a mathematical point of view and they have been used successfully in many other settings.
One reason for the lack of take-up in medicine that is often mentioned is that the techniques are famously "data hungry"; in order to get enough information to diagnose a disease reliably, or predict the success of a treatment or other clinical intervention, you need good data for a lot of patients. This can take a long-time and be problematic in a variety of ways. GLADYS, the Glasgow Dyspepsia System, was an early computer aided diagnosis project that used probabilistic methods for diagnosing upper abdominal pain. A key step for the GLADYS team was to estimate the probabilities of particular symptoms and signs occurring in patients with a range of different conditions. They did this by recording the clinical histories of patients who presented in their gastroenterology clinic using a standardized questionnaire and terminology. They quickly accumulated data for patients presenting with common causes of dyspepsia (e.g. duodenal ulcer and "functional" disease) but the rate-limiting factor in obtaining good estimates of probabilities for the dyspepsia domain was the least common conditions, notably gastric cancer. In fact it took several years to compile a sufficiently large set of patients with cancer from a single clinic to fully populate the statistical database. Further complications arose from patients' and doctors' different and often vague use of terms [3] and difficulties transferring the system to other geographical locations where the original statistical parameters may not be valid [4].
If it isn't practical to estimate statistical parameters on a large scale perhaps we might turn to expert clinicians to provide subjective estimates of likelihoods? Unfortunately, as cognitive scientists have clearly shown, expert estimates of probability parameters are subject to various forms of bias, and subjective estimates of probabilities are often poorly "calibrated" with respect to objective estimates [5].
By 1980 a new option for helping clinicians with their decision-making had appeared, one that took more of a logical "rule-based" approach rather than a statistical one. The hubristically named "expert systems" promised to use this approach to "model human knowledge" and "emulate expert reasoning" in many fields, one of the most prominent of which was medicine [6].
In the last 10 years or so rule-based techniques have become very popular for developing commercial clinical decision systems and are being widely promoted. Despite practical successes there are reasons to be cautious about their use in clinical applications, particularly ones where a lot is riding on making the right decision. Firstly they are not grounded in a well-understood mathematical framework for clinical decisionmaking, in contrast with established statistical decision models. Furthermore it's very difficult to check and validate rules without very precise terminology and the old software adage "garbage in, garbage out" still applies.

Big Data
Perhaps it is now time for mathematical methods to return to the stage -perhaps this time to achieve the kind of impact in routine clinical practice that CT and other medical imaging techniques have? This is the promise of "big data". A large number of scientists, clinicians and computer scientists are taking this idea very seriously because of the rapid development of techniques for extracting useful information from the huge collections of structured and unstructured data to be found on the web. The appetite for large amounts of data characteristic of mathematical decision models may be about to be sated by the ability to rapidly estimate and continuously update reliable parameters that can be fed into statistical decision models.
Entirely new fields of clinical information technology, called "predictive analytics" and "prescriptive analytics", are growing rapidly.
"Big data" seems to be have captured people's imagination far beyond the medical and clinical research world, in part perhaps because of our everyday experience with the magic of Google and other search engines. Google, Autonomy, SAS clinical analytics, GE Healthcare and Microsoft are just a few of the companies who are actively looking to apply big data technologies in clinical decision-making, and the new kid on the block, IBM's "Watson" -which famously beat two national champions on the US general-knowledge game show Jeopardy, and is now being refitted for healthcare applications. Watson's advocates are promising revolutionary benefits to clinicians: "properly applied this has the potential to totally change the practice of medicine" (e.g. The Next Cancer Breakthrough http://www.youtube.com/ watch?v=hMtXHvbecY0).

What does Watson do? In one IBM demonstration video
http://www.youtube.com/watch?v=HZsPc0h_mt(or)http:// www.youtube.com/watch?v=8lGJ0h_jAp8 the voiceover says "Dr. Mark Norton, a clinical oncologist, logs into the electronic medical record for one of his patients. Instead of spending time trying to find relevant information he uses the IBM oncology diagnosis and treatment advisor and pushes the Ask Watson button and Watson analyses the patient data against tens of thousands of documents in its vast body of medical literature" (screen shows "3,469 textbooks; 69 guidelines; 247,460 journal articles; 61,540 clinical trials; 106,054 other clinical documents").
The voiceover continues: "Dr. Norton starts with the case information tab and Watson pulls out the relevant information as well as suggestions for additional information to gather" [and] "tests that Watson suggests he consider ordering". At the push of a button Watson can give specific text snippets to support the suggestions it makes, and pull up the evidence that underpins these snippets from the open research literature. Finally "he presses the treatment options tab to see a panel of confidence-scored suggested treatments"(see Figure 2) "and a list of clinical trials to consider, and again can review the supporting evidence…". The magic continues, with Dr. Norton speaking directly to Watson, to give additional information and ask questions http:// www.forbes.com/sites/johnnosta/2013/03/03/is-watson-injeopardy-this-oncologist-thinks-so/.
At this point the Watson scenario in the video is only a demonstration; the narrator tells us that the system is not operational yet. If and when it is we will want to see evidence that the new capabilities, to process documents on a huge scale, estimate statistical parameters in real time and predict the benefits of alternative treatments, is really going to deliver significant patient benefits. Nevertheless, despite the lack of firm and objective evidence to date, the possible implications of big data seem clear.

Caveats
There is a danger of getting a bit carried away though. In another IBM Marketing Video (http://www. youtube.com/ watch?v=8DBqLTdPolI) a senior clinician from Memorial Sloan Kettering says "we have the opportunity of going past intelligence to what I would call wisdom" and "Watson is going to enable us to … take that wisdom and put it in a way that people who don't have that much experience in any individual disease can have a wise counselor at their side".
The key thing to remember about Watson, however, is that despite obvious power to crunch massive amounts of medical information and clinical data its capabilities are not really comparable to the repertoire of abilities, skills and insights that a professional clinician calls clinical judgment. Watson is only what the Watson technical team designed it to be: "a smart question/answer system". Being able to answer the range of general knowledge questions that arise in the Jeopardy game show is very impressive, and being able to rapidly come up with answers to important clinical questions like "what are the treatment options for my patient" is surely one of the core skills of the expert clinician.
But clinical expertise and judgment are much more than the ability to answer questions. For starters a key feature of clinical judgment is understanding what is relevant and what isn't: knowing what the important questions are, when to deviate from the standard or usual care pathway and so forth. It means being able to plan treatments as well as answer questions, help patients set up a personalized plan of care consistent with their individual needs and preferences and, critically these days, consider a patient's personal circumstances, co-morbidities, poly-pharmacy and so on. For many patients it also includes the expectation that the clinician will explain things clearlywhat is being recommended and why-and tailor the detail and depth of explanations to a patient's goals, abilities and desire for information.
A key question in thinking about how big data might address these challenges is how people in general, including doctors and their patients, actually make decisions. An important set of insights from cognitive science is that we now know that people don't make their decisions on statistical grounds, but in very different ways (e.g. see [7]). Patients don't apply numerical models in their decision-making, nor do their doctors most of the time, and when they try they don't do it very well. People do something else

Cognitive Systems
IBM tells us that Watson is a "cognitive system". However, although humans are far from perfect in their reasoning and decision-making and medical error is an important and growing challenge to safe patient care, humans are still the best cognitive systems we know of. It is true that we apply rough and ready "rules of thumb" and "fast and frugal heuristics" to come up quickly with hypotheses and options for action but few artificial cognitive systems can yet match the range, flexibility and creativity of our thinking.
First people frame their decisions flexibly, dynamically tailoring their goals and priorities in response to the current situation and using general problem solving strategies to achieve their goals as well as bringing specialist expertise to bear.
This problem solving can be creative when we are facing a situation we have not encountered before. Such characteristically human capabilities are finely honed in experienced clinicians who can rapidly recognize any need to change direction if circumstances change.
People cope with uncertainty and ambiguity in ways that are very different from statistical methods. We construct and articulate reasons for our beliefs and actions, often in the form of logical arguments for and arguments against alternative options, and we can assess whether the rationale for arguments is sound as well as looking at whether they are consistent with new data and recent evidence.
Humans have another powerful capability that is not a feature of many computer systems. We are "meta-cognitive" creatures; we can reflect critically on our thinking, our knowledge and our assumptions, on the persuasiveness of evidence and adequacy of arguments, and whether our assumptions are still valid as the world around us changes. We can also reflect on the tradeoffs between costs and benefits of competing treatments, and balance competing criteria. Is the cost of a treatment combined with the risk of an unsuccessful outcome more or less attractive than the impact on quality of life offered by a less effective but also less aggressive chemotherapy? It is a core clinical skill to be able to help patients understand such questions and arrive at a choice that they will not later come to regret [8][9][10][11].
Finally, actually taking a decision is not just a matter of cool and rational assessment of data but is frequently bound up with the patient's personal aspirations, anxieties and cultural values.
Good clinical judgment is about trying to accommodate these and many other issues. Patients and their doctors want to discuss their decisions, to review and compare their options and how they might play out over their lives and family circumstances. They want to know the provenance of this or that treatment recommendation and the evidence it depends on. Why should I trust this damn machine?
The trouble with the state of the art of Big Data and the search engines we all now use day in day out is that they do not report why a particular set of items appeared on the first page of a Google report, or the reasons that a particular item is present and another is not. Is the selection I see a dispassionate summary of the benefits of a treatment for a patient like me? Am I really like most other patients with my condition or are my circumstances atypical? Or is it conceivable that the treatment providers have massaged the raw data to elevate their Google ranking? For the moment we need an experienced clinician to help with such issues. "Big data" is at this stage in its development synonymous with "Black box" because a big statistical calculation cannot provide much illumination of such questions.