8803-DataSanity-600x900CLICK HERE to request a free copy of the preface and chapter summary of my book…

DATA SANITY: A Quantum Leap to Unprecedented Results

What they’re saying about Davis…

"Thank you so much for two 'life-changing' presentations at the IHI meeting in Orlando. What you had to say was more helpful than ten years worth of management meetings!"

Director of Outpatient Clincs
Chinle Service Unit (IHS)
Chinle, Arizona

The Wisdom of David Kerridge—Part 2

Statistics in the real world aren’t quite as tidy as those in a text book.

Analytic statistical methods are in very strong contrast with what is normally taught in most statistics textbooks, which describe the problem as one of “accepting” or “rejecting” hypotheses. In the real world of quality improvement, we must look for repeatability over many different populations. Walter Shewhart added the new concept of statistical control, which defines repeatability over time sampling from a process, rather than a population.

For example, the effectiveness of a drug may depend on the age of the patient, or previous treatment, or the stage of the disease. Ideally we want one treatment that works well in all foreseeable circumstances, but we may not be able to get it. Once we recognize that the aim of the study is to predict, we can see what range of possibilities are most important. We not only design studies to cover a wide range of circumstances, but to make the “inference gap” as small as possible.

By the inference gap we mean the gap between the circumstances under which the observations were collected and the circumstances in which the treatment will be used. This gap has to be bridged by assumptions, in this case, based on theoretical medical knowledge, about the importance of the differences.

Suppose that we compare two antibiotics in the treatment of an infection. We conclude that one did better in our tests. How does that help us? Well-planned and designed experiments are rarely possible in emergencies, so the gap may be quite large.

Suppose that we want to use an antibiotic in Africa, however, all our testing on the antibiotic was done in one hospital in New York in 2003. It’s quite possible that the best antibiotic in New York is not the same as the best in a refugee camp in Zaire. In New York, the strains of bacteria may be different; and the problems of transport and storage truly are different. If the antibiotic is freshly made and stored in efficient refrigerators, it may be excellent. It may not work at all if transported to a camp with poor storage facilities.

And even if the same antibiotic works in both places, how long will it go on working? This will depend on how carefully it is used, and how quickly resistant strains of bacteria build up.

And then there are the sampling issues

Scenario 1: We often use random sampling in analytic studies, but it is not the same as that used in an enumerative study. For example, we may take a group of patients who attend a particular clinic and suffer from the same chronic condition; we then choose at random, or in some complicated way involving random numbers, who is to get which treatment. However, the resulting sample is not necessarily a random sample of the patients who will be treated in the future at that same clinic. Still less are they a random sample of the patients who will be treated in any other clinic.

In fact, the patients who will be treated in the future will depend on choices that we and others have not yet made. Those choices will depend on the results of the study we are currently doing and on studies by other people that may be carried out in the future.

Scenario 2: Suppose that we want to know which of two antibiotics is better in treating typhoid. We cannot take a random sample of all the people who will be treated in the future; there is no readily available “bead box” of people waiting to be sampled, because we don’t know who will get typhoid in the future. One has no choice but to use the mathematics of random sampling; but this is a different kind of problem—sampling from an imaginary population. The famous statistician, R.A. Fisher, used the words: “A hypothetical infinite population.”

The practical difference, as Fisher saw it, is that we must not rely on what happens in any one experiment; we must repeat the experiment under as many different circumstances as we can. If the results under different circumstances are consistent, believe them. If they disagree, think again.

So with an analytic study, there are two distinct sources of uncertainty:

  • Uncertainty due to sampling, just as in an enumerative study. This can be expressed numerically by standard statistical theory.
  • Uncertainty due to the fact that we are predicting what will happen at some time in the future and to some group that is different from our original sample. This uncertainty is unknown and unknowable. We rarely know how the results we produce will be used, and so all we can do is to warn the potential user of the range of uncertainties which will affect different actions.

The latter uncertainty, especially in management circumstances, will usually be an order of magnitude greater than the uncertainty due to sampling.

People want tidy solutions and feel uncomfortable with the “unknown and unknowable.” Of course, we would rather be certain if we can, but it is very dangerous to pretend to be more certain than we are. The result, in most statistics courses, has been a theory in which the unmeasured uncertainty has just been ignored.

Aim and method—five examples

In looking at a potential improvement opportunity, “What is your aim?” is always the first question. Here are examples of different aims, calling for different methods.

Aim 1: Describe accurately the state of things at one point in time and place.

Method 1: Define precisely the population to be studied, and use very exact random sampling.

Aim 2: Discover problems and possibilities, to form a new theory.

Method 2: Look for interesting groups, where new ideas will be obvious (using a common cause strategy to expose hidden, aggregated special causes). These may be focus groups, rather than random samples. The accuracy and rigor required in the first case is wasted. This assumes that the possibilities discovered will be tested by other means before making any prediction.

Aim 3: Predict the future, to test a general theory.

Method 3: Study extreme and atypical samples (special causes) with great rigor and accuracy.

Aim 4: Predict the future, to help management.

Method 4: Get samples as close as possible to the foreseeable range of circumstances in which the prediction will be used in practice.

Aim 5: Change the future, to make it more predictable.

Method 5: Use SPC to remove special causes, and experiment using the plan-do-study-act (PDSA) cycle to reduce common cause variation.

The first case is enumerative, all the rest are analytic. How many statistics textbooks make these obviously necessary distinctions?

As a dear statistician friend of mine said as we talked about the futility of teaching degrees of freedom (DOF): “I wish people were asking better questions about the problem they’re trying to understand or solve, the quality of the data they’re collecting and crunching, and what on earth they’re actually going to do with the results and their conclusions. In a well-meaning attempt not to turn away any statistical questions, my own painful attempts to explain DOF have only served to distract the people who are asking from what they really should be thinking about.… People think it’s important, but in the big scheme of things, there are far more important issues in data collection and interpretation.… I’d rather people understood that the quality of their data is far more important than the quantity of it.”

Can we please stop the legalized torture and waste that is passing for alleged statistical training?

Copyright © 2018 Harmony Consulting, LLC All rights reserved. | Phone: 207.899.0962 | Admin | Use & Privacy | Site Map
Site design, development & hosting by Small Web Solutions