A Bayesian Way to compare Wound Healing Rates

Wound Healing Rate, or WHR, is a widely used yet oftentimes divisive performance metric within the wound care industry. WHR is commonly understood to mean the number of patients with healed wounds versus the total number of patients with wounds during some defined period of time, expressed as a percentage. For example, 50 patients with healed wounds versus 100 total patients with wounds over a quarter, would be a 50% WHR.

Putting aside the Fifean commentary about the credibility of reported WHR, I want to focus on WHR for comparison, say when comparing WHR between SNF facilities. Let’s consider the following six fictional SNF’s and, ask which SNF(s) has the best and worst WHR?

Our imaginary SNF’s

As we can see, we have differing denominators (Total Residents with Wounds), which makes a straightforward comparison difficult. The classical tools here would be the Chi-squared test for proportions or Fisher’s exact Test. The outputs between these two tests are a chi-square score (the relationship between variables), degrees of freedom (the number of samples being summed), and a p-value (a measure of the probability that an observed difference could have occurred just by random chance). These tests and outputs, at least to me, are unintuitive, and confusing, and don’t really answer the question that I have.

I suggest that we try a more intuitive and elucidating approach; one based on a Bayesianism.

Although this may seem a paradox, all exact science is dominated by the idea of approximation


What is Bayesianism, and how is it different from classical statistics a/k/a Frequentism? Dr. Jake VanderPlas articulates the difference succinctly:

For frequentists, probability only has meaning in terms of a limiting case of repeated measurements…probabilities are fundamentally related to frequencies of events.

For Bayesians, the concept of probability is extended to cover degrees of certainty about statements…probabilities are fundamentally related to our own knowledge about an event.

In short, Bayesianism allows one to encode their own beliefs and knowledge when doing statistical inference, and to provide a measure of uncertainty to the results of that inference. This is done by randomly sampling from distributions that represent our beliefs and knowledge, and conditioning this sampling on the data we have collected.

Let’s look at that table of SNF’s again:

Our imaginary SNFs

And let’s represent the data in a chart:

(chart below is interactive)

What are my beliefs and what knowledge can I bring to bear to this problem?

  • First, I believe that facilities use different internal methods to measure WHR and the numbers should be treated with healthy skepticism.
  • Second, I believe that SNF’s are generally faced with similar issues related to wounds, i.e. staffing, supplies, etc., so I can think of these facilities as being, despite differences in size, more similar than dissimilar.
  • Third, I believe the WHR from the SNF’s with larger denominators (Total Resident with Wounds) than I do the SNF’s with smaller denominators.

I can encode all of the beliefs and knowledge in what is called a Hierarchical Beta-Binomial model. It sounds more complicated than it really is. “Beta” refers to a distribution between 0 and 1, which represents the probability of an event, here the probability of a wound healing. This part is called a Prior because we are encoding our belief about what we think the result will look like. “Binomial” here refers to the likelihood of a wound healing or not healing, which is a binary (“heal” or “not heal”) process. This is based on the Binomial distribution, conditioned on the observed WHR for each SNF. “Hierarchical” refers to how the model allows information about the SNF’s to be shared during the calculations, i.e. that SNF’s are similar and that WHR will tend toward a global mean.

We can then use probabilistic programming software, to do the hard-lifting of drawing thousands of random samples based on our model, conditioned on our observed WHR for each nursing home.

We can summarize the result, or posterior, as follows:

(chart below is interactive)

We can now answer the question of whose WHR performance is better, in an intuitive way. We have credible intervals, which is the range of our belief(s) about what the real WHR performance is for each SNF. The larger the range, the higher uncertainty we have. We can see that the range for SNF 1 is very wide because we have a small sample size and a WHR that is very low. The range is very large, expressing our skepticism about this WHR. We also have Bayesian WHR means, which in most cases is close to the observed WHR. However, we will notice that most Bayesian means move toward the middle or “global mean”. This articulates our belief that the SNF’s are similar as it relates to wound-care.

Based on this model, SNF 5 is the best performer. It has the highest Bayesian mean and the lowest uncertainty (edging out SNF 4). The worst performer is a tossup between SNF 1 and SNF 6, although based on the credibility interval, I would pick SNF 6 as the worst performer. If we wished, we can also rank the SNF’s or compare one SNF to another, and even easily calculate the probability of a WHR on the credible interval for each SNF. Because we have a full posterior, we can answer many more questions in the business realm, in a way that is intuitive and easily understood by stakeholders.