Bayesian statistics

Bayesian statistics is a branch of statistics that interprets probability as a measure of belief or confidence in an event occurring. This approach is named after Thomas Bayes, who formulated Bayes’ theorem. Here’s a simple breakdown:

  1. Prior Probability: This represents your initial belief about the probability of an event before seeing any new data.
  2. Likelihood: This is the probability of observing the new data given your initial belief.
  3. Posterior Probability: This is the updated probability of the event after taking the new data into account.

Bayes’ theorem mathematically combines these elements to update your beliefs based on new evidence. This method is particularly useful in fields where you continuously gather new data and need to update your predictions or beliefs accordingly12

 

Example

Let’s go through a simple example of Bayesian statistics.

Imagine you have a medical test for a rare disease. The disease affects 1 in 1,000 people (0.1% prevalence). The test is 99% accurate, meaning it correctly identifies 99% of those with the disease (true positive rate) and correctly identifies 99% of those without the disease (true negative rate).

Now, suppose you take the test and it comes back positive. What is the probability that you actually have the disease?

Here’s how we can use Bayes’ theorem to find out:

  1. Prior Probability (P(Disease)): The initial probability of having the disease is 0.1% or 0.001.
  2. Likelihood (P(Test Positive | Disease)): The probability of testing positive if you have the disease is 99% or 0.99.
  3. Probability of Test Positive (P(Test Positive)): This includes both true positives and false positives. It can be calculated as:
    • P(Test Positive) = P(Test Positive | Disease) * P(Disease) + P(Test Positive | No Disease) * P(No Disease)
    • P(Test Positive) = (0.99 * 0.001) + (0.01 * 0.999) = 0.001 + 0.00999 = 0.01099
  4. Posterior Probability (P(Disease | Test Positive)): The updated probability of having the disease given a positive test result is:
    • P(Disease | Test Positive) = (P(Test Positive | Disease) * P(Disease)) / P(Test Positive)
    • P(Disease | Test Positive) = (0.99 * 0.001) / 0.01099 ≈ 0.090

So, even with a positive test result, the probability that you actually have the disease is about 9%.

This example shows how Bayesian statistics can help update our beliefs based on new evidence. Does this make sense? Would you like to explore another example or a different concept?

Comments