By Kanishk Nishar
Bayes’ Theorem deals with conditional probability. Before we dive into the theorem, let us talk about conditional probability.
The formula for conditional probability is:
P (A) = Probability that event A will occur,
P (B) = Probability that event B will occur, and
P (A∩ B) = Probability that both event A and B will occur.
How is this formula derived?
Well, you could think of it as:
That is, the probability of event A happening given that event B has happened is equal to the probability that event B has happened given that event A has occurred and the probability that both event A and B will occur. Rephrasing this very same equation gives the formula given above.
The theorem’s formula is:
P(A) = Probability that the event A will occur
P(B) = Probability that event B will occur
P(A|B) = Probability that event A will occur given that event B has occurred
P(B|A) = Probability that event B will occur given that event A has occurred
For example: In a town of 10,000 people, 1% of the population suffers from malaria. Tests for the disease are accurate 96% of the time when the patient is infected. Similarly, tests are accurate 99% of the time when the patient isn’t infected. Tests show subjects to be positive 1.2% of the time. What is the probability that a patient who tested positive for the test has malaria?
What this data is telling us is that, the tests have false positives 1% of the time.
To solve this question, let us say that:
A = People are infected with malaria
B = People who test positive for malaria
Substituting the values in the equation:
Therefore, despite testing positive for the test, the likelihood that you have malaria is only 0.8%.
Bayes’ Theorem is important because it shows us how counterintuitive probability can be.
Incidentally, despite the importance of this theorem in data science, Bayes did not even publish his findings. The theorem was found among his papers after his death.
Application of Bayes’ Theorem:-
Bayes’ Theorem is utilized to predict weather based on previous data about various contributing factors such as direct solar irradiance, sea water velocity, etc.
Insurance premiums skyrockets in areas affected by floods because its predicted that these areas are more likely to be hits by natural calamities.
Symptoms can be indicative of multiple diseases. Knowing your past medical history allows doctors to narrow down possible ailments.
Companies can make optimal decisions based on past data about interest rates. They can also decide whom to lend money to based on the entity’s past financial history. Businesses can also utilize knowledge about past contingent expenses to be able to better manage their resources.
Spam filters use keywords to predict how likely an email is going to be spam. As it analyzes more emails, it gets better at correctly detecting spam.