Great Bayesian exercise! So let’s say criminals are .5% of society. And the accuracy is 90%. What’s the probability that a person is a criminal if this algo tags them? https://t.co/XtX3OSwUsf— JD Long (@CMastication) March 6, 2019
The tweet is related to a paper discussed in this article by Calling Bull. Imagine that such an algorithm were applied to the real world. What is the probability that a person is a criminal if the algorithm says so?
JD provided two data points:
• We assume that criminals are 0.5% of the population
• The accuracy of the algorithm is 90%
In the thread there are responses that answer the question using conditional probability formulas. Here's a little secret: I loathe formulas. Particularly when I can reason with an image instead; I may one day write a book about that, although Math with Bad Drawings is already out there, and you should get a copy. I prefer quick heuristics and visuals.
For my quick back-of-the-napkin-and-not-that-precise exercise on conditional probability I needed two other figures:
• The population: let's assume that we're in the U.S, so it's 325 million (you may not need the population if you use formulas, but it's useful for the diagram.)
• The false positive rate: how often the algorithm tags a person as a criminal even if that person is not a criminal. After reading this I guessed a false positive rate of around 6%.
Here's the resulting tree diagram; the probability of your being a criminal if the algorithm tags you as such is roughly only 7%:
Of those, 93% (the 19,402,500) aren't criminals. The chance of false positives is enormous: if a photo is tagged as depicting a criminal, 9 out of 10 times that person won't be a criminal at all.
I double-checked the calculation using round numbers, beginning with a sample of 10,000 people; quants in the room, please let me know if I missed something: