As humans, our behaviors are extremely erratic and complex. That’s why Predictive Analytics on human behavior is so fascinating as we are able to make sense of something that seems intangible. How does the human component play in to the notion of Chance and Luck? Life is filled with unknowns. While we can observe the past, what we really want to see is the future, and the future remains unknown until it happens.
In the old days, humans would attribute all future events to luck, to an unknowable deity to try and influence the future by prayers or sacrifices to the deity. But more recently, science has developed various techniques to predict the future with certainty (the sun will rise at 6:08 am tomorrow) or with some level of accuracy (e.g. it will rain tomorrow). Yet with human affairs there are too many unknowns and people attribute predictive success to “luck”.
But isn’t luck just an example of chance, i.e., just an expression of probabilities? The difference between luck and chance is similar to the risk and uncertainty which is described by various authors, for example in Nate Silver’s book, “The Signal and the Noise”. Risk represents uncertainty about the future that you can model with a probabilistic model–a prime example of this is gambling on card games like blackjack or poker. It is possible to calculate the distribution of probabilities for every possible hand. In poker, there is the additional unknown of the player strategy, but it is still possible to come up with estimates of the return from different hands, etc.
On the other hand, with other prediction problems like the mortgage market, financial engineers came up with great models for the probability distribution of people paying the mortgage early when the interest rate changed, etc.
Unfortunately, these models did not capture the behavior when house prices crashed nationwide, and the great financial crisis of 2008 resulted from the realization that the models captured risk, but not the underlying uncertainty of when and where the models were applicable and when they broke down completely.
When the models broke down, there were no easy fixes–the Gaussian assumptions underlying most models did not work with the long tailed (or “black swan”) events of the crisis. Even in games of chance where the probabilities are well known, human behavior adds an element of predictability that can be used to improve the odds. We will discuss how to improve the return in one of the simplest and most popular gambles, The Lottery.
Can you be lucky playing the lottery?
Even in simple probabilistic games where everything is due to chance and the probabilities are well known, there is often a human element to prediction that can be used to improve the odds. A good example of this is a lottery. In a typical lottery, the player chooses 6 numbers from 1 to 49 without repetition. The lottery commission also chooses 6 numbers, and if the player’s numbers match all the numbers chosen by the lottery commission, the player wins the jackpot. If the jackpot is not won one week, it is carried over to the following week.
The lottery commission carefully tests its machines to ensure that all the 49 numbers are chosen with equal probability and therefore it is possible to calculate that the probability of any particular combination of 6 numbers is 1 in , i.e., 1 in 13,983,816. Therefore, it seems obvious that you could not do any better with any particular ticket, and any number choice is as good as any other.
But as is true with anything that involves human decision-making, things are more predictable than they appear. It is clear that humans do not choose lottery numbers with the same proportions as the lottery machine–numbers like 3 and 7, birthdays, etc. are more popular than numbers like 40 and 48. As a consequence, if you choose a ticket that is very popular, e.g., 1,2,3,4,5,6, you will be sharing the jackpot with more people if you win, and therefore your expected return is lower. On the other hand, if you choose an unpopular ticket, you are more likely to take home the entire jackpot.
Typically, only about 25% of the money collected by the lottery goes to the jackpot (the remaining is divided between the government need (e.g., education) that justifies the lottery, the administrative costs and the smaller (non-jackpot) prizes). Your expected return when betting on the lottery is less than 50%, i.e., on the average, you will get back only half of what you put in. But at times the jackpot builds up so that it is more than three times the normal jackpot, at which point your expected return could go above 1 and it becomes worthwhile to bet on the lottery. But even in this case, it makes sense to only bet on unpopular ticket combinations, since if you bet on the popular combinations, the division of the jackpot would destroy the expected return.
What are the unpopular tickets? Unfortunately the lottery commissions do not publish the distribution of popularity of all possible ticket combinations. In the old days, 13 million numbers would constitute “big data”, though this data set is small by current standards of big data (Google has published the frequency count for all ngrams of words up to 5 words long on the internet, which include more than a billion sequences of 4 words or 5 words).
However, the Canadian lottery commission did once publish a list of the popularity of each number from 1 to 49, and using this distribution, it is possible to use a maximum entropy based approach described by Cover and Stern to estimate the popularity of each ticket. It turns out that the combination of least popular numbers are 20, 30, 31, 39, 40, 48, which has about ¼ the probability as the average, and this would be a good ticket to buy. Unfortunately, if a few people read this article and choose the same ticket, the advantage disappears, and hence it is probably better to buy a ticket at random from among the less popular numbers. It is not just enough to choose a low probability ticket–it is necessary to choose one that no one else will choose. And, how to do this? – I plan to cover in a separate post soon.
(Original blog article reproduced with permission from InsightsOne Inc.)