George Rebane
We’ll start by skipping the famous Yogism about predicting. But living life successfully requires us to constantly predict the future – events that may or not happen in a time yet to come. It’s necessary to do that because we have to prepare for various futures by deciding what actions to take right now, or at least before the predicted event occurs. Most of our predictions are about whether a certain thing will happen or not, whether it will happen before a given time or not, or which one of, say, five possible things will come true. In short, very discrete things, things or events that can be characterized by a ‘yes’ or ‘no’. Then there are predictions that aren’t discrete and cover a range of values that are the anticipated outcome. Things like how many marbles are in a big jar, when will Billy get home from work, how much rain will the coming storm bring, what will our house sell for, a candidate’s poll number after the debate, the closing value of the Dow Industrials at week’s end, this year’s GDP, and so forth.
A little thought shows that all the CURRENT answers to such questions or unknown values are really what the techies call random variables. But for each such random variable or r.v., all of us have a hunch or belief about how well we can characterize our ignorance about the actual outcome or value realized in the future.
For discrete (random) events we often characterize the chance that something will come to pass as, say, ‘one out of three’ or state your odds such as ‘three to one it’ll rain tomorrow’. If we really don’t have a clue as to the outcome, we’ll use the old coin flip 50-50 analogy. For continuous r.v.s most people will state the smallest range of values that for them encompass all or almost all the possibilities. They’ll use phrases like ‘Tomorrow’s high temperature will be somewhere in the 45 to 55F range’ or ‘The fourth quarter GDP growth will be in the 1.5 to 2.0% range.’
However, when pressed or thought about some more, almost everyone is able to pick a value somewhere in their stated range which they believe will be at or near the most likely value that comes to pass. And they can also tell you how confident they are about choosing that most likely or best guess value by using words like ‘pretty sure’, ’50-50 it’ll be somewhere near there’, ‘it’s just a wild hunch’, ‘I’d be surprised if it’s not close to that’, and so on.
Well, it turns out that you can usefully summarize your belief in such a future value by just stating your intuition in four numbers – the low/high of the range (L and H), the best guess or most likely value within that range (M), and you confidence (C) in the best guess. C values will range between zero and one – low C values for little confidence and higher values expressing more confidence. These four values or the 4-tuple – [L, H, M, C] - expressing your belief defines a very useful and powerful measure of probability that can be combined with other such measures to judge, say, who’s prediction is best or what level of risk is represented by such a ‘state of knowledge’.
Similar results could be calculated with more serious predictions of, say, the Fed’s discount rate after the FOMC meeting that will impact your securities portfolio, and may recommend some immediate rebalancing of your investments.
So how did we come up with those ‘likelihood’ values from your and your friend’s predictions? Well, you can just accept that it involves some technical mumbo-jumbo, and quit reading right here. Or you can say, I like the approach because it’s intuitive and simple enough to use; give me the spreadsheet and I’ll take it from there (Email me and I'll send it to you). Or you can hang on a bit and learn a little about probability and risk.
Eine Kleine Technische Musik. Any coherent collection of the four numbers or 4-tuple defines what is known as a probability density function (PDF), the most popular one that everyone has heard of is the Gaussian or normal, or simply the ‘bell curve’. A PDF may assume literally any reasonable shape and is defined over the relevant range of the yet unknown random variable (r.v.). Our 4-tuple defines a ‘univariate’ (single variable) PDF that has the shape of a house. It is called a Mode Augmented Boxcar (MAB) distribution and shown below in Figure 1. (The mathematical derivation and formulas can be downloaded here - Download TR1509-1_MABv151231)
The confidence value C relates to XM, the most likely or best guess value. The closer C is to unity or one, the more peaked is the MAB at XM, and the closer C is to zero, the flatter is the ‘roof’ of the house-shaped PDF. At C = 1 the MAB assumes the shape of a triangle PDF, and at C = 0 it becomes the well-known uniform or ‘boxcar’ (rectangle) distribution, and essentially ignores any best guess value (see Figure 3).
Probability, which takes on values from zero (impossible) to one (certainty), is computed as the area under the PDF between two values of the random variable. A moment’s thought then tells you that in the above figure the area under the blue PDF between XL and XH must be unity or one, since the PDF says it is certain that the random variable will take on some value from XL to XH. Other intervals, that can but don’t have to be between XL and XH, will take on values less than one indicating the probability that the random variable will fall within the specified range.
The height of the MAB (blue PDF in Figure 1) is called the likelihood and measured on the p(x) axis. In Figure 2 we have drawn the two MABs defined by your (blue) and your friend’s (green) predictions captured by the 4-tuples given above. You can now see why your prediction was the better of the two because its likelihood at the outcome value of 52.5% was 0.35, your friend’s was 0.15. That was because his PDF was more spread out and flatter. Remember, both of your PDFs have equal areas under their curves, each being unity. So if someone is less certain about the outcome and expresses it with a greater range between their low and high values and/or is less confident about his best guess, he must then pay for that with a flatter more spread out MAB that gives lower likelihoods (the math is in the above cited paper).
Figure 2
Similar arguments can be made with the location of the best guess and the confidence in that most likely value. If, say, both of your MABs had the identical range and most likely values, then the one with the higher confidence would win if the random variable realized near the best guess, but the higher confidence would lose if the real value was far from the best guess and came in close to either edge of the MAB. This is illustrated in Figure 3 which shows the MAB for the 4-tuple [5, 15, 12, C] where C is varied from zero to one illustrating how the MAB changes shape from a boxcar to a triangle PDF.
And that’s how you capture, compute, and compare subjective beliefs about the future that are intuitive and accessible to the large population of non-technical prognosticators. (To extract and deal with more sophisticated MAB parameters like expected values, variances, and cumulative distributions, please download the MAB paper cited above.)
Comments