Sources of inaccuracies in estimations
This article is part of the Challenges with estimations and possible solutions series.
Estimates always have some amount of inaccuracy involved. In the end, an estimate tends to be an extrapolation based on available information. Being able to do such an extrapolation well has its limits.
I believe that the magnitude of the inaccuracy also increases with the extent of extrapolation needed. Over a longer time frame, the number of risks that can actualise to cause the original estimate to be no longer applicable increases.
Below I'll list various inaccuracies:
Personal bias
We humans, sadly, have all kinds of cognitive biases that are unhelpful when it comes to estimation. We might be very optimistic and tend to assume the best in people and circumstances and therefore people struggling with this bias tend to underestimate. Or we might be polar opposites and be very pessimistic and see problems and risks that need mitigations everywhere we look and therefore over-estimate.
Priming / Anchoring
A very specific bias to be aware of during estimation work. The Anchoring Bias causes us to over-value the first piece of information we receive. Conscious efforts should be taken to combat this, both in asking for an estimate (like "Could you verify if this is indeed something we can finish easily within 1 day?") as well as answering (have all people do a 'blind vote' on the estimate, and then compare).
Data problems
You want to mitigate the impact of personal biases. Smart, but there are challenges here too!
Incomplete data
Imagine being asked to build a Spriglygoog (you've never heard of this because I just made this word up): you'd have no way of translating your past experiences or already collected data to this proposal. Or perhaps you actually have worked with Sprigly-things before but never collected any data about it. Or perhaps you are fresh from your education. Without any founding basis, performing any kind of extrapolation is going to be hard to impossible to do well.
Inaccurate data
Maybe you have data about building this thing you are being asked to build. But there is a lot of variance in your measurements. Or the measurement device is inaccurate itself (analogous to measuring the length of a boat using a chain of bananas) or has some level of ambiguity involved ("Well, I cannot remember exactly how long it took. Probably a couple of days?").
Incorrect data
Data collected could be misclassified, unintentionally causing incomplete or irrelevant data to be taken into account. Or there was a mistake happening in which unit of measure someone quotes a length in (inches versus centimetres).
The unpredictability of the future
The future often is hard to predict. One thing we do know is that things over time tend to change. What exactly is going to change and by how much is a challenge. "Past performance is no guarantee of future results"
Cost of material/labour changes
From a macroeconomic perspective, there is a cycle of expansions and contractions. Transitions in these cycles can be overlooked or misjudged. And the impact can at times be hard to predict. But also supply&demand competition causes prices to fluctuate.
Availability of materials/labour
Sometimes resources become scarce where there have historically been taken for granted to be always available. This could be due to labour strikes, wars, pandemics, and (natural) disasters that accessibility to abundant resources can be lost or greatly diminished.
Currency exchange rates
Due to globalisation, certain goods are likely to be sourced from abroad. This other country likely uses a different currency than you. Geopolitics, desynchronised economic cycles and FOREX speculation can all impact the exchange rate for the better or worse.
Locality differences
A project done in 1 country/province can have completely different outcomes in another location. Cultural influences can play a role, social norms, ways of working, or geographical (the Netherlands hardly has any mountains, which Austria has a lot of).
Misused methodologies
Extrapolation can be done in various ways. But what is the correct way?
Misused extrapolation function
Humans often presume that things scale linearly because that makes it easier to reason about. But when factors scale sub-linearly (logarithmic for example) or super-linearly (exponential) the margin or error drastically increases over longer periods of extrapolation.
Misassessment of correlation/causation
Sometimes we assume some random variable to be uncorrelated with itself or between factors such that it's easier to reason about. However, those variables might still be correlated with each other and influence each other in a significant way not captured by oversimplification.
And sometimes an action is thought of to be causal while in reality, it may not be. Therefore the assumed action is less effective than originally expected.
Propagating inaccuracies
We've now listed various causes of inaccuracies in estimations. But often an estimation is a combination of estimations, where the act of combining also comes with its challenges.
Compounding error margin misjudgment
For normally distributed random variables when adding a multiple of those variables together, the variance / standard deviation increases. This means that as the variance grows, the average happens less likely and the amount by which is wrong increases as well. However, due to the Gambler Fallacy, most people's intuition tells them that a combination of random events makes the average happen more likely (Often referred to as 'errors cancelling each other out').
In other words, inaccuracies accumulate as more and more inaccuracies are combined to give the estimation.