Oil Price Forecasts: Should We Score Accuracy?
Who is the top forecaster in the oil market? The surprising answer is that nobody knows because the accuracy of predictions is never properly tracked and measured after they are made.
Banks, consultancies, government agencies and even journalists routinely issue predictions about what will happen to oil supply, demand and prices in future. Forecasts about the trajectory of prices over the next few months and years drive decisions affecting billions of dollars of investment.
Oil companies rely on them to decide whether to drill more wells, develop new fields and hedge their future production. Consumers rely on them in deciding whether to buy a small fuel-efficient car or a gas-guzzler. Governments need them to produce revenue estimates and budgets.
Expectations about oil prices are among the most important variables in the world economy and central to the debate over greenhouse emissions and global warming. But given how pervasive and influential oil price predictions are, there is a surprising lack of data into how accurate the forecasts have been and which forecasters have the best track record.
FORECASTERS AS STORYTELLERS
The problem is not confined to oil. Making decisions on the basis of forecasts of unknown accuracy is prevalent in many areas of business and politics, according to Philip Tetlock and Dan Gardner, who have published a new book titled "Superforecasting: The Art and Science of Prediction."
"Every day, the news media deliver forecasts without reporting, or even asking, how good the forecasters who made the forecasts really are," according to Tetlock and Gardner. "Every day, corporations and governments pay for forecasts that may be prescient or worthless or something in between. And every day, all of us - leaders of nations, corporate executives, investors and voters - make critical decisions on the basis of forecasts whose quality is unknown."
According to Tetlock and Gardner, the problem lies on the demand side rather than the supply side. Governments, businesses, investors and individuals don't demand evidence of accuracy before deciding whether to accept and act on a prediction.
Forecasts are routinely made but the results are almost never tracked. Prominent forecasters build reputations not because of their accuracy but because of their skill at telling a compelling story with conviction.
"You might think that the goal of forecasting is to foresee the future accurately, but that's often not the goal, or at least not the sole goal," Tetlock and Gardner said, also claiming forecasts are also meant to entertain, advance political agendas and impress clients.
Forecasting as entertainment is fine for inconsequential predictions. But the lack of verification should be worrying when billions of dollars of investment decisions rest on the outcome.
"Baseball managers wouldn't dream of getting out the checkbook to hire a player without consulting performance statistics. Even fans expect to see player stats on scoreboards and TV screens. And yet when it comes to decisions that matter far more than any baseball game, we're content to be ignorant."
FIRMING UP FUZZY FORECASTS
Part of the problem is the fuzzy language in which the predictions are often expressed, which makes it difficult to tell if the forecast was right or wrong even after the event. Forecasts are often expressed using ambiguous words like probable, possible and risk, for which there are no agreed definitions, making it impossible to score them afterwards.
The U.S. Intelligence Community has struggled with the lack of precision in the meaning of words commonly used to express likelihood and chance since the 1960s ("Sherman Kent and the Profession of Intelligence Analysis" 2002). In other cases, relatively specific forecasts are matched with an unspecific timeframe, which also makes it difficult to score them for accuracy.
There is a maxim among professional analysts that cynically confirms the problem: always predict a price, or a timeframe, but never both.
However, in recent years, many oil market forecasters have been pushed to quantify their forecasts by making specific price predictions over specified time horizons.
Many have also embraced uncertainty by offering forecasts in the form of a probability distribution rather than a point estimate, which is a much more useful and realistic way to think about the future.
Oil forecasters are catching up with weather forecasters and intelligence community in trying to estimate the likelihood of a whole range of outcomes, not just the central one.
Since 1939, the U.S. Weather Service has restricted the use of the terms "probably" and "possibly" and encouraged forecasters to make percentage predictions instead.
As long ago as 1920, the U.S. Weather Bureau in Roswell, New Mexico, got an enthusiastic reaction from farmers to its percentage forecasts for the chance of rain during the alfalfa harvesting season ("Verification of a Forecaster's Confidence and the Use of Probability Statements in Weather Forecasting" 1944).
FORECAST VERIFICATION
Percentage forecasts are an important step forward, but the oil market is still lagging behind in terms of measuring forecast accuracy after the event.
The problem with percentage forecasts is working out whether they were accurate even in retrospect. Tetlock and Gardner call this problem of "being on the wrong side of 'maybe'."
To understand the problem, imagine a weather forecaster who says that tomorrow there is a 70 percent chance of rain. The forecast also implies there is a 30 percent chance it will not rain.
If it doesn't rain, the forecast was not necessarily wrong in a statistical sense. But it is still likely to be criticised by anyone concentrating only on the most likely outcome rather than the whole range of forecasts.
Meteorologists pioneered the solution to the probability forecasting problem and the solution was published by Glenn Brier of the U.S. Weather Bureau ("Verification of Forecasts Expressed in Terms of Probability" 1950).
The most accurate forecaster is the one whose forecast probability distributions get closest to the distribution of actual outturns over time. If a forecaster predicts there will be a 70 percent chance of rain they should be proved correct about 70 percent of the time.
Verifying accuracy is obviously much easier for weather forecasts, where thousands of fresh forecasts are issued every day, and sometimes even more frequently, and can be compared with thousands of outcomes.
Verification is more difficult for subjects like oil prices, but given how frequently prices are forecast it is not impossible and would be highly desirable. Brier published a careful methodology for comparing a set of forecasts expressed as probability distributions with eventual outcomes, and scoring forecasters on a standard scale from zero (complete accuracy) to 2.0 (perfect inaccuracy).
Brier scores, named after the author, have been used to benchmark weather forecasts for decades but in principle they can be used in any field where forecasts are expressed in terms of probability distributions.
Tetlock has been employing them since 2011 to track the accuracy of a panel of forecasters answering questions about economics, politics and international relations posed by the U.S. intelligence community as part of a project funded by the Intelligence Advanced Research Projects Agency (IARPA).
Brier scoring price forecasts could also bring important benefits for the oil market. The aim would not just be to identify the most accurate forecasters, those most worth paying attention to, but improving the accuracy of all forecasts by subjecting them to rigorous analysis after the event.
Weather forecasts have improved enormously over the last 50 years because they have been subjected to rigorous analysis. It is far less obvious that forecasts for oil prices and other financial markets have become any better. If we demand accuracy and accountability from weather forecasters and intelligence specialists, shouldn't we do the same from oil market forecasters?
By John Kemp