This is my last blog post about coffee (I promise). Ever since stumbling upon this Atlantic article about which countries consume the most coffee per capita, I’ve pondered a deeper question—not just who drinks the most coffee, but who’s most addicted to the stuff. You might argue that these questions are one and the same, but when it comes to studying addiction, it actually makes more sense to look at it through the lens of elasticity rather than gross consumption.
You might recall elasticity from an earlier blog post. Generally speaking, elasticity is the economic concept relating sensitivity of consumption to changes in another variable (in my earlier post, that variable was income). When it comes to studying addiction, economists focus on price elasticity—i.e. % change in quantity divided by the % change in price. And it makes sense. If you want to know how addicted people are to something, why not see how badly they need it after you jack up the price? If they don’t react very much, you can surmise that they are more addicted than if they don’t. Focusing on elasticity rather than gross consumption allows for a richer understanding of addiction. That’s why economists regularly employ this type of analysis when it comes to designing public policy around cigarette taxes.
I would never want to tax coffee, but I am interested in applying the same approach to calculate the price elasticity of coffee across different countries. While price elasticity is not a super complicated idea to grasp, in practice it is actually quite difficult to calculate. In the rest of this blog post, I’ll discuss these challenges in detail.
Gettin’ Da Data
The first problem for any data analysis is locating suitable data; in this case, the most important data is information for the two variables that make up the definition of elasticity: price and quantity. Thanks to the International Coffee Organization (ICO), finding data about retail coffee prices was surprisingly easy for 26 different countries reaching (in some cases) back to 1990.
Although price data was remarkably easy to find, there were still a few wrinkles to deal with. First, for a few countries (e.g. the U.S.), there was missing data in some years. To remedy this, I used the handy R package imputeTS
, to generate reasonable values for the gaps in the affected time series.
The other wrinkle related to inflation. I searched around ICO’s website to see if their prices were nominal or whether they controlled for the effects of inflation. Since I couldn’t find any mention of inflation, I assumed the prices were nominal. Thus, I had to make a quick stop at the World Bank to grab inflation data so that I could deflate nominal prices to real prices.
While I was at the Bank, I also grabbed data on population and real GDP. The former is needed to get variables on a per capita basis (where appropriate), while the latter is needed as a control in our final model. Why? If you want to see how people react to changes in the price of coffee, it is important to hold constant any changes in their income. We’ve already seen how positively associated coffee consumption is with income, so this is definitely a variable you want to control for.
Getting price data might have been pretty easy, but quantity data posed more of a challenge. In fact, I wasn’t able to find any publicly available data about country-level coffee consumption. What I did find, however, was data about coffee production, imports and exports (thanks to the UN’s Food and Agriculture Organization). So, using a basic accounting logic (i.e. production – exports + imports), I was able to back into a net quantity of coffee left in a country in a given year.
There are obvious problems with this approach. For one thing, it assumes that all coffee left in a country after accounting for imports and exports is actually consumed. Although coffee is a perishable good, it is likely that at least in some years, quantity is carried over from one year to another. And unfortunately, the UN’s data gives me no way to account for this. The best I can hope for is that this net quantity I calculate is at least correlated with consumption. Since elasticity is chiefly concerned with the changes in consumption rather than absolute levels, if they are correlated, then my elasticity estimates shouldn’t be too severely biased. In all, the situation is not ideal. But if I couldn’t figure out some sort of workaround for the lack of publicly available coffee consumption data, I would’ve had to call it a day.
Dogged by Endogeneity
Once you have your data ducks in a row, the next step is to estimate your statistical model. There are several ways to model addiction, but the most simple is the following:
$latex log(cof\_cons\_pc) = \alpha + \beta_{1} * log(price) + \beta_{2} * log(real\_gdp\_pc) + \beta_{3} * year + \varepsilon$
The above linear regression equation models per capita coffee consumption as a function of price while controlling for the effects of income and time. The regression is taken in logs mostly as a matter of convenience, since logged models allow the coefficient on price, $latex \beta_1$, to be your estimate of elasticity. But there is a major problem with estimating the above equation, as economists will tell you, and that is the issue of endogeneity.
Endogeneity can mean several different things, but in this context, it refers to the fact that you can’t isolate the effect of price on quantity because the data you have is the sum total of all the shocks that shift both supply and demand over the course of a year. Shocks can be anything that affect supply/demand apart from price itself—from changing consumer tastes to a freak frostbite that wipes out half the annual Colombian coffee crop. These shocks are for the most part unobserved, but all together they define the market dynamics that jointly determine the equilibrium quantity and price.
To isolate the effect of price, you have to locate the variation in price that is not also correlated with the unobserved shocks in a given year. That way, the corresponding change in quantity can safely be attributed to the change in price alone. This strategy is known as using an instrumental variable (IV). In the World Bank Tobacco Toolkit, one suggested IV is lagged price (of cigarettes in their case, though the justification is the same for coffee). The rationale is that shocks from one year are not likely to carry over to the next, while at the same time, lagged price remains a good predictor of price in the current period. This idea has its critics (which I’ll mention later), but has obvious appeal since it doesn’t require any additional data.
To complete implementing the IV strategy, you must first run the model:
$latex price_t = \alpha + \beta_{1} * price_{t-1} + \beta_{2} * real\_gdp\_pc_t + \beta_{3} * year_t + \varepsilon_t$
You then use predicted values of $latex price_t$ from this model as the values for price in the original elasticity model I outlined above. This is commonly known as as Two Stage Least Squares regression.
Next Station: Non-Stationarity
Endogeneity is a big pain, but it’s not the only issue that makes it difficult to calculate elasticity. Since we’re relying on time series data (i.e. repeated observations of country level data) as our source of variation, we open ourselves to the various problems that often pester inference in time series regression as well.
Perhaps the most severe problem posed by time series data is the threat of non-stationarity. What is non-stationarity? Well, when a variable is stationary, its mean and variance remain constant over time. Having stationary variables in linear regression is important because when they’re not, it can cause the coefficients you estimate in the model to be spurious—i.e. meaningless. Thus, finding a way to make sure your variables are stationary is rather important.
This is all made more complicated by the fact that there are several different flavors of stationarity. A series might be trend stationary, which means that it’s stationary around a trend line. Or it might be difference stationary. That means that it’s stationary after you difference the series, where differencing is to subtract away the value of the series from the year before, so you’re just left with the change from year to year. A series could also have structural breaks, like an outlier or a lasting shift in the mean (or multiple shifts). And finally, if two or more series are non-stationary but related to one another by way of something called co-integration, then you have to apply a whole different analytical approach.
At this point, a concrete example might help to illustrate the type of adjustments that need to be made to ensure stationarity of a series. Take a look at this log-scaled time series of coffee consumption in The Netherlands:
It seems like overall there is a slightly downward trend from 1990 through 2007 (with a brief interruption in 1998/99). Then, in 2008 through 2010, there was a mean shift downward, followed by another shift down in 2011. But starting in 2011, there seems to be a strong upward trend. All of these quirks in the series are accounted for in my elasticity model for The Netherlands using dummy variables—interacted with year when appropriate to allow for different slopes in different epochs described above.
This kind of fine-grained analysis had to be done on three variables per model—across twenty-six. different. models. . . Blech. Originally, I had hoped to automate much of this stage of the analysis, but the idiosyncrasies of each series made this impossible. The biggest issue were the structural breaks, which easily throw off the Augmented Dickey-Fuller test, the workhorse for detecting statistically whether or not a series is stationary.
This part of the project definitely took the longest to get done. It also involved a fair amount of judgment calls—when should a series be de-trended, or differenced, or how to separate the epochs when structural breaks were present. All this lends to a critique of time series analysis that it can often be more of an art than science. The work was tedious, but at the very least, it gave me confidence that it might be a while before artificial intelligence replaces humans for this particular task. In prior work, I actually implemented a structural break detection algorithm I once found in a paper, but I wasn’t impressed with its performance, so I wasn’t going to go down that rabbit hole again (for this project, at least).
Other Complications
Even after you’ve dealt with stationarity, there are still other potential problem areas. Serial correlation is one of them. What is serial correlation? Well, one of the assumptions in linear regression is that the error term, or $latex \varepsilon$, as it appears in the elasticity model above, is independent across different observations. Since you observe the same entity multiple times in time series data, the observations in your model are by definition dependent, or correlated. A little serial correlation isn’t a big deal, but a lot of serial correlation can cause your standard errors to become biased, and you need those for fun stuff like statistical inference (confidence intervals/hypothesis testing).
Another problem that can plague your $latex \varepsilon$’s is heteroskedascity, which is a complicated word that means the variance of your errors is not constant over time. Fortunately, both heteroskedascity and serial correlation can be controlled for using robust covariance calculations known as Newey-West estimators. These methods are easily accessible in R via the sandwich
package, and I used it whenever I observed heteroskedascity or serial correlation in my models.
A final issue is the problem of multicollinearity. Multicollinearity is not strictly a time series related issue; it occurs whenever the covariates in your model are highly correlated with one another. When this happens, your beta estimates become highly unreliable and unstable. This occurred in the models for Belgium and Luxembourg between the IV price variable and GDP. There are not many good options when your model suffers from multicollinearity. Keep the troublesome covariate, and you’re left with some really weird coefficient values. Throw it out, and your model could suffer from omitted variable bias. In the end, I excluded GDP from these models because the estimated coefficients looked less strange.
Results/Discussion
In the end, only two of the elasticities I estimated ended up being statistically significant—or reliable—estimates of elasticity (for Lithuania and Poland). The rest were statistically insignificant (at $latex \alpha$ = 0.05), which means that positive values of elasticity are among the plausible values for a majority of the estimates. From an economic theory standpoint this makes little sense, since it is a violation of the law of demand. Higher prices should lead to a fall in demand, not a rise. Economists have a name for goods that violate the law of demand—Giffen goods—but they are very rare to encounter in the real world (some say that rice in China is one of them; I’m pretty sure coffee is a normal, well-behaved good.
Whenever you wind up with insignificant results, there is always a question of how to present them (if at all). Since the elasticity models produce actual point estimates for each country, I could have just put those point estimates in descending order and spit out that list as a ranking of the countries most addicted to coffee. But that would be misleading. Instead, I think it’s better to display entire confidence intervals—particularly to demonstrate the variance in certainty (i.e. width of the interval) across the different country estimates. The graphic below ranks countries from top to bottom in descending order by width of confidence interval. The vertical line at -1.0% is a reference for the threshold between goods considered price elastic ( $latex \epsilon$ < -1.0% ) versus price inelastic ( -1.0% > $latex \epsilon$ > 0.0%).
When looking at the graphic above, it is important to bear in mind that apart from perhaps a few cases, it is not possible to draw conclusions about the differences in elasticities between individual countries. You cannot, for example, conclude that coffee is more elastic in the United States relative to Spain. To generate a ranking of elasticities across all countries (and arrive at an answer to the question posed by the post title), we would need to perform a battery of pairwise comparisons between all the the different countries ([26*25]/2 = 320 in total). Based on the graphic above, however, I am not convinced this would be worth the effort. Given the degree of overlap across confidence intervals—and the fact that the significance-level correction to account for multiple comparisons would only make this problem worse—I think the analysis would just wind up being largely inconclusive.
In the end, I’m left wondering what might be causing the unreliable estimates. In some cases, it could just be a lack of data; perhaps with access to more years—or more granular data taken at monthly or quarterly intervals—confidence intervals would shrink toward significance. In other cases, I might have gotten unlucky in terms of the variation of a given sample. But I am also not supremely confident in the fidelity of my two main variables, quantity and price, since both variables have artificial qualities to them. Quantity is based on values I synthetically backed into rather than coming from a concrete, vetted source, and price is derived from IV estimation. Although I trusted the World Bank when it said lagged price was a valid IV, I subsequently read some literature that said it may not solve the endogeneity issue after all. Specifically, it argues the assumption that shocks are not serially correlated is problematic.
If lagged price is not a valid IV, then another variable must be found that is correlated with price, but not with shocks to demand. Upon another round of Googling, I managed to find data with global supply-side prices through the years. It would be interesting to compare the results using these two different IVs. But then again, I did promise that this would be my last article about coffee… Does that mean the pot is empty?