 Welcome back everybody. Let me welcome you to the last session of this conference, ECB Conference on Monetary Policy. We have the real pleasure to host Amy Nakamura, UC Berkeley as our second keynote speaker of today. She is well known for her very wide-ranging research agenda which spans many areas of great interest for scholars, for us, central bank economies, practitioners, or monetary policy. In that tradition today, she will speak about forecast errors, which is something central banks and other institutions have been struggling with for a year, good year now in a big way. She proposes a very interesting interpretation for why we see all the forecasting anomalies that she documents in her presentation. I must say I find that interpretation very convincing, but I want to reveal what it is. It will be Amy to do so in her presentation. So Amy, over to you. You have 50 minutes and then we'll open the floor for Q&A and for debate. Thank you very much for that very gracious introduction and also for building some mystery around my results, which I will reveal in my presentation. This paper is called Learning about the Long Run. It's joint with Leland Farmer and John Steinsen. The paper is about forecast errors. As is well known, there are a variety of different kinds of forecast errors, even among professional forecasters. These have been well documented in the literature. Let me just go through a few of the facts that I think have motivated the literature and also motivate our paper. One fact has to do with bias. Here I'm going to be talking about two leading examples. One is forecasting interest rates. That's the T-bill row here. The second is forecasting GDP growth. That's the GDP growth row here. The one thing to keep in mind is that in terms of the forecast horizon, the forecast horizon for T-bill is in terms of quarters, whereas for GDP growth, this is in terms of years. This is just because you'll see as we go along in the presentation that many of the most interesting forecast errors for interest rates are at shorter horizons and for GDP growth at longer horizons. The data for interest rates are from the survey professional forecast. That's the interest rate forecast data come from. For GDP growth, it's from the congressional budget office. In the case of bias, the idea is that you could run a regression of the forecast error where the forecast error is just given by the actual outcome of variable minus the forecast at various horizons into the future. This would be one, two, three, four, and five quarters into the future for interest rates. For GDP growth, it would be several years into the future. You see that for interest rates over the sample period we're looking at, there has been a negative bias in forecasts. For GDP growth, nothing statistically significant. Another anomaly that has been documented in literature has been auto-correlated forecast errors. This is the idea that when forecasters are off in one direction, they tend to be off repeatedly in the same direction. Of course, this is maybe an idea that we feel familiar with over the past year. If you look at interest rates, the idea is to run a regression of the forecast error at time T of outcomes H periods ahead on the same object H periods ago. There would be different ways of documented auto-correlated forecast errors, but this is one statistic like this. You can see that for interest rates, you see this significant auto-correlation in forecast errors. In contrast, again, for GDP growth, we don't see anything statistically significant for auto-correlated forecast errors. A third kind of classic test of the sort of rationality of these forecasts is to look at what Mincer and Zarnowitz introduced in the classic paper, which is to look at the effect of a change in the forecast on outcomes at the same horizon. The idea here is that we'll full information maximum likelihood, then the forecasts are just Bayesian conditional expectations. In this situation, then if the forecast moves by 1%, then the actual outcome should also move by 1%. In other words, this coefficient beta should be equal to 1. In this case, actually for interest rates, even though the coefficient differs statistically significantly from 1, so here I want to mention that the stars reflect differences from 1 as opposed to differences from 0, and you can see there is some statistically significant difference from 1. It's a little bit below 1, but you can see that in economic terms, it's not that far from 1. However, if we look at GDP forecasts, you see just massive deviations from the value of 1. In fact, at a forecast horizon of three years, the coefficient is basically 0. So what this means is that on average, when the forecast moved by 1%, actual outcomes at that horizon didn't move at all. Now this is from the CBO, so this is the Congressional Budget Office in the United States, but these forecasts from the Congressional Budget Office are very, very close to private sector forecasts, so I think that this is a more general phenomenon about these forecasts. Another set of facts that some of you may be thinking of as related but are not usually analyzed in the same literature are the set of facts that relate to the term structure, the expectation hypothesis of interest rates. So Campbell and Schiller introduced some sort of benchmark test to look at tests of the expectation hypothesis. The first regression test that they looked at on the right-hand side puts the spread between long-term and short-term yields, and on the left-hand side puts a sort of X-post spread. So here we have the yield spread. On the left-hand side, we have the average short-term interest rates minus the current short-term interest rate. And it's easy to show that, again, with the expectation rational expectations, you would think that this coefficient beta would be equal to one. But a large literature has shown that, in fact, this coefficient is closer to zero than to one, and that's what we're documenting in this table here. If you run this regression at various horizons and into the future, you see you get coefficients very close to zero. So, you know, while it should be, according to full information, there are expectations that these yield spreads strongly predict the trajectory of future short-term interest rates. In fact, it's not the case in the data. Now, I should add one more thing. I've been playing a little bit fast and loose with my terminology. There is another important potential gap between the predictions of full information rational expectations in this object on the left-hand side, which is risk-premium. But here I'm assuming that this risk-premium are significant. I'll come back to that later on. So, a second test of this expectation hypothesis, so the idea that long-term interest rates are approximately equal to the average of future short-term interest rates plus a constant risk-premium, is looking at the change in the long-term yield on the left-hand side on the yield spread on the right-hand side. So, here the intuition is a little bit more involved, but the idea is when the yield spread is high, that means that if you just look at the interest rate, the return on the long-term bond would be higher than the return on the short-term bond. So, to equalize returns, you actually need the yield spread on the longer-term bond to rise so that the long-term bond takes a capital loss and that equalizes returns between the short and the long-term bond. Or another way to think about it is that with the expectation hypothesis, the long-term interest rate is just a weighted average of future short-term interest rates. And in that situation, right now, the short-term interest rate is low relative to the long-term interest rate. And so, as you move forward one period in time, you're going to be dropping out a relatively low value of the short-term interest rate and that should make the average of the remaining periods rise over time. In any case, you know, these are just kind of intuitions to develop the fact that, again, if you look at the expectation hypothesis, the prediction is that beta would be equal to 1. So, when the yield spread is unusually high, this would be a time when you're expecting long-term yields to rise. But in fact, again, as a long literature has documented, this coefficient is, in fact, negative. So, again, the null hypothesis under the expectation hypothesis is that beta would actually be equal to 1. In fact, what we see in the data is that beta is minus 1. So, it's very far from what you would get as the null from the expectation hypothesis. Now, there are a number of important potential explanations for these anomalies relating to the expectation hypothesis. One set of explanations has to do with time-bearing risk premia, like I mentioned. A second set of explanations relates to deviations between full information and rational expectations and, you know, the expectations that people actually have. And that's the sort of line of inquiry that we're going to pursue, and that's the connection that I'm trying to develop between these term structure facts and the facts relating to anomalies in professional forecasts. So, an important seminal paper on this topic was by Frut in 1989, where he pointed out that a number of the puzzles relating to uncovered interest rate parity could be substantially ameliorated if you were to consider survey expectations as opposed to imposing full information rational expectations, which, you know, suggests that some of the locus of the problems relates to the formation of expectations. Now, in macroeconomics, there's a long literature talking about these ideas going back, as I mentioned earlier, to Mincer and Zarnawitz and certainly to Friedman's work early after the sort of influx of rational expectations into macroeconomics, and one traditional reaction to this set of facts is that the forecasters are irrational or are using the information in an inefficient way. And a recent sort of variant of this line of thinking is the line of work by Bordello, Ginayoli Ma and Schleifer. An alternative reaction, where there's also been a lot of work in macroeconomics is the idea that these forecasters face important forms of informational rigidity and frictions. Now, this second literature sort of falls into two branches. On the one hand, there's important work on sticky information models. In these models, the key assumption is that forecasters update their information infrequently. And a second important branch of this literature relates to noisy information. So in this case, the situation, the challenge that the forecasters face is that they don't observe a completely clean data on the variables that they're interested in. So they get a noisy signal of the variables they're interested in. Now while we find these explanations entirely plausible for explaining many facts in macroeconomics, particularly relating to households and firms, they seem less plausible for the case of professional forecasters. In all my interactions with professional forecasters, my strong impression has been that they are very up to date on the latest announcements by the Fed and the ECB and so on. So they're not using old information about macroeconomic variables. And of course they know exactly what the Fed funds rate is or the other policy rates that we face in the world. So these seem like less plausible explanations for the specific case of professional forecasters. And in addition to that, typically the professional forecasters know exactly what each other's forecasts are and are sort of intensely aware of where they lie in that distribution. But even if these particular information frictions may be less important for the case of professional forecasters, it doesn't mean that full information rational expectations isn't making a strong set of assumptions because in particular, this set of assumptions implies that the forecasters in the model really know the model that is generating the data. They know the model and they know the parameters. And I think that the last 18 months has made all of us feel like this is a strong assumption. It's a strong assumption to say that we know the model and we know the parameters. And it seems more realistic to assume that just like us, the forecasters in the model are learning about the model that generates the data. But once you introduce parameter learning even into a model with rational expectations, then it actually fundamentally changes the dynamics of the model and it can lead standard rational expectations test to fail. This has been a point, again, that in a qualitative sense at least has been made going all the way back to Benjamin Friedman's early work on this topic and I'm listing a number of important papers that have also made this point. So given that this is an idea that has been out there, you know that we don't know the model, we don't know the parameters in real time and we're learning. And that in fact this is something that can change the interpretation of failures of rational expectations test. Why isn't this sort of the standard interpretation in literature? So let me just read a very nice, I think, quote that summarizes, you know, the intuition for this idea that maybe people don't know the model in real time by Anna Friedman's last paper in 2018. She writes, ex post predictability of forecast errors did not imply that people make obvious mistakes that could be easily fixed in real time even when conducting a quasi real-time estimation and econometrician uses ex post knowledge of a statistical relationship that would have been much harder to uncover in real time. Okay, so this is the idea that maybe it's hard to figure these things out in real time. And as I said, this seems like a plausible idea, so why is this not a more prominent interpretation in literature? Well, I think that one major issue is that once you start to introduce parameter uncertainty into model, the models often become much harder to solve. In particular, if you allow not just for, you know, transitory states to be what people are learning about, but also these persistent parameters. And as a consequence of this, to sort of get closed form solutions and more tractable models, a lot of the earlier work on parameter learning used relatively simple models. And in these models, there's sort of a folk theorem that the Bayesian learning tends to be relatively fast. And so the question remains whether this kind of a learning explanation could really explain anomalies that persist over long periods of time. The anomalies that I showed you at the beginning have actually persisted over decades. And so, you know, the question is, can this type of learning explanation actually lead to persistent anomalies over that kind of time horizon? Or would it be something that we would expect to see disappear over just a few years? There is an informal discussion in a number of papers that suggests, you know, that makes the argument that perhaps in some form, parameter breaks might sustain learning over longer periods of time. But it's been hard to formalize this idea of parameter breaks. And so I think it's reasonable to say that the question is still out there, whether in a model with rational expectations, you could actually have this kind of learning sustaining anomalies over long periods of time. At the same time, I think it's been increasingly realized in people's thinking related to the term structure that you actually need to have a somewhat more complicated model to understand interest rate dynamics. So in particular, I think it has been pretty well established that you need to have some kind of a forecasting and points model of the type that was suggested by Kozicki and Tinsley where people are changing their views about where interest rates are going in the longer run to understand the behavior of interest rates. And there are very similar issues that arise in the case of GDP. So the question is where do we think growth is going in the longer run? And these are questions that, you know, there's people's views have changed over time, you know, for example, in the era of secular stagnation and so on. And that's an important component of how the forecasts are changing. But the challenge is that once you introduce these kinds of forces, you're thinking about an unobserved components model where there are multiple components of the interest rate or of GDP growth and this deep component that has to do with long term interest rate expectations or long term growth expectations is not directly observable. In that kind of model, the parameters can be very difficult to learn. And the basic intuition is that the model can yield fairly similar fit to high frequency behavior, which is the data that you're, you know, getting all the time if you're the agent in the model. But at the same time, they can yield very different implications for low frequency behavior. So intuitively, you know, if you look at, you know, an unobserved component that has a AR1 coefficient of 0.9 that might have fairly hot, fairly similar high frequency implications to a model that had this component having a persistence of 1. But if you look at the longer run implications, they might be very different. So in this setting, we're going to show that Bayesian learning can in fact be very slow. And again, this follows up on other recent work sort of making related points. So what we're going to do very concretely is to look at the two applications that I presented, the set of facts about professional forecasters at the beginning. So forecasting interest rates and forecasting GDP growth. And we're going to be thinking about exactly the set of facts that I presented at the beginning. The model is going to be very simple conceptually. We're going to think about Bayesian forecasters. We endow them with an unobserved components model and initial beliefs about the parameters. Then we feed them the data on interest rates and GDP in real time. And we have them generate real time forecasts. And then the question is going to be whether we are able to match the anomalies in the data using this model. And then at the end, so this analysis will all be using actual data on GDP and interest rates. At the end, I'll also do a Monte Carlo simulation where I'm actually simulating the interest rate data from the model and then applying learning model to that simulated data. And that's useful because in that Monte Carlo, we actually know that the data is truly coming from the model that we propose. So the main result is going to be that we can actually match the forecasting anomalies that I showed you when the forecasters are endowed with reasonable initial beliefs. Of course the question is reasonable is that I'll show you the priors and the argument we're making is just that these are fairly dispersed initial beliefs. They're not point masses where the individual couldn't be convinced of a different view but they're fairly dispersed initial beliefs and nevertheless there can be quite persistent anomalies. So the main interpretation that we come to of these forecasting anomalies is that in a situation where the low frequency phenomena are hard to learn about, then learning can generate very persistent forecasting anomalies. And rational expectation can in fact be quite misleading even over pretty long periods of time. Okay, let me start with this interest rate forecasting example. The data that we're using are data on the short-term interest rate from it's the three-month T-bill rate over the period 1951 to 2019. We started in 1951 because we view this as a major sort of change in monetary policy after the Treasury Accord. We're going to use zero coupon yield curve data from Ryu and Wu starting in 1961 and then we're going to use forecasts from the survey professional forecasters which start in 1981. And they're asked to produce nowcasts and forecasts up to four quarters in the future. So here's a picture for the United States that many of you will be familiar with. Of course there are related pictures for many other countries in the world. The U.S. sustained run-up of interest rates until a peak in 1980 then you know Paul Goldboker came into office. There was a major change in monetary policy and interest rates and inflation have been falling since then. Here's another picture that many of you may be familiar with and I think gives you some intuition about the forecasting anomalies that I showed you at the beginning of the presentation. This picture which some people refer to as the hair plot or the hairy caterpillar plot it shows forecasts of interest rates starting at each point in time going back to around 1980. So each dot is representing the forecast at a given quarter into the future. And so you can see sort of intuitively what is causing for example the auto-correlated forecast errors. What is causing the auto-correlated forecast errors is for example during the early 2000s the Fed is lowering interest rates but repeatedly the bond market sort of expects that things are going to go back to normal and they don't go back to normal or at least they don't go back to normal for a much longer time period than the bond market thinks and the same thing happened during the great recession. So you see that the Fed lowered interest rates in a sustained way and then of course kept them at zero for a long time but all the way through this episode repeatedly the bond market was forecasting that interest rates were going to go back up to some normal level and the same was true of these are professional forecasts from the survey professional forecasters. So you see that there was this persistent expectation that we were going to go back to normal. So our model for short-term interest rates is very simple it's a three-parameter model and the idea is going to be that the T-bill yield will be a sum of two components there's going to be a random walk component mu and then a transitory component x and the three parameters will be gamma which is the variance share of the permanent component so you can see that gamma enters on the error term in the expression for mu and for x rho which is the persistence of the transitory component and sigma which is the volatility of the short yield so the most important parameters to think about here are going to be rho and gamma so as rho gets higher this is going to make the transitory component larger fraction sorry it's going to make the transitory component more persistent and as gamma gets higher it's going to make the variance share of the persistent component larger and so one of the identification problems that the agents in this model are going to have to grapple with is if they see what they think look like persistent interest rates how much is it that rho is high this transitory component has a very high persistence of the variance share of the permanent component and the reason these things matter is because as I alluded to at the beginning while a rho of 0.9 might look pretty similar to a unit root over short time horizons over longer time horizons a unit root is going to look very different from a transitory component with a rho of 0.9 so that's going to be an important challenge that these agents are facing in trying to estimate the model so the actions we do is very conceptually simple we feed the agents the actual data on interest rates in real time we start with a set of initial beliefs in 1951 and then in every time period we give the agents a new observation then we re-estimate the whole model and we use the posterior densities for these parameters and the states to produce a forecast up to 10 years into the future and then we then in turn use that to create estimates of the long yield through the expectation hypothesis one shortcut that we take is that we turn off parameter learning during the ZLD period this is because it basically allows us to stay in the domain of a linear model so one important idea to think about in the context of our model is that learning is going to be slow so I'll tell you that at the outset so as a consequence the initial beliefs of the forecasters will matter and an important question is going to be not just whether we can match the anomalies for some value of the initial beliefs of the forecasters but whether we can actually choose reasonable or reasonably dispersed values of these initial beliefs because this interpretation is going to be much more interesting if we can sort of explain what happened with reasonably dispersed initial beliefs then if we were assuming say a point mass distribution on one of the priors so the exercise that we're going to do is that we're going to try to minimize the sum of the squared deviations between the model and the data regression so we're going to do these simulations we're going to estimate the regressions and then we're going to minimize the sum of the squared deviations of the model regression and the data regressions while searching over a space of initial beliefs and the space of initial beliefs is going to be determined by these two priors so these are going to be priors over rho and gamma it turns out that sigma squared isn't so important for the facts we're explaining so we fix the parameters of the prior for sigma squared but we search over the space of initial beliefs of rho and gamma we're looking over these four parameters the mean, the variance of the distribution of rho and gamma and remember rho is the persistence of the transitory component in our model and gamma is the fraction of the variation associated with the round and walk component so the question is going to be can we find initial beliefs for rho and gamma which will sort of rationalize the set of anomalies that we see in the data when we constrain ourselves to say that the agents in the model are otherwise updating in a Bayesian manner and the answer is that we can so I'll show you the priors in a moment and you can think about whether you think that these initial beliefs are reasonable but let me first show you the results in terms of the forecast that these agents actually have so here I'm showing you the same picture I showed you before for the forecast going out four quarters and the model and you can see there are a bunch of wiggles that the model is not able to fit however the model does fit a number of these salient facts relating to autocorrelated forecast errors so if you look for example during this episode I discussed in the 2000s you see that in the model just as in the data there's this sort of tendency to repeatedly think that things are going back to normal when in fact the variation is much more persistent than people expected to be and the same is true during the Great Recession just like in the data in the model there's this repeated view of the agents in the model that interest rates are going to sort of mean revert and this doesn't happen for a very long time now to give you a sense of the variation that our model doesn't fit during the zero lower bound period in the United States at some point the fed starts to use fluid guidance and then the interest rate expectations really completely collapse to zero and that's something that our model is not going to fit because the only information and this is clearly a simplifying assumption the only information that we're giving the agents in our model is actually just the most recent interest rate data and so that we're not going to capture the consequences of something like forward guidance so I'm not going to go through this in great detail but the basic finding is that we can fit the various regression tests that I showed you at the beginning so there's a negative bias in the interest rates just like in the model auto-correlated forecast errors in the model just as in the data the Mincer's Arnawitz test remember for interest rates the coefficient on the forecast was a little bit less than one and that's true in the model as in the data and then we look at one additional test which is this this test developed by my colleague Yuri Granutenko and Ali Koybian so here they're looking at the regression of the forecast error on the left-hand side on the updates in forecast on the right-hand side and the question is when forecasts are being updated in the upward direction does that tend to still be associated with an underestimate so when this coefficient is positive it means that even though the forecasters are updating upward they're still too low and the forecaster is still positive and they describe this as underreaction so this is actually something we see for interest rates in the data and we can also generate this in the model now thinking about the interest rate facts so the first fact from the first Campbell and Schiller regression was that looking at the sort of X post yield running regression on the yield spread you see a positive coefficient but it's very close to zero I guess you see actually negative coefficient at the shortest horizons but it's very close to zero the null hypothesis under the expectation hypothesis is that this would be equal to one but you see that in fact in the data it's close to zero than to one that's something that our model can replicate the second Campbell and Schiller regression again the null hypothesis under the expectation hypothesis is that beta equals one but in the data the coefficient is actually around minus one and that's again something we replicate in the model are the initial beliefs that are required to generate these results reasonable that's a little bit in the eye of the beholder but I think we would argue that they are reasonably dispersed so here's the distribution that we're using based on these hyper parameters in terms of the distribution of row and gamma that we estimate to fit the anomalies so this row here is sort of centered around level between 0.7 and 0.8 and here's the gamma parameter here you know the row does put substantial mass on values that are close to a unit root and also with significantly lower persistence if you think about it in terms of half-lise this would span a pretty wide range what is the intuition for these results but one way of thinking about it for the interest rate case is that it looks like over time if you look at the row estimates that there's sort of a gradual drift upward and how persistent the agents in the model think these interest rate movements are so relative to their prior which you know puts quite a bit of weight on relatively stationary values over time the agents start to believe in more persistence of these interest rate fluctuations and I'll come back to this in our Monte Carlo analysis but I think that's one central intuition that the priors that they start with in 1960 actually put too little weight on very persistent movements in the interest rate now this is the estimates of the state variable you can see these move around a lot and sort of reflect the idea that they rise dramatically around 1980 and then fall thereafter now what would be the intuition for why agents could be wrong in this particular way I think Fama in the 2006 paper has a nice quote that relates to this he says there was little prior experience with a fiduciary currency when the right to exchange currency for gold was discontinued in 1971 and it is reasonable that the higher inflation and interest rates of fall were a surprise the experience led market participants to rationally predict that a fiduciary currency a currency that is not backed by a commodity like gold implied permanently higher expect inflation in other words the proceeding positive shocks to expect inflation were judged to be permanent it turns out however that the Federal Reserve won a long odds game and they learned how to manage a fiduciary currency the argument he is making here is that perhaps one of the things that was influencing people's beliefs about interest rates was that for most of history people had been on a gold standard and in the gold standard in fact interest rate movements are a lot more transitory and so perhaps some of what happened subsequently where we saw these much more persistent movements in interest rates was surprising relative to this long history of countries being on a gold standard with much less persistent interest rate movements in the paper we do some initial analysis additional analysis that sort of considers for example a break in monetary policy in 1990 because we know there was a major change in monetary policy associated with and were able to fit some additional facts related to finance but let me let me spend a few minutes talking about our output growth forecasting analysis this is going to be very sort of easy to follow given what I've already said about the interest rate because we're going to do a very similar exercise but this time for GDP growth as opposed to interest rates so the data now are from the Philadelphia Feds real time data set and the forecaster from the congressional budget office an appealing thing about these forecasts is just that they go out five years but in fact they're very similar to other professional forecasters so here's the same kind of hair plot or a whisker plot that I showed you for interest rates but this time for GDP forecasts so again the and now with the one additional complication that there's an issue of different vintages of data but the the dashed lines the dashed black line is showing you the initial release and then at each point in time these gray lines are showing you what the forecasts were going out forward in time and so you can see some of the major challenges that have been faced by the CBO and other professional forecasters on growth so look at the late 1990s so during this period of time actual growth and the initial release of actual growth was very high in the force and there was this repeated expectation as you can see from the the gray line that growth was going to go back to normal and it wasn't until sort of the very end that the CBO you know the early 2000s the CBO finally started to predict that this high growth that we were seeing in the late 1990s was actually going to persist but unfortunately almost right around the time that they switched from believing that this is going to be transitory to believing it was going to persist it was over there was the dot-com crash and effect growth fell so there you see you know again an intuition for why there can be these autocorrelated forecasters the model that we are using is just slightly more complicated than the model we used in the interest rate case output is going to have two components so there's going to be again a random walk component and a transitory component and it's going to be a random walk with drift because of course there's positive growth on average and the transitory component is going to be an AR2 and that's to replicate some of the hump shaped dynamics that we see in GDP but still it's it's only a 5 parameter model and so it's still a relatively parsimmonious model of GDP so we do the same exercise as before so we're going to ask what initial beliefs would you have to have on these parameters to be able to fit the anomalies that we see in the data and we are actually able to fit a number of these anomalies I would say that the fit of this model to the anomalies is not quite as good as in the case of the interest rate data but still we're able to fit a number of the salient facts so in particular if you look at that episode I was just describing to you in the late 1990s where growth is very high and repeatedly the CBO kept thinking it would go back to normal we see the same kind of dynamics in our model and remember in our model these are Bayesian forecasters they're forced to be Bayesian at least conditional on the model that we give them the most interesting anomaly for the GDP forecast was the Minsters Arnawitz regression you might remember that the Minsters Arnawitz progression had this distressing feature that if you look three years out if you're running a regression of actual output growth on the forecast for that point in time for three years out the coefficient is actually zero so a 1% increase in the forecast was actually associated with no increase at all in actual outcomes and this general pattern is something that we're able to replicate in the data so we don't quite get zero here but there's this strong downward slope and in fact we're able to get to this very counter-intuitive kind of result that the coefficient is actually negative at longer horizons which means that a higher forecast is actually associated with with lower actual outcomes and we're also able to fit a number of these other anomalies that I showed you at the beginning so for example the Coybian and Grodinchenko statistic for the case of output in fact the coefficient here flips for interest rates it was mildly positive so that in their language is associated with under reaction of the forecast for output at longer horizons the coefficient is negative which in their language is associated with overreaction so the the forecasts are being updated up and that's associated with a forecast error that's negative so they updated too much and we're able to fit this qualitatively so what initial beliefs do we need to fit these facts these are a little bit harder to interpret because AR2's are harder to interpret than AR1's but again so row one plus row two is I think one measure of the persistence of the the transitory component in the GDP growth model and you see that again like it from a half life standpoint we're putting weight on a pretty wide range of different half lives and so you know these are certainly not uniform priors and I want to emphasize this the fact that they're informative priors is playing an important role in why we're able to fit these facts but at the same time there it's not a situation where we're putting a point mass on particular strong beliefs about these parameters so the last thing I want to talk about is this Monte Carlo sort of which gives some insight into why it is that we get these results so here what we're going to do is simulate data from the interest rate model and we're going to consider three different versions of the truth relative to the initial beliefs so in the first version we're going to think about a case where the initial beliefs are unbiased in the sense that they're centered on the truth so this is actually still not full information rational expectations because it's not that the forecasters literally know the parameters they know some distribution around the crack parameters but they're centered in the right place and it's going to turn out that even though this isn't quite full information rational expectations all these rational expectations are going to work in that case the second setting we're going to think about is a case of downward biased initial beliefs so here we're going to have the same truth but the initial beliefs are going to be centered around values of persistence which are too low so this is related to the quote I described about the gold standard the idea that maybe coming into the Volcker period people just didn't anticipate how persistent these movements and interest rates and so on could be and then the third example we're going to flip things and think about a case where the initial beliefs are upward biased relative to the truth and here because the so the truth here we're choosing something similar to the final posterior means of our parameters and it is actually relatively persistent so to create this case with upward bias initial beliefs we're actually going to change the truth to make the truth much more transitory so here's sort of a visual description of this the gray line is the truth and the black is the prior the initial beliefs and so in the unbiased case the initial beliefs are centered around the truth although there's still some uncertainty in the downward bias case the initial beliefs are you know imply a lot less persistence than the truth and in the upward bias case the initial beliefs imply much less persistence than the truth sorry did I get that wrong in the downward bias case the initial beliefs beliefs imply much less persistence than the truth truth and the upward bias case, the initial beliefs imply much more persistence than the truth. So what we see from this Monte Carlo analysis is that in the case of unbiased initial beliefs, we basically get back the full information rational expectations results. So here, I'm looking at the autocorrelation forecasters. The null hypothesis under full information rational expectations would be beta equals 0. And in fact, we find essentially the beta equals 0. But in contrast, if we assume these downward biased initial beliefs, people's initial beliefs are that rho implies much less persistence than actually turns out to be in the data, then we can generate autocorrelated forecasters of the kind that we see in the data. And in contrast, if we flip things and we say what if people thought things were going to be much more persistent but actually they turn out to be very transitory, then we actually get negatively autocorrelated forecasters. And we see the same thing for the various different statistics we have analyzed. So for the interest rate case, this Coybion and Gradnichenko coefficient for underreaction or overreaction, for the unbiased initial beliefs case, we get 0 as in full information rational expectations. For the downward biased initial beliefs, we get a slightly positive numbers that's underreaction. And for the upward biased initial beliefs, we get a negative number, which is overreaction. And similarly for the various tests of the expectation hypothesis, remember that the null hypothesis here would be beta equals 1. For the unbiased initial beliefs case, we basically replicate beta equals 1. For the downward biased initial beliefs case, we get a number close to 0, like in the data. And for the upward biased initial beliefs case, we get a number much higher than 1. For the second expectation hypothesis test, again, the null hypothesis is 1. And that's basically what we get for the unbiased initial beliefs case. For the downward biased initial beliefs case, we get a number close to minus 1, like in the data. And for the upward biased initial beliefs case, we get a number much higher than 1. So in this sense, we're able to kind of, one way of thinking about this is data reduction. We're able to see these different facts through a sort of unified lens that if you have these initial beliefs that are biased relative to what actually turns out to be true, then you can get all of these anomalies. One more piece of intuition here. In this Monte Carlo, we can ask the question, how long does it actually take for people to learn the truth? And it turns out that it just takes a very long time in this kind of model. So here is the simulation showing rho between 0.9 and 1. And you see that there's learning going in the direction of the truth. But even over many decades, there's still a gap between the actual, in the model, what these Bayesian agents believe about rho and what it actually is. And the same thing for gamma. And a lot of this has to do with this uninsured component structure. So if we turn off updating about gamma, so now there's no more uninsured components structure. Now there's just one component of the interest rate model. Now the learning occurs much more rapidly. That's what I'm showing in this dash line. And you see that convergence toward the true beliefs about rho happens much more rapidly than in our model, which has updating about both of these uninsured components. So let me stop there. The basic argument we make is that these forecast anomalies can be explained by slow learning in a model where there's this uninsured components structure and there's some important degree of uncertainty about the long run. And that's why we called the paper Learning About the Long Run. Because it's not that we're trying to argue that it would always be the case that you could explain failures and rational expectations using this type of story. But rather that this would potentially matter in situations where there's a significant amount of uncertainty about where things are going in the long run. But our sense is that the interest rate case and the GB growth case are both cases where there is a lot of that kind of uncertainty. And we're looking at the historical data, shows you these large secular shifts that it would be reasonable for forecasters to worry about going forward. OK, let me stop there and take some questions. Take some time. Fascinating presentation. Emmy, thank you. You have already a couple of questions. So one question has two parts to it. Let me read out the first part. And then maybe you can already react. And then we go to the second one. Romer and Romer 2000 and 2000 AER demonstrate that the Federal Reserve has considerable information about macroeconomic variables. For example, inflation beyond what is known to professional forecasters. Do you find that this is still the case? If yes, would it be optimal for central banks to release this information? That's the problem. So I don't think that our results particularly speak to that question. So I guess that the forecasts that we're using for GDP growth are from the Congressional Budget Office. So that's an arm of the government, of course, not quite the Fed. Over short horizons, those forecasts are actually very similar to private sector forecasts. There might be some small differences, but I think they're relatively small. So I don't think our results directly speak to that issue. However, I would say as a more general matter, I think this is a very interesting question. I've myself done other work on this question of the Fed information effect. And I think it's an area that should be explored more. OK, so let me move to the second part for GDP forecast. As you have shown, data revisions are very big and get to benchmark revisions by statistical agencies or advances in economic techniques. The difference between forecasting anomalies for GDP and inflation, where there is no data uncertainty, could give a dimension of data uncertainty. And that's a question mark. Yes, so of course, for interest rates, there isn't this updating as you mentioned. But for GDP, there is. And this is why we're using this real time data. So the information that the agents in our model are receiving is the initial release of GDP. But in my graphs, you could see, just like you're saying, that there's often quite a large gap between the initial release and the final release. And I think that's an important issue as well. And very important in understanding the forecast. If you looked at these hair plots that I had, you would be completely confused if you were looking at the updated statistics. But they make sense in the context of the statistics people had at the time. OK, there's another one from Sarah Holton. In general, professional forecasters are better in forecasting inflation and GDP than interest rates, especially long-term interest rates. Would this mean that the ultimate drivers of the forecast anomalies are different? I'm not sure. So we're arguing that regardless of the overall level of failures on these two. And that clearly has to do with the amount of persistence to some extent. That there could be common factors that are important for both, that if there's this, I mean, most importantly, we're arguing that if there's this long-run component, which has been emphasized in the case of interest rates, and I think also in GDP growth, that the question is, where are interest rates going in the long run? Is there going to be a secular decline in interest rates? Is there going to be a secular decline in growth? That changing views about that question will affect forecast errors. For example, if you look at the last year, it might take some time for people to become convinced that the inflation is really persistent. And that kind of learning can play an important role in the forecast errors. Thank you. Last question, great presentation. So that's the first exclamation. The econometric literature has come up with the tests that identify the horizon, as of which model forecast essentially become completely uninformative. For some variables, that horizon is going to be fairly short. Can your framework say something on when those horizons are reached, i.e. when forecasting based on models doesn't really make sense anymore? That's a good question. I will have to think about that. In our model, I think the main message would be that, so you look at those CDO forecasts and you see that sort of disturbing result that if you look five years out, you see zero coefficient on the forecast. If you're running regression, actual outcomes on the forecast. And you ask yourself, does that imply that the CDO is doing something wrong? And actually, I think one of the consequences of our analysis is that, no, it doesn't necessarily imply that the CDO is doing something wrong, because even a Bayesian agent facing a world in which they just don't know the parameters and are learning about the parameters, could actually generate data that looked like that, where they're actually using all of the information efficiently, but they're just wrong about something in the sense that their prior was centered on the wrong place, and so as a consequence, they didn't have unbiased fires ex-post. But of course, ex-sante, I mean, nobody has a crystal ball, how do you know where to put your initial beliefs if you haven't seen the future yet? So I actually think that in certain ways, our results are sort of a defense against mistakes that are made, even when they look persistent ex-post, that they could be a consequence of an agent that actually is using the information efficiently. Now, one question you might ask in which we asked ourselves is would it be better to just have a completely agnostic prior, like a flat prior, would that work better? And an interesting sort of fact is that actually it doesn't work better from a forecasting standpoint. So we have done some analysis of, in our model in some sense, the anomaly, some of the anomalies come from the fact that there are these informative priors, what if you used flat priors, but one thing that will probably be unsurprising to those of you who are forecasters is that using flat priors is very problematic too, just because of all the overfitting problems. So there's a sort of logic from a forecasting perspective that you wanna shrink towards some values of the parameters, even if you don't know the right ex-sante. But then when you do that, you can actually lead to this evidence of bias. And I think one of the sort of consequences of our analysis is to say that that doesn't necessarily mean that you're not using information in an efficient way. Thank you, thank you, Emi, very much for your presentation, very well received by everybody. And let me close the session and the conference at this point. On behalf of DCB, I would like to thank formerly all the speakers for their contributions, all the participants for the questions and debate, and the organizers specifically for putting together a terrific conference from which we all learned, at least I learned a lot. Thanks a lot and see you next year.