 Thank you very much to the organizers for having me here to talk about some of my research agenda with Molin Shun and my former student Alexei Kasanov. So if you're in the search for talent, I beg you to look at him because he's under place at this moment, okay? It's amazing to be in this conference where I don't need to convince the audience that thinking about non-linearities is important. I've been trying to do this for the last 20 years and it's always a an outrage to convince the general macro audience, but here it seems that it's gonna be an easy task for me, okay? So let me talk about what we do in this project. So we think about dynamic factor models and you know that this is one of the most useful tools in terms of time series that we have a both in academia and in central banking. The standard dynamic factor model establishes a linear relation between the factor today and the factor yesterday and the factor and the observation variable. There are two things around that with Markov switch in time-varying parameters, but for most of the literature people have concentrated in this type. But as we have seen in the previous presentations in the last 15 years, we have experienced a lot of these sort of tail events that introduce outliers in the data. So we are forced to rethink our models in terms of how do we handle these non-linearities? Stochastic volatility is one way that people have a use it to deal with the COVID scenario, for example, okay? But on top of this, for example, we have these binding constraints. One example is the zero lower bound. If you think in terms of probabilities, for example, if you have probabilities of moving from an employee to an employee, well, this also imposes bounds into the data. So we need to think about how to use these toolboxes once we have these sort of non-linearities kicking in around the data. So that's what we do here. So in this model, we're going to introduce we're going to start at a very abstract way of the relation between the factor and itself in the past and also between the factor and the data that we observe. So that's how we're going to start. Obviously, this is a very ambitious agenda. So that it goes along the lines of what Maximiliano was doing, they're trying to have a non-parametric approach. So we are more humble in the sense that we are going to try to learn from what people have done in the structural macro literature. And in particular, we are going to use some results of the solution of dynamic stochastic general equilibrium models. In particular, we are going to use the representation of a second order perturbation approach to these non-linear models as the baseline for our dynamic factor model. What is interesting is that it's a very simple model, but out of this simple model, for example, we can generate some of the factors that we have seen in the previous presentations. For example, we can generate asymmetric impulse response functions in the sense that a positive shock has a different impact than a negative shock. It's also a state dependent. So if you see it in a financial crisis, the same shock will have different implications that if you go to normal times. The predicted distributions that I'm going to show you, they display non-normalities. Time of our involatility emerges as an endogenous element of the model without introducing stochastic volatility by itself. And also, if I have time, I'll talk a bit about the asymmetric tail behavior. We're going to use two applications that highlighted two different flavors of our model. So the first one is going to be a non-linear credit cycle, where I'm going to show you that as we move to the 2000s and more recently, this non-linear component of the financial cycle became more important. If you look in the 90s and the 80s, you don't need to rely in our infrastructure to think about the data. And the second one is an exercise where we try to extract as a rule and fear a shadow rate, where we specifically impose the zero lower bound restriction on the short term rates. So that's what we do. And the second exercise that is going to emphasize or tries to show you is that it's not that hard to do this non-linear estimation once you impose these bounding constraints that come out from, in this case, the zero lower bound. There is a big literature, but some of you are here. Let me skip and let me go to the model. So this is how we start. We're going to postulate that the factor today depends on the factor yesterday via this unknown function H. So that's like the highest level of abstraction that we can put as a comma from a macro structural training. And then the natural way for me to think about this non-linear relation is to use this second order approximation between the factor and the factor itself in the past. Now, if you have worked with these type of approximations, you know that these are plain draw approximation to the non-linear dynamic analysis introduces explosiveness in the model. So then what do you do? You are going to separate two components of the factor. One is going to be a first order component and the second one is going to be a second order component where these two components evolve. One as the standard linear factor component that you have been using in the past. And the second one is the one that will track the non-linearity. Notice that here critical to introduce the stability of the model is that the factor, the second component depends on the linear component to the square. If you don't do this trick, then this system is not stable. So this is something that we learned from the work of among other people and Fernandez Villeverde and Rubio Ramirez. If you impose this restriction on the autoregressive process in our case, then you can show that the factor component has these rich non-linear structures. So as you can see, you have a volatility on the past. So it's going to have a time value and a volatility that in our case emerges naturally without having to impose. So you can see here because I have this dependence on the H and HX of the system, then the shocks will have this asymmetric behavior and also will depend on where you are in the cycle of the economy. So the first approach to solve or to take this model to the later is to use the non-linear factor dynamics and impose a linear relation between the factor and the observable. This is what we are going to do in the financial data application. When we have the zero lower bound, we are going to switch this linear relation between the factor and the data here. We are going to have a max operator that is going to be the max between what the factor is predicting, what the integrator should be, and zero. So the moment that we go below zero, then the data has to be bounded at zero, yet we leave the shadow rate to be informed by the longer maturities. And as you will see in a moment, the longer maturities are quite important to inform what this shadow rate should do during zero lower bound episodes. We are going to impose normality in the innovations to the system. The reason is because as I show you in the MA, the composition of our system is already quite rich. So we don't want to blur the richness of the model with additional non-linearity elements like time-varying parameters or stochastic volatility. There is nothing in the approach that keeps you from doing that. Here, for clarity, we keep it the same. So the model has this very nice status-based representation. And from here, if you stare for a moment to this representation of the system, you immediately can see some of the nice characteristics of the system. So for example, you will have asymmetrical responses to the shocks. Why? Because the system will depend on the square of the shock. So if you have a positive shock, the two reinforce the system. But if the shock is negative, well, the quadratic term will undo some of the effect of the quadratic term. You can also see that there is the time-varying volatility component over here already. And you can see also other nice features about, for example, the dependence of the impulse response on where the economy is at a particular time, because of this dependence of the factor today on where the factor was yesterday. And this guy is in terms informed by this part of the composition. So let me go through some of the properties of the model when we do numerical analysis of the model. Here, we are going to plot generalized impulse response functions. The left graph is to a positive shock. The right graph is to a negative shock. The blue line is how the factor behaves after these innovations. And then we decompose into the first-order element. So that's the one that you will get out of a standard dynamic factor model. And the dark line, solid line, is what you get or the contribution of the second-order component. And as I mentioned before, you can see that depending on the sign of the shock, these two first and second-order components reinforce each other. So that's the left-hand side, or they fight each other. So there is some dampening if the shock is negative. Why? Because the quadratic term gives you a positive effect. This is an example that generates that depending on where you are in the cycle. So now I'm going to condition where the economy is. The linear factor with the blue line is at 0.56. The dotted black line is at 3.33. We choose just as an example. And you can see, again, the positive shock on the left, the right shock on the right. And you can see this very nice feature of the model that notice where you are in the cycle can have this endogenous propagation effect where you need only one shock and then you're going to start getting these homes in the data. So this is, for example, if you think about the European debt crisis, you can think of these spreads that are increasing during 2011, well, they are in our model emerging from one shock and then it's the internal propagation, the one that does the job for you. So this is another feature that we think of the model, another feature of the model that is quite tractable. Something that people already talked in the previous presentations, we can talk about the shortfall or the long rise of the predictive density, but this model is going to generate asymmetry behavior in response to these shocks as you can see at the bottom of this figure. You can see in our particular, in this particular exercise, it's a positive shock that pushes the distribution to the right, increases the uncertainty of the variance of the distribution, so that means that the long rise will increase than what happens with the shortfall of the distribution. So let me conclude this by showing you the predictive densities out of the model. So at time zero, you get the shock. This is a normal distribution because the non-linearity kicks in with one period and as you move out to the second period, you can see that the predictive density starts to depart out from normality. In this particular exercise, there is not much persistence. And you can see that by period three, you are getting back to the original pre-shock distribution. So as I said, how much time do I have? So as I said, we have two different applications that highlight two different flavors of the model. The applications also highlight some of the technical challenges that I'm pretty sure you all know better than I do when you try to estimate these models. Because of the non-linearity of the model, the particular type that we choose, these models need to be estimated with a non-linear filter. So here we are going to use two different filters to estimate the model. The first model, the credit growth in the US, we are going to use a relatively new filter that is called a Gibbs sampler with particle smoother. Big Assa for short is very useful in stochastic volatility models, for example. The advantage is that it's relatively efficient in the computational sense that you don't need that many particles to do the estimation. Here the results you are going to see is with a thousand particles. On the other hand, once we go to the interest rate data, then we are going to impose this max operator on the relation between the factor and the observables. For that one, we are going to use metropolis hasten with a standard particle filter. The advantage of this is that it's a standard bootstrap filter, so it's very well known. There are theoretical results about the convergence of, in this case, to the posterior distribution of our model. The disadvantage is that it needs a lot of particles. Here in this particular exercise, because of the nonlinearity introduced by the max operator, we are talking of the order of 10 to the power of 5 particles. It's extremely expensive. Here we have this trade-off into capturing these higher nonlinearities needing this particle filter that is a standard one, but it needs to implement it in a lower level language because it's quite computational demanding versus this other approach that is easier to implement. For example, this one we implemented with pretty good results in MATLAB. Let me talk about the applications. Let me go to the nonlinear credit cycle. We are going to take data from the US, financial data coming from a credit growth to be more precise, coming from different sectors in the US economy. This picture plots the dynamics of these different credit growth variables. You can see that it's typically around the crisis, COVID, the great recession and previous recessions where we see these reversals in credit growth that have these high nonlinearities in the data. We estimate the model. What you have here in the top panel, we have the nonlinear factor model that's in red and in blue is the linear component of the factor model. The second figure down here is the second-order element of our factor analysis. What you can see before the 1990s or even going into before the IT boom, you can see that there was no need for the nonlinear component. It's everything about the linear factor model, and that explains a bit why when we started this nonlinear business with my co-authors Juan and Jesus, we had a very hard time convincing people because in reality you don't need a rich nonlinear model to explain what you see in the data. However, as you move to the 2000s and especially around the financial crisis, you can see that the second component of the factor model becomes statistically significant and becomes important. Here we illustrate again the properties of the model. For example, we have the impact of a positive shock and a negative shock when you are in a credit boom. There isn't much happening here. You can see what the nonlinear model predicts relative to a linear model is pretty much the same. As we move to a credit crunch, this is around 2008, then what you can see is that the linear model predicts something different to what is going to happen in terms of our credit growth. You have, for example, here an adverse shock. It's going to make credit growth contract by quite a lot compared to what a linear model is going to predict. Notice this hump that you have here is an endogenous element of the model. We need a sequence of shocks to give you the hump. It's just one shock at time zero and then the model internally will tell you this is what is happening. We think that this is important from a policy point of view because if it's just one shock rather than a sequence of shocks, we think that that's a different action that the central bank should take because it's internally the economy is doing this hump rather than a sequence of adverse shocks. Again, this is showing you that the model can generate endogenous stochastic volatility and also this asymmetric behavior of the tails of the predicted distribution. In the last five minutes, let me talk about the second exercise. Here we are going to take interest rates of different maturities and our task is to extract a shadow rate. An indication of where the real interest rate of the economy is in the case that we do not impose the zero lower bound as what happened in the US. We are going to build forward rates using the methodology of Gua and Sia. Let me skip the details. We can talk later. I'm going to have the second order time-varying factor structure that I mentioned before. On top of this, although you don't see here because of this particular implementation on the background, this restriction is imposing a zero lower bound. If the factor is telling you that the three-month forward rate has to be below 0.3%, then we cap it at that level. Now, this application, what it's going to highlight, is the importance of this max operator that you have in the model. You are going to see that in terms at least of this interest rate application, this second order component over here is less important. You can get away without this guy, you impose the max operator and you're going to get very similar results. That's why we think we like this other application because it highlights that there is no one model fit all sizes. Here you have the factor from our nonlinear composition. You can see that before the greater recession, the nonlinear factor model and the linear factor model predicts something very similar, which is not surprising given that there was no many important nonlinearities in the interest rates. As we get into the zero lower bound, so this is the gray shaded line, you can see that the linear model cannot capture that because it's not aware that there is a zero lower bound. However, here, because we have the zero lower bound and we let the loan rates inform how the factor has to behave, you can see we have an important dynamics. It's easier to see the results with this last picture. Here we have the shadow rate in blue dotted. This is the one from the model by Hua and Xi. The red dotted line is what you get out of our model. There is another line that is capped at zero. This is the three month interest rate for the US. As you can see, the model is quite a parsimonious, yet we are getting very close to the shadow rate that we get from Hua and Xi without having to impose all the restrictions they have to impose to make possible the estimation of their model. Importantly, if you look at the bottom of this figure, the 10 year rate, you can see in this sum here, you can see that the dynamics of the factor when you are at the zero lower bound is informed by the longer term rates. Again, this is possible because we are respecting the zero lower bound, yet the factor is still loading on the 10 year rate because the 10 or even the 5 year rate they are not at the zero lower bound. We have dynamics that are rich to inform how this shadow rate has to behave. We did a likelihood ratio test and we find that the nonlinear component of the factor structure is not necessary for this one, but the max operator to cap the interest rate at zero that is what's critical in this representation. To conclude, we propose we want to think a class of nonlinear dynamic factor models. We hope that the applications show you the richness and the flexibility of this approach that the model has this nice asymmetric behavior, state dependence of the impulse responses and asymmetric behavior that we have seen in previous applications, in previous papers that we think should be part of our toolbox to do forecasting and more structural analysis. Thank you. Thank you Pablo. So the discussion is Matteo Jacobini. Okay, okay. Thank you. So first of all, it was a real pleasure for me to read this paper and to have the chance right now to discuss it. So I have a little disclaimer. So the author was so cool that many of the things that I have been discussing and willing to ask him, he already answered. So my discussion will be quite shorter, but I will try to add something on top of that. So, okay, just first of all to recap a little bit what he has been talking about. We have been looking at a new, I mean a proposal for a new nonlinear dynamic factor model that casted in a state space form where the key features that both the measurement equation and the state equation are allowed to be nonlinear functions. So this was constructed to allow for specific features for the impulse response functions on the latent factor. So this was designed to allow for asymmetric replies to positive and negative shocks, state dependence and size dependence. So to allow for different kinds of asymmetry in many different conceptions of asymmetry itself. Also, the key two key applications that we have seen are to US data for the recovery of the shadow rate during the lower bound of the interest rate period and to the credit cycle, okay? So once again, the key point of all this is in my view the specification of this very nice nonlinear dynamics which can happen both and separately as we have seen in the different applications on the measurement end slash or on the state transition. So let's go very quickly through some of my comments. So these are, as mentioned, some comments are just not really useful right now because he already answered them. But one of the key features that I suggest, I suggest the author to answer at least within the paper is to be really specific on what they are doing as much as he was right during the talk because I do believe that this paper is really, really cool and the things they are doing are very nice but sometimes a little bit overshadowed in the paper. So I think you should stress them more because you do have a lot of things which are very cool here. So in my view, just once again, this double nonlinearity which is not mandatory, you can just turn it on and off according to your application as you were mentioning over the last few slides you presented is really, really nice. And I have a comment on this later on so I'm not spending too much time right now. But one of the key things I would say to suggest is to highlight a little bit more the difference and the similarities of your specific framework with respect to the two key papers that you were mentioning so the key and the under same paper. You have some of this discussion during your talk but I suggest you to spend some more time on that just to avoid giving the not clarity about what is new from you and what is just coming from them. So you are mentioning this pruned approximation of the state space in a second order approximation which is really nice. But my point is, is this approximation something which comes directed from them or it is an approximation that you apply to one model of yours? So how does the equations that you have in your model relate to the contribution of these two previous papers? So what's the difference in here? And in terms of the specification of the model so in the most general form as it is, let's say for generic G and H functions which represent the measurement and the C transition equations. So they are allowed to be generic nonlinear functions but I guess that if you provide some more intuition about the specific kind of nonlinearities that you can take or that you can deal with it's going to be a little bit more nicer and clearer. On this side you are using in one application a max function to allow for this lower bound which is a particular case but did you try to use alternative forms of nonlinearity in the measurement equation just to have an idea in a simulation setting about the performance of your methods? So this is an open question. Another point that I do have is that in your generic framework so Y stands for your observables, F is your factor so what's the size of these two guys? So in the real data applications F is fixed to one so you have one factor which is the shadow interest rate it is the common driver of the critical cycle but in practice for your theoretical framework do you restrict them to be exactly a scalar for FT or it is allowed to be greater and I have a further comment on this in the next slide. So going forward this is a little bit where I have a little bit more of questions so you propose two different algorithms for the estimation of your methods. You have a nice slide in your presentation highlighting the particle Gibbs and the metropolis S things with the particle bootstrap filter. So my question here is a very simple one at first at least to ask is why do you have two of them? So we are mentioning that the computational cost of the two is different one is an approximate the other is not and you are referring also to the different number of particles which is also one of the questions that I had at the beginning. So my point is did you test both of them on the same DGP in a simulation study to highlight was it criticality or one compared to the other one if so what are the problems of one compared to the other one is approximate okay but what is the I mean the loss of information that you are having in just going for the quickest solution and a second another question which is not listed in here is that since you're using at the end of the day some particle booster filter for example my question is okay this by design this kind of class of algorithms quite wide and allows for whatever kind of state space model so without relying necessarily on a very simple or very complex measurement slash transition equation so maybe you can leverage since you're just already using a booster filter you can leverage on that to allow for more complex model eventually since you are your inference approach basically doesn't care about that much of simplifications as it is right now and then some technical details in here about the design of your method of I mean this maybe is a little bit more tedious for the audience to go through it we can discuss this later my main question in here is why do you need such a long MCMC with a lot of I mean very long burning and thinning so my guess is that you have a lot of auto correlation inside but my question is since there is not just a single parameter where is what is the main source of auto correlation is it for a single parameter or it is for couples of parameters and if so which one of them so this is an additional question yeah and one point very important one that I skipped before is this parameter HXX which is a very I mean I really loved this approach for essentially for this guy because this guy is a scalar parameter which is essentially telling you whether your state transition is linear or non-linear in a sense if this guy goes to zero you are back to a linear case so it's beautiful because you are constructing a potentially non-linear state transition as a one-to-one extension of a linear case but if this guy my question is why don't you try to leverage on the Bayesian framework to allow for shrinkage priors on this guy to learn from data so to shrink to the linear case which is the simplest one the most intuitive one if you want so the one that as you mentioned the guys are prone to accept in the narrative and deviate from it if the data tells you to do that so right now you have a kind of a standard Gaussian prior for this slightly higher variance but I think that you can leverage since also it is a scalar you don't have that much of a burden in a computational aspect to learn from it and it's going to be kind of data-driven learning of the non-linearities of the state if it is necessary and this is kind of one of my main questions so going multivariate I'm done going multivariate in terms of the factors so from one factor to more than one factor is it feasible in terms of let's say theoretically speaking for the model design and if so computationally speaking do you envisage any particular trouble in doing that and if so which one of them and then just going quickly because I'm running out of time so yes for the grid cycle I mean for the grid cycle application sometimes at least in the current version that I have read I struggled a little bit in understanding among the different and competing models that you have which one is the best so I have a really it's really clear from the picture and the narratives that your model what it is doing sometimes I struggle to find a very practical question so should I go for model A or model B let's say a more objective way than just looking at pictures there is something like this in the other application that you have for the shadow interest rate when you have some marginal likelihood computation and maybe is there something that you can do also in this case but I just was unable to find it in the paper and finally for the marginal likelihood computation you were using this kind of harmonic mean estimator do you find any numerical issue in working with that kind of estimator because sometimes I mean it's not that easy to work to work with that so I was wondering whether there was any kind of numerical trouble there and I mean there are other just mind I can ask maybe later just to let the other audience talk by the way it was really really nice I enjoyed revising and thanks a lot for your presentation once again so let's speak a few questions from the audience as well thank you and this was a great paper and great discussion and I wanted to follow up on probably some of the one comment of Jakob and it's about the relationship with DSG model so you use the word inspired in your slides and so and can we think or can you establish like a mapping between DSG models solved at the second order and so then what can we learn relatively to that like it will be interesting because one of the issues we are having in our stochastic volatility framework might be and not as direct connection with the DSG so in terms of using those models for predating or finding statistical moments or property that then I can look back in my DSG it's not that at least immediate so I was wondering whether that could be a strength of your approach hi so I had some questions on your applications so on the first application I was wondering like you know whether you looked a bit like at the coefficient so in particular for example whether the coefficient on the quadratic term is positive or negative and like you know in general to compare like the linear shape and like you know your linear shape and whether you could come up with an interpretation of this result in terms of like you know some economic story and the second question is like so normally if you look at those shadow rates right they would be different from their actual rates if you're at like you know at zero lower bound right and in your case they seem to differ also like you know over entire history so I was wondering what was the story behind this Pablo so I think it's a very nice paper and I have a very technical question you didn't mention the underlying variables that you are using in the credit model and the interest rate model I think the interest rate you talk about different interest rate but in the credit model I would like to know which are the underlying variables that you did at least I didn't see in the presentation and then also with respect to this approximation to the I mean to the DSG something that they also would like some clarification to talk about the second order approximation but basically do you try other like third order will give you different things or I don't know or if the second order is not important enough perhaps you don't need another one because perhaps could give you more room in terms of like positive and negative shocks and just considering the second order and perhaps delete the second order and use the third order I don't know that's something that I would like to know Thank you very nice presentation I liked a lot the application to interest rates which is quite important given the period of the zero lower bound and the result the likelihood ratio test is quite reassuring for all the literature I was just wondering about the linear model so how was this factor identified to become flat because like if you're just using all the forward rates you should kind of it's not restricted to follow like the federal funds rate it should be kind of average right so I was wondering how this was identified So Pablo Okay so thank you Matteo for the amazing presentation I have to apologize with you because we send you the older version where a lot of these points are already addressed so I apologize for that some of the questions that you ask about they will push them to the appendix so I think we will have to bring them back to the main text to clarify them okay so going back to the question on what variables we use so we use a credit growth for different sectors in the US economy so that yeah we use five variables of credit growth going to different sectors of the US economy now a common question that came up both by Matteo Dario and other people in the audience is the relation between our model and our structural DSG model so that's an excellent question you can think you get our representations think about the HXX that we have in our model this is the second order this will be a convolution of for example if you have a stochastic volatility in your model or if you have a quadratic adjustment investment so this HXX is going to be a convolution of that so it's kind of semi structural analysis but we cannot go deeper to what you can do from the DSG literature we think it's a very useful approach once we have these non-linear DSGs because people what have done is to have a linear VR where we have some form of non-linearity like a condition where you are in the business cycle and then you get an impulse response and then you try to match it for me there is no theoretical analysis that says that should the empirical impulse response should match the theoretical impulse response I think we think our formulation is a step of trying to bring these non-linearities from a reduced format model to discipline what the structural fully non-linear macro models that we have so that's the way we want to think about going another question is about the estimation of this HH parameter so here we need a lot of particles or we need these very long chains because if you look at the MA representation of what we have this HX term introduces this endogenous time of volatility in the model so this guy then starts to fight up with the variances that we have for the measurement error or the variance of the linear component of the factor analysis so it took us a while to figure out once we got a nice convergence it was clear that we need the long chain because this HX was having some identification issues with one of the variances of the model so that's one reason why we need these very long MC chains so about the second and third order approximation well if you have worked with the structural models and you know that the second order is already a stretch because it's a Taylor expansion so if you go to the third order approximation this is going to be even a tiner so it's going to be even less relevant than the second order related to this point we did try to do some thinking about a model with more than one factor we do two things we apologize this is in the appendix the first one is we did just a plain PCA and we extracted the two first principle components and we find that they line up with what is our first component and the second component they have a strong correlation between them so for that reason we didn't push magic going to a second factor model because I guess one application that we thought at some point is to have both macro and financial data in the system so then you can think that they have a different factors structures so that's a potential application but again is pushing the computational limits of what we can do so that's I'm hoping that I answer most of the questions thank you very much Matteo and thank you for the questions