 And we'll come to this new session at the USR 2021 Conference, so 4A Trendist Markets and Models session. So today you have three wonderful talks that will be delivered during this session. I hope we'll enjoy them and you'll learn a lot from the speakers. So my name is Mouna Bel-Aid. I'm basically an engineer, statistics and data analysis, and I'm a co-organizer at Tunis R User Group. And I'm so pleased to share this session and to participate in this volunteering work at the USR 2021 Conference. So the first talk will be delivered by Harrod Poor. And he will cover how to measure global trends using Google Trends and the global trends package in R. So let me just introduce Harrod. So he's a research associate at WU Vienna in Austria, where he's part of the Department for Global Business and Trade. And so he's research focuses on topics in global strategy and international finance. So he mostly uses R for data and cleansing and statistical testing. And I'm sure that you will learn a lot from him. So just a reminder that there will be a live caption and so in this session and so that you can enable or disable it on Zoom. Other last thing, please don't hesitate to leave your questions in the chat. Or so please don't hesitate to share this in the corresponding. So Shannon in the Slack, Shannon, as you go along. And so this session should be as interactive as possible. So I think the screen is yours, Harrod. Thanks a lot. So today I'm presenting a project I've been working on with my colleague, Jago Milner, also from WU Vienna. And I'm presenting the global trends package, which is an approach to access and analyze data from Google Trends in order to get an idea about global trends and what's going on in the data provided by Google. So in terms of agenda, I'd like to answer three quick questions today. First of all, what does global trends do so in terms of functionality? Second, what can you do with global trends? So I will be talking about applications. And then why should you care? And this is something the reviews have been pestering us about. So what do we contribute to the community in a wider sense? So what does global trends do? Why would you need this package? What can you do with that? So first of all, the global trends package allows you to access data and download the data from Google Trends, which gives you an idea to analyze the dispersion and the development of global trends. This can be really anything I'll talk about this a bit later on with data from Google Trends. So we include functions for downloads, functions for computing, for preparing the data, functions for exporting, but also some plot functions. And since this involves loads of data, also in terms of various categories, the data is stored in an SQL live file, so it's a very easy way of storing and sharing the data. What we've been thinking of is basically a one-stop solution to work with data from Google Trends, so you can download, compute, export, visualize, all with this one package. The package is available on GitHub, and also on GitHub. I've uploaded the presentation and a workable code example so that you can basically see how the package works and what steps you have to take. I think I don't have to tell you about Google and Google Trends. I have to tell you about Google, and Google really has loads of data. I have some of you already know Google Trends. This is what the Google Transporter looks like. You enter a search term and you get some data. I've entered the Superbowl, and within your search query, Google normalizes the maximum value of your search query to 100. So as you can see, that's the green dot. During the Superbowl weekend, the search volumes for the Superbowl in Austria were at the maximum, so this is where 100 was, and this is where it's normalized to 100. However, when we extend the time period, the normalizing basically starts from the beginning, and so the 100 from before the green dot becomes a week 50, more of a 40. Also, if we add additional keywords, so in this case, Champions League, which is way more relevant in Austria, than Superbowl, you see that the 100 from the first slide is now more of a 10, more of a 50. So this is quite a big problem when we think about large scale data comparison and analysis. So what our package does in order to overcome this is renormalization. So we have object keywords and control keywords. Object is what we are actually interested in. So in the previous example of the Superbowl or the Champions League, and the control keywords, they should more or less mirror the standard usage of internet in a given location. So instead of having data relative to the search query, so as we have before, we transform the data to relative to standard internet usage, so these control terms. So we've made some good experiences with these examples that you can see here, but we know that they're context-dependent, so you might have to play around with these, but so far the experiences have been quite good. So instead of looking at the search volumes as we have them on Google Trends, what the package does is to transform them to a search score, which is relative to standard internet usage. And the idea is that this makes it comparable across locations, across terms, and gives really super cool data to the data, which I will show you later on. In the package, we include two measures for internationalization. The first one is the degree of internationalization, which is more in terms of dispersion, so how equally distributed are search volumes across the globe. So our people in the US, in the UK, in Japan, Russia, and Australia equally interested in a given topic, so if this distribution is very equal, then you have a high degree of internationalization. We measure this in terms of an inverted Gini coefficient, there are some robustness checks included, and this is unweighted data, so the US counts as much as the UK, as Austria, as Luxembourg. The second measure we include is the volume of internationalization, which is more about global search scores, so how relevant, really, on global learning is a topic. So can you look at how important globally is suitable compared to the Champions League, compared to Donald Trump or whatever? So these are the two measures we use in this package, the two measures that allow you to look at the internationalization of global trends. So in terms of workflow, that's actually pretty simple, and you have more details in the documentation than the example that I've uploaded to GitHub, and the first step is to set up, you have to initialize the database file, you have to connect to the database file that is stored locally, and then you have to add your keywords to the database. It's stored in there, and then you're good to go to start your downloads. So if there are a few download commands, so first you have to download the control data, so your baseline data, and then you can download the object data, whatever you're actually interested in. And this might take some time because you can't send too many queries at the same time to the Google service, but this is kind of the workhorse functions, let's say. Now that you've downloaded and stored the data to the database file, you compute first the search scores and then the bottom of internationalization, and then the degree of internationalization. Now you have all your data together in your SQL live file. There are some export functions that provide you with data frames that you can store directly to Excel or CSV files or whatever. And then we have added some visualization functions that build on these exported data sets where you can basically visualize the data and prepare some plots, which we'll show you some examples later on. But in terms of workflow, that's it. So if you will see, when you look at the example that I've provided in GitHub, it's really not more than 20 lines of code from initializing the database to creating your first plot. So it's a really, really simple approach that takes care of basically all the things you have to do in order to get a nice dataset. So this is how global trends works. These are the list of functionalities. So might have looked a bit quick. So you might ask, okay, what's the point? What can we do with that? I'll try to answer this question now. The point where we started was the internationalization of firms. And this is just a quick example of six firms from the S&P 500. And let's say in terms of face value, it makes sense. We, in this plot, we can see that Coca-Cola, Facebook and Microsoft are quite internationalized, while more domestic companies where we would expect low degree of internationalization, such as Alaska Air Group or Illinois Toolworks. They have very low degrees of internationalization, which basically shows that the measures do what we expect them to do. In case you don't believe what you see in face value, in the backup slides of the slide deck, you will also find some robustness checks where we test the validity of our corporate internationalization measures. What you can also see when you look at the box plots is that the distribution of these degree of internationalization plots is actually quite dense. So there is little noise in there. So this is monthly data and you only have a very few outliers. So it's not distorted by some other trends, but it really seems to capture the internationalization of firms. So this is where we started and where we come from, but you can do lots of other stuff with this data and with our package. You can also look at the internationalization of products because all you need is keywords on Google and this is not limited to firms. You can do this for products and as you can see, the example of the Nintendo Switch or the Tesla Model 3, once the product got introduced, the degree of internationalization went up. So again, we see that the measures basically do whatever we expect them to do. In the case of the Volkswagen Golf, it's way more stable as it would also expect. You can do this for individuals, for politicians, for athletes, for scholars, like Paul Grubman. And the interesting thing is it's always the same scale of comparison. So you can compare the internationalization of Donald Trump to the internationalization of Kiya and Mbappi to the internationalization of the Volkswagen Golf, but also to the internationalization of Facebook because it's always the same scale. We've done this for organizations other than firms, football clubs, universities, even terror organizations. You can do it for global trends and phenomena like Brexit, the ice park challenge, or even COVID. So you see that COVID has revolved experience with quite international quite quickly. Since the data is not limited to country level data, you can also look at within country dispersion as we do here with some national newspapers in the US. And for the Boston Globe and Star Tribune, you also see that the interest, the search score is highest in those states where these newspapers are actually from. So again, the data really shows more or less what we expect the data to show. So we really think that this is valid. Another cool thing, and then I'm more or less done with the applications of the package, is that because you have, as we provided in the global trends package, you have a monthly time series of internationalization measures. You can also do event studies. And we've included some functions to do that so that you can compute abnormal internationalizations of it similar to event study and finance. We're still working on that. But again, here are two examples of the one that the Christiane Romano transferred to Juventus and then the Tiffany acquisition by LVM Ash. We also think that this makes sense. So I've convinced you that you can, I hope I've convinced you that you can do cool stuff of our package, but you probably have been doing similar stuff already. So you might be asking, why should I care? What's the point of all of that? And the answer is always in research, it depends. It depends on who you are. So we believe to academics and practitioners, our package provides systematic access to global trends and really gives access to an amazing data source that has been applied, but we don't think that it has ever been used in such a very systematic way because this is the core aspect. We believe that this renormalization that is integrated in the package, it allows really large-scale data analysis. You're not limited to a few keywords in a few locations, but you can do this on a very large scale. And then the SQLite database file that is used to store the data makes it very easy to share the data within projects and among colleagues. So for our users, we don't want to replace cheat trends R. We know that cheat trends R is there because we actually use cheat trends R. This is what we use to access the Google Trends API and we never thought about replacing cheat trends R. That is absolutely not our intention. So if your code currently runs with cheat trends R, good for you, I'm really happy. Keep working on your goals. You probably don't need our package right now. But if you think about writing new code to analyze Google Trends, you might want to look into our package because it provides a one-stop solution to really a system of functions that guides you from downloads to exports to visualization all in one package. And then finally, to conclude for our developers, I think our package is a case in point that you really have to adapt your package to the users you want to target. Who is your average user? Think about that in academia. My experiences and now my co-panelists can prop me up on these experiences. People don't really use R, they have to cope with R. So everything that is beyond Stack Overflow or a tutorial gets tricky and to these users, we provide the Google Trends package. So depending on whatever your package wants to do, whom you want to target, you might have to adapt your code to your target audience. So thanks a lot for your attention on the code. The package everything is available on Github. This is super-working progress. So any comments, any suggestions, bug reports, criticism, whatever. If you file it through Github, it's highly, highly appreciated. Thanks a lot and I'm looking forward to your questions. Thank you so much, Harj, for this great presentation. Hopefully, soldier attendees, you get a clear understanding about the utility of this R package. So I think it's time to move to questions. I have two questions here. So the first one, how can we search for local events, meaning storm in my hometown, for example, is there a good solution to do that for that? What is it? Actually, that's pretty simple because on Google Trends, therefore also in our package, you can specify the location where you want to get the data for. So it doesn't work on a city basis, but on a province basis, so US states. So your keyword of interest, your object keyword would be storm or hurricane, and you would only use as a location, I don't know, the US state of Maine or Virginia or whatever your actual location. So this is pretty simple. I'm not sure whether you need our package for that because it's really intended for large-scale comparison, but with Google Trends, it's definitely possible. Yeah, all right, awesome. You have another question from Hernan. Thank you for attending. Is there a limit to the number of requests, so meaning keywords that can be made per session time units? Yes, but no one knows what the limit is. Google doesn't tell us. So our workaround is that we wait around 10 seconds between each download. So this makes it a bit time-consuming. We know, but that's the only solution that seems to be around. And once you get blocked, I think the code tries every few seconds whether the download is possible and then generally one download is possible and has to wait again. So I don't know what the number of downloads actually is, but so for researchers, time isn't that important. So we have the time to wait till the downloads are open again. But the package does this automatically, so you don't have to refresh on an interval. The package is doing that. Yeah, that's so cool. So please don't hesitate. If someone else has a question to leave his question in the chat, I would be happy to tell her about it. Otherwise, don't hesitate also to review the links of the package, I think. So the Zoom host will share this in the chat. Okay, otherwise, let's move now to the second talk. So computing disposition effect on financial market data and that would be delivered by Sir Lorenzo Mastrieli, who's a master's graduate in economics and finance and also a graduate in political science at University of Delhi, Studi di Milano and also Mark Zenotu, who's another developer and working as data scientist at T-Voice in Milano. So they will introduce to us the new disposition effect our package and so that along with it you quickly evaluate the precision of disposition effects behaviors of an investor, sorry. So I think really nice topic. The screen is yours. Okay, hi everybody. Marco here from Italy. Today, my colleague Lorenzo and I are going to talk about the disposition effect, which is the R package we developed to perform behavioral finance analysis. So as I said, I'm a senior data scientist and our developer working since five years in a multinational company based in Italy. And here with me there's Lorenzo, who will introduce the theoretical concept of Bitcoin app. Hi everyone. We can, I think, switch to the other slide, Marco. So for the beginning, we must define disposition effect in what it is. The disposition effect is a particular behavioral anomaly related to the tendency of investors actually to sell an asset when it is gaining value on the financial markets, while instead tend to have the tendency of holding the asset when it is losing value. So this is irrational because it was already demonstrated by Gigadish and TITMA in 1993. The bad performance stocks in financial markets that have performed bad for the past three to 12 months actually tend to underperform the markets to keep performing badly in the next three to 12 months. So that's why this disposition effect and the tendency of keeping the losing stock is actually irrational. It is observed in financial and real market data, but also in other situations. So if we move to the next slide, Marco, it was discovered in 1985 by Shafri and Stateman and the reference paper is made by Yodin in 1998 using US financial market data. It has been actually demonstrated to be present in real market data for private investors and experienced investors. Experienced investor financial institutions such as Greenblatt and Keduario at the most recently in 2001. And the actual research is actually switching the focus from the single asset level to the entire portfolio level of a single investor. So instead of undermining whether the investor has a disposition effect linked to a single asset and the tendency of selling the asset when it is gaining or not selling when it is losing, the focus is at the portfolio level. So if the portfolio is gaining value or losing value, how do we measure the disposition effect if actually the difference between these two, these two percentage measure, which we have the realized gains and the realized losses. And at the same time, the paper gain and the paper losses. So what do we have here? The realized gain and the realized losses actually anytime an investor's made a trade, he's actually selling a stock for a gain or for a profit. And that's basically what is a realized gain or a realized loss. For the paper gain and the paper loss, instead, we have a little bit more complicated situation. So suppose the investor has 500 stocks of a single asset and he sells only partially the stock. So it's a 250 stocks. He will remain in the portfolio with 250 stocks, but those 250 stocks in his portfolio are actually a gaining or a losing position which is not actually realized. So this is an opportunity of realizing a gain or a loss which is not taken by the investors. And that is the measure of the paper gain and the paper loss. So it's actually measure the situation of the portfolio. We make the percentage ratio between the realized gains and the realized losses. So how many times the investor realizes the gain when he has the opportunity to, how many times the investor realizes the losses when he has the opportunity of doing it. And the result of the difference is the disposition effect whenever the measure is greater than zero. So whenever we have the realization of gains greater than the realization of losses. Why it is important? It is important because as I already said, it has been documented in a lot of different contexts. So we are talking about financial markets but this is also actually being discovered in the house markets, in auctions, also in policy context because it is really related to the phenomenon of endowment effect. But in particular, this is also important because it has been documented in any time of financial players such as local funds, governmental institutions. As I say, the private investors whether they are experienced or unexperienced. And in this way, understanding the disposition effect would be really helpful even for private banks or financial institutions but also for financial authorities to regulate the markets and managing maybe a situation like really stressful situation just as the COVID situation in 2020. But at the same time, it is really important because it is an irrational phenomenon and it actually leads directly to the negation of the classical economic theory and negating the operational behavioral agents. But I think I will leave the stage to my colleague Marco so we will actually present you the technicalities of the package. Okay, so actually, Lorenta talked to you about the old theoretical staff. So because the disposition effect is very important, we decided to develop an app package and you can actually install the development version by GitHub, but we are planning to release it on time by September 10. So let's start talking about the package with main functionalities and cover all the steps towards the actual computation of disposition effect. First of all, among the main functions that the package contains, there are four fundamental interfaces. The last is the core function of the package. This is the function that actually performs the computation of realized and paper gains and lossy. Lorenta talked to you about this. It did this under the hood and this function is a bit complicated. So for this reason, for body of computer, function should be used instead, which is the user friendly interface to the calculation of the realized and paper gains and lossy. Finally, disposition compute and disposition. Summary allows you to compute the disposition effect and some aggregate statistics are related to the disposition effect based on the results of portfolio compute. So what kind of data do we need to perform the disposition effect? After we do the print type of data prints, that needs to be used as input into the portfolio compute function. The transaction data print, which is the data print containing all the financial transactions of an investor during a specific period of time and the market prices. That is the actual prices found on the stock market for each traded asset on each transaction data time. Okay, so we invested transaction data set must contain this information, these variables. So it has to contain the indexer ID, the type of the transaction that is if the transaction is a buyer or sell, the asset ID and also the traded quantity price and data time. The market price instead needs just these three variables that include these three variables, the asset ID, the reference data time and the reference price found on the stock market for that asset at that date time. So how can we compute the distribution effect? Well, it is very easy actually. We need just to input these two data prints, the transaction and market prices into portfolio compute. This way we obtain a new data print containing the realized and paper gains and losses for each asset, as you can see here in the in the yellow box. Then we just need to apply the disposition compute function on this result to obtain a value of the disposition effect for each asset here in the red box. And of course, through the distribution summary, we can summarize the results among many interests. So now I would like to spend a few minutes talking about portfolio compute because as you saw, it is the main user interface of the package. And as I said, it is easy to use, but it is also highly flexible and it allows you to perform many different analysis. So let me focus on few of its most important parameters. We made the argument, we allowed your argument the portfolio driven DE and time series DE argument. Okay, first of all, the needed argument. It allows to calculate the realized and paper gains and losses based on different meters. The standard count meter or total value and duration meters, where total stands for the aggregate amount of traded quantity, value is the expected value of the traded asset and duration is the total holding period for each asset. The allowed short argument instead allows to give the user the possibility to include or not the short selling into the calculation. Okay, then we have two parameters that are now to perform analysis that are really, really at the frontier of the behavioral financial domain. The portfolio driven DE allows to separate the computation of the realized paper gains and losses between positive and negative portfolios. While time series DE allows to compute the evolution in time of the disposition effect for the investor. However, if you would like to compute the evolution in time and the time series of the disposition effect for the traded asset, you can do that by specifying the asset for which you want to calculate the time series distribution effect by means of the parameter assets time series DE. And of course, we have many vignettes for this package. I told you that we are planning to submit a contract so there is plenty of documentation and the vignette about the analysis of this position effect to show you how to actually use all these parameters in more detail. So finally, let me conclude with a very important topic that is computational performance, computational efficiency. Well, this is really important because financial data may be used in size hence the scalability of computation may be an issue. We had many tests about all this computation time. Here you can see a small test on a sample of 120 investors with an overage of a thousand transactions and an overage for an investor with a thousand transactions it takes almost 16 seconds to calculate, realize and pay the gains and loss that is to use the portfolio compute function while using the disposition compute that is the actual calculation of the disposition effect if all work in time, as you can see. Okay, this of course depends on the number of transactions and the number of traded assets but the good news is that portfolio computations are embarrassingly parallel among different investors. So if you want to analyze the disposition effect and perform all the computation of realize and pay the gains and loss on different investment, that level is embarrassingly parallel. Hence you can actually parallelize the function easily and we have also provided a big message that I'm telling you how we can do that, different with different parallel possibilities. So Anna, thank you very much for your attention and for your interest. If you want to talk more deeply about all this topic, just contact me and Lorenzo and we are very, we will be very glad to talk about this. Thank you so much for you. This is a great presentation as well. Hopefully we enjoyed it because I did. So let's move to questions. Yes, you have one question from Max, I think it's interesting. Portfolio transactions and market prices could be hard sometimes to organize. So is there a way to import from a trading platform? Well, this is a very interesting topic. Unfortunately, we don't have this because most of the time trading platform does not allow to have this data for free. So most of the time you have to pay for this. And importantly, we were not able to work with this kind of API, let's say, that allows to actually import data automatically from a trading platform. But if someone knows how to do this or how to collect data from free trading platform for free, of course, we can maybe talk about that and try to develop some API that can do this. Yeah, that's nice. So maybe from my side, I would ask a question. What was the most challenging task I would say that you... So during the preparation of this work, so what was the... Which step or which task using R? Well, actually the development of portfolio compute, which is the main function, let's say it's very... It was difficult to allow it to use in a very easy way because as I said, we developed the gains and losses function which is the real core of the package but it's very difficult to use. So the most difficult part was to try to do some function that was easier for our user that most of the time when they come from behavioral analysis domain, they are not comfortable with using R at all. Yeah, that's good. That's true. Okay, so please don't hesitate if someone else has a question before moving to the last talk. Don't... Is it to share this on the chat? Okay, I think there is a question. Yes. Funerality means there should be a arbitrage possibility, right? Or so have you tried a test portfolio strategy using your package? I will answer this question. We are going to... This is actually a possibility of an arbitrage and for the moment we are actually working on the package. So you're able to determine whether the disposition effect is present or not. For all the consequences that you want to derive from the presence of the disposition effect, that's up to you. I mean, we are working on the computation methods, okay? And how you use the result is up to you. So it actually leads to this possibility and its consequences. There is a paper on that which is made by Frazzini in 2006 and which is actually linked the disposition effect to the possibility of arbitrage in financial markets. And our package actually moves from that starting point and actually the difficulties that Frazzini found out to confirm that disposition effect leads to arbitrage or not. We talked about the possibility of financial regulators to eliminate the arbitrage possibility. Of course, there is also another side of the equation. Yeah, good. Thank you for your answer. We have a question from Levi. So thank you for attending. Can you use daily information in transactions and markets? Yeah, of course. They need data. So you can actually use any frequency you want. So you can actually work. You are not limited to work with closing price. You can work with the intraday data also. Yeah, that's good. So thank you for your time, for your valuable notes that you shared with the art community. Let's move to the last talk. Okay. So estimation methods for markets in equilibrium and this equilibrium that will be delivered by Pontellis Karapana-Jottis from the College University of Frankfurt and so who's also an assistant professor at the Economics and Philosophy University. So during his talk, he will present the art package, this egg, which provides functionality to simplify the estimation of models for markets and this so in equilibrium and disequilibrium using full information of maximum likelihood methods. So let's enjoy this talk together. The screen is yours. Welcome, everyone, and thank you for taking the time to watch this video. I'm very excited about having the opportunity to present the estimation methods for markets in equilibrium and disequilibrium that the package this egg provides to 2021's user conference. My name is Pavelis Karapana-Jottis. I'm an assistant professor at the BS University in this pattern and I'm also a research affiliate of the Leibniz Institute Safe in Frankfurt. In most of our work, I as an economist as well as most of my colleagues use models for markets in equilibrium and this is actually done for quite good reasons. First, equilibrium concepts tend to be analytically very convenient and second, equilibrium models constitute reasonable econometric approximations that enable us to study our market of interest without giving up too much generality, at least on most occasions. However, on other occasions, this is not the case and equilibrium models constitute not our most appropriate of choices. As an example, you can see here embedded on this slide a Wikipedia article describing a concurrent, ongoing computer chip shortage crisis that I'm guessing most of the people who are interested in this video have either personally experienced or at least heard of. Market models are typically represented as systems of potentially nonlinear equation economics. They can be broadly categorized into two types of models. Market models that embrace the market clearing condition and are typically called equilibrium models and market models that use the short side rule and are typically called disequilibrium models. On the left hand side of this figure, you may see the picture of a model that embraces market clearing, the condition that is also printed below the figure. The econometric assumption of the market clearing condition is that the traded quantity Q is always equal to the demanded quantity D, which is always equal to the supplied quantity S. Thus, unlike the state of the Schrodinger cut, the state of the market is always known to you as a data scientist or an econometrician. You know that whenever you observe the market, the point that you pick lies directly at the crossing of the mountains applying. We know how to estimate this linearly already from the 50s, but in the 70s many economists were not very fond of this idea of perpetual market clearing and they began to set up models that use the short side rule instead. A depiction of such a model can be found on the right hand side of the slide. The main econometric assumption of the short side rule, which is depicted also below the figure, is that we can either observe demanded or supplied quantities depending on which one of them is the smallest. Now, although these models are quite appealing, they become directly quite more difficult to estimate because for starters the short side rule makes the model nonlinear. The most popular estimation method is that of full information maximum likelihood. Despite its popularity, up to recently there was no standard software implementation that one could use in order to obtain out-of-the-box estimates of this type of models. This led, of course, to duplication of effort as every researcher had to reimplement the estimation procedure, created issues with respect to the reproducibility of the research output because it is not for standard that in economics the implementation code will be served and also created issues with respect to the comparability of the results because different researchers might have been using different optimization tools, initializing values, stopping criteria and so on and so forth. The packets dissect aims at filling exactly this gap. The packets has three main design goals. Firstly, to provide a simple, approachable, common estimation interface for all the models in the packets, irrespective whether they describe markets in equilibrium or its equilibrium. Secondly, is to provide fast implementation evidence. Perhaps a 30-40 second difference is not that important when estimating ones, but the harsh reality of a researcher dealing with such models is that she will estimate them 100 times. So performance gains are quite significant in this case. And thirdly, the packet aims to provide some post-estimation functionality that can facilitate analytics. The common interface design goal is achieved in a typical, if I may say fashion, namely by adopting an object-oriented of all five models of the packets. Concerning the computational performance, the packets employs by default novel analytic expressions for the likelihoods of all the models. Although the user may override this behavior it is typically not advised to do so because there are huge performance gains that are documented also in large-scale benchmarking estimations using simulated data. You'll also have the opportunity to see some of the results later on in this video. Concerning lastly the post-estimation utilities, the packets offer functionality to calculate predicted, applied quantities, aggregated quantities, various indices that are helpful in the analysis of shortages and marginal effects that take into account both sides of the market. From an architectural perspective the organization of the packets is quite typical. There are two backend classes and five front-end classes. The market model class contains all the functionality that is found in all the front-end classes of the packets as an abstract class in C++ terms or an interface in Java terms. There are two derived classes from the market model class, namely the equilibrium model which is a front-end class and its equilibrium model which is still a backend class. This equilibrium model is a backend class because there are four distinct disequilibrium specializations and for each of these specializations we get a different front-end class. All five models can be estimated using full information maximum likelihood which essentially boils down to a call to ours opting. The equilibrium model can also be estimated to states least squares. I have also experimented with the native optimizer of the GNU Scientific Library for estimating the equilibrium model without achieving however any significant performance gains in spite of my efforts to parallelize the estimation. Still, the functionality that the GNU Optimizer is exposed to the user because the GNU Library offers the ability to tweak a little bit more the optimization parameters. In the online vignette you may find the benchmarking results for all the five models of the packets. In this video we're going to focus only on the basic model as a representative case. The remaining models follow very similar patterns in the results and the basic model is by far the most commonly used disequilibrium model in the economics and finance literature. The benchmarking exercises took place in the cluster of scientific computing of Kettig University and there were two variations that were conducted for each one of the five models. In the first one that is represented by the figure that you see on the left, the number of market model parameters was kept constant and the number of observations was led to exponentially increase. In the second one that's represented by the figure that you see on the right, the number of observations was kept constant and the number of market model parameters was led to linearly increase. The benchmarking statistics were gathered by simulating both model parameters and the result in generated dataset. For a dataset, for the then generated dataset to be included into the measurements it had to pass a series of sanity checks that had to do with the market data balance. Of course the simulation times were not measured, only the estimation times were measured and before beginning any measurement the processors were warmed up by two untimed estimations. For all the benchmarking exercises the models were estimated using three different optimization options. Namely, BFCS with an analytically calculated gradient BFCS with numerically approximated gradient and the simplex method of nether and mid. In all of the benchmark, the faster alternative was obtained by BFCS with analytically calculated gradients. Each point that you see on a solid line of those two figures represents the average estimation time obtained by 100 estimations using the simulated data. The dotted lines represent one standard deviation differences from the averages. As you can see in the figure on the right, the performance gains obtained by increasing the... when we increase the number of parameters remains almost stable. As you can see in the figure on the left, the performance gains that we obtain as the number of observations increases increases linearly. As a numerical example with around 40,000 observations which could correspond to the benchmarks to the estimations on the figure of the right, the basic model is estimated 6.43 times faster when the analytic expressions are used in comparison with when they are not used. This would be 10.73 seconds compared to 69.07 seconds correspondingly. Now we can take advantage of the common interface that is provided by this in order to initialize multiple market models while we keep the base underlying specification constant. In this respect on this slide we are going to initialize 6 required variables and 2 optional variables that are used by the constructors of the models. In line number 1 we specify the identifiers of our data set. In line number 2 we specify the quantity columnar data and in line number 3 we specify the price columnar data. 3 out of the 5 models that are provided by this set they have a dynamic component and for them to be estimated the calculation of price loss is required. For these models therefore we need to specify separately as we do in line number 4 the time columnar data. In lines number 5 and 6 we specify the market structure namely the demand and the supply equations. The format of the equations follows the plastic pattern that is also used in the linear models and the user should also expect the functionality that is obtained by the linear models. For example if one passes uses a factor column in these equations then the indicator variables corresponding to the levels of this factor column would be automatically created and the corresponding coefficients would be automatically estimated. In line number 7 the optional variable determines whether the shocks between demand and supply are allowed to be correlated in the estimation and in line number 8 the verbosity level is set. The verbosity level determines the eagerness with which the constructor and subsequent course to the constructed option will communicate messages to the user and it ranges from a value of 0 which prints on errors to a value of 4 which prints debug information. Now we can initialize and compare multiple market models based on the same variables. In the first 7 lines we pass the variables to a new call asking it to construct an equilibrium model. The equilibrium model is static and therefore we do not need to specify the time column separately. In lines 8 to 14 we use an almost identical new call with the only difference being that we construct now a basic equilibrium model. In lines 15 to 21 we are asking the new call to construct a directional equilibrium model. The directional model is dynamic and therefore we have to separately pass the constructor a time column in this case. Lastly in line 22 to 28 we have an almost identical to the preceding call with the only difference this time being that we ask to construct a deterministic adjustment this equilibrium model. The common interface can also be used when estimating constructed model objects. For example to estimate the equilibrium model that we have previously constructed we call estimate. In the estimate call of line 2 we also pass the additional keyword argument control which is essentially passed down to the bbml call and as you can see in line 1 sets the maximum number of iterations for the optimizer. The results of this and all the subsequent estimation calls are presented in the table below the call block. Estimating the basic model can be done in a very similar manner. By default estimate initializes the optimizer using a starting values the estimates of simple linear regressions of the past demand the supply equations. In case the user needs she can override this behavior by specifying the start keyword which is also passed down to bbml. For example we estimate the basic model here using a starting values the estimates that we have obtained in line 2 from estimating the equilibrium model. The user also has the ability to directly override the use optimization algorithm. By default bfcs with analytic gradient expressions would be used but the user has the ability to switch to nether mid as for example we do here in the estimation of the directional model. The last two lines of the call block are also estimating the deterministic adjusted model. With this I would like to close the presentation of the R-Packets DSEC which provides estimation methods for markets in equilibrium and in equilibrium by shortly reiterating the main points of the talk. The packets provides methods for estimating models that cannot be estimated out of the box using other software alternative software. It does so by providing a very simple common interface for all the models irrespective whether these are models for markets in equilibrium or its equilibrium. Perhaps the strongest point of the packets is the implementation of very fast estimation routines that are using novel expressions for the gradients of the likelihoods of the models. The packets lastly provides also post estimation analysis tools. From my point of view the most interesting potential future packets would be to include additional models to implement additional estimation methods or to implement disequilibrium tests. Nevertheless I would be more than happy to hear suggestions about future expansions from users or potential contributors. Thank you for taking the time to watch this video. Thank you so much. So that has been really an exciting learning experience about this R package. Really great job. Thank you for sharing this within the R community. Let me check if there is questions that are questions. So please don't hesitate to share your questions in the chat. I would be happy to tell the brain list. Okay, yes, you get one question. Will the slides of this video will be shared? Yes, I think I already shared them with the organizers, but I will try to make for sure that this will happen. Thanks. Yeah, that will be good. Thank you. Yes, so here we have it's possible to review this session on YouTube. So already the link is shared on the chat. So don't hesitate to review this later. If there is another question, let me see. Okay. Yes, Mariam also shared the link to package or code repository. So feel free to see this and to keep in touch to keep an eye there. Okay. So hopefully you get also a deep dive into the powerful features of the different R package that were presented today. I hope you found them useful and personally I will, so I highly encourage to use them, such work and such contributions make the programming language are more and more powerful. So next so after this talk don't hesitate to join the elevator pitch session. So the elevator pitches, this is the name of the channel on Slack. Don't hesitate to go there and so that will be interactions and so we'll get networking, etc. And also don't forget to get social, meaning so share attendance with your friends on social media at the end of the video. So user are 2021 conference and let's spur variable knowledge. Thank you so much all for attending and goodbye.