 Okay. Hello everyone. My name is Kirill Semyonovsky and I've been involved in the movement for 15 years in different capacities and roles. Today I'm going to present a research on the content of the project as a free knowledge as an economist. I've been always wondering. So is there any economic justification or explanation of the work I'm doing on the community projects? Because there have been always people saying, oh, you're wasting your time. You're doing something that you don't or anything for. Why are you doing so? And I thought it was always like, I'm not alone. There are a lot of people doing the same thing and deriving some utility from it. So I tended to explain whether this is true. Is there any economic justification and is the work that I'm producing actually really freely accessible to anyone in the world? So free knowledge is commonly thought as, but in practice it's not a pure public good. And while it's perfectly non-rivers, meaning that if I know something, I don't prevent anyone else to have the same knowledge. For example, if I know that Singapore is an island country in Southeast Asia, that knowledge doesn't prevent anyone else to acquire the same knowledge. But the problems come with the excludability and this is what makes free knowledge not a pure public good. This is mostly driven by different factors which are related to the limitation of access and it's driven by economic, institutional and social factors. So the result is that less people can consume the free knowledge and also the same time less people can produce and in economics that creates economic inefficiency. So the main research questions here are why is the free knowledge not a pure public good? Then what are the implications of the impurity? How this impurity can be measured and what are the implications of it? How big are the implications across countries and what are the drying factors that lead to this impurity? So at the beginning I'm going to define what's the difference between a pure and an impure public good. Then I'm going to use this knowledge to develop a model on free knowledge. I also introduce the concept of an invisible text as a measure of excludability and at the end I'm calibrating the model with data from the Wikimedia projects in order to study how big are the impurities or how big is the invisible text across countries. So why is it so important? It's actually included in the Wikimedia vision. Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. The line every single person on the planet is given free access actually refers to non-excludability. So the ultimate goal of the Wikimedia movement is to make the content on the Wikimedia projects a pure public good. So publicly accessible to anyone. Okay, the economic literature on this topic, especially on free knowledge is pretty scarce, but there are many papers on the contribution to public goods. I'm not going to delve too much on this and I will just explain that a must read is The Warm Glow Theory by Andreoni developed by the end of the 80s and in the early 90s. He opines that people are impure altruists and they're not driven by their desire to contribute, but also because of the joy that they have while contributing to the public goods. I also strongly recommend the fairness and reciprocity models by Robin and Fer. These are very interesting because if a person wants to contribute to a public good, then that person also expects to get something from the other side. That's called reciprocity and it's very interesting. And in recent times, there are also many models and theories developed around the social image and the pro-social behavior, understanding the social motives that drive people to contribute to public goods. With regards to the Wikimedia projects, I have seen only a couple of papers. The one by Jagan Zhu is published in a reputable peer-reviewed journal in Economics. It consists of a natural experiment on the Chinese Wikipedia after it was blocked in order to study how to be a hero that editors changed as a result of the block. This paper by Algan and the group of authors was presented at Wikimedia 2014 in London. It also consists of an experiment, but it's not an experiment about the public goods game in producing the Wikimedia content. So I begin with explaining what is a peer and what an impure public good. I assume that there is a good G in the economy with two properties. The first one is excludability and the second one rivalry. So these are the main properties of the public goods. I scale them on an interval between 0 and 1 where 1 denotes perfect known as excludability and perfect known rivalry. I also assume that almost every good in the economy has a complementary good C, so that its credibility is not only a function of its price, but also of the access to its complementary good. For example, if I have an online newspaper subscription, but they don't have a computer, then the subscription is worthless. Even though I don't need to pay the price, I cannot access the newspaper because they don't have a computer to do it. So I assume that every complementary good has a price which is the lowest price that some individuals cannot afford to pay. So every price below this price means that the good is non-excludable because every person in the economy in the world can pay that price and can access the good. So technically, the excludability is a function of this price. It is the highest level of excludability at which there are individuals who cannot access the good. So this is the eta-headed above, eta-overlined. In the same way, I can also define the breakeven point of rivalry. This is the raw-overlined, which is the highest level of rivalry at which there are individuals who cannot consume the good. And here is a very nice definition about what is a pure and what an impure good. Technically, a pure good is one which is perfectly non-rivorous and perfectly non-excludable. It has the values of one. An impure public good is one which is not perfectly non-rivorous or perfectly non-excludable, but it still has values above these threshold values. While a private good is one which is either excludable or rivalrous and it has at least one of these values which is lower than the threshold values. So on this chart, the green area on the top right corner actually represents the public good area. So if the combination of values of rivalry and excludability is located there, puts the good there, then the good is a public good. If it's not, then it's a private good. So why is this important in the context of the good G? Because a public good G is pure if it's perfectly non-rivorous, its price is equal to zero, and if the complementary good is a public good. And it's an impure public good if it's perfectly non-rivorous, it's a fear of charge, its price equals to zero, and the complementary good is a private good. By applying mathematical induction, this can be easily extended to the case of infinitely many complementary goods. So a public good needs to be pure, can be pure if and only if, for each sequence of complementary goods, all of them are public goods. If at least one of them is a private good, then the public good G is an impure public good. And here is a very nice implication of this, that individuals who cannot afford to pay for one in the network of the complementary goods, they can also not access the public good, the primary public good. So to illustrate this better, for example, let's take the Wikimedia content and assume that it has three complementary goods. The first one is the Internet access, the second one are the IT skills that the people need to possess in order to access the Wikimedia content, and also the literacy, which is also important to read the content on the projects. And if at least one of these complementary goods fails to be a public good, then the content of the Wikimedia projects will not be a pure public good. Okay, so let's move on to the development of the model. I assume that the economic environment consists of a finite number of individuals, and they operate in a discrete and infinite time. But the production of free knowledge takes place in a continuous time setting. I also assume that every person has a leisure time age, which they decide to divide on consuming free knowledge and also contributing to the production of free knowledge. I denote them by V and W. And W, which is the contribution time of producing to the free knowledge is also dependent on the autism level on the size of the population, the number of people contributing to the free knowledge and also on the development of the region that the person comes from. I think here it's very important to explain the autism level, which can take values equal to zero, which denotes the case of an egoist. And well greater than zero, which denotes the person at the state of an alt-risk. In case this is equal to zero, it means that the time that the person spends to contribute to the free knowledge equals to zero. And this definition helps to divide the population into two groups. The first one consists of people contributing and producing the free knowledge. And the second one is the group of free riders, only the people who consume the free knowledge but not contribute to its production. So each person contributing to the production of free knowledge has an individual share, which is denoted by J. And it's a function by G. It's a function by the time that the person spends and the productivity rate to producing the knowledge. And then by this function, the total amount of free knowledge produced in the economy can be easily expressed. It's very important that all these components of the function of free knowledge are independent so that it can be decomposed in three parts. The first one is the individual production by each individual in the economy. And the second one is the contribution by all other people. And we also have the state or the amount of free knowledge produced in all previous periods. Then another very important thing in the model are the social interactions. This is very important because the Wikimedia movement is a social environment. People communicating with each other in order to produce the knowledge. And I also assume that every people derives additional utility. There is a function of additional utility from social interactions with other people in the economy, with other individuals contributing to the production of free knowledge. The idea here is that the people expect that by those social interactions, their productivity rate will increase and they will make more quality edits or their contribution to the free knowledge will be of higher quality. This of course alters the production function. So the total amount of free knowledge produced also includes the effect of the social interactions in the economy. To better illustrate this, I presented with a simple graph of four vertices and six directed edges in which every person interacts with every other person in the economy. And from those social interactions, they derived additional utility, which has to contribute to increase the quality of their edits. And at the end, I think it's very important to explain the utility function. So every agent, every individual in the economy has a utility problem that they aim to maximize. It consists of two utility functions. The first one is the utility of consumption. So it's an increasing function of the time that people spend to consume free knowledge, to read and to learn from the knowledge on the Wikimedia projects. And it's also an increasing function of the amount of knowledge available. So if there are more articles on Wikipedia, then people would get higher utility because they can read more content. And the second component is the social benefit of production. This is exactly what drives people to contribute to the Wikimedia projects. Because if this is equal to zero, then people would spend their entire time to consume the Wikimedia content. But it's impossible because if no one produces the content, then there will be no knowledge available, no free knowledge available. And then the whole value of the maximization problem would be zero. So the social benefit function is an increasing function of the time spent to produce free knowledge. And it's also a decreasing function of the amount of free knowledge available. The assumption here is that if there are more articles, if there's more content, people would be not that motivated to contribute and they would prefer to spend less time to contribute to the production of free knowledge. And here is an important proposition. That's the national equilibrium in the production of free knowledge, which is achieved when the time spent to contribute and the free knowledge available equals to zero can be achieved even only if the social benefit function is equal to zero. So the intuition is very simple. If people don't derive utility from their contributions to the free knowledge, then they would prefer not to spend time because it's useless and they would rather spend more time to consume the knowledge and derive utility from it. Okay, so the equilibrium in the economy. So I assume this is a market economy. There is a market of free knowledge in the sense of any other good. So the equilibrium is achieved when the aggregate supply meets the aggregate demand. And it's very important that at this equilibrium, at this steady state, individuals tend to make decisions on how much time to spend to contribute to free knowledge and how much interactions to have with the individuals in order to achieve that level. So after explaining, after defining the equilibrium in the economy, I move on to explain what's the effect of excludability and rivalry. Okay, so here I defined a rate of excludability in the economy, which affects the number of people that can have access to the free knowledge. So the total number of individuals can be decomposed in two parts. The first one is the share of people who have access to the free knowledge and the second one is the one of people who are excluded from consuming the free knowledge. So the excludability rate does not only depend on the vector of prices of the complementary goods, but it also is dependent on the rivalry levels to access the good and also on the policies by the government. For example, if there is a censorship in the country, although the people can access free knowledge, if it's censored, they cannot do it. So this is also a major source of excludability, which may result in less people having access to the knowledge. And at the end, if I analyze the aggregate demand and aggregate supply, it can be noted that the demand curve is below the demand curve in the case of perfect non-excludability. So the aggregate supply is lower because the total amount of supply knowledge in the economy is less than the total supply knowledge that it would be in the case of perfect non-excludability. So here comes the main part regarding the concept of the indivisible tax of free knowledge. What actually is the reflection of lower supply of free knowledge as a result of the excludability and rivalry in the economy, and it can be calculated as the total amount of free knowledge that is not produced in the economy because some people cannot access free knowledge and the total amount that could have been produced had all people had access to free knowledge. And the important question is why to call it an invisible tax? First of all, in public economics, a tax is known as an amount levied by the government, by the authorities to support production and provision of public goods. But the thing is that in microeconomics, a tax is something else. It's a source of economic inefficiency which results in lower supply and demand and it actually leads to the creation of a deadweight loss. And it is called invisible because there are no monetary payments. So it's not that someone gets money from the people in the economy. But even in that case, the effects in the economy are the same as in the presence of the taxes. So on this chart, there is a nice depiction of how this affects the market equilibrium. As you can note, the red shaded area here represents the deadweight loss. It's the area which is created like an economic inefficiency because of the lower demand and the lower supply leading to a lower steady state in the economy, lower equilibrium. And there is some major theorem at the end which represents what the deadweight loss means in practical terms. So the deadweight loss of taxing the free knowledge is the sum of utility functions of all individuals that have no access to free knowledge. In fact, it represents those people who do not have free access in the economy. This is very intuitional because if I don't have access to free knowledge, I cannot enjoy the benefits of that knowledge which is available online. And at the end, I move on to calibrate the model. I use data from the Wikipedia projects, but I have to note that there are missing data on page edits from many countries, like for example Russia, China, Pakistan, Iran, Turkey and Thailand. And the page edits are very important because they're the main supply metrics. The main metrics used to calibrate the supply of free knowledge. Whereas on the demand side, I use the page views as a metric. Then I move on to calculate the annual elasticity of page edits which are estimated by using this quadratic regression. I regress the page edits per capita by using the share of internet users and the literacy rate, which is the quadratic term. And I also aggregate the page edits per country using the formula of the average page edits and the average number of editors. Because the Wikimedia Foundation is still working on the development of concise and precise data sets of page edits per country, I use the statistics of bucketed page edits per country and decided to use the average of their buckets, of their intervals and also to normalize them in order to calculate the average number of page edits. This table shows the results of the calibration of the elasticity that are used to calculate the potential maximum that it's made and Wikimedia articles created. This is very important because the potential maximum represents the amount of free knowledge that would be produced if there is no exclusivity in the economy. And here are the results of the calibration. These are results across countries. This is really not something strange to me. I expected to have these results, but the magnitude of the differences, the drastic of magnitude across countries is really pretty strange. I didn't expect to have such results. For example, the lowest rates can be observed mostly in the developed countries, in Europe, in North America, in the US and Canada, as well as in Australia, New Zealand and Japan. The point is that the lowest rates have been observed in Luxembourg, 0.3%. This means that given the current state in this country, 99.7% of the total available free knowledge could be produced in the economy, which means that only 0.3% represents the loss of the knowledge that is produced because of the non-excludability. Similar rates, invisible tax rates have been obtained for Norway and Finland. Whereas on the other hand, the highest tax rates, the invisible tax rates have been observed for Malawi, Chad and Lesotho, African countries in which almost 99.8% of the total free knowledge in the economy, which could be produced, is not produced because of the exclusability. I also made a comparison between the Global South and the Global North. The invisible tax rate in the countries of Global South is 77.2%, which means that only 20-20.8% of the potential maximum free knowledge is produced there. Whereas in the Global North, it's only 14.6%, which means that more than 85% of the potential maximum is produced. And with regards to the factors of exclusability, I analysed three of them. The first one is digital divide. In 2022, the average share of internet users was slightly above 50% in Global South, whereas in the Global North, it was almost 90%, which speaks about a large and drastically difference. Then another source of exclusability is the net neutrality versus zero-rating paradigm. For example, there are a lot of authors arguing that zero-rating has positive economic effects for consumers, which means that more consumers could consume the free knowledge. And there was a very nice project by the Wikipedia Foundation, Wikipedia Zero, which was an attempt to reduce exclusability across countries. And the main criticism of this project was that it was not net neutral, it was against the principle of net neutrality. And at the end, we have the censorship. This is a case in which governments deliberately seize access to the Wikipedia projects so that people are prevented to access the free knowledge. This can be done by blocking content and by prosecuting editors. And there have also been some examples of some traces of censorship in developed countries, in democratic countries, like the UK, Australia, France and Germany, where there were disputes related to single articles. And about the future research, I think this is an ongoing project. This is something that could be improved in the future. So it's good to obtain new and more detailed granular data on the Wikipedia project so that the model can be recalibrated. Then there is also room to work on the model's components, like, for example, to model the marginal utility functions to estimate and forecast the contribution times spent by editors and also to study the social interactions between people. And there is also a lot of room to conduct experiments, like, for example, natural experiments to study the effect of reforms and censorship, and also to conduct online experiments in order to study the behavior and preferences of the Wikipedia editors. Thank you. If you have any questions, please raise your hand.