 I'm Salvatore Bobonas and today's lecture is Wikipedia data and its sources. We all use Wikipedia, we even rely on it. Even as a university researcher and specialist in comparative data analysis, when I want a quick data summary, I turn to Wikipedia. In general, Wikipedia data are highly reliable and well audited, and any errors, omissions and vandalism are easily spotted. Wikipedia is especially useful for comparative summaries of data from different sources, really it's just a fantastic tool and people who say not to use it are frankly being pedants. Wikipedia is indispensable in the 21st century. It's especially good for comparative GDP data and other headline data series. It can be less good for more obscure figures. Remember, Wikipedia is updated by unpaid, volunteer human beings like you and me. There's no large paid staff updating Wikipedia data on a daily or monthly basis. It's all volunteer labor and so the biggest, most headline grabbing numbers get done regularly. Other more obscure data series may be updated rarely or may not be in Wikipedia at all. Now ideally, Wikipedia data are always referenced to the original sources. If you're just using Wikipedia casually, for example, if you want to know for a classroom discussion the comparative life expectancy in the United States versus Brazil, sure, check Wikipedia. If you're doing professional research, something for a corporate audience or for a research audience, an academic journal or if you're writing a school paper, definitely always go to the original source. Wikipedia will tell you where they got the data, so just go to the place they got the data and get it again for yourself. Wikipedia may not always have the most recent or best quality data for data series that are not heavily used and I'll illustrate some of this in the illustration today. So let me just go to Wikipedia. I'm going to look up, without even saying Wikipedia, I'm going to go to Google and look up a list of countries by GDP per capita. And you'll see without even asking for Wikipedia, Google automatically assumes I'm looking for things from Wikipedia. I have two options here, list of countries by GDP purchasing power parity and list of countries by GDP nominal. I'll start with the nominal series. These are based on current exchange rates and if I scroll down, I will see three different sources of data, International Monetary Fund, World Bank and United Nations and these are all on a most recent available basis. I can sort by country name or leave it sorted by rank. I'm going to sort each of these by country name so we can very conveniently see some of the differences. All three of these are GDP figures, note slightly different year but World Bank and United Nations are theoretically the same year and we'll see some differences. So I'm going to look up Brazil. Brazil in the World Bank series is 11,612 US dollars per person per year. In the United Nations series, it's Brazil 11,387. So not a big difference, 11,612 or 11,387. What is Brazil's GDP per capita? Well that's a difficult question. These are different series reported by different agencies. Note that the IMF series for 2015 is much lower. Just because the Brazilian Real, the currency has declined substantially between 2014 and 2015 and so Brazil now appears poorer in US dollar terms than it did. If we want to avoid some of those problems having to do with exchange rates, we can go to the PPP series, the Purchasing Power Parity series. Purchasing Power Parity is a way of adjusting for the cost of living in different countries. Then I'll switch to alphabetical order. These are IMF World Bank and CIA World Factbook as sources. I'll compare the IMF and the World Bank data again for Brazil. If I scroll down a little bit, I see Brazil at 16,000 dollars per person per year in the IMF series, 15,800 in the World Bank series. So again, very similar. Notice how the PPP, Purchasing Power Parity series gives a higher value for GDP per capita than the exchange rate based series. This is because in general, poor countries have lower costs of living than rich countries. As a result, when converting to dollars, poor countries appear to have greater income per person because the cost of living is lower in those countries. Now, if I look for some other series, lists of countries by life expectancy. Again, Wikipedia comes up as the top choice, and once again, if I scroll down, I get a very nice table of countries' life expectancy and then broken out for female and male life expectancy. And if I sort by country, I can get figures for specific countries. Brazil's life expectancy is 75 years, 79 for women, 72 for men. So far, so good. These data were for 2013, published in 2015 by the World Health Organization, and here is a link to the underlying data. So very well-referenced, a very nice data series. Now, let me look for data that are less robust. So if I look for lists of countries by women's education level, that's maybe not as you commonly use the series, though it is published by various organizations, including the World Bank. Here Wikipedia's data, Wikipedia gives data for 45 countries, but they're unreferenced. The list is incomplete and only includes countries with average years of school or 12 or greater. Why? We don't know. Where do the data come from? We don't know. There is a reference here that's a reference for the definition of the list, but the reference isn't attached to the list itself, so I can't be absolutely sure that it comes from that source. Also, there's no year attached to this data. So these data are probably much less trustworthy. I could look for Brazil, but it's not here, because average years of schooling in Brazil is probably well under 12, and it's not included in this list. So this is a sort of list that I certainly would not want to use Wikipedia as a source for. It's undated, unsourced, not comprehensive. I would want to look for a more comprehensive data source than Wikipedia. In fact, I would probably go to the World Bank's World Development Indicators, and if I went down here to this link on girls' education from the World Bank, I could eventually navigate through to the World Development Indicators and its data source. Key takeaways. First, Wikipedia data are useful for quick consultation, but for quote-unquote official data, the underlying sources should always be consulted, and that includes in a work environment or in a school environment, anything that's being published should rely on official sources, the official sources that underlie Wikipedia, not on Wikipedia itself. Second, Wikipedia's editors make choices about what data to present, but we can usually trust their choices. These are usually unbiased choices that are well monitored by other editors. And finally, do remember that Wikipedia's editors usually are not deep subject matter experts in the areas they cover. They are editors who contribute on hundreds of topics throughout the year. So their judgments should be treated with caution. You as the data user should make a judgment about the data you're using. Thank you for watching this lecture. For more about me, you can go to salvatorbebonus.com, where you can also subscribe to my monthly newsletter on global current affairs.