 Live from the Congress Center in London, England, it's theCUBE at MIT and the Digital Economy, the second machine age. Brought to you by headline sponsor, MIT. Welcome back to London, everybody. This is Dave Vellante with Stu Miniman. And this is theCUBE, and we're here covering MIT's initiative on the digital economy. Roberto Rigabon is here. Roberto, welcome to theCUBE. Great to see you. Thank you so much for inviting me. So tell us about the billion prices project. What is that all about? So several years ago, I would say actually nine years ago, a student of mine, Alberto Cavallo, decided that he wanted to try to compute alternative measures of inflation rate. And he was motivated because in Argentina, he's from Argentina, he was motivated because in Argentina they intervene in the statistical office and they were lying about inflation rate. So he wanted to construct the truthful inflation rate, in fact. So he started with food and things like that. And then I realized the tremendous potential of just taking that to all the sectors, not only food, and we started working together. And right now, we download from more than 70 countries worldwide. We download on almost every sector information from the web on service prices and stores as well. I mean, stores are the easy ones. Getting the price of a taxi in London is a little bit harder. What we do is we have to make reservations every day from the airport to the hotel and they tell you the price of taxes. So you have to be kind of creative about how to get the data. But then you have to collect everything. So is a robot doing that? No, no, no. A person cannot do that because you have far too many cities in the world to actually do that. No, so it's actually a robot. Make sure you cancel those reservations, right? Yeah, yeah, yeah, we do it. Actually, we ask for a quote, not a reservation. Ah, that's right. It's a quote. Okay, good. So we ask for the quote. It is perfectly fine. So okay, so you're basically taking, you know, many, many, many data points. It's not a sampling of data points as we talked about offline. You know, big data, we talk about Hadoop, sampling is dead, and you've gone right to the heart of it. So what does the data show? So, I mean, they're very surprising in some countries, for example, the inflation rate that we compute online is very similar to the one that is computed by the statistical office. Which countries do a good job? I would say that all developed nations do a very, very good job. So the numbers are not identical, and the numbers will never be identical month by month. Sure, right. And the reason is that the online business just moves faster than the offline business. Imagine in Italy, if you want to increase the price of pasta, well, you increase first to the online users because, you know, they're probably richer, they don't care about the price, they have no memory, but you will not increase the price of pasta in the streets and expect it not to be burned in the store, no? So imagine increasing the price of bread in France. It will be there. So they don't increase the price of bread, they just make the bread smaller, and that's my version of how the croissant was invented, no? It used to be this size. We just put it that way. So, indeed, they move at different periods, but when you look at the year-on-year, for example, in the United States, I get a little bit more inflation than the BLS, and I would say that this is on the .02, .03 percent. So it's nothing. The numbers are very small. In Japan, I always get a little bit less, and that would be about 10 basis points less than the official data, but that's the order of discrepancy. Now, when you go to emerging markets, for example, in Brazil, our inflation rate is 2 percent bigger than the official one. So their official one will be, let's say, 6.5 or 7. The one that we will compute will be at least 2 percent more than that, on average. And it's not because we have price controls or not. We collect the prices from the items that have price controls. I just collect way more items. What happens in the statistical office is that they put too much weight on the products that have control prices. By construction, you have a lot of items that the prices are flat. And then Argentina, our inflation rate will be three times the official one, not 3 percent. So if they say 8, we will get 24. Okay, so in the former example, the developed nations, it's more of a timing issue, maybe. But in the less developed nations, you're saying that it's essentially the government's hiding the ball. Is that right? Well, in some sense, when in some sense it's a statistical capacity, it's how much can you do? I mean, the poorer the country, the less resources the statistical office has. So it's a flawed measurement system? It's a flawed measurement system because of resources, some of them. In other countries, it's just manipulation. So, yeah. If your country's name, Argentina, Russia, Venezuela, they are just manipulating the data. So the other ones might be... So they don't like you very much. I would imagine that. Yeah, no, they don't. What's been the reaction through your project in the data? Well, so what is interesting is, for example, in developed nations, like a country like Australia, where they produce the inflation rate only once every quarter, actually our data has been very well received by the central bank, by the statistical office, by everybody in the sense that we are actually complemented. And for example, Australia is one of our earliest countries. We have this data from 2008 and it's just remarkably close to the statistical office. So in that sense they use it very actively, as a measurement of daily inflation rate. So in developed nations it has been very well received and in countries like Brazil it's well received, they just disregard it and say, that's the inflation rate online. I said, yeah, yours is inflation rate offline. It's not clear which one is better. Well, the academics in Brazil must appreciate that. Yeah, so that is very well received. But it's interesting, central banks like the Chilean central bank really appreciates the data because it gives them a very early signal. So what happens is that when there's going to be inflation you see that in the online prices about two or three months before you see it on the, yes, because I mean there's a very big delay. So as a central banker when I can see this very fast shift in the online inflation rate and I can infer from that what is going to happen in the economy is actually very valuable. So central banks are using your data to make decisions. And I imagine some private sector and financial system also uses the data because of the same reason. Yeah, I was curious, do you know are there businesses or traders that are watching what you're doing and being able to take advantage of those opportunities? I hope they are. In some sense what we do is we distribute this, a big part of our research is about three years ago we did a spin-off. This was incredibly expensive to do with resources from MIT which, by the way, this was created entirely with resources from MIT. So we did a spin-off and now we actually distribute this through a bank, a state street bank and we distribute the index. So I really hope that somebody's using it. So how do you distribute? You have an API into the data set? No, actually because I'm not very good at selling so we just have an agreement where they distribute and they are the ones that sell. I just give the index to them and then they are the ones that distribute. They have a research webpage and what we do is to what is publicly available is the inflation rate of Argentina and the U.S. that is entirely publicly available so you can go to Alberto's or my webpage and you can see the inflation rate of these two countries. But except for those two the rest are distributed through the state street bank. For pay? And I think it's for free. But then state street bank is funding essentially? Yeah, state street bank is actually funding a big part of what we do. A baby big part, yes. Can you speak to we were talking a little bit before off camera just the growth of this and how much resources I mean you said you named it a billion prices before it was a billion and now it's over a billion prices a day. Yeah, so indeed when we started my goal was to actually try to get one billion prices in a quarter and remember I'm collecting every day so this is totally cheating, no? And the reason is I know it's not stupid but collecting a million prices is still very hard. So I wanted a billion in a quarter and today we have the ability to download way more than a billion in a day. So I mean we don't because we don't want to. I mean there are certain items that you don't need to download all the items like books for example. If you go to Amazon.com for example they will have about 17 different million books sold on that particular day. So now how many of them do you really need to compute the inflation rate of books? Probably 500. I mean you need a very small sample of books which are the hot books. Everything else is sold once a month or once a week so you don't need the prices. So even though we have the ability to download now what we do is we restrict ourselves we know that we don't need all 17 million to compute the inflation rate of books so we don't need to go all that way. So in this process we have learned that you don't always need all the data. Now in some sectors you need electronics. Electronics you need all the data otherwise you will make humongous mistakes because I need the new iPhone to be in the data and the old iPhone to be in the data to be able to compute an index. Oh we're getting the high sign but we have to talk about 1000 Big Macs before we go. Our new index what we have done is borrowing from the economics idea of the Big Mac the beautiful idea is that you get the exact same item to compare across nations and we said well in our data we actually have a lot of items that are identical worldwide I mean I have Sarah's data so I have the same t-shirt and H&M the same t-shirt Nike the same shoes and so what we did was to create this index that compares identical items worldwide and it's a thousands of them and I call it a 1000 Big Macs project because it's in that spirit and the beauty is that this is more or less tells me how expensive the tradable products are in my particular country and when you think about it all the tradable products are online food is online electronics are online clothing online cosmetics are online I mean everything except cars you can find it online and therefore what we can do is we have a massive set of the basket and we compare it so we've detect what I call macro-economic imbalances so when Brazil becomes more expensive on everything gasoline, food, electronics when it's too expensive something has to happen to the exchange rate and in fact we are actually finding in our research that nine months down the road when we find these signals nine months down the road they tend to depreciate they take action oh yeah so for example right now in England there's a lot of questions about what is going to happen to the pound dollar rate there's a lot of political volatility yeah what's happening with the sterling well actually I think in the sterling for example in our data looks that is right about the equilibrium that there's no trend so there's going to be a lot of noise due to politics but I don't think that the trend will drive in too far from the 150 and so it is it's in our data roots exactly where it should be in other words I and in England versus the US we have thousands and thousands and thousands of items is that okay it's reaching a big number when I look at all of them and I ask who is too expensive and who is too cheap the proportion is very small between England this actually the data is from yesterday okay so it's balanced you would predict stability well I mean volatility due to politics but there's no trend in that sense so that is mostly noise yeah you can normalize that line over time exactly you can normalize exactly that's the way we did it so it's kind of very interesting and anyway I'm very excited about this and this is the first time we're going to present that research we're really excited to hear this your conversation this afternoon we'll be broadcasting it live so Roberto thanks very much for coming on the tube no thank you thank you for having me congratulations on the research and appreciate your sharing oh no thank you thank you guys for having me alright keep it right there we'll be back with our next guest right after this we're live from London this is MIT IDE this is the cube we'll be right back