 interface of that aduber. And the interesting thing here, with respect to transportation and data, is that a lot of the big data technologies that have been built over the years by the likes of, say, Yahoo or Google or Facebook, Twitter, LinkedIn, any of those guys, it's all for web content. And kind of monetizing that web content through advertisements and many other things. I think one thing that is unique about data aduber and some of the challenges that we face is the area to which we apply it is very unique. It's transportation, and nobody has really looked at the implications of big data and transportation, and what are the things that are unique challenges there? So we see that a lot. We'll talk about some of them as well today. So data, we are a highly data-driven company. Data informs every decision at the company. And also, data separates Uber from many of our more traditional competitors. Like, if you think of Uber as a transportation platform, but if you think of Uber as a ride-hailing company, it's a very intelligent, smart, ride-hailing company because that's what data differentiates and makes some of how we provide that service to our customers in a very smart way. As well as, if you think of it as a food delivery company, again, data significantly differentiates how we do that much, much better than how a traditional food delivery company would do it. So in terms of us moving things, data informs every decision at the company, making us very efficient. So our massive data holds deep insights, and the data team's goal is to surface those insights. The data team serves 6,000 or so data scientists in data growth. Those are probably the skillset that you most familiar with. Typically, they are programming in Java, Go, Python, in the data world, maybe in the machine learning world, Python, or R, and stuff like that. So that's we cater to them as well. Data quality and data freshness is a pretty big challenge. One aspect of it is also there is this continuum between what is batch and what is real-time, and that gaps keep shrinking for us all the time. We've done a lot in terms of how we incrementally ingest data, and we make all data available for query in our petabyte scale data lake within an hour. So there are many things that I think in general in industry are considered real-time, where we actually run the batch mode across our entire data set. And then we also have specialized systems that are based off of real-time streaming analytics, either SAMS or Flink-based systems, which cater for a much smaller data set, but it is much more real-time. And the real-time there is maybe minutes to hours, or minutes to less than an hour. Data quality is a big deal, because there is bad data that gets into the system all the time, which then has bad repercussions on the insights and the reports that we generate, so that's a problem. And also data correctness in that context means that if there is incorrect data in one table, then there is a cascading effect on aggregates or downstream tables that also may get affected or downstream data sets. So it's a pretty big problem to kind of go all the way back and correct it all. Disaster multi-DC, we are already in multiple data centers just because of a single data center cannot hold all of the, I think a data center has about 150 megawatts of capacity, power capacity, and our server footprint is much bigger than that. So we have to be multi-DCs for just scaling our business, and then there is also the aspect of disaster and how we recover. And as we go into 2018 and 2019, we have a rapidly evolving cloud strategy on how we want to leverage public cloud infrastructure in addition to our own data centers. Efficiency is a pretty big deal for us because as I talked about, our data growth is slightly exponential compared to our business growth. Our business growth itself is actually somewhat exponential, it has been. So for us, it's very important for us to keep a check on how much money we are spending on doing all the things that we do with data, in just prepping the data, making it available. So it's a very important factor. We work on that a lot. And then another challenge is security, needless to say, with GDPR just behind us, and things like SOC's compliance and other things that are following us in 2019 as we think about IPO. It's a pretty big aspect of how we think about data. With that, any questions? We can fairly make it interactive. I think we are doing pretty reasonable on time. So for instance, there is some partitioning on the user base as well. For instance, a subset of our data actually goes into the data warehouse, which is Vertica-based. And typically, that is used by the data operations for a lot of interactive querying against that data set. That may be one aspect of how we look at think about partitioning. But in terms of our batch data set, all the users actually get to usually typically end up using that. So yeah, but I can think of the data operations folks typically may end up using the data warehouse, which may be a subset of the data that we have in the entire data lake. I don't think it's all of the data. This is the city ops, actually. It's maybe the data wing of the city ops. And we have a hybrid strategy. In some cases, we work with partners or we buy maps. In a lot of cases, we build our own maps as well. Especially in the Bangalore office here, there is a pretty big maps team, which works on a lot of the internal tooling for maps. And we also have a large maps operations team, which processes a lot of the data that comes from our own mapping systems that acquire this.