 me. Okay, so it's a pleasure to be here. I've actually never given a talk at this kind of a big auditorium. So I must thank Xenob and Ashwin for inviting me here. I'm glad I accepted the invitation. This is my first time in the conference and I've already heard a couple of very good talks. Okay, so let me start with my presentation. The title of my talk is somewhat of a mouthful but I basically wanted to convey all the important things that we at Xendrive are working on. So we basically work in the space of driving safety and we work with smartphone sensors. We analyze data coming from the sensors and that is spatial data, location data and then we basically characterize driving behavior. So the basis or the reason for why we are doing this essentially can be summarized in five arguments. So I'm going to sort of give you that sequence. First, driving safety is of utmost importance. I don't think I need to sort of sell you driving safety on that. Anyone who drives a car or any vehicle and anyone who drives in Bangalore probably is already sold on this idea. A sad fact is that there are just too many accidents happening worldwide and people are losing lives in addition to sort of loss of property. So this is extremely important. Second, automobile as such has undergone tremendous technological advances. Safety systems, new engineering systems being put into automobiles year on year. So if we go by Sherlock Holmes, if the collisions are not reduced, then there has to be something to do with driving. So if the cars have become better, they have improved, then probably drivers have become worse. Third, if driving has become worse, there are actually specific observable behaviors which are very well correlated with crash risk. People keep inventing newer and newer ways to keep themselves distracted while driving. So distracted driving typically features at the top of the list, but there are other vehicular maneuvers. You are breaking hard, you are aggressively accelerating, you are taking dangerous turns and so on and so forth. So many of these vehicular maneuvers are very well correlated with your propensity to get into crashes. So your driving behavior in a way can be characterized by these events. If you do that, this is the fourth point, if you do that and if you give right feedback, then significant improvements in safety can be achieved. And we have seen it first hand. So we work with fleets, we work with fleet aggregators and we in fact have applied, by fleet managers can give feedback to their drivers and week on week, month on month we have seen sort of reduction in these dangerous risky behaviors and thereby there is improvement in the safety. Fifth and perhaps the most important for this particular talk is that all this thing can be done using a smartphone. Particularly all the vehicle maneuvers that we are talking about can be detected by data captured on smartphones. So today smartphones actually have a manager of sensors starting from GPS. GPS is probably what everyone in this room uses. So it gives you location, but it also gives you raw speed, it reports accuracy numbers, it gives you direction of motion, so on and so forth. But there are also inertial navigation sensors like accelerometers, gyroscope, magnetometer and they measure bunch of things, total acceleration, your angular velocity, magnetic flux which can give you which way you are heading, those kinds of things. And then there are many more. There is proximity sensor, there is ambient light sensor, barometer, so on and so forth. So all these basically sensors can be put to use to understand these vehicle maneuvers. I would say this was a very bold hypothesis when the founders of Zendrive started. This was like untested hypothesis, but now when I am standing here, I can basically say that to great degree we have basically proven it out. So what we do, very simply speaking, so we collect data from smartphone sensors, we analyze it using a variety of techniques. So and you will see the reasons, we are basically dealing with noisy time series kind of data. So machine learning is only one part of it. So a multitude of techniques from signal processing, you have machine learning, stats, geographical information systems and then there is just raw smarts because this is ultimately a physical system. Cars do not fly, cars do not go at 150 meters per second, they do not stop within 1 meter and they are traveling at 80 kilometers per hour. So all these things basically have to put together and then you can analyze this data. So far till date I would say we are about, we have amassed about quarter billion miles of data. So I would say we are in the vicinity of Sagebidder from Earth. So it is huge amount of data. So we analyze it, then we quantify the, this is where the statistical framework comes in for risk modeling. And finally we deploy these solutions as mainly as a form of an SDK or a consumer app and the deployment can happen, the deployment of algorithms can happen on your phone. It is a completely client-side implementation or it can happen on the backend, on the servers. Now since I am not going to have time to discuss all these things, my focus is basically going to be on the middle two portions, identify and quantify. I understand the people, many people here would be interested in knowing sort of how we ingest data, what is our technology stack, how we do big computations and so on and so forth. My focus is not going to be on that. If there is a question I will try to answer it. If not we can basically have the discussion offline. Okay, so here is the three sort of broad categories of problems we are dealing with, characterization, quantification and optimizations. Okay, let's start from the top. So drive detection. We don't collect data continuously. So we basically have to know that you have started moving. That's fairly easy. OSS these days will tell you when the phone has started moving. There is all this well-known concept of geofencing and significant updates. But we are not interested in motion. We are interested in high-speed motion. But if you are riding in a train or riding in a bus, we are not interested. So we have to infer that you are riding in a car. But that does not suffice because we want to attribute that driving to you, which means that we have to infer that you are driving the car. Okay, so it is identifying motion, finding that it is car and then you are the driver. That is drive detection. So suppose you have done that. Next is basically you want to identify all these different vehicle maneuvers. And now we have a bunch of them. So we can detect aggressive acceleration, hard braking, hard cornering, over-speeding, whether you are using phone while driving, so on and so forth. So event detection comes as the second step. Collision detection is an event, an unfortunate event that we want to detect. But it is also for a slightly different purpose. At this point in time, we have this technology and it is basically to help say family members, let them know that you are in the collision or with some fleets they want to know that their vehicle has been in a collision or basically get early response teams onto the accident site. So collision detection is a special kind of event and we do detect that. You have done that. Now what do you want to do? You want to summarize your drive or a trip and basically start quantifying your behavior. And we basically do it in sort of three large buckets. You want to understand how good you are at controlling your car, whether you are focused on your drive and whether you are doing sort of cautious maneuvers. You have done that. That happens for say a drive. But driving behavior is actually a long-term sort of quantity. So it has to be done over a long period. That is where the risk modeling happens. So all these things basically come together to sort of come into a variable that we call as Zend drive score along with other variables which allow us to model risk driving risk. Now optimizations is a slightly special kind of problems because our deployments are on the back end, they are on the phone. Data is coming in really fast. This is time series sensor data. So there are algorithmic optimizations. We are running this on the phone. We don't want your battery to just go down because it is not going to be useful. So we are doing battery optimizations and so on and so forth. So this is kind of the lay of the land for the kind of problems we are trying to solve. Again, so I don't have that much time. I am going to focus on event detection. So what do we get? So this is our raw data. On your left-hand side you are seeing GPS. This is a simple sort of trace on the map. This is location latitude, longitude. You will also get location accuracy numbers. You will get course information. That is on the left side. What you see on the right side is your sort of temporal data. So you will get raw speed from your GPS sensor. You will get three axis accelerometers. You will get three axis gyroscope. I am not showing all the sensors here but this is basically what we see. What we want to do is we want to annotate this time series in very simple terms. You want to say, okay, the blue bars is say one kind of event. This is where these blue events happen and then there are a bunch of red and green events that are happening. So we want to know exactly where these things happen because location of these events is also important. You are getting into a collision. You are doing a certain maneuvered intersection that is probably more dangerous than if you are doing it on a straight road. So we want to know the locations. In some cases we also want to know the durations of these because they matter. Some of these events are very sort of split second events. Some of them are long. So this is going to be our output. How do we do it? Okay, the first thing that you must understand is we are basically trying to understand vehicle dynamics. But our measurement frame which is the phone is not necessarily aligned to the car. It can be anywhere in the car. We are not instrumenting the car. We do not have a specialized hardware which is attached to the car whereby we can have it aligned to cars longitudinal axis and lateral axis. This is an arbitrary measurement frame. It is not a very difficult problem. Anyone who has done sort of basic linear algebra knows this is just change of basis and by measuring your gravity and through north you will be able to make sort of change of basis. Not very difficult. Care has to be of course followed in that case because again this is noisy but it is doable. What makes the problem more interesting is that your measurement frame can be actually moving inside the car. So there is a relative motion. You are using the phone then it is moving inside the car. It has its own motion. Your phone can be in the mount. It is vibrating because it is a shaky mount. It is moving inside the car. So disentangling these two dynamics is actually very interesting problem. This is the first step that you have to follow. This is the first challenge. The second is and these are sort of more or less challenges with respect to raw data coming from sensors is just reality of life. You are going to get noisy data. They are after all sensors. They are going to have their limitations. The interesting fact is as I said we do not have a custom hardware. We are basically asking you to put our software on your device. It could be iPhone, it could be Samsung, whatever. Samsung has maybe 50 different models. So there is an immense variety of sensors there. First is it may not even be available. I mean I have a Moto G3. It does not have gyroscope. Gyroscopes used to be expensive so in the earlier phones only high end phones would have it. These days many of the phones have it. So there is a question of whether they have a limit. Second is their characteristics are different. I mean maybe STM is manufacturing all the accelerometer that Samsung is putting on model 900 but it could be a different chip on a different phone model. Their calibrations could be different. I mean iPhone 7 probably has positive bias on the accelerometer but iPhone 5 has negative bias. You basically have to deal with it. We do not want to build models for specific models. So just to abuse machine learning terminology. We don't want to overfit models. So we have to have very robust and stable solutions and this is what we deal with. Senses small function. I mean they will toggle. They will give you bad data. They will simply freeze but you have to infer it. Again the reality of like there is nothing getting away from it. As I said noisy unevenly spaced data. You are basically saying give me data at 100 hertz you get it at 91 hertz on an average. And then there will be long periods of silence where you are not getting anything. But deal with it. Temporal data is in my opinion somewhat easy to handle because we have a mental frame. Okay so there is a lot of filtering framework and we can use it. How do you filter that location data? Because location data in some way is like a noisy realization of your road network. If you are going on a road you are driving a vehicle. So you are driving on the road. That may not happen in Bangalore but mostly people drive on roads. But you are getting GPS points here and there. So in that chat you will see that all the red portions that you see is where your horizontal accuracy is by. You are off the road. Your GPS history is showing that you are going in some other direction. So filtering spatial data requires different techniques than filtering temporal data. Okay and we basically have to build up. Okay so now the final set of challenges before kind of I overwhelm you with this sort of all the challenges thing. So when I say detectors we are really doing predictive modeling on it. I mean we want to detect events with very high accuracy. What these are ultimately going to be are predictions. Okay so what are the challenges? As I said events have sort of range of time scales. Your phone use could be few seconds. You are checking notification. You are doing navigation which is 10 minutes. Your heartbreak is split second. Your aggressive acceleration is maybe 3 seconds. Your heart cornering depending on the type of the turn maybe it is 3 seconds. It is a huge variety. Now these events also have a cross play within them. Certain events do not happen together. If the phone is moving inside then it can be confused as a turning event. So there is a lot of cross play within that. It is not that there is a specific center for every event. It is not okay. Aggressive acceleration sensor 1. Heartbreak sensor 2 it is not. You basically have to pull in data from multiple sensors and different sensors may have different parts of the signature of that particular event. That has to be basically built. A model building particular poses a very interesting problem for us because where is the ground road? I mean when we started who had the driving data had we seen what the heartbreak event looks like? How does phone move inside? How does it get registered on the accelerometer? So there is a lot of visual element to it before you get into sort of modeling it but to create that visual element you need data for it. So collecting experimental data is essential step for us. It is expensive but it is necessary for us. So experiment design has to be done carefully. You have to have right sort of measuring instruments to create ground road because we don't want to use other phone to create ground road for some other phone. So recently we tied up with the University of Michigan Transport Research Institute. It is called Amtri. So they have a big setup. They have well instrumented cars which sort of work in their facilities or they are driving on the road and these are high end data equation systems. I think Zendri was the first one sort of to be incubated in that. So we are getting sort of label data in some way from these kinds of engagements. But we also keep getting unlabeled data because we have customers a lot of them. We have as I said it is like quarter billion miles of unlabeled data in some way and we have to exploit it. Now feature engineering is another step and that comes naturally to this kind of data because no one is giving me a rectangular data set with X and Y being the predictor. I know what Y will look like. But X is something that I have to construct. I can't put in the whole time series and say this is X. It has to be done very intelligently. So feature engineering is extremely important activity and physics also plays a role in it. And finally you have built a model. You get a base classifier. I cannot directly use it because this is just time series. I have to operationalize it. And this operationalization can be in a batch mode which can happen on a server. The whole data gets uploaded. I can basically put it sort of I can process that in one shot. If my implementation is on the phone I can do that. So I have to have some kind of streaming version of it. And then there is efficiency. As I said we don't want to kill your battery. There is a computational penalty that is being paid. So there is a detection battery trade-off. There has to be intelligent data collection because your GPS consumes a lot of battery. We rely on GPS to detect certain events. So that's the trade-off that we have to do. So enough of challenges. Let me sort of give you a general picture of how we think about these problems and how we basically build our detectors. So as I said we don't want to sort of leave out our unlabeled data which is coming from the field immense. But we want to sort of use our labeled and unlabeled data very intelligently. So here is the general strategy we follow. So we take label data start analyzing it. And as a first step we try to build sort of very simple candidate generation schemes which in one shot can tell you that in that long time series certain events of interest might be happening. So scanning a time series from left to right is a very expensive operation. We want to be able to do it in one shot. And these could be sort of low accuracy this thing. Classification models are very high recalls. So basically outline areas where things of interest might be happening. And then we have very specific models. These could be classification regression models which then go and analyze these candidates. That is sort of one line of how we get to specific events. The second is use unlabeled data. Now you construct very low level features. Because you can't really handcraft these features. So very low level signal features and you cluster them in a suitable space. This can give you two kinds of outputs. It can basically tell you a specific event likelihood in certain region. Or it can eliminate regions where nothing of interest is happening. And then that can basically feed into the candidate generation phase. So it's basically an interplay of these two things that ultimately goes into an event detection. Let me give you an example. So we want to detect phone use when the person is actually driving. This is as I said is a major safety concern anywhere. It's also important for us because we want to disentangle phone dynamics from vehicle dynamics. So if I give you the raw data I should be able to say, oh this is where the phone was moving. And then either I explicitly pull out the phone dynamics in some ways sort of subtracted to get the vehicle dynamics. Or I basically just carve out regions where I can be more confident that here I can see what was happening with the vehicle. So it is extremely active. There is no oracle or there is no direct sensor to say when your phone is using. No self-respecting OS is going to tell us that person is talking on the phone. You don't want to use microphone because people invent new ways. They will start texting. It's not very useful. So it has to be inferred from what's getting registered on the phone. And then there is just diverse usage patterns for phones. You will keep them in the cup holder, pick it up, check the notification, keep it on the dashboard. Then pick it up, start making a call. Call goes on for 10, 15 seconds and so on and so forth. So it's just too much diversity. One is moving inside. So we follow a similar method. So in this case our candidate generation is kind of very hard handcrafted sort of features on very nicely filtered sort of time series. Specific sensors that we use there. And then we refine those predictions using a classification model. It's basically based on ideas on how the sort of rigid body moves inside. So it's basically to do with how the bodies are sort of rotating inside each other. So a little bit of physics in here. So that's one stream. The other stream is just take raw data with very simple features and cluster. So in this case spectral clustering seems to have worked for us. That basically can tell us when the phone used that. Watch them together and then you have gotten a decent phone use detector. I think and well, a lot of time. Okay, so concluding remarks. So improving driving safety is extremely crucial. We are basically committed to it. And what we are doing with smartphone data and analytics is already I think seeing results enabling significant improvements in safety. And interestingly for sort of people like us, data science guys, theoreticians as well as practitioners, it is throwing up very new non-traditional kind of problems. We still don't have a vocabulary for these. I mean, for example, marketing analytics, business analytics certain problems have become important soon. We might have similar thing here, but these are sort of very exciting, challenging, non-traditional kind of problems. That's about it. Any questions? Hi, great talk. So now part of the problem part of why an accident takes place is also location specific like there's a blind curve, potholes, free junction, things like that. And since you're collecting data about you being very driver specific any ideas or any thoughts about pivoting that same data to become location specific. Like NHI for example says that there are about 50 spots on Indian highways which are the most accident roads. So can you do a pivot off the data and do some sort of that also that these are places which are most accident roads. Exactly. We actually have done that. In fact, some of our blog posts will also talk about that. Such studies have been conducted for example in San Francisco where we have been able to identify certain roads, intersections which are prone to these kinds of events and give feedback to cities to make improvements to that. Hi, my name is Shreya here. I can't see you. So who are the main consumers of this data? By consumers you mean the technology as such? Who are our customers? Yeah, exactly. Who are your customers? So we work with right share companies. We work with taxi aggregators. We work with valleys. We work with truck companies. Is it only behavioral that you are trying to see? I mean, or is it also something else other than that? No, it's basically, see, final objective is driving safety. Our argument has been that if you characterize driving behavior in the correct fashion, you are going to improve that. So whatever it takes for us to help our customers improve driving safety, you do that. Great one. Hi, I just one question here. So do you also take data from telematic devices right now in India? No, we just phone sensors. By that you mean devices which are attached to the car? Exactly, like in the North America. So do you see your model evolving because that's what is growing up, right? And there are limitations of using a smartphone. So do you see your model evolving and dwelling into telematics data as time progresses? That looks like a more strategic question for the founder. Okay, so I'll reference from that. It is possible, see, these are trends and you have to adapt to trends. Currently what we are basically saying is that you can do everything with smart phones. Hi, here in the second room, on your left. Hello. Oh, sorry. You mentioned about collecting data from the smartphone sensors. There could be other sources of data, let's say not in India but maybe in other countries you could get data from the traffic lights at what point in time a particular traffic light is on for which road, etc. And also there could be information about traffic on a particular road like Google Maps Caves yesterday. So would it really help, I think it will help to use that information along with the sensor data to come up with patterns. Does it make sense to you? Absolutely, I mean any kind of traffic related data would be useful to us. But think about it, I mean we already have quarter billion miles of data very likely we will be able to infer what you are saying. So are you looking at tapping those sources of data or no? Not actively I would say but we use GIS information extensively but not in this form. So you told that you do driver identification like whether the car has been by a particular person and whether he is driving the car or something. So how do you do that? Do you do with any form of computer vision methods or how do you do that? Currently it is just based on these sensors that I showed you. It's not based on you. I have one question here. So my question is does this solution offer real time feedback to the driver while he is driving or is it like when he ends the trip you collect the data and then pass on that feedback to his company like for example aggregators. It is both the modes. It is in both modes. So is the accuracy same for both the models? Real time as well as once you get the complete data and then you send. My main question is how you tackle the problem of scale in real time since you have lots of data how these models work in real time to offer feedback to the driver and multiple drivers. Okay so earlier implementations used to be on the backend where you would upload the data and then algorithms would run on it we have client side implementations where everything runs on the phone. All the detectors are basically on your phone. So making it real time is not a major challenge for us at this point in time. Implementing on the client side poses certain problems with challenges with regards to directly transferring these algorithms on the mobile in view of your battery or computational limitations but we have not seen major sort of accuracy discrepancies between two. Okay thank you. I have a question. So you have learning mode and real time mode. So any algorithm have two phases. One is the learning mode and the real time mode. On the learning mode you collect the data and apply some formulas. Is it happening in the phone or on some data that you collect? No, so model building is all in the other place. We need not do it on the phone. So model building is a separate system. Each time I drive the pattern will be different. So it is a coalition of different patterns. You store all the patterns inside the mobile so to recommend something to the driver you need patterns which are analyzed already and data in real time to apply the models and then recommend something. So computation will happen on the phone say and feedback can be given and whatever has been computed can be stored back on the servers. Thank you. Hi. Great talk by the way. So one thing as was pointed out is that instead of being driver specific, pivot the data to look at it from the road condition point of view. And then I was wondering whether you because as you said you have a lot of data already. Are you looking at instead of just safety, convenience, for example if you have a lot of data are you able to infer road condition? Are you able to figure out let's say places where an accident or road work has happened so everybody is slowing up and then speeding up as that. Would you be looking into making that data available? Are these sensors, is the sensor data good enough to identify? Is the road bad? Is the road good? Or can you infer that from average speed and traffic? The answer to most of your questions is yes. It can be done. We are not doing it at this point in time. Currently our focus is on safety. That is where this is my case. Thanks. But with huge amount of data all these things that you said about we have already contemplated and they seem feasible. Here this might be again redundant because I know you have the data. But with autonomous vehicles around the corner and with what you are doing is any thoughts about how possible it is to look at not a single driver but a swarm of drivers to basically start predicting given how a person you know how a person or how a driver reacts because they are doing the behaviour profiling. Knowing that given that two drivers approaching each other we know how they are going to react and then an accident is imminent. Something of that sort if you get the drift of where I am going. Absolutely. Again, I mean it is doable. It is doable from the data that we have. Just to add to that, I mean autonomous vehicles actually open a big opportunity for us. So yeah, it is doable. What does the text act look like? Sorry, I can't see you. So what does the text act look like to do all these algorithmic things behind? What are the technologies you are using under you? Okay, so my expertise actually is not on that. So you have to sort of bear with me on that. So I will just give you the bare bones of it. So we have an SDK running on our phone and then we have all these sort of backend pipelines running on map reduce and Amazon open source infrastructure. That is the extent of it I will do really. I have a question. So how did you manage to get the 1 billion data? Is it crowdsourced? Do you have apps that people can download and run? How do you get that data? Sorry, I could not see you. Yeah, here. Oh, here. We have customers. Okay, I see. So it is just field data. These are people actually driving on the road. Thank you. Pardon? They are always in the pipeline. Many of these algorithms I am sure are going to be trade secrets. My question is from deployment point of view. Okay, so from the deployment perspective how you take my understanding is that it can be deployed globally anywhere. So how you take care of the data center? Is it on cloud or you have your own platform which you are taking care of? Currently, as I said, it is in the cloud. We don't have our own servers. It's basically the Amazon open source system. So it can be deployed anywhere. We have customers worldwide. Is it specific to any cloud platform? It has to be Amazon or any other cloud platform at least. I don't think I am the right person to answer that. Sorry. Okay, thank you. Hi. So while you explain the model word event detection and you also the final deliverable that is given to your customers is a score for every driver. Is there something, I mean it's just an extension, early event detection, which is a broad area which moves us into the real time world. So you have a real time monitoring service? Yes. But does that also include the event detection service? Like as an example, let's say a guy is driving too fast too soon at a sudden acceleration. Is there a warning that comes up and I mean is that feedback built into the app? Technology exists. Final implementation. Okay. So real time feedback can be given into the system. But it's not part of the app right now. Back over here. So my question is that you are attributing the sensor data related to my driving behavior. But my driving behavior might change with respect to what car I am driving. If I am driving a sedan I might be a bit careful and if I am driving a smaller car I might be doing some strong maneuvers or faster turns like that. So are you even able to detect those things that even what kind of car I am driving? I understand that you are not collecting any data from the car itself. Is that behavior differentiable at your end? So we actually don't know what kind of car you are driving. And in my opinion really the basic driving behavior does not change a lot with the change of cars. Again this is a hypothesis and I might be speculating in this case. In any case if you are going to keep changing your cars and you want to assess your driving risk you would rather average it over all the things that you are driving. All the different car models. Because that gives the sort of long term propensity of getting into accidents. So rather than tailor our algorithms to specific cars we would basically take that sort of holistic approach.