 Trajectories are probably the most complex type of spatial temporal data because they are the order sequences of geospatial locations and time stamps of moving objects over space and time, such as GPS trajectories of a fleet of taxi or trajectories of massive movement of people over urban space. So a lot of these real-world problems in urban environments are a mixture of different data times, which make these problems even more complex. And hence, there are certain methods like cross-demand data fusion and stuff to combine these into one unifying data. So for example, if you look at a broad traffic prediction, you have the infrastructure to the road network. They have their own sort of static properties, like fixed number of lanes, the speed limit, and et cetera. So the bridge type and road type and highway type. But at the same time, you have dynamic features like the cars traveling through them. So the speed changes with time and throughout the readings of the day. And also the same thing for incident prediction, you have the boundary of the neighborhood, the number of buildings in the other types of the district, is the commercial district, is the neighborhood, residential neighborhood, and you have the population dynamics going on in that neighborhood. At the same time, you have the instance type that changes with day and night, also with the seasons. And trajectory prediction is probably the most complex one that you're moving people, the group dynamics, whether individuals moving alone or with a group of people. So weekday weekend patterns, et cetera. So it gets more and more complex. And therefore, in this talk alone, we'll be able to always talk in detail the two problems, traffic speed prediction and crime incident prediction, due to the relatedness and also somewhat being more simple than trajectory prediction. So the first problem is how do you go about predicting fine-grained traffic speed over a big complex network, up-road networks, in the most extensive way, spatially. And also how do you predict ahead of time, in five, 10 minutes ahead of time, so if you want to round a fleet of autonomous vehicles? So the problem is I got a lot of data from Uber. I was in Pittsburgh, Pennsylvania, in 2015, doing my internship. So from the data, we have sensor readings. And if you look at this little figure, so this location of the sensors of certain road segments at every five-minute snapshots. So the problem is how do you go about infer to the whole network the distribution of the speed through the whole network, because you can't practically play sensors everywhere. And how do you go about inferring in a few minutes ahead of time so that you can even plan for your fleet of autonomous taxi, so to say, how to optimize for the speed and how to do a real-time speed prediction? So that's the problem spatial temporal inference. So for the problem, I employed a method called Gaussian process, which is basically a non-linear prediction, or interpolation, if you will, that takes advantage of the property of being spatially correlated of these sensors because obviously if you look at the road network, so speed over one particular road segment would be influenced by the incoming flow of the traffic through the other branches of the network. And so not every other edges in the crap would have the same influence, so that the spatial influence, also the temporal influence, at what time in the day, and the sudden events that might feed into the network effect. So Gaussian process has certain advantages of being able to model the complex behaviors that are spatial temporary correlated over space and time. So that's the main advantages of Gaussian process. So in short, as I said, it's a non-linear regression technique that encapsulates the concept of closeness in space and time by kernel functions, which can further use to incorporate features, static features, such as infrastructure, et cetera. At the same time, it's also a Bayesian method. Now, I don't want to go deep into all this stuff because it's a bit too technical. But briefly speaking, you can do Bayesian inference on GPs, or do Bayesian updates without retraining the whole model, because training is expensive. If you knew something's happening, you can update the parameter in a Bayesian inference fashion, which can be much faster than retraining everything in light of new evidence of your new data, new event, et cetera. So that's the main advantage. Now, I'm not going to scare you off with this, but this is just to say that you are able to use kernel functions in Gaussian processes to capture the complexity of the network, of the directedness of the graph, of the spatial temporal correlation of the graph, of the network structure. At the same time, you can also incorporate static features of the infrastructure using this particular kernel method, so without going further into it. And we were able to evaluate the proposed method of what we call the local Gaussian process, which is something more complex. But essentially, it's the Gaussian process on clustered road segments over space and time. So we were able to do inference over the whole network at different time steps in the future. And we were able to evaluate with static art alternatives, and we were able to show that our error rate is among the most competitive with pretty good performance when cross-validated with the actual data. So the results were published in an archipelage journal last year, in the journal Big Data, a special issue on urban computing. So that's the first problem. Second problem is crime instant prediction for urban law enforcement, which is a consultation work between me and my advisor back then for a national law enforcement agency here in Singapore. So crimes is an obvious effect of urbanization and population growth, because as population density increases, you have more people moving in certain neighborhoods, and therefore increases incidents and certainly puts a lot of stress and stress on the local law enforcement agencies. So the challenge is how to guarantee the quality of service, the quality of law enforcement service, in light of these trends of rising demands and this urbanization trend. And for the problem, we were provided with large-scale database of crime incidents that provide fine-crowned details of when an incident happened, the context, descriptions. So yes, it's a big database from National Law Enforcement Singapore. So therefore it was able to make high precision, high fidelity prediction of crime incidents using machine learning with pretty much the same technique. So as I said before, the law enforcement problem wants to guarantee at any point in time, at any neighborhood, at any sort of defined neighborhood where you have a certain manpower resource, like number of cars, how do you guarantee no more than alpha fraction of the incidents cannot be responded with in 15 minutes time. So that's the optimization problem underlying the problem, how do you go about predicting the incidents, the trends of incidents in this particular place, in this particular time, so you can place the resource in the right time, the right place, and hence the quality of service can be guaranteed. So as I said, the database is pretty big. Singapore is a pretty safe place. We got a couple of years of incident reports with half a million reported incidents from emergency phone calls from police, reports, et cetera. Each incident has a lot of features, special temporary features as well as the respond tactics and a lot of stuff that I can't even publicly say. At a metadata containing the police neighborhood, the boundary neighborhood, the sector boundaries, and police deployment, where they go around patrolling deploying those resources. And I was able to use the same techniques of Gaussian processes with Kerner and features that capture the similarity of certain neighborhood boundaries and certain space and time features. Also, whether it's weekend or weekday, and also discretize the time scale, and also count them, for instance, in each special temporary bin, I was able to predict in pretty much like, given this bin, can you predict the neighboring bins, and can you predict ahead of time so that the resources can be planned in advance. And we did comparison. And we was able to come up with pretty competitive models compared with the other steady art models, and was able to show that our models can realistically capture the distribution of the number of incidents in certain neighborhoods. And it was enough for the police force to do the resource planning. So that's the problem. So basically, it does show. So yeah, I just want to introduce the problem on a very, very superficial, in a very superficial fight, because I can't really go deep in this, because you do time constraints. But there are certain takeaways that I would like you to have after this talk. So it's urban computing takeaways. So what is urban computing about? So it is about 3M. So it's about big data and big data, big challenges and big cities. So solving big challenges using big data in big cities. It is about data management. So how do you go about fusing cross-domain data, craft data, social network data, online and offline human behavior, as well as trajectory is what they do in the real world against what they do in online world. About data mining, how do you get insights for the data, how to recruit people and clusters people, and how do you learn and predict the pattern of behavior. So it's about machine learning. And it's a win-win-win solution. So it's win for the people, for the city, and for the environment. You put these three together, it's 3BMW. So that's the takeaway. Thank you very much. You talked about bringing together your area, basically put all these petrol cars there? Yes, exactly. That's actually part of the work that I did not mention. So how do you go about predicting the instance, and how do you go about predicting the response time? So given the traffic condition, can you optimize for the response time? You can give a prediction of the travel time for the police cars. So that was also part of the research that I did not mention. Yeah, so actually one leads to another. Can you repeat again? All right, city brain by Alibaba. Oh, OK. They are doing something that is really similar in large-scale cities. So what is the opinion of those? To be honest, I'm not aware of the city brain from Alibaba. But I know there are a lot of things doing this. And as I probably have mentioned, the pioneering group is Microsoft Research Asia in Beijing, who proposed a whole problem and a whole framework, and who did a lot of research work on cross-dominated fusion trajectory prediction, and all these sub-problems in urban computing. Yeah, but I don't know much about Alibaba. All right, OK. The big challenge is to put forward, which one do you think you find to be the simplest to solve? And secondly, which one do you think would have the case came back to you? Can you repeat the first question, please? Other problems? Yes. Which one do you think is the easiest? Right. Really, there's no problem that's easiest. So they're all interlinked. So one leads to another. If you think of the urban system as a complex system, so everything is interrelated. And I mean, there's nothing that's really, as I say, the problem of law enforcement calls for the problem of your deployment, how do you deploy response time given in traffic conditions. And also, maybe incidents can affect the traffic conditions. So if you have crime incidents or whatever incidents that might disrupt the traffic. So everything is an organic complex system. So first of all, I don't think anything is simpler than anything else. And what's the second question? Which of the problems do the city respond to? Right. Well, I don't know whether it's positive or negative impact. But what I'm imagining is that from my knowledge I have been reading or researching on this particular problem, a lot of efforts have been put in linking online and offline social behavior. So how do you even say your phone recourse your offline work behaviors where you go at what time? And you have your online social network friends. How do you identify pinpoint individual and merge these two offline trajectories with online behavior social network? So that's a lot of work going on, particularly in China that I'm aware of. So it's suddenly, there are people who have the power to know too much about you. So there's a lot of concerns and problems about privacy. First of all, and so suddenly there's a lot of impact when it comes to collecting massive data and knowing too much about individual citizens at any level. So that's a huge concern. So I think, but actively, I mean, that's really something that's actively going on in this particular urban computing domain research. Yeah. What was the response from law enforcement if you work with them at all in terms of using the results from the race image? Right. So the response that they are adopting the model, and I mean, it probably takes years to calibrate its impact on law enforcement. But there's growing concern of how do you optimize for the resource given this growing trend of population and stuff? How do you guarantee this alpha fraction of instance cannot be responded in 15 minutes time? So that's the original question. But for the impact to be calibrated, it probably has to take years to know whether that quality of service can be maintained. Yes. Exactly. Yeah, right. Not only the population and infrastructure evolves, but the types, the nature of crimes, the instance also evolves. So that has to be taken into account in the model. So it's not only the number of the number of instance, but also the type of incidents and how that evolves. Yeah. So you work with these two specific areas? Yeah. Yeah, exactly. Yeah, I think there's a question over there. Oh, OK. OK. Thank you very much. So next up, we have a quick initial.