 So, hi everyone, so what I'm going to present is a bit more technical, so do interrupt me if you have any questions or you feel a bit, you know, confused. So I will start my presentation with a very small demo. So here you see this one, small box. So it will be the same as this one. So this is a sensor box developed by the 5G Center and we're working with. So this sensor box consists of a set of sensor modules that capture different information. So I will introduce the sensor modules by showing what kind of information it can capture. So this is a live data stream coming to the sensor box. The different lines indicating different information is capturing. So for example, temperature, noise, humidity, brightness, distance, and dust density. So I will start by showing you how this different lines of information will change according to the change of my activity. So for example, you can see this purple line is changing if I'm coming closer. So here. So it goes down because I'm more closer to the sensor box. So which captures if people passing by in the house. So we know that at what time and for how long this person is coming back and forth. So I just sprays into the sensor box and you should notice this blue line increased. That's humidity because I braced air into the sensor. Then it drops down because I just sprays a little bit. And you also notice this purple line goes down as well because I was closer to the sensor box. So this yellow line indicates brightness. So the level of light it can sense. If you notice anything changing that's because I'm using my cell phone to give it more light so I can capture the level has increased. And temperature is more tricky because it's more slow to change. So but now you have an idea of like around like 20, you know, 25 something, 27, 8 something like that. And while dust it is pretty, pretty low. That's good, right? There's little dust. But if I just, you know, kind of putting paint in it a bit increased because the basis value is quite small. So you see the increase because I put a paint in it. It captures particles, right? So if I put something in it, it captures a big amount of dust. So and sound is a bit tricky for using my computer because the connection there is interference. That's why the basis value is pretty high. But if you plug into the wall that's what we do in houses. It has a basis value of 30 or 40 decibels. So I would try to see whether, so here this is a microphone, but we only capture decibels. We don't record any conversation or sound. Yeah, so you see that it increased. So here this green line that's because that's air pressure. So the decibel is, the sound is transmitting by the air pressure. So that's why you see the change if I, you know, try to give it a little sound. So that's a very brief overview of what kind of sensors we are using and what kind of information we are capturing. And these are all raw data that come in from the live stream data from the sensor box. And based on this information, we want to know meaningful stuff. For example, what kind of activities people are doing in their house. But this is only a part of the sensor that we're using. So let me know if you're interested in this and we can have further discussion. So this demo now is over. I'm going to, my slides. So I'm going to talk about more on sensor-generated data, visualization and data analysis. So first I'm going to, you know, give an overview of the sensor data we're using and capturing. And then we, I'm going to move on to visualize the data and then how to analyze it. So this is, wait a minute. This is the overview picture of, you know, data collection, data transmission, storage and visualization analysis. Here you see a lot of different kinds of sensors that you can use. And you put the sensors in the house, in a workshop, in any, you know, places that you are interesting to research. And then you have different ways of transmitting the data, where Wi-Fi or Bluetooth or, you know, any kind of technologies you can use. And also very importantly, you want to, you want to, how to say that, store the data in a secure way. And you want to transmit it in a secure way. So in the sense that you have to be careful about data security. And so on the other side, so for us, we have a data server. So the data is transmitting in real time to the data server. But you can also choose to store it locally. I think that's the way that Sphere is doing. So you have local storage. Encryption, for us encryption is really important because we want to real-time, you know, transmitting data to our data server. Well, so this is a system box that's just demoed. And we also have another type of sensor that's energy monitor. So this, we developed by ourselves. This is an out-of-shell product. And this one, as well, is a commercial product we are using. But we're not using it for, like, activity monitor, like a measuring of a number of steps in that sense. We're using it more like a Bluetooth device. So people are wearing that. So then this act has a Bluetooth module that can capture, can sense the nearby Bluetooth devices. And then we know the signal source in a way that we know we're a person. So if we deploy a number of the sensor boxes in different rooms, and the sensor signals will be different according to the location of the person. So that's the purpose. So we're using this together with this rather than using the original functionality of measuring a number of steps. So that's digital sensors that we are calling it. But here we also have another way of capturing information. We call it human sensors. So human sense themselves and try to log what they're doing, like, in terms of activities, like cooking, like sleeping, like watching TV, at what time, using what kind of devices, and in which room, and also with whom, and also, you know, level of, you know, how enjoy your... So that's the two types of sensor information we're collecting. So then is... We capture the information. There are so many different kinds of information we are collecting. And if we want to understand it in the easy way, how do we do it? So we want to visualize it. So the first option will be like an open source platform for us to, you know, not only visualize, but also monitoring data collection. So for that, for example, you can have options, you have loads of those options. There are so many open source platforms out there you can choose. For example, Kibana, Grafna, etc. And we are using the Kibana. And for this kind of choice, you have, you know, it's less development and quick start. But it has basic functionalities. What I mean basic functionality is that if you want to do further analysis with data and then visualize in a more meaningful way, you might find it very limited. And another option will be do the coding by yourself. So for example, you, you know, you use any language you're familiar with, like R or Python, and then you can just code your own visualization methods and then visualize the data in a way that you want to investigate further. So this is just an overview. And then I will give you some examples of what we do, how we do it in home sense. For example, this is a Kibana platform. This is just monitoring data coming from different sensors, different time, different number of data points coming, and these are the values. So, and then we know whether there is a sensor that is having a problem. For example, it stopped sending data. Then we know in which house and in which room. And then we can contact the household to restart the sensor or, you know, tell us more about what's going on or we go there to replace the sensor. And in a more meaningful way, that is, you can, if we visualize in a longer term, we can see patterns in the house. But this is more like a very initial stage of visualizing data. So you just, you know, very, very simple line charts. You see how data are changing. Like, for example, this one is the ranging sensor. You know, for example, we put the sensor in our office. Then we can see in the morning some people come in and there are movements and then in the lunchtime there are people moving. So you can see the patterns every day might be similar, but weekdays and differences between weekdays and weekends. Things like that. And also the change of temperature, humidity, and, you know, noise level. There are all kinds of correlations between those data. And then if we can gather the overview like this, you might have a brief understanding of the data you are collecting. Let me see. I think it is at least like a more than a week. Yes. Yes. So and then that comes again back to more customized digital visualization. So for example, that's what we did for a paper. So we want to have understanding of electricity consumption relating to different kinds of activities. So you have electricity consumption changing over the day and in different days. And then you can also have, you have change points. So where it changes. But you also have the lens of the duration when nothing changed or there's less, much less changes. Then you want to visualize it like, okay, here there is a big area that nothing is changing. And here as well, here as well. So and then you kind of compare with some other attributes. For example, movements. Then that's a better indication for sleeping. So for most people, if you go to sleep, your house going to stable, stabilize. So electricity usage will normally go to a minimum value and it stays here like for quite a long time. So this kind of visualization, you have to do some analysis and then you do some, you visualize the data based on your analysis. This is something that you might need to program yourself because open source platform, like I mentioned, it has a lot of functionality, but it has only basis functionalities. Or, you know, you want more advanced visualization like a 3D plot. And depending on the data you have. And you can cluster, make a clustering of it. Wait a minute. This should. So you have different, for example, this is electricity and this is number of movements and this is the time of the day. So you kind of cluster with how many class groups of data. You can group into different categories and you can see which group of data has different kinds of values and you can have a better understanding of this might mean different things. So this is just an example of how you might want to visualize your data and this might be something that you want to program yourself. And so there are so many choices you can have, but in a basis either you have open source platform to start with. And then if you want to do more sophisticated analysis you might want to do some coding by yourself. So that's a visualization. But again, I think it's not like a step one and step two. It's always like a mix. So you do some visualization and then you have an idea of your data and then you do some analysis and then you want to do some visualization again. It's like a loop thing. It's not like one and two. So now I'm going to talk a little bit about data analysis. For us, I think this is quite a standard procedure you do things. So you want to clean your data and do some transformation and then you might also want to reduce the amount of data you want to use. So for cleaning, maybe you have missing values. For example, I gave earlier. So you might have some sensors malfunctioning or a stopped function. So for example, I think a lot of times we have to contact the households to restart the sensor again because there might be some problem with their broadband or even sometimes they have electricity issues or they have guests that came and they don't want to monitor the guests. They just turn it off. So you definitely have missing values and also an element detection is that some of the sensor modules might, you know, broke. For example, we have a sensor box. The temperature is always below zero and that's not possible at all. So and for data analysis, depending on the algorithm you use, you might want to do some normalization, you know, normalize your data to zero mean and one standard deviation and you might also want to map your data to some other meaningful labels you want to have and or window length slicing and feature extraction. So that's for data transformation and data reduction is that, give you a simple example. So for us, we want to deploy as many as possible sensor boxes in every room. So for, you know, for technical people, we really want to capture everything that we can and then we can, we have a better understanding what we can do more analysis and we can have more information. But in another sense is sometimes we see very similar data points coming from two different boxes. So for example, if I put one here and one here and, you know, people have very similar patterns every day. So these two boxes close to each other might, you know, give you very similar ratings. Then you might want to, you know, reduce the amount of data you want to have and select the more important features you want to use to do the analysis. So that's a pre-processing. You want to prepare the data you want to use for more analysis, for more, you know, extracting knowledge from the data. So for data analysis, you can have very simple descriptive statistics like the mean, standard deviation, the variance, quartiles, skewness and correlation. You can also have statistical inference. So you assume a certain distribution, then you try to estimate the parameters of the distribution, then you have a model of the data you have. Or you can have more advanced techniques like the most popular ones nowadays, motion learning. And depending on the nature of your problem, you want to predict like continuous value or the real value thing, that's regression. Or you want to classify, for example, you want to know which activity people are doing. That's more like a classification because that has discrete values. Like for example, whether you are sleeping or not, that's a binary thing. So that's more classification. Or you want to, you know, clustering the households depending on their energy usage. So that's more like grouping different households depending on the energy consumption they have. And so you can build models to try to fit your data to the model. And then you can use the model to either to understand or to explain why something happens or to predict future trends. But based on all of this, you want, the most important thing is that you want to evaluate whether this method is good enough, you know, to tell you future trends or even for explanation. So again, depending on the task, nature of the task, you might have different metrics for evaluation, like precision recall, F1 or mean square error, edit distance. So don't be puzzled if this all seems a bit too technical to you. I will give you an example in a minute. So, okay, here comes the example. So for feature extraction, that's more like a data transformation, right? So we have, so this one is, don't look at the mess. So just think about, you have a signal that is capturing temperature over the day. And you want to know whether at a certain point, the distribution of the data changed. So or there is a significant change of the temperature. For example, in the morning, normally it is much cooler. And then at a certain point, it becomes much warmer. And then enough to know maybe it's much, much warmer and then it goes down. So you want to find those changing points. But that's a natural change depending on the weather. But if you cook, it becomes even more hotter in the room. So people doing activities, and these activities will change the readings from the sensor box. And we want to find the point when this happens. So that's a change point detection. So what it looks like is this. So this is a temperature, this is humidity. And that's the signal that we get over the day, the values changes. And then you can say that we try to find the significant change along the data. So you can also have these small ones, but that might not be that meaningful to us. You know, whether it also changes. And so we want to really find the significant change points that might lead into us, lead us to some conclusion about the performance of certain activities. So this is kind of analysis of the data and then you get the meaning where there is a significant change and then that leads to you for further investigation of what is happening in the house. And again, like I mentioned to you, there's so much information, there's so much data. And so there are so many sensors you would deploy in the house. But if you combine them all together, you might like the word big data, but it's not. You have so much duplicate information. Like I mentioned, if you put two sensor box here and here, the temperature will be the same, the humidity might be the same as well. And even the passing by the movements, so I'm talking here, I'm coming back and forth, they have, they're capturing similar information. So the intention to put as much and as many as possible sensors is to not missing any information. But on the other hand, you might get duplicate information as well. So we want to minimize the information we can use to, we want to use for analysis in the sense that we want to remove the duplication. So the feature selection is to find more important information while to reduce duplication or, you know, kind of repeating information. So again, here's a hit map. So these are the two boxes deployed in one kitchen. And you can see very correlated features like temperature change. So in these two rooms, they are very, so the darker it is, the more similar they are. So if you plot this, you might be able to tell that which features you might only want to, you know, keep one and remove the other. And this one is more about activities and features. So for example, doing laundry is very, very close related to the electricity usage of the washing machine. I think that's very intuitive and very straightforward. And so some other like this is sleeping, the up point is sleeping, which is very closely related with energy consumption, like I mentioned to you before. If you're sleeping, the house is coming down. And the electricity usage is really stabilized and to no very, for a longer period. Oh, okay, sorry, I might have too many slides. So like I mentioned to you, you have pre-processing, you want to prepare the data, and then you really want a minimum amount of data that is really important for you to do further analysis. And now what you really want to do is your research question. For example, for us, we want to understand what people are doing in their house. So household activities. So the sensors are sending everything. And the change of the signals in the sensors are very likely caused by the change of the activities at home. For example, your cooking, the temperature, humidity will change, electricity usage might also change. So whether we can recognize activities from sensor-generated data, that's a research question we have. But sensor data don't tell you what is going on. It only tells you temperature is changing, humidity is changing. Well, that's why we have the other side of the information is time use theory, which tells you, people tell you what I'm doing. And then you can kind of compare the two. So very simple examination. This other information we observe from the sensors. That's a hidden state. That's people changing their activities. For example, I'm sleeping, I'm getting up, I'm preparing my breakfast, I'm working in front of my computer, and then cooking lunch again or doing something else, or doing laundry, or watching TV or whatever. That's a hidden state from the point of view of the sensors. The sensors observe only sensor signals like temperature, electricity usage, movements, et cetera. So simple example, cooking. So we're using data from energy model. This is only one house example. So the most important indicator for cooking is electricity usage. So for example here, this is the time use theory tells you, the black lines tells you, I'm cooking, I'm cooking, I'm cooking, I'm cooking, I'm cooking. And these are the patterns learned from the sensor generated data. So basically it's a binary classification. So either something happens or something not happening. So based on the sensor generated data, we learn this. And this is what the time use theory tells us. So you can say that it has a very close similarity, which means that this actually recognizing cooking for this house is quite simple because we have a strong indicator from the energy consumption. Well, here, even simpler for sleeping. So this is using a sensor box in the living room. So the living room is the hop of this house. So whenever this person goes to sleep past the living room and then whenever the person get up in the morning going to other rooms doing some other activities, they also pass by the living room. So the movements in the living room is a strong indicator for kind of the duration of sleep. And what time he or she goes to sleep and what time she and or he gets up. But again, like I mentioned to you, the time use theory is also the type of sensor. It's a human sensor and it does not necessarily tells you the true truth. So it's also kind of information you want to use, but you don't really take it too seriously. So you want to kind of validate between the two to find the agreement between these two. To what extent the pattern we learned from one agrees with each other. So that's more about agreement evaluation. So for that, we used, for our research, that's an evaluation metric we use at the added distance. So basically what it does is so the person tells you a sequence of activity it does, right, and getting up. So just imagine you have a limited number of activities the person is doing, getting up, cooking, watching TV, doing laundry, this is a sequence. And we also have a sequence learned from the sensor generated data. So you try to convert one sequence to the other by changing some of the labels. So what is the minimum number of operations you have to use to convert one to the other? So this is kind of an evaluation metric we're using to compare the similarity between these two. So by this, we kind of get an understanding of what type of information we can use from the sensor to learn what kind of activities people are doing, which one are more relevant, and to what extent this activity can be recognized depending on the agreement between the two kinds of information we get from two kinds of sensors. Just last two slides. So this is also something, energy consumption activities. So nowadays, people's lives are closely related to the usage of appliances, right? So for example, for me, my house, there's no gas, it's only electricity. So almost if you know what type of appliances I'm using, then you know what type of activities I'm doing. I'm cooking with electrical hop. I'm taking shower with, again, using electricity. I'm working with my laptop. And what else? I'm playing game using V, whatever. So I'm electricity person. So if you know how much electricity I'm consuming, then you probably know what type of activities I'm doing. So that's something interesting for us. So if you know the total energy consumption, if you can disaggregate the usage of individual appliances, then you have a better understanding of the type of activities people are doing. So that's between activities and energy consumption. Last one, indoor localization. Like I mentioned to you, we have sensors in different rooms. If we know that, okay, now we have a stronger signal here than these people, the person might be here. So, and even from what time to what time, we have better understanding of the location where this person is in, inside the house. So, and then again, so different type of rooms have different functionalities. So for example, if I'm cooking, you know, 99.99%, I'm in the kitchen, right? If I'm sleeping, I'm in the bedroom. So that's the type of relationship between activities and the location of the house. Last one slide. So some experience to share. So first one is choose the right sensors to capture relevant information you want to explore. So what I mean by right is functionality, ease of use and price. So like I mentioned to you, we use a mixed of sensors. We developed our own sensors. We also use commercial products. So my experience is if you develop your own, you have much control, right? You can design in a way that you want to, how you collect the data and what type of data you want to collect. But it's a lot of effort to put it in. There's a long learning curve and also you have to test it again and again and again. Everything works perfect in the lab. The minute it goes out of the lab, don't think about it. So for commercial products, you have limited access to the data. So that's why we use the smart band in a different way because there's no open API for us to get the data from the band. So that's why we think, okay, maybe we should kind of go in another direction rather than using the original functionalities from the band. And the second one is start with an open source platform. I think that's something really, really valuable. So for us, it really saves a lot of time and effort. But we can have a better understanding of the data we are collecting and which again goes back to maybe we want to change something. So if you want to start something quickly, do choose an open source platform to start with. Otherwise, you might have a lot of trouble even developing your own and testing. That's a whole lot of work. And also the third one is understand the nature of research question, whether you want to do classification or regression or clustering. So that really depends on the nature of the task you want to do. And then you have to try different kinds of methods and to, you know, compare to better understand what is better and what is wrong. So for example, for us at the beginning, we thought, okay, we can use a supervised learning because we know the ground truth. People are telling us what they are doing. Don't believe it. You know, the first, I'm not saying that they are saying something fake. It's just, you know, people, if you ask people to record every 10 minutes what they are doing, they get bored and they are not following that. Most of the people will just sit down in the evening and say, what have I have been doing from what time to what? That's most people do. You know, we're not blaming them because this is really a difficult task for them to do. Like every 10 minutes, you ask people to update what they are doing. That's really a lot of work. So finally, evaluation metrics need to reflect the characteristics of the data as per the model you use to fit the data. Again, for us, like for example, the added distance, that's why we want to use it is to elevate the influence of the missing values and the shifting of the time. So, yes, so thank you. Sorry for a bit of fresh.