 Hello, everyone. I hope you all enjoy the conference so far. How do you like it? Nice. I hope that I live up to the expectations for this talk, but let's get started. So I'm gonna talk about quantifying self like with all different materials like variables, different apps and everything like the older data stream that you can get and what the problems are with out-of-the-box apps and what you can do about it. A little bit about me. So as already mentioned, I kind of settled an MLE role, but in reality you are never and just in a machine learning engineer you like everything your MLOPs, your data scientists, your machine learning engineer, and I am engineer by day and crazy data scientists by night. I absolutely love playing around with new technologies, with new gadgets if I can get my hands on and yeah, it's a little bit perk of the work. I can get some pretty cool things. So if you want to connect, I'm on Twitter, Mastadon, Discord. You can also find me in the Discord on Nearpyton. Just ping me if you have any questions afterwards. So let's dive in. What about quantifying self? There are different types of this. So I wanted to give you my motivation why is it interesting for me and why I think it's important for everybody, then we'll dive into different areas of tracking, what it is, how it is done, what biases it creates and what biases it fights. I'll give you some examples, more concrete and tips how to minimize not so good parts about that. Then I'll give you overview of sample analysis of your data that you, I wouldn't say supposed to run, but something an idea to play around with where you can get more insights. And in the end we will have time for Q&A. I will try not to be too slow, so do you have enough time to get something to eat and be in front of others? But let's see how it goes. So data collection. Beforehand we start with this. There is a market of mobile apps and wearables. There are two markets. One is health apps and one is lifestyle apps. Health apps are medical grade, strongly regulated and rather big market, but not it's also very expensive and not for everyone. So health issues, we're not going to touch in this talk. We're going to talk about lifestyle apps. Fitness tracking, meditation, sleeping, everything that regulations kind of say, yeah, you should be careful with the data, you should not use untrusted algorithms and so there are some specific guidelines from medicine, but generally speaking this is just someone came up with an idea for a sort of form of apps. Then I'm going to talk about consumer grade devices. I'm not going to talk about algorithms and devices used in medical field because again you won't be able to get your hands on this and they are actually very hard to use outside of the hospital context. We're also going to concentrate on structure data collection, but there are other ways like taking pictures, freestyle journal that you can use in your everyday life. In this talk we're going to go from active or manual tracking to automated tracking or hybrid and hybrid in between. Manual being you give your impression about yourself. How you're doing today from zero to ten? What was the highlight of the day and you write the free text and down the line to 24-7 variables that do not require any input. Hard rate monitors like watches, rings, something else like bands, you can imagine. Of course, hybrids. Hybrids would be also wearables, but you need to create a specific context like I'm starting a training now. Those gadgets are capable of identifying where you start a training, but funnily enough it doesn't track as a training, so weird fact. Also let's talk about biases. A bias is a logical fallacy in the end. It can stem from data collection process where there is a specific flaw in the process or flaw in the hypothesis about the customer. It can stem from data analysis. Again, different hypotheses, different algorithms that are not really usable for the data you collected or some wrong steps in cleaning of data. Different things can appear and data representation will skew your understanding about what you're doing wrong and actually will influence your further judgment and thus influencing the data collection step. That's crazy world. So let's dive in into active tracking. Within active tracking we have structured data or apps and unstructured apps. Let's briefly touch on unstructured. This would be basically therapy. It's your free source, it's journaling, it's mood journals, but where you write with your own hand, you can draw pictures and everything else, but the problem is this is so hard to standardize and to put for multiple different non-homogeneous customer groups into one app. So it doesn't make sense. Go to therapy. Always good. Structured on the other side is huge market. I think it was estimated, so lifestyle apps like mood, different chatbots and stuff, they're estimated around 10 to 20 billion in 2000 pre-pandemic time and it was growing crazy rates. So, and it's not even saturated now. So we're talking about apps that ask you once in a day, how is your mood? And you give a number from 0 to 10, from 0 to 5, or some specific things like, did you experience pain or some upset stomach if you're tracking these specific things? So in a nutshell, this apps or this group of apps is tracking a very specific question about very narrow part of your life. But there are some really problematic things about it. For instance, again, they operate and they're built on homogeneous group of people and this is never true, right? It's never the case. So they can have wrong scope. For instance, you want to have your mood and your energy levels correlates heavily and you got a mood tracker, but you need an energy tracker because if you're productive, you're happy. If you're not productive, you are unhappy. It's situational thing, but still, for you personally, it would be a different scope. Then, wrong scale. There are quite a few apps that ask you to provide a number from 0 to 5, which is actually considered not enough because you are losing a lot of in-between data. For instance, am I meh feeling today or am I meh meh feeling today? Am I like, absolutely amazing, it's a 5, or is my 5 actually a 7? And between 7 and 10, there is a huge gap. So you are losing a lot of data, a lot of graduation, like gradual scaling, and in the end, you will flatten out in your estimates as well because like meh, everything is meh, everything is meh good, but I don't have a specific number for that. So time frequency is also a pretty cool thing because again, super example for mood, the app, most of the apps ask you either randomly during the date once or in the evening, how are you doing today? And I had a rollercoaster of emotions through the day. I was down the floor, I thought I'm going to die, or I was above sky, above the clouds from happiness. What do I put in? Average, highest, lowest. In this moment where you actually think about it, it's not worth anything. It's already biased. It's already just trash. And the representation, of course, is also pretty funny. Some apps show you the previous results before asking the questions, which conditions you to follow the trend or to follow your expectations about yourself again throughout the number. So the biases. You can go on to Wikipedia at least and do this like a small exercise for yourself, go through the list and say like, am I actually into these bias? Am I resilient to this one? Is this about me? Do I tend to do this? It's like a really good game that can occupy you for hours. But there are some biases that these apps are trying to minimize, like anchoring. For instance, a lot of people think that it's because of one particular thing. I'm not going to drink alcohol. Everything's going to be fine. I'm going to do sports. Everything's going to be fine. Not the case. People are complicated. Life is complicated. But at the same time, the next really funny bias is everything is connected. Like, there is a really cool example of correlation is not causation. For instance, the more people buy ice cream, the more pirate ships are out there in the sea. And you're like, what the fuck? This is not the case. But this is a temporal correlation that summer, right? Everybody likes eating ice cream in the summer. And in the summer pirates are more active because better weather, less storms and what's not doesn't have to do anything with each other. But this is the case. So these apps, by isolating different contexts, reduce these biases. And you can read this on and so forth. And again, for every single person, the set of biases is going to be different. But these apps are causing cognitive distortions. Like, everything is connected, right? If you have a fitness tracker, it will say your sleep, your alcohol, your social interactions, your something else, everything is connected. And they do this by having this baseline assumption about all the people, which again, not going to be true or might not be true for you specifically. But the algorithms trained on your app are not customized in most cases. Then presentation matters is like, I'm going to anchor you and tell you that you do feel better than you actually feel. Or for instance, I feel great, but I didn't sleep much. And I'm like, oh, no, I actually supposed to feel sad. I probably going to crash today. And only because of this conditioning, I will feel worse, which makes no sense. Brains is crazy thing. Right. And things like hindsight or searching for information, those are minimized and this is a good thing. But I guess you have used something from here and this is crazy in my eyes. It's a complete overload. You can count calories, you can count your social interactions, you can count anything to do everything. And at some point you are oversaturated and this is also a problem of UI UX as well as the design of data collection flow. So there are a couple of solutions that I came up with. You can also decide for yourself whether they work. And in general, it's keep it simple. Try to concentrate on one thing. Ideally read up on some research papers or some general, I wouldn't advise blog posts that they because they are oversimplifying or sometimes otherwise do represent the issue to complex. So read it up if you have specific concern and try to see a search for an app that does exactly that. They're definitely one at least one. Then get data out of it and run your own analysis. I will show you in the end how it can be done. Automated tracking. So definitely there is a hypothesis that how you feel will reflect in your physical parameters. Like if I'm stressed right now, I'm at 127 BPM. I'm freaking out. And my watch does tell me that I have no alerts because they're at 150, but we'll get there. So yeah, variable and apps or active tracking is paired with an app and with a hardware. The hardware driven, it has better tracking because it's not depending on your I feel bad or something else. It can tell you basically something happened here, something happened here, something happened, nothing happened, you get. And you can assume or you can pair your mental state in this moment with the data and then get some results out of it. Manual input possible. You can adjust the numbers or you can again pair it with something that this tracking device was not supposed to or not exactly supposed to, but is not measuring. It represents graphs, numbers and real time feedback. But yeah, they since there is an app, it suffers from the very same problems as the manual apps. And of course, on top of that, you have some problems with the sensors. They are not perfect. You have problems with the context or situation where in, for instance, if I'm going for a swim. My garment is not considered to be the swimmers garment. So I probably get have to get another watch as always. But the problem is the connection to my skin is has an additional layer of water and thus reducing the accuracy of the signal. There are also some very specific differences, dark skin, lighter skin or even something else. So this data loss, battery lifetime is a problem still, especially in a very small devices. And of course, undisclosed algorithms. The most of the apps and producers of this wearables are writing papers and posting blog posts, but they never, as far as I know, telling you what the formula is and whether it makes sense for you as a specific group. Saying that there are very different sets of sensors and different forms of tracking devices. The most used sensors are optical sensors. This is like shining light through your skin and analyzes the returning light. How much energy did it lose? And based on that, according to different algorithms, you can estimate heart rate, heart rate variation, as well as your oxygen saturation. Don't ask me exactly about the algorithms or let's check it out together afterwards. Then bioimpedance sensor is trying to identify the resistance of your skin. It's not for heart rate, although I saw some claims that it can be used like this. I don't know how. But again, I'm not the one who is developing that. So this is more used in scales to identify your composition, like how much fat to have, how much percentage of the bone structure and stuff like that. But you can theoretically use it also for stress levels estimation. Then accelerometer, you know that. Temperature sensor, you know that because the general assumption is the higher your stress goes, the higher the temperature is. You can also use this type of process to estimate the things. Gyroscope to properly identify the movement, for instance, if you have a need for workout tracking or something else. GPS, not everywhere, of course, but watches are mainly equipped with GPS. And this is also pretty cool because you can more reliably estimate the area you've covered for your hike, for your walk and does more accurately. The algorithms are using additional sensor data to improve the algorithm prediction or estimation. Then I put those two down after many other sensors. ECG and EEG stand for electrocardiogram that sends an electro signal through one point in your body and reads it out in the other. It's kind of close to bioimpedance from very high level approach, but it's not. It's a medical-grade approach. You've probably seen this in hospitals if you've ever been there. They put a small sensor somewhere and it reads out your heart rate and all the variations. And EEG is an electroencephalogram where you put a cap with multiple different electrodes on your scalp or somewhere else and they're reading out your brain waves. Super complex topic, super interesting topic. Not going to talk about it because I just don't trust myself to be good enough in this. So these things are special based and there are consumer-grade bands for sleeping, for meditation and for focus. But in my experience I have no association with the companies, but the most known one is Muse for sleeping and on focus. And so these are more for developers yet. So like neurosity, emotive and stuff, they're still positioning themselves as researchers, consumer-grade gadgets. They're super expensive and they're rather big. So the problem is and also they are utilizing so-called dry electrodes which have a problem of just going somewhere on your head and you cannot guarantee the proper readings. So these two are prone to huge errors just from the reading collection. Otherwise watches, bands, rings, even some implants, not talking about implants because they're mostly medical-grade. So now you know where they are. You know, I would say, I think Elon would argue with you, but let's see, I mean, yeah, they should be. But there are like really implants, implants that go really deep and can only be extracted with surgery. And for instance, glucose readings where you just get a blood of like a couple drops of your blood, they are considered intrusive. They're not implants, implants, but yeah. But let's see, I really am skeptical whether they will go into consumer-grade. The future is gray. So, but also different apps and different algorithms and different types of sensors or even how they are created produce very different data. For instance, this is the same day, day or rather night, around the same score, but you can see the deep sleep is like two times different. One is Garmin, watch one is Oro Ring. Who is correct? I don't know. I would like to think that Oro is correct because otherwise I'm in trouble. So, and what do you do about that? You get another gadget that uses different algorithm, different sensor to get the third value to average it out and see who is right. And then you have a problem that this is also might be flawed. That's devil's circle. But just be aware, if you have multiple types of the same data collection, be very careful with that. So biases minimized and created by this one, sorry, about the same. They definitely cause less biases just because your input is not needed and you can't really all the time try to fake your stress. Like, yeah, you can start breathing really fast all the time, but you're probably not going to last for too long. And why would you do that? But biases minimized are of course everything that I showed in the previous slide for manual tracking. And it's in my experience again, you can experience something else. It works better because you are taking out the perception during the data collection bias out. Again, just if you have any wearables, go for the bias list and check what you think about it. Solutions for these things, search for an app or ask for a feature from your provider that applies filters for your data. Don't react to absolute numbers. Try to see it in progression. Absolute numbers will have a meaning if you have an absolute outlier. Then keep an eye on it. And then you can rely on the data provided by your app within small horizon. Because normally anomalies in health data, and this is already health data although we are talking about lifestyle app, is not occurring, it's accumulating over a long period of time, but there are specific triggers. And you have better chance of catching them by looking at up to two weeks' worth of data. Export data. Most of the apps are pretty nice although the structure is awful. They're still allowing you to export data or get it through API. Try to analyze it and in a way of comparing similar periods like first quarter of previous year with first quarter of this year. Or go for a full year trend, split it into different behavioral patterns changes. For instance, I know that I tend to sleep longer in winter, thanks not a lot of sun, and less in summer and I feel great. So you would see this difference as well. But don't correlate directly, not enough sleep bed, too much sleep bed. It will depend on so many contexts. You can calibrate your incoming data and the graphs with other gadgets, but again, this is like a spiral. And it costs so much. Analysis. So, a couple of these analysis. I tend to see my health data as a time series. And the first approach that I took is I wanted to break it down into trend cycles and noise. You can do a lot of with cycles. Along with this, you will know how actually you are progressing trend. And also, if you're a cyclic, like quite a lot of things in life are cyclic. Sun goes down, goes up. You tend to be hungry. You tend to be full. You tend to take and so on and so forth. And hormones are definitely regulating all these processes. And it's not a long stretch to think that there are different size cycles in your life. Mood, energy, stuff like that. For women, it's even more than that. So just if you are aware of those, it's already a pretty cool knowledge. And keep an eye on anomalies. Anomalies are not to throw away data. It's own data. It's own time series. And also check out, check up with your doctor regularly. Don't rely on the trackers. Okay. Let me go to my small thing. Can you see it well? Can you read it well? Okay. So just saying that I went, I'm not a statistician by trade. So I've learned this thing again and again. I do have some mathematical training, but not to adapt. So if you spot some not optimal algorithms, let's say, do let me know please. I would love to learn, but don't judge me too harshly. So let's get first my, this is my data. I'm fine with sharing it. You can check it out on the GitHub profile. I'm fine with this. You have my confirmation for GDPR. Just go ahead. Right. Let's check out what war gives me. And this is the BPM. And these four things are the arbitrary result that their algorithm comes up with for simplification of the results representation we don't need. But as you can see, it measures BPM every five minutes when I'm awake. And this is just awake data. So I need to do some data preparation, grouping, blah, blah, blah. Let's keep that. And I've for simplicity for this talk, I went with minimum maximum and average for the day on the BPM reading. And if you plot it, and also I am comparing the one month's January specifically from 2022. And 23. Just can tell you that it was very different in my memory. In one, I was rather stressed. And you can see it's the case. For instance, in this case, the red line is the later period and the green line is the older period. In the older period, I had the about the same direction, but lower numbers. And this January was rough. And this is a knowledge for you because you can compare your context and adjust the context based on the data, not on data or your behavior. The next one would be sleep. It's quite a lot. That's way, way, way more data than this, but I just wanted to see deep sleep, light sleep, REM sleep and total sleep duration. Like the general things that people are watching out for. Same thing. Prepare, prepare. And let's compare how many days, three months worse from January to March of 22 and 23. There are different, I'm going to show you two different algorithms. And this is what I meant. Whenever you work with time series, be careful because there are different modeling approaches and you can get kind of very different results. So you will be the only one who can judge that it makes sense or makes no sense. This is the original with red line would be the 2022 and the blue line 2023. Yeah, we're talking about deep sleep or total sleep. I am way more consistent in the first quarter of this year, as you can see. But at the same time, yeah, it was not so fixed. This is fine. But for some reason, and this is an interesting one. I have a declining, I also have a declining, slightly declining trend, but long, not that bad as it was in the 2022. Which again, I can identify specific stressors in my life and due to this data and try to avoid them, which is a cool thing. And these cycles are kind of weird. But yeah, you can also use something like Fourier analysis or autocorrelation analysis to get the period and even see if you have the cycle. Because again, would be nice to know that every 14 days I have bad night and you don't have to. And if you know that you can kind of give yourself a leeway and say like, I'm not a bad person. I'm doing what I can. My body works this way. Relax. And then you just by actually allowing yourself to make a mistake or have a bad day, bad night, you will get rid of the stress and stress is a huge problematic factor. And maybe you can improve this 14th day or night. So another one would be deep sleep. I guess this is slightly different algorithm that I wanted to show you that shows different trend upward going. So although I was sleeping less, I was sleeping more in the deep sleep phase, which is kind of good, I guess, according to the latest studies. Very cyclic behavior, like I tend to sleep good, good, good, bad, good, good, good, bad. And I don't know right now whether it has to do something with I sleep very badly on Sunday because the Monday I have the first meeting at nine. Maybe who knows. But you can definitely check that out for yourself with your calendar as an additional data source. Great. And more for the light sleep, same type of analysis. And it's similar to the total sleep. Everything goes down, which is not that great. The cycles are not that stable on the light sleep, REM sleep. It's really funny one. So the I was getting more light sleep in the beginning of the 2022. And it's definitely declining in the later year. But I know why if you want to know, just ask me personally. But also really interesting how the cycles are behaving. So the later trend is going haywire. I really need to do something about it. Okay, but what about mood? So this was data from the structure data. And now let's talk about a structure data mood thing. The app that I'm using is asking me once a day at nine o'clock. How my day was from zero to five. And I can put some tags, but I didn't realize them. Same approach, same model, same 30 days from June 21 and June 22. Super different periods. And as you can see, the early trend was horrible. I was rather fastly approaching my negative mood. Had some stress in life. Yeah. And right now I'm pretty happy. I also absolutely love Europe height. And I hope you also do. And this gives me a boost. Cycles going haywire. But it might be also the problem of the algorithm. So again, be careful about that. I wanted to show you this one cycle computation, but I've asked Charger PT to write it and it didn't work. So, I mean, it did give me a number, but it made no sense. So let's keep that one. All right. So in, oh my God, where are all my things? And after this analysis, what can you do? Look out for outliers. I rely on Short Horizon from the app whenever you see an outlier. Because they are pretty good still in what they're doing. Compare similar timeframes, compute trend, and trend and other components on your own, according to the algorithm that works best for you because there are seriously different things. Use multiple apps because the context matters. Within an app, they are optimized to answer a specific question. And not basically delete the borders. And don't sweat it. Just time to time, like once in the six months or so, just stop. Don't track. Give yourself life. And my personal insights, I found out what my meditation time is. If I don't meditate in the morning, I don't. Training also, if I don't train in the morning, I don't sleep. I also identified some bed sleep things that were very surprising for me again. Ask personally. And also, I figured out what helped me to go into the flow. I guess everybody wants to be super productive and have this feeling more often. You can use this data to identify your triggers and what helps you out with that. Again, ask me after the talk. And knowing your cycles, trends, where you are going, you can better organize yourself. You can, for instance, say I'm not starting this new habit now. I actually will start it. Wait for it. In the upward trend from your worst point in the mood. Because then you have enough motivation from just beginning and you have it. And then you go into I feel great period for long enough to keep the habit. Don't start at your highest point. And also you, it's rather good to know whether you are a ruminating person like something happened a week ago. I still think about it and it kills my mood. Just be aware of that. And again, it will decrease your stress and many, many more things that you can learn for yourself. So let's turn into cyborgs with more variables and check out the resources. These are mainly research papers that I found surprisingly interesting and insightful as well. And thank you for coming for my talk. Thank you for your time.