 And today I'm talking about citizen data science with Home Assistant. My own background actually, I come from a scientific background. I did a PhD in physics and I was basically building experiments that would study man-made crystals. Okay. Okay, so just to give you a brief overview of what I'm going to talk about, I'm first going to introduce the topic of the smart home and smart hubs and answer the question, what are they and why should I care. I'll then give a brief introduction to Home Assistant and tell you why I think it's a great platform for doing citizen science. Then we're going to have two case studies. I'm going to present the first one and that's on a bird classification project I've been working on. Okay, so is that working? The first topic, smart home and smart home hubs. What are they? Well to start with, there's a few more people coming in. People are buying all kinds of connected Wi-Fi devices like Wi-Fi connected light bulbs and other kind of smart things that they're putting in their home. This ranges from quite simple things, things that you toggle on and off with their remote control through to more sophisticated smart devices. A nice example is this learning thermostat and this basically learns your habits over time and adjusts the heating in your house to save you energy. So for instance, if nobody's home, the heating will go off and it'll save you some money. So typically people start buying these things and their home maybe starts filling up with a few of them and what they really want is a hub to centralize control of all those different devices. So you've got a few commercial ones out there and the obvious one is Amazon Alexa. Does anybody in this room have an Alexa? Okay, quite a few people already. There's a couple of other ones. So Google, they come to the game a bit later on but they've got their Google Home and Apple have just released this, well six months ago now, this smart pod speaker which basically is a smart home hub as well. So these are great services, great toys, but they're a bit limited in terms of how you can actually access the data that they're capturing. So you've got a dozen devices in your home, this thing is aggregating that data in some way but you have no way of actually getting to it. They've also got privacy concerns. So there was a recent article in the news, this Alexa inadvertently sent somebody's private conversation over the web and it ended up in somebody else's home. So there's a couple of things, concerns people have about these commercial solutions. I want to do a bit of sort of future prospecting as well like we had in the previous talk and just sort of speculate for a second, what would a future smart home be capable of? Well, I think there's a number of things that we'd like it to do. We'd like it to have routines which can save us time, energy or money, a bit like the learning thermostat. There's a couple of problems, challenges you have when living in a home that I think a smart home would be really good at solving. One that I've identified is non-invasive monitoring. So let's say you have an elderly relative and you're a bit concerned about their welfare but you don't have to invade their privacy by sticking cameras in their home. The question is, can you come up with some kind of system which can sort of monitor them, give you that reassurance but not invade their privacy? And I think a smart home might be one way to do that. Another idea I've had which is actually inspired by an episode of Black Mirror which is a really great TV series, you have to check it out, is this idea of the home learning your personal preferences over time. So what actually happens in that episode is a smart home learns how the person likes their toast and so they get a perfect toast every time. But you can imagine it learning different habits for different people. So these are just a few of the ideas and I think it's going to be probably people with a data background maybe even Python data scientists that are going to drive some of these developments and innovations but what we really need is an appropriate platform to try these ideas out. So that's where our home assistant comes in. What is home assistant? Well, it's an open source home automation hub. It's written in Python 3 and it runs totally locally. So most people have a Raspberry Pi in their home that they're using to run home assistant. This is also a very popular project on GitHub and right now it's got over 14,600 stars and it's growing at an impressive rate. We've got a very active community as well. So based on regular downloads we think there's at least 60,000 regular users of home assistant. If you want to check out the docs and get a feel for what home assistant looks like head over to the website. I've just got a screenshot of it over there and you can see the URL at the top. So the next question is what can you do with home assistant? Well, it's a smart home hub. So what it's really great at doing is linking together different services allowing you to control them all from a central place and you can then set up very advanced automations to do routines. I've got a screenshot over here just of some of the main sort of high profile integrations that are available within Home Assistant. So it actually started out with the Philips Hue smart light system but it now supports the IKEA system. There's also integrated media players so you've got the Plex Media Player in there and other kind of web-based services. So like Dark Sky is an internet weather service. Okay, so the next question is if you're curious about Home Assistant, what do you need to get started with it? Well, you don't need an awful lot of money. I reckon that you can pick up a Raspberry Pi for £10, put it in a case, you're talking £15 worth of hardware, if you've got your own power supply and an SD card, then you're ready to go. So the setup process is pretty straightforward. What you would do is just head over to the URL there, download a disk image, you then flash that onto the SD card and then an optional step is to set up some of the configuration for the Wi-Fi, for instance. You would basically just edit a text file on the SD card, plug the SD card into the Raspberry Pi, give it about 20 minutes and it's going to connect to the internet and download the latest updates to Home Assistant and then it's going to start a little web server which you can access via this URL over here. It's got a really neat auto-discovery system so it sort of scans your network and if there's devices that it can control, it will pick them up automatically and give you a nice step-by-step prompt to configure them with Home Assistant. Once you've done all that, you can start exploring and configuring extra services like the web-connected services and customizing the look and feel of Home Assistant and creating automations. Okay, so I've said it starts a web server, so what is the UI? Well, it's accessible via the browser and basically you get a panel. This is the sort of out-of-the-box interface to Home Assistant. You can see across the top here we've got some sensor readings, these are just numerical readings of Home Assistant, and these are showing controls to switch on devices that have been connected to Home Assistant. You've also, in this case, this guy is displaying a media player, so this gives you a very standard way to look at Home Assistant. But the great thing about Home Assistant being open source is you're not limited what you can do with it, so you can completely customize the look and feel of Home Assistant. There's this really neat project on GitHub and it's called Floor Plan on to a 2D map of their home. So this guy over here, this is a real-time image that updates. So what you're seeing on the graphics there are sensors he's got in his home, and if, for instance, the door opens, the door will actually open in real-time on that display. So it gives you a really neat way to know what's going on in your home. This guy's also got his camera feeds over there and some other weather data, and I believe that the way he's displaying it is via a tablet on a wall, probably in his living room or something like that. Another point is there's lots of creative people out there trying new ways of interacting with a smart home. Okay, so this is a talk about citizen science and data, so the question is how do I view the data that Home Assistant is capturing? Well, it actually gives you a standard interface to look at the data that's being recorded in the database via this history tool. What you can see across the top there is categorical sensors, so, for instance, binary sensors that are on or off. And then below, we've got numerical sensors, so in this case, he's got, I think, some temperature sensors and it's just a time series plot of the data they're capturing. Like I say, because Home Assistant is open source, you're not limited in the way they actually view the data that's being captured by it, and a real neat development of Home Assistant is this idea of add-ons. So basically, you have a kind of web store inside Home Assistant where you can, with a single click, install other services on the same Raspberry Pi and automatically bring them in with Home Assistant. One of the popular add-ons at the moment is this InfluxDB and Grafana add-on. I don't know if you know about InfluxDB, but it's a very popular time series database in Internet of Things. So basically, you set things up so that Home Assistant would plumb its data into InfluxDB. Grafana is another tool, but it allows you to build dashboards that look at data within databases. So this guy is actually configured, I think he's showing his download speed on the right there, and then across the bottom, you've got some other sensor data. But the point being, you can actually create really nice visual ways of looking at the data you're capturing without very much effort at all. So while we're talking about add-ons, I just want to briefly mention an in-development add-on, which is that of a Jupyter lab server. So if anybody's a data scientist, they're probably working with Jupyter notebooks some of the time at least. My idea here is how do we get people doing data science? Fire up a Jupyter notebook and start looking at data they're capturing. So what we're working on is an add-on which allows you to create a Jupyter lab server with a single click, and that will all be linked so you can automatically get the data from your database on Home Assistant. So that's a work in progress, but I think that could be really neat. Okay, so we can get a little bit more technical now. Seeing as we're talking about data, and I just want to give you an overview of the backend that Home Assistant is using. So it's using a standard SQL database. It's a SQLite database, but you can change that and use MySQL or whatever else you want to do. This is great because it's a standard tool and you can use SQL queries or pandas or even R to look at the data that you're capturing in such a database. I actually wrote a small package called the HasDataDetective, which provides a bunch of convenient functions to help you passing the data out of the database, and I've got a plot that I created using that package which is just a time series plot showing well, I'm thinking green outside temperature and then in red just the temperature in my home. But this is really simple to do. And it's on PyPy, so you can install it pretty quickly, and I was playing around with it the other day, and I wanted to know, could you collaborate with other people online on analyzing your data in Home Assistant? And I picked up this tool, Google Co-laboratory. Has anybody else tried that one? No? Okay, well basically it's one guy, but basically it's like a Jupyter notebook server that's running on one of Google's servers, and it's pretty neat if you're looking to get into sort of deep learning and stuff, because actually with a single click of like a drop down menu you can start using a GPU and you get 12 hours free usage of that. Anyway, the point being it's an online environment with basically no setup, no install, with a lot of potential, and you can analyze your Home Assistant data online in that way using, well, the tool that I wrote. So check it out. Okay, so while we were talking about data, I wanted to briefly talk about different ways of getting data into Home Assistant, and I think we've got a couple of people here that are interested in MicroPython, is that right? It's all one, okay? So there's lots of different ways you can get data into Home Assistant, but I just want to focus on the ways you might do it, maybe from MicroPython or from one of these MicroBoards that run MicroPython. The first way you could do it is a physical connection. So if you're running Home Assistant on a Raspberry Pi, it has these GPIO plug-wires and sensors into Home Assistant, and you could use your MicroPython board to sort of stream data in that way. Another way you can get data in is by using the REST API that Home Assistant creates, so just using standard web calls. Another protocol that's supported is MQTT, and this is a very lightweight messaging protocol that's particularly popular of the Internet of Things, and actually this is the approach I used when I wanted to integrate a MicroPython board into my Home Assistant setup. So I bought a flat last year, and I upgraded the boiler in it. It was supposed to be, you know, brand new. Within six months, this boiler started leaking, and this was really annoying. I was getting messages in the middle of the night from my neighbour to say there was water coming through the floor. So I wanted to create a sensor that would tell me when this thing was leaking, capture more data about when the leak was happening, and hopefully in that way figure out what was wrong, so I had a couple of engineers came out and said, well, there's nothing wrong with the boiler, we can't see what's wrong. So what I did, I used one of these MicroPython boards, in this case I chose a Pycon board, and I put a temperature sensor on it and a moisture sensor. And what you can see in the image over here is this is my boiler, and this is the temperature sensor on the side of it, and that's the moisture sensor down there. I wrote some MicroPython code that would ping regular readings from these sensors to my Home Assistant with that using the MQTT protocol and had a little real-time display on Home Assistant of the readings. So I ran this and had a few more leaks and I noticed that the leaks, they were always happening when I wasn't home, but they were happening exactly at 5.15pm. When this was such a specific time that with the engineer we were able to figure out what was basically causing the leak. So we solved it, I haven't had any problems since, so that's great. So this is an example of using a data to solve a problem you have in the real world. Oh, and it's all Python as well. Okay, so that's my introduction to Home Assistant. So, like I say, we're talking about citizen science and we have two case studies and the one I'm presenting today is a project I've been working on very recently on bird identification. So my mum, she's really interested in bird watching and the sort of bird life, and she bought me this bird feeder and it's got suction pads and you can stick it onto your window. So I stuck it onto my window at home and put some seeds in and pretty quickly had birds coming to my window because this is over winter. The birds, you know, they're very hungry so we had a lot of activity. It was really fun to watch. It turns out there's all these sort of science studies run by organisations like the RSPB Royal Society for Protection of Birds and what they ask people to do is to watch their bird feeder and then write down the birds they see coming and then at the end of however long this project runs for three months or so you can get a picture over the whole country of what's going on with the bird populations. Well, I thought, well, I've got home assistant and I like to learn about new technology so why don't we see if we can have home assistant automatically capture images of these birds and preferably classify the birds that it's seen as well. So what I did is I purchased in this case a 10 pound USB webcam I plugged it into my Raspberry Pi 3 which is running home assistant and I stuck it on my window still pointing at the bird feeder with the typical image that I would capture. I set up, I used one of these home assistant add-ons to set up a motion detection system so the camera when it detected motion would take an image. The only problem with this system was motion triggered systems they're not discriminative so even if there was a funky light effect or a plane in the background it would trigger the motion capture and ended up with literally hundreds almost thousands of images probably less than half of them actually had a bird in so this obviously was not going to work as an approach to quickly gather images just of birds. So really the first challenge was what kind of classifier can I use to sort out the images with birds and those without. So this is an image of a bird, this is one without a bird and I had a look online I've been just generally interested in learning about deep learning and image classifiers and I found a really neat tool it's really simple to use machine learning in a box they call it machine box and those guys are actually based in London as well I think the whole thing's written in Go actually but basically there's different machine learning models and they run inside a docker container and they expose the model via REST API so this is everything that I needed to get started integrating this into my project so basically I had a thousand images and I physically sorted them 500 with birds, 500 without birds I've practically got RSI sorting these things into two folders it took a while but nevertheless I used one of their training scripts to then post these images to the classification box which is what they call their classifier and you do a test train split and the classifier came up with an accuracy of around 90% which really surprisingly is pretty good I guess probably the reason is that actually classifying the non bird is pretty straightforward but one of the things I want to get from this process was learning more about how you actually come up with accurate image classifiers okay so the next task was integrating that into Home Assistant and I've got a diagram showing how I've gone about doing that so on the top left I've got my USB webcam this motion captures an image Home Assistant I've got running on a Raspberry Pi that has a component which detects when an image has been saved in a folder and will then post that image over my local network to machine box which has the classifier running in docker in my case I'm running that on a local server I'm actually running on a Synology that will perform the classification then return the result to Home Assistant Home Assistant then I have an automation which says if the chance that there's a bird in the image is greater than 80% send me that image so that's what happens if there's a bird in the image it posts the photo to my phone with a notification this is quite a funky image you can actually see I think it's a Robin or something leaping off the bird for you to be heading that all this ground so yeah I wrote some custom code to do this and the URL for the GitHub work is down below so that brings me on to my final slide for this bit what's the next steps for this particular project well I'd like to see like I said I want to learn about image classifiers I want to see if by curating the images that I use to train a classifier can I improve the accuracy can I get up to 95% maybe after that I want to also be able to discriminate by species because I have probably blue tits even parakeets visiting this thing the great if home assistant can say hey there's a parakeet with a 95% probability and I can start to actually gather data that would be directly relevant to the bird watch study I mentioned earlier on obviously I'd like to contribute this data to that study as well and my next task is to integrate classification box into home assistant so it becomes really simple for anybody that's familiar with home assistant to start using it another idea I've got is how do we share this work with people that don't have a technical background my mother for instance wouldn't really know about image classifiers so my idea is probably to publish this work somewhere maybe on haxter make the classifiers available provide all the scripts that I use just to set things up and then anybody with budget of about 30 or 40 pounds will be able to reproduce this work finally I need to make my bird feeder magpie proof because I've had a real problem recently a magpie has discovered that by it was too big to actually get on the bird feeder that's what it does it clings on with one leg and sort of scrapes up the window with the other leg so it's pecking away through the hole to get the food and it's actually ripped the bird feeder off the window several times now and it has a hole in it but I captured this quite interesting you can see the magpie having a feed and it's actually managed to almost yank the bird feeder off the wall this blue tit has come along found that it can't feed in its regular way flown around for a bit and then discovered it can actually feed at this hole over there so this has been quite a fun project overall so that's my project and I'll see you over to Oliver to talk about pollution personalised pollution and everything