 Okay, I'm happy to be here. I welcome you to the presentation build a pump monitor for railway applications with Zephyros. My name is Oliver Fölkers. I'm here together with Jonas Remert. And together we will make this presentation. I will introduce you to the pump monitor project and will explain or will handle the question, can artificial intelligence recognize the wastewater flow, which is quite an interesting question, as we will see. And in the second part, Jonas will give an overview of the firmware that we developed together and explain the power management and what we learned from it. And if we have some time left, then we can take some questions from the audience. If you have a really urgent question, then just ask in between. Okay, so what is the task? You might have heard about the ICE trains, Intercity Express trains from Deutsche Bahn. And if you travel with them regularly, then you have experienced the sign, as you can see on the photo. And so sometimes the toilets in the ICE trains are out of order. And why is this the case? And of course the toilet might be dirty, but in that case, it can easily be resolved. And quite often the reason is that the wastewater tank is full. And it is full because it was not properly emptied before. So there would be an overflow and to avoid another overflow, it's shut down. So the reason why the tanks are full is that pump failures go unnoticed. And when the train is traveling, then it's too late to fix the toilet. So our task that Deutsche Bahn had for us was to develop a system that monitors the wastewater pump automatically and reliably. So it should work fully automatically. And of course it needs to work reliably because otherwise it doesn't make sense to monitor the quality. So how does it look like? On the photo you can see an ICE train with a magnified part of it where you can see a part of the tank with a pipe from the pump. And the little black box there is the module that we have built from Berliner Sensor Technik. And this little box with an LED, in this case a green LED, indicating that everything is fine, that is our product that we developed and that's what I'm talking about. So the challenge in this case is that we had no energy supply, no wired connection, no gateway, almost nothing because it has to work independently for safety reasons. The only connection is via mobile radio and the whole system is battery powered. So this changes everything because if it's battery powered it means that everything has to be extremely energy sufficient. Now first we thought we could use a conventional flow meter because there are lots of flow meters available in the industry for liquids. They work very well. Generally, however in this case with wastewater, they cannot differentiate between an empty pipe and blockage. And this makes the... this gives a real challenge because an empty pipe is okay, a blockage of course is not okay but in both cases there is no flow. And wastewater is something that conventional flow meters cannot easily detect because it's variable, it's solid, liquid and gas and they are mixed just such as foam. So if you have water with bubbles is that a liquid? Is it not a liquid? It depends. And that's quite complex. So from a scientific perspective it's quite an interesting matter. In order to indicate that in order to detect the flow we have built a hybrid sensor system with pressure, vibration, temperature, humidity sensors which all work together. On the diagram you can see the pressure sensor and the flow of the flow of the water over time. Now why do we use sapphire sets? After all that's why we are speaking here at the conference. You probably all know bare metal programming usually with C on a microcontroller. In this case it would be not capable enough because we have multitasking, we have requirements for real-time operation and it needs to be energy saving and if you have multiple sensors working at the same time then bare metal programming is simply efficient. On the other hand a full PC operating system such as Linux would be too large and it would use too much energy too much memory and so on. And Zephyros is exactly the right size as it turned out. Another reason why we are happy to use Zephyros is that we are using the Nordic semiconductor NLF 9160 microcontroller which is very capable because it combines an arm microcontroller with an LTE and NB IoT modem, wireless modem in one unit and that is very capable and also energy saving. And Nordic semiconductor has an SDK software development kit which is based on Zephyros so it was an obvious choice for this project. Another reason is to have another positive reason for Zephyros is that it's open source which is a really strong argument if you build an embedded project because you can have later access to the source code, you can debug it, you can make service, you can handle security or safety issues and all that. So open source was a strong argument in favor of Zephyros when we worked with this for Deutsche Bahn. Now what have we built after about three years? We developed a monitoring system that detects and reports failures automatically so it works and it really works the way it should. The result now is that we have a series of 36 modules in 24-7 operation so they work night and day continuously for over one year. There are no visible parts from the outside, no buttons, nobody has to switch them on because everything is automatic. Over the time we have more than 50,000 wastewater deposits which are analyzed and where a protocol has been sent wirelessly with an unambiguous detection of any issues so we both send a protocol when everything is fine but also we give an alert as a failure and you can see in the middle this is an MQTT report with a JSON format so it's a very short like a telegram which reports every wastewater service. Now the architecture of the network is the obvious choice would be cloud computing for most people cloud computing is something that everybody knows. Cloud computing has the advantage that it requires only simple nodes all data are collected into the cloud so everything is in the cloud and you have a powerful central processing. However everything must be online and it must be always online. It only works as long as you have access to the network and if you don't have access to the network or to the cloud then you are out of luck and in this case with a real-world railway application you cannot be sure that the network is always up and if the network is not running then the trains could not be serviced which would be unacceptable so that is why we are using edge computing. Edge computing in this case means that processing is performed locally. It works independently from any central computer or a central network which keeps the traffic low it can handle interrupted connections so even if there is no network at all everything still works and then with a delay the data will be sent so the protocol will be sent when the network is up again or when the connection is up again. You can still use cloud functions but these are optional, these are not required. So that's why we are using edge computing. Now the big question is can you or can we detect wastewater flow with machine learning or with visual intelligence because that would be the obvious choice for most people and to give you an idea I just took my iPhone and searched for dogs which in German means hund and you can see the results over there so on the leftmost top left you can see Farabella horse which is a miniature horse it was detected as a dog okay it's the size of a dog but it's definitely not a dog I think can you see that? You might see it even from a distance that it's not a dog and in the top center you can see my mother with a teddy bear I can think even if it's a small photo you can still see that this is not a dog but a teddy bear so this is also a failure. On the right top right you can see a dog definitely then picture number four I think my daughter with a dog okay then another dog okay then there are group of people at the airport there's no dog at all so I wondered why has a dog been recognized and it turns out that there's a poster in the background and there's a word hunders in English and it starts with H, U and D so the iPhone has found a dog there okay you can argue about that whether that's correct or not then there's another dog and another dog and then there's a fox and you could discuss whether a fox is actually a dog so that's the kind of result that you get with machine learning in this case it's a most current IOS Apple is a leading company and whatever so it's a wonderful software system and I think it's very admirable but still if you have nine hits and at least two of them are absolutely questionable and two others are to be discussed and that's not the kind of path that we would want to go with a wastewater or something that the customer would accept in this case so any machine learning system as you can see on the left works with training data and has to have annotated data and it creates some kind of model that associates patterns and these training data must be extensive and the distribution must be homogeneous that's something I will explain in this slide for machine learning to work the training data must reflect the universe universe means all the data and if they do as you can see on the left most box so if the if the training data is a correct subset which represents the whole universe then everything is fine and you can work with it however in the middle you can see a box with a certain subset and it looks violet with a little green but all in all there's much more green and also there's yellow which has not been detected at all so in this case the training data would not be representative of the universe and then it will simply not work it cannot work it's impossible to work in that case and very often results are mixed as you can see on the right side you have a subset or you have training data that somehow reflects the universe but a bit or not so that's like the fox in the photo collection it's a bit of a dog but it's not really a dog and of course it creates quite a challenge for wastewater detection now what can we do to fix that so if you have seen my CV I had been working in the medical industry with heart base makers and ECG looks at first it looks completely different but also there is a flow and the ECG reflects activity of the heart and people that are experts in this area they don't just look at the ECG they understand biology and they understand medicine and this knowledge turns out to be very important to be very relevant and I just notice that the heart is also a pump so in some ways also pump monitoring but that's just another note okay so with the wastewater pumps there is also knowledge about wastewater and of course the sensors reflect the wastewater flow and it changes with temperature for example or with a diameter of the pipe and all these things so real world experience does count and artificial intelligence does not replace the human knowledge it does help but it doesn't replace it now in order to create a real-time detection of the wastewater flow we have to have a real-time signal processing in this case with a Zephyr RTOS so real-time operating system that is a great match we have to do the pattern recognition as early as possible to make sure that it works in time and that there is never a delay so we should never run out of time for the processing it must be fast enough to always catch up with the sensor data stream and we need to use an adequate resolution a higher resolution is not necessarily better then something which is different from from the traditional artificial artificial intelligence is with our system that traditional AI is a black box you cannot follow why you can trace or debug it usually at least not as a user so I cannot ask Siri well is a fox really a dog and why did you think that the teddy bear is a dog that's a question that you're not supposed to ask and the system is not capable of responding to that so maybe there's a developer version of that but for normal users it doesn't work so and in fact in most cases people don't know why there are some strange results of the AI and usually the response is okay we have to take more training data but that's not a satisfactory so in our case we have a system that can provide automatic explanations because there are certain rules for filtering and the logic of conclusions that can be traced and on the right side you can see an example of such an explanation which has been generated automatically so if there's a conclusion that there is a blockage or a proper handling of the pumping process then it can explain it's rational so it can explain the pump duration is longer than 10 seconds there's no long interruption and there's always a certain pressure and that way one can check whether the logic is correct or not okay to summarize signal processing is hard there are no shortcuts if you want to do a good signal processing system you need years of experience sometimes decades so in the automotive industry or in the medical industry there are usually people who have decades of experience and excellent algorithms are precise and fast and they are really good but it's not easy to do the reason why it's not easy to do is that domain knowledge is also relevant so you have to know your profession and you need to know if you work in the automotive area you need to know cars if you work with wastewater you need to have some understanding of pumps and of the situation on site and artificial intelligence can help to build a knowledge base and it can help a lot but for itself it's not sufficient okay that's from my part and then I'll hand over to Jonas good morning everyone I will present the ZFAR application that we developed for the pump monitor and first I'll start with an overview of the hardware and the firmware next I'll give an introduction or explain how we use power management and what we have learned from the project so first of all what is the scope of the application first we need to read fc-text the nfc-text are connected to the tanks and they give us a representation of which tank has been emptied right now second we need to monitor pump pressure samples so we sample these pressure samples and later we create a report from it three we need to provide direct user feedback that in case there's any failure sometimes the operator or the user can do something about it and then we want direct feedback we give that via an LED bar for evaluate pressure, gradient and generate the report at the end of the pumping process we generate that report and then we send it to an MQTT broker via LTEM in this picture you can see the block diagram of the application like most of the components we have different components and they're connected via different interfaces we use for example I2C, SPI and analog but what is more important for us when we thought about the application and how it needs to work is how much the power usage of each devices like to give you an understanding there is for example the accelerometer which runs with 1.823.3V so it's quite flexible and it consumes around 5 microamps in operation so it's really low on the other side of the spectrum there is the NXP pressure sensor which runs on 5V and it consumes 7 milliampere in operation so that's a factor of more than 1000 and if you only run the accelerometer you can run the system for more than 10 years and if you have the pressure sensor continuously on the battery is empty after a few days so there is a great dynamic and you can see this in many low power applications so that means you need to have some measure to switch off the devices which consume a lot of power and this can be done via power gating like you have a power switch or a transistor, a MOSFET transistor which switches off the actual operating voltage and the LED controller the microSD card and the pressure sensor are power gated so in the firmware we can switch them off when we don't need them here on this slide I'll explain which modules we used to meet the specification there are some trade-offs to make so first of all we use an RTAR so we can use threads, work queues and the timer API for example and those help in making the development easier we use sensor, LEDs and storage APIs for example we have an SD card where we monitor the samples the pressure samples in a raw data format so that we can analyze them later in an offline mode in case we don't have a network reception then this data can help later to learn from the field test then we use the NRF Connect SDK because we use NRF 9160 and in the SDK there is of course modem support for the NRF 9160 for LTE M and NBIUT stack and in addition the SDK also includes a driver for the NFC reader that we use then we use MQTT and yes there are more efficient formats like lightweight M2M for example but we made the choice to use MQTT because it was easier to develop for our application and there is a bigger flexibility when you choose cloud providers like MQTT brokers there is a big choice and lightweight M2M also has some choice nowadays but still this was the easier solution it was easier to meet the timeline and the requirements and then we use the power management subsystem from Zephyr to switch off the devices when we don't meet them so the system has three main states first is the sleep mode most of the time the system resides in sleep mode when the accelerometer detects an acceleration above a threshold then an interrupt is generated and the system switches to sensing mode to last for around 120 seconds and in this sensing mode we continuously pull the NFC reader and we take pressure samples every two seconds and usually a pump monitoring is started when you hold the plumbing to the tank and the NFC tag is read but in some cases when an NFC tag is for example missing then we want to still have the option to just start the pump monitoring when the pressure is also below a threshold so then it's automatically started the pressure sampling frequency is 50 Hz and most of the energy is consumed in pump monitoring so that means the power management works and the sleep mode is actually quite low power and most of the energy is consumed by following the NFC tag reading pressure and signaling via the LEDs here you can see the PCB connected to a Nordic power profiler kit and the power profiler kit supplies the PCB and at the same time it measures the current so that you can see the dynamic current flow of the system and in this diagram you can see an interval of two seconds and in these two seconds we take every two seconds one pressure sample and you can see this relatively high peak of current flow represents the switch on time pressure sensor so we should switch it on for a short time we have this peak current and then after we take the samples we switch it off next when we zoom in we see now the individual polling of the NFC reader this is an interval of 100 milliseconds and that was also a decision from the requirement we needed a good responsiveness of the system the system should just work and if the user will hold the plumbing to the end it should immediately recognize the NFC tag that's why we made that trade-off to pollute in a relatively high frequency of 10 hearts even though that consumes more energy here again we zoom a little bit in and now we have an interval of 20 milliseconds this represents our 50 hearts pressure sampling and you can see here the time when we take a pressure sample the energy is very low in the CPU time and reading a few analog samples in this case the pressure sensor needs to be always on it has a warm-up time of 20 milliseconds so in this case when we pressure with 50 hearts we just keep it on and you can make two categorization here cheap energy I mean cheap and expensive devices to run in terms of energy and in our example in our system cheap is CPU time taking temperature 50 sensor samples or accelerometer samples those sensors are all very optimized and they don't consume much energy on the other side you have expensive things like radio pressure sensor NFC reader and LED they consume a lot of current and we need to switch them off if we can so from that example you can also take that edge computing use case in our case makes sense because we have CPU time we can use them and it doesn't consume too much energy at the same time if you would send the raw data to the cloud we always would need to keep the radio on or at much larger intervals so that would ultimately consume much more energy and this is the case in many systems like radio time is always expensive so what are the issues or what can be the issues when we just switch off devices we just unplug them from power in case of the accelerometer of the analog pressure sensor it's an analog and purely analog sensor and it doesn't need initialization so we can just switch it off when we switch it on again we just need to wait 20 milliseconds warm up time and then we can take samples anytime for the TI LED controller for example we need to initialize it after we switch it on there is one register chip enable and this chip enable register needs to be switched on after power up via sqrc so in Zephyr you can do this via power management and you see on the on the right side part of the implementation of the driver and inside the device driver there is a function device action resume where it calls LP5569 enable so it calls a function to enable the device and this is automatically executed from the power management when we switch on a regulator a regulator is a concept in Zephyr where you can control voltages and that saves some effort and Zephyr takes the work in our case then some important design decisions in our project of course power gating versus always on it's a trade-off and you have to check each device individually if you want to keep it on in order to power saving mode in our case the LP5569 had some other problems that it was connected via cabling in a separate module and sometimes it can happen that these cables are not reliable especially during shock conditions you might have very brief disconnect so if we have this hotplugging capability anyway if we initialize the device anyway after power-up then these rare incidences it doesn't matter anymore and we just re-initialize the device and we're good to go again then the 5-fold requirement for the pressure sensor is actually a disadvantage but we chose that sensor because it was a very robust solution and the robustness was more important for the system then as I explained already the MQTT versus we went for MQTT because it was easier for us to develop and the energy impact or the energy disadvantage of MQTT in combination with TCP was still low compared to the other devices then we had the decision to store the samples on an SD card and this would usually be not required because we can send the data via the result via network via LTE-M but in the beginning we did not knew the state of the network sometime back the network was in the process of building up in Germany or in Europe and we were not exactly sure how good it was at that time so we wanted to have the option to store that data locally that later we can take it from the device if needed. What have you learned from the project? First develop features in application modules like with Zephyr very easily you can separate software modules and test features individually, test devices individually and then later connect everything and integrate it with each other the mode of the way how we worked for hardware we did a specification like we knew what we wanted and for software it was more like a rapid prototyping approach because some parts of the requirements have not been known at the start of the project and we needed to figure out how to best take these samples and what exactly, what data exactly we need to see, we need to sample for example some examples for that the NFC tag format was initially not very well specified and there were some try and error to take to read many tags and then to identify what data is on them and how reliable is this data and for example there were two options on how to start how to move from sensing to pump monitoring first to take to read an NFC tag successfully or to start the pumping process by detecting an under pressure a low pressure and this was also requirement was, it was not there at the beginning so we changed our strategy in the middle and went for option B that we have the second option to activate the monitoring then the field test with a limited number of devices it's a great opportunity to own and I think for most projects you will not get around that, you need that you need a field test, you need to make your experiences and then optimize the system one takes also LTE and that looks reliable we had very little issues with it and on nearly all sides we could continuously get a network connection and send the data that we required then some notes from the operations operations means the phase while the field test was deployed and while the sensors were in the field for debugging purposes we just stored all the MQ2T messages in an SQL database that's a very easy way of filtering them later at any time and finding issues with it, comparing results comparing temperature and humidity to have some data we took daily telemetry statistics and those included humidity and temperature then the wake up cycles how many times the system switched from sleep mode to sensing mode and to pump mode, we wanted to know this information and ultimately the active time which was important to calculate how much energy we need for the project how many pump emptying processes are there daily and one interesting feature which we also later implemented was water intrusion alarm we have a humidity sensor in the housing so in case the housing the ceiling has an issue there was heavy rain or the housing is broken in other ways and water is inside the housing then we can detect this early and we can detect it before the electronic breaks for some days or so but if water is for some days in the housing in the electronics then this will surely harm it quickly what's next we have there are plans to increase the battery life and we are confident that we can reach a battery life of more than one year for that we probably need to remove the 5 volt requirement to make the power management more efficient and for that we need to change the pressure sensor then the capacitive NFC detection can be optimized the NFC reader has two ways of detecting attacks either we are inductive or we are capacitive sensing then the cloud to device communication that we can from the cloud from the MQTT broker we can send small data small data for example to set the lock level in the hardware that initially when you deploy in your firmware version you might want to have a high lock level and afterwards we can reduce that and we need firmware over the air update for the app and the modem firmware of the device so that's it do you have question yeah my question is if you have ever thought about adding any simple button such as wake me up button to reading the NFCs it may improve the power efficiency you want to answer you asked about the NFC reader specifically NFC reader is read only in a situation when the module is very near to the railway tank it doesn't need to permanently because it will use a lot of energy but if the system is turned on because and it can detect that it's turned on with the accelerometer then it wakes up and if it's near to the railway then it will uphold the NFC reader and that's 10 times a second which is relatively fast so if you move it then you will not notice it's just like paying with a credit card with NFC near field communication so if you detect it 10 times then you will not notice a delay but if you do it 100 times per second then it would waste a lot of energy and if you have it 2 times a second you would notice a delay so I think adding a button would be easier technically but this is like a user interface problem the user doesn't want to press a button or he might forget it or so and then it's easier to have the system automatic alright I have a question about the data collection because while you can make the sensors how did you go about establishing the truth so sitting in the rail yard and recording raw readings of all sensors for a week or for some period or how did you get the truth some workers saying this one is blocked this one is not because they've been doing that for 20 years or something like that we spend a lot of time together at the railway stations and it was several weeks and we learned a lot from the people who did the work and that in fact they are the real heroes because everything that is in this system in the signal processing is something that we learned from them you mentioned that the most power consuming parts are the NFC and the data communication or data transferring so what is the frequency of transferring data or is it acceptable for in this case customer to have the intervals the frequency below so that you only every day or so have to transfer data I mean data itself doesn't change the volume but maybe the polling or the frequency can be lowered to save energy that's true and for example we have this interval where we send statistics once a day and we can vary that there is no problem about it but when we measure a pump emptying process the customer wants to have this immediately this data because the train is in the maintenance station and it has maybe 10 tank connections and Dajaban wants to make sure that all of them could be emptied successfully and if not then five minutes later they could take some actions they could connect a stronger pump or I mean there might some measures and they can take a learning from it so they want to have that data immediately we have one minute left you said there is a second mechanism for starting the monitoring process right so when there is no NFC that's based on the pressure but in this case how do you know that the train is standing near the pumping station or what's the trigger for starting this monitoring process if there is no NFC available because when the train is speeding up the pressure actually changes and I'm wondering how do you differentiate between train standing near the pumping station versus the pressure change because the train is moving well we know that the user has taken the system out of the holder because of the accelerometer has triggered and the second thing once then the pressure changes then we know that the pump has been on and something is connected to the plumbing and then only the pressure changes if the plumbing is open the pressure will not go below that threshold so if there is liquid flowing then there is always movement and there is always some kind of waves and that's a distinctive pattern that we can recognize so if you want if you would like to know more details about that you can send me an email and I can send you a detailed description from that just real quick there is one question on the chat I'm not sure of the context here but they asked why not shut down temp and humidity sensors as well and during the middle of it yep we can do that but it's not required because they are already very low power TI has designed them like that they can be in a sleek mode and they consume I think less than one micro ampere so it doesn't bring a significant benefit to switch them off very good and with that we actually reached the end of our time here like that oof like to thank our speakers and I think you will also engage online with the chats afterwards as well so thank you very much thank you