 Hi, my name is Luigi Zapparelli and I will be presenting the IoT pass Internet of Things platform as a service with my colleague fans and unfortunately fan could not make this recording session so I will also be doing his presentation as well. So the ask was to come up with a platform as a service which really is the application layer and data layer to implement a solution. The following requirements that it'd be API based, in other words API endpoints for data in data out restful API endpoints based on Jason so very, very standard requests that we have an analytics element where we can actually aggregate some specific data that was required. That it have a AI scoring element or a pluggable AI scoring engine, which was a nice to have really it wasn't a major part of the requirement and then a dashboard to view the data in graphic form, we have a requirement of seven days worth of data for any device that was that was used. And I'll elaborate on that later, why seven days. It was just part of the the ask. Over and above that the technical requirements we had to adhere to event driven microservice architecture that it be loosely coupled based on Linux containers. And that we have a scalable system that's it is higher has a high availability, and that it could handle 300, the ask was about 300 million documents in a day. And that's sort of roughly is about three and a half thousand requests per second. So it is a fairly high throughput requirement. So the tech stack that we chose for the pass would obviously obviously be based on container orchestrator because the comment was that it be on Linux containers so the obvious choices are Kubernetes open shift. In the messaging queue we made use of Kafka. And we have message producers and consumers that we could then implement and then have them plug in and out on for our loosely coupled architecture. On the database side we made use of couch base, no scale database that the couch base is fairly impressive, it has a basic crud engine, your basic database engine. However, you would need like three dedicated notes so it will be set up in a high availability type implementation on three nodes that would be replicated with persistent storage. And over and above that you would have need an extra two nodes dedicated to eventing and the analytics engine. So this does, this does consume a lot of resources so I've also implemented a simple Redis interface and the choices, the choices up to the user to deploy what they, what they prefer. Obviously couch base gives you so much more. Frontend was based on Vue.js, completely. And back in on Golang, Golang is the technology that we've language that we familiar with. And for our, for our CRCD pipelines we implemented all go CD and text on. These are great open source projects and they can be deployed completely on Kubernetes. Obviously the OpenShift has the OpenShift pipelines which is also based on and on tech time. Before I elaborate on the actual IoT devices I just want to say that these devices that we were using and testing on were medical devices so it's based on non invasive. Vital sign measurements. So, we had various devices, and I'll show you in the next slide. The devices were that we used one was a wearable device that only had three vital sign measurements. It's on the top right hand over here. And the top right hand is the wearable non invasive device that basically gave you systolic blood pressure, diastolic blood pressure. And SPO2, which is your blood, blood oxygen saturation, and then heart rate. It did not have temperature and respiratory rates. We had to improvise and build an interface that would then handle temperature and respiratory rate and then we'd combine the readings into one, one Jason payload. So we had a development kit that we were asked to implement on the development kit uses the same photo prismograph. It uses pulse transient times, and you can read up on that there's some really good documentation online about that. And this device on the bottom left corner the picture here is does also have temperature and accelerometer. And the other pieces of electronics really was my development interface to actually build an SPI. And this is an application channel for the, for both the wearable device and for these this SDK type hardware device. And the data would then be connected to this Omega Linux single board computer. It will then have SPI interface read the data from the, the Bluetooth gateway which is not shown here but the watch, or should I say the wearable device will push data to the Bluetooth gateway, and the Bluetooth gateway is a TPC TCP server and from the TCP server then we can inject the relevant data that we needed. This then goes to this Omega device, which is really a Wi-Fi interface that then pushes. I've got a growing process there they're running and it uses HTTPS so we'll have secure connection to the actual platform as a service. There's an overview on on the hardware that was used to, to implement the, the solution. So as I've mentioned it's very medical oriented or medical vital science orientated, but you could change the implementation fairly easily. And I will discuss that in the actual detail architecture. From a visual perspective, all the interface, or should I say the dashboard really for the analytics and the views into the data. Based on Vue.js and deployed to Firebase and Firebase is really cool and it will uses a CDN and push data to edge servers that would then you could then connect your pairs into specific regions and areas to to sort of make use of of the edge services. I've also shown here the wearable device with the Bluetooth gateway. Unfortunately, I didn't have a picture of the Bluetooth gateway but you could have several watches connect to the Bluetooth gateway. You could then intercept the TCP server that the Bluetooth gateway connects to and inject extra data and what I've shown you is our hardware device in the previous slide that then via the Wi-Fi module pushes also to the IOT platform in the user. Or doctor or physician or patient can actually then view the data interface. The ingress into the application platform is via elastic load balancer. This was deployed on AWS. And via the ELB then to various ingress points. So we had an ingress dedicated to the message producer. I've shown you a couple of pods you could read the nice thing with Kubernetes and OpenShift is that you can scale these pods. They have a notion of a service and then the service will then use a round robin algorithm to connect to the different pods which are really we have a one-to-one correlation to Linux container. The message consumer then would read off the data that was pushed on to the Kafka messaging queue and connect to either couch base or Redis. And then we had another ingress or should I say route interface to an API gateway which is basically an engine X service that would that is just deployed as a reverse proxy. The nice thing is we use config maps and you could then update and change the config maps as you add services and change the services. Analytics service would then just present data for the dashboard. We have an auth service just to ensure that all endpoints were secure. And then a machine learning service which was the pluggable machine learning interface. I also implemented a CRON job. This was very Redis specific because we don't have the eventing feature and couch base has a powerful eventing feature that you can write JavaScript interfaces to aggregate data and push data from one bucket to the next. So I to implement that I just used a CRON job to read from data from Redis and then store it into a key value store for the aggregation. So that's basically the implementation of the different services. What I also did was implement a metrics interface and by the way all the deployment templates to deploy Kafka we use Strimsy. For couch base we use the couch base operator. For Redis there's a deployment, there's a Redis HA deployment template in the repo and again all these and for the actual observability section we have all the templates in the repo and I'll have a link at the end of the slide that you can go have a look at. So we have a dedicated ingress again to Grafana dashboard that then picks up from a Prometheus server that does all the scraping from the different microservices. So the dedicated Kafka instance again would be put on to three three dedicated nodes these nodes were tainted. So we don't want any other services running on them. And they each would have a Kafka zookeeper instance on this is before the, the, the newer version of Kafka which does not have zookeeper. For the Redis couch base we had three dedicated nodes also tainted where we have an HA version of couch base or Redis and for each in each availability zone. We have a persistent volume for Kafka and zookeeper in case there was an outage we would, we could have all data persisted, as well as the persistent volumes for either Redis or couch base. So for the analytics, I just gave a quick overview of basically, you know the elements needed for the scoring. So basically you'd have the data engineer or data analyst, updating information via Scala. And then this would then create a bundle a zip file. And there is a great project called M leap. And it will then take that zip file. And then it would push that zip file into the actual pod, or deployed within the pod. And then the M leap will pick up and handle the, the interface via Jason for for the actual scoring. All development is done locally. You create a artifact the zip, the zip is uploaded into the M leap service and that can be used for for the machine learning scoring. So for the analytics side, as I mentioned, catchphrase is a very powerful DB eventing service where you can write JavaScript to aggregate. And the great thing is that you can then streamline your, your database to handle. So you don't want to have too much data that you're going to be storing so you set a time to live on the initial bucket, where all the events are coming through from Kafka. Then catchphrase has this eventing routine it will pick up within the, within the TTL the time to live so if you set the time to live to 10 minutes just let's say. So this is on on your capacity planning but if you if you let's, as an example use 10 minutes, you would need to have the aggregation, the eventing running every five minutes just to ensure that you, you've got all the data. You would aggregate the data, and then push it to another bucket so it's a really powerful feature within couch base. As I mentioned, I haven't implemented. So for this implementation I haven't, I have used Redis. And so I have a cron job that runs instead of, you know, the, the, because it doesn't have the eventing. To go into more details now on the actual IOT performance and how we got to 3500 requests per second. After having a look at the architecture. I found two levels of optimization in the, I was the all the nodes were based on core OS streams. So the actual settings are very core OS specific. So basically in the ETC security limits, I found that setting soft and hard limits for any user and route to over one million, as well as the ETC sis control config file, where we have the FS file max to over one million. There are no Linux experts, so I found that these two configurations were, were enough for me to get the, the throughput that I wanted, but there is a read me file that I've also deployed that I have in the actual repository that you can go have a look at for extra additional tweaks and setups for optimization on, on, on the host so every host that had within our deployment would have the setting enabled for the actual Kafka optimization what I found was that really worked well was the setting up a cleaner. The service that runs, and that the policy that I use was compact and delete. And I set the log retention hours to, to one hour. Obviously you can actually lower that you can set it to a couple of minutes if you wanted to. But what's so nice about this is that the algorithm will then need a specific message ID and the message data, and it will then look for duplicates and basically compact those duplicates and you'll only have the latest data. So if you had, let's for example say 1000 devices, you would have only 1000 sort of pieces, 1000 message IDs within Kafka that will then only take the latest data compacted and delete the unwanted one so in this way you preserve. A lot of overhead with regards to data storage. And this was one of the optimizations that we found really worked well. Obviously you, you know, this is something that you can play around with and tweak. So, so what I did do here, and just to sort of indicate what this is all about is that I created via Golang go bench, it's an open source project I, I clone the project and then obviously updated and made some changes that would be very specific to the version payload. And I created 1000 for this example 1000 different users with different payloads. And, and then run and let it run and I found that in order to. I mean, to run this thing over 24 hours is a bit time consuming. And I found that I could do it within. It was, I think six and a half hours by pushing the request to 13,585 requests per second. I managed to get 326 million requests in that time with no failures which is, which is really good. And you can see here I've used them, the node port on the local on the Kubernetes. Implementation Kubernetes lab, lab implementation that I have. And I found the optimum here was 500 clients to be able to give me the, the 13,585 requests. So this is like, really three times, maybe more. Three times more than what was requested. And so it gives you a good level of confidence that the system does work and does scale. So yeah this was, this was really beneficial. It was good to see that with that with that with that architecture that we could scale. And what I also did was I added metrics. So on on the actual message consumer and producer. You can see that the average heap memory and CPU usage was was really exceptional way below than what I expected. The most, the more important ones were the, the average request times and it's a pity that this, I'm sorry this graph is fairly small, but you can, you can take my word for it that the request times were under half a second. And that was 300 at this point in time where I did the, the, the snapshot for this slide, we're on 300 million requests. And the average, the average request time was sitting round about well way way under, well, at about 500 milliseconds was just fairly impressive. That's for the consumer, low memory usage, low CPU usage. And so all in all I was really satisfied with with the throughput, the final reading here. At the end, just to ensure that I got most of the data through was 326 million. And so if I go back to the initial. The initial test, we see that we have 326 million so they, the actual go bench told me that I had produced 326 million. The analytics of the Prometheus serve had picked up 326 million. And over here because I initially set up some tests, and I did not clear them. When I started the, the, the, the complete profiling. But all in all, we are really happy with the, the throughput with a response. You know, I think, I think what we were happy with is that is really was the simplicity of the design and the simplicity of the, of the implementation. All in all, the, the offering really is, is fairly customizable, and we could make you make it work within the different industries like medical for industrial type systems, automotive education, I mean the list goes on really. So what I'll do now is just basically share the, the front end design and and overview that that we use to implement the system. What I'll do here is I'll just go ahead and show you the actual the dashboard itself. So the thinking here was just to have a main dashboard and yes I know. People are going to say, oh, you have PII data here exposed. I just chose to add this as a, as an example but this obviously a name tied to an ID does not have to be revealed. This could be completely hidden. We can opt in and opt out with that. There's some way of tying a profile to a specific patient and that could be the client doctor privilege. But we have the specific pieces of information here. The systolic blood pressure, the diastolic was not needed for the actual statistics or analytics should I say so we have the systolic blood pressure. The average heart rate, average blood oxygen saturation. And then the respiratory rate. It's a weird icon I know but that's basically what we had at the time of the design and then temperature that was measured. And these were the basic averages over the week. So we say we only, we only collect seven days with the data. And then against the actual AI engine, the, we use news, it's the national early warning score, it has a matrix. And the, the scoring was against this. And so basically then just other information use and usage net. And then just the level. So basically this patient because of one or two measurements that were exceeding the news early, the early warning score is in critical conduct, well is in critical mode. So the indicators here will show you the early warning scores. And you can see here that the heart rate is, is in a level you to, according to the news matrix. And the problem, the area that is problematic was the level you three, whether the temperature was above the average. And then just lastly we have a group of charts. We have seven days for each measurement so we have for blood pressure for heart rate. Blood oxygen temperature and respiratory rate. So I mean I know this is a very simplistic view and interface. As we mentioned this was built around view view JS has a very simple implementation it makes then calls to the back end service. And these services have a JWT to protect the end points. That's basically it for the design. So in conclusion, we have, I think met our criteria that we designed and develop the system. And with the event driven microservices on a loosely coupled architecture that have interfaces that were totally pluggable based on Linux containers. We met the scalable and high availability criteria. And I think that this level of confidence in that the system, you know, produced the throughput that the, that the requirements we had upfront could handle. So I want to thank you for listening to this talk. I believe these slides are going to be shared. And the appropriate links. I've actually put in here. And these are just, if you're really interested I created a couple of blogs on LinkedIn. I'm basically describing how I went about profiling the Golang microservices. How I use the ML scholar Apache spark engine to run against the news the national early warning system. Just a little side project, because our tecton pipelines were taking, you know, more than about eight minutes to actually build and deploy. I'm a great fan of tecton and August CD, but just in a development environment is a bit. Slow. And so I created a custom CD pipeline. And I've also written article on that and it does a does a linting a test, all the unit tests. It does a static code analysis via sonar cube. And then building the, the actual Linux container and embedding the binary in that container, pushing it to key IO using canico Google container tools. It's a fairly neat thing. It's actually can obviously because of of your native deployment it's it makes use of, of that and, and you can get sort of end to end development build to to your Linux container registry within a minute, I've also dropped in links for the complete project and the You'll find in the actual project links to the message producer, the message consumer for both catch base and redis. There's an analytic service, the cron interface, the IOT, the author interface, the template to deploy all the artifacts, and all the, the database, the message queue. And then I have the implementation of go bench to do the performance profiling, and I've included the spark ML vital signs scholar implementation as as just as an indication I could not share the data the data was was given by a third party and because we were still in a PSE mode they didn't want any data to be revealed so I can't share the data unfortunately. I have the the spark ML has been a clone from another project so it's really it was a really simple implementation. So, again, I want to thank you. My contact details are here fans contact details are also here. Please feel free to contact us clone the projects. And read up on on what we had to offer. I think it's really easy to implement other types of services by changing this schemas these message producing consumer extremely simple. There's no rocket science in them. Yeah, so I'll leave this open to to question and answers. Thank you very much for for listening.