 Počakaj. Mištji, begina, če se bo našam. Ovo se je pravda na vse na začenjenju glasboj na konferenci del v tem, da je se vse tako v tem jo vzve, da se tako vse so raztivite vzvega. Če so pri začenjenjem sem vzelo, da se bo naša razdaj se odpočeno, da se bo vzelo, da se bo vzelo, da se je vzelo, da se je začenje začenje začenje začenje, da se je začenje začenje, Mi je Bogdan Mitra, sem architect na Danesplej Sirona in vse zelo sem izgleda Kupkon. Srednji je, da sem početno, da sem vse zelo. Srednji je, da sem početno zelo. Vzelo se na terbe, da se trajimo, da se se vse taj dojel. Moje, srednji je Kevin Rewijk, je predstavnji architect v SpektroCloud. And I've been working together with Bogdan in the last 10 months to finalize this solution. And we would like to walk you through it. So as we're going to be talking about taking dentistry into the 21st century, I wanna start with a genuine trigger warning because I don't know how this is for you guys, but the next image that we're going to show might trigger something that you've had in your childhood In to je za mnogo. In je... If you can't handle this sort of stuff, close your eyes, and it'll tell you when the image is gone. So... If you ever get dental impressions for braces, then this is the stuff you have to bite into. And if you're 35 or over, then you can probably still imagine how horrible this stuff is. I had braces when I was 16, and for me this stuff was purple, not blue, but I can still taste how this stuff smells. And maybe you can still taste how this stuff smells. Bogdan, please tell me that there is a better way today to do this. Yes, actually Kevin, there is definitely a better way, so then I will just share it with you, for example. So as you can see here, this is very fancy, and then I will just try to put some words and show, for example, how actually the new technology help us to improve both patient and doctor experience. So as you can see here, it's actually using the latest technologies, it's pretty straightforward, and then the doctor can seamlessly use like an intraoral scanner to scan the mouth of the patient and then create a 3D model out of it. So, of course, this one will help quite a lot, for example, the user experience, I mean the patient experience plus will support also the doctor to get like a better understanding of the 3D model, right? So then our research kind of say that almost 80% of the patients were actually way more happy with this approach. So then I think, Kevin, you are also one of this guy here. I would have loved if this existed when I had to do this. Okay, now really on the short notice how actually this works. So the patient is just joining the dentist for a treatment, he will just take a seat and then the doctor using the intraoral scanner will do like a scanning of the mouth and then all the data will be streamed to our edge device, which is hidden somewhere in the office and then there we will actually process the information using our rendering services which are backed by GPU power and then as soon as the model is created, this one will be just shown on the display to the doctor so then he can take further actions, checking if everything is fine, make some adjustments and then at the end just push the model to the cloud to actually either send it, for example, to some labs for being ordered or open some other flows on our side. So you can imagine that this is quite a big difference from how things used to be, right? So on the left-hand side that was what it was for me. It wasn't great for the patient, I had to prevent myself from hurling over the doctor. It wasn't great for dentists, a lot of manual work and sometimes mopping the floor, but there was no IT involved. Maybe you had to do it twice if you were unlucky, but that was about it. On the right-hand side, I would have been a much happier patient. The doctor is much happier because their patients feel they have gotten a much better service, but you can imagine that the DevOps team is like how the hell are we going to put a Kubernetes device at the edge where there's no one that can help and I have no remote access to it. How do I do that? So that's what we're here to talk about. Because it's not about just running Kubernetes, because of course if you're building a new application, you're building it on top of Kubernetes, but if you're shipping that to a remote dentist office, then now you're talking about edge Kubernetes, and there's a lot below the waterline of the kind of issues that you run into with, as I said, no local skills. The doctor can turn it on and turn it off. That's about all that they can do. There is how do you do remote maintenance? How do you work on networks that you don't control? What happens when the doctor goes on vacation and they turn off the thing for six weeks? There's a bunch of problems here. So how do we tackle those? Okay, so then during this journey, as Kevin mentioned initially, it's running for almost one year for us, so then we had really a lot of challenges regarding managing edge devices on scales. So here we are grouping on three big parts, which I think are very important to be considered whenever you guys want to manage devices on premise devices, on premise environments. So the first one is actually really regarding the onboarding. How we are sending the edge device to the doctors, because most of the time the doctors, they are good in what they are doing, but actually they don't have much technical expertise. So it's really hard, for example, for them, we cannot ask them actually to say, yeah, you need to connect to the device and there are some Linux commands. The other big challenge for us is that the edge device can run in some uncontrolled networks, which obviously can be different from customer to customers, right? But then we still have to manage this one in order to be able to deploy on scales. And nevertheless, we want to give a very good experience to our customers, right? And the whole idea for us is just to have like a plug-and-play device, which actually is shipped by some logistics company and then the doctor get it, they just turn it on and then everything should automatically or magically happen within updating the services, getting security updates and so on. The next big challenge is regarding to security and then this one, I mean, we really have to take it high into consideration because we are running under some highly regulated environments, so we need to definitely consider, for example, which sensitive data is available on the edge device and then how can we protect it. And nevertheless, in a fast-running environment of software world, things are changing quite fast, new vulnerabilities are available, so then we should be able to quickly react on those and then provide patches to the services which are running on premise. And nevertheless, the maintenance was another challenge which give us some kind of headaches, so then the main problem here is that there is no physical access to this edge device, so the device is just running somewhere in a field and then we actually, in case of issues, have to be able to jump in and then figure out, for example, what the problem is and then, yeah, more important, we need to fix those issues. And then, of course, we have some software drifts and ages for, we need to provide operating system updates, we need to provide Kubernetes updates and then, of course, we also want to update over there our services in case we need to patch or to deploy some new versions. And last but not least, we definitely need to have a proper fleet monitoring solution. We deploy on the field and we don't want actually the dentist or the clients to figure out that there is an issue with the edge device so we want to be one step ahead of them and then discover this on our own. So now by having all these challenges, it was actually pretty challenging, for example, to find like a solution and now in the next slides, I will try to explain you based on these challenges how actually we came up for our platform. So as I said before, by going from the onboarding challenges, we actually came up with some requirements. As I said, one of the requirements from our business guide were actually to have really like a plug-and-play solution. You just plug it into the network and then it gets self-updated without the doctor being necessary to do anything. The need of having like a service technician on the field which, I mean, if you are just multiplying with the number of practices we are managing, so then it will be quite some money for the company. And the other requirement which actually came from the onboarding is we need to be very careful on how actually we can run Kubernetes clusters on different networks. On some networks there is a DHCP so then we are there is a DHCP enabled so for example in some other cluster practices they actually need for example to set up some proxies in order to get connected to the internet so we have to take care of all this because as I said, the environment from customer to customer might be different. Now discussing about security so I will say here that this one we actually came up with some clear requirements on our side so we will definitely need to understand the flow of the data running on the edge device to see where actually some sensitive information is used in order to be able to protect it. So based on these challenges we came up with we need to have full disk encryption enabled for the edge device because we are also storing some patient raw data on the device for some amount of time. Secure boot is also very important to be considered because I mean as I said we have like a manufacturer which is delivering edge devices so nothing is actually on our hands so from the moment the edge device is built under the moment the edge device reach the dentist a lot of things can happen so then we have to make sure that nobody tamper installs some malicious software and then we should be able to also detect this one and this one is done through the to the secure boot and then of course we want to keep we want to be on the latest version regarding the patches and security updates so then we need to be able to update over there the Ubuntu operating system we are actually using. So now coming from the maintenance challenges we actually came up also with some requirements how solutions should be resilient against failures I mean failures will always happen I am working actually for almost 18 years in the field and then I never saw like a perfect software so at some point it will happen and then we definitely need to have like a central solution where actually we can monitor all our devices so then it is nothing more annoying for an operation teams and you might know guys you need to open this application to see this and then you need to open another application to monitor some other things so it is really important to have one single pane of glass for everything related to the edge devices so now the question how did we pick up the right platform to help us managing our edge devices so here I will say that I spent a lot of nights maybe drinking tons of coffees trying to do different pots with different providers so I will say that from a functionality point of view they are actually very similar but then a lot of companies they still have a lot of solutions they still have this data center management mindset in one company if you manage a few hundreds clusters but then with edge devices it is a completely different game because we are talking here about thousands of devices you can imagine how many practices are and then on each of them we will have to deploy an edge device so that is how actually we end up selecting edge selecting pallet as a platform for us to manage and one of the main reasons was actually the flexibility they had to be able to integrate with different other applications which I will just talk a bit later so Kevin if we take a small tour around what pallet edge does it is based on Kyros which is an open source meta distribution if you go to kyros.io there is an open source project that we have built that took the starting components of Elemental originally developed by Rancher and we basically completed all of that work and took it to the next level Elemental is no longer maintained by Rancher themselves what we are able to do with this is set up any OS that you want so it supports Ubuntu, Fedora Alpine, Rel you can start with any OS that you want and it is flashed onto a real device so it makes an ISO you stick it into the device and it puts the software on there makes an A and B partition since it is immutable it works on a container image so we download a container image we have two copies on it on the box and we can boot into one or the other so when we do upgrades we automatically upgrade the second image try to boot in it if that works and if it doesn't we roll back so this gives you a really flexible platform and then we have added some commercial components to it one overlay networking support so these devices whether it is one or multiple they will find each other at a location and then they will ad hoc build a VXLAN network with each other so that all of the Kubernetes nodes that run on top of them have a VXLAN network on static IP addresses that will never change while the real network underneath can be anything and can change so Bogdan was actually able to deploy clusters bring them home and they would work there as well which is really relevant because they want to be able to pre-stage a cluster, make sure it's all working and then ship it to a customer so it goes to a completely different network which means that you need something like this to be able to do so the next thing that we did is we added trusted boot and full disc encryption capabilities so this is full secure boot boot measurements and full disc encryption that allows the system to be completely tamper proof from the get go and we take care of all of the work of getting the keys imported into secure boot and even making it possible that for example you can use signed NVIDIA drivers so that your GPU can still operate while the system is in secure boot mode and then on top of that all of this is remotely managed we have a device management agent and a cluster management agent running in here the device management agent will allow us to do full upgrades so we can pull in another image for a different OS or a different Kubernetes version so that you can do build to build upgrades and then the cluster management agent will allow you to build the cluster and update all of the components within there and then this all reports into our SaaS platform in the case of Dentspli, they get a dedicated SaaS platform that they are the only owner of to make sure that all the data is within their jurisdiction and if we go to the next slide this SaaS platform provides a wealth of open source projects that you can use both open source and commercial to finalize the solution so we want to use Nginx and Metal LB or Argo CD or Flux, all of these things you can just grab them from either our public repository or our community repository and quickly build up a cluster profile as you can see here to build your clusters with and then maintain your clusters with that so let's take a look at how we then actually ship a device and what happens when it gets there so let me first talk about what we do out of the box and then what Dentspli did to take it to the next level so as I said our builder will create an ISO that you stick into the device it bootstraps, it flashes the core OS to the disk with a registration agent and then basically you can put it in a warehouse you can stock it and then at some point in time in the future it will be sent to the customer and so when it gets to the customer they can power it on and then there are two options either you've pre-configured automatic registration and it will just show up in pallet as an edge host that is ready for provisioning and you can deploy a cluster to it or you can do some sort of self-service registration where it will show a QR code on the screen assuming there is a screen connected they can scan that with a phone they will get redirected to a registration website and then there they can register the device after which a cluster profile can be attached to it via an admin and then the cluster builds and fully deploys and the user can use it but Dentspli wanted to go a couple steps further and make this fully hands off so let's take a look at what they've done in our case we actually wanted to customize a bit the baseline that was provided by Spectro team so then here on the edge manufacturer warehouse part on our side we are just building an ISO image we have a partner which is building edge devices for us based on our specifications so they are just flashing the ISO on the edge devices and then at some point they will be shipped to the warehouse from there actually we go to the step 2 when the edge device is reaching the on premise a site here the Spectro cloud provides the possibility to auto register the edge devices with their platform whenever they turn on but then actually we don't want to have this approach so then we just disable this functionality and instead we want to be in control and decide when an edge device will be imported into the platform so therefore we have like onboarding application which is running on mobile phone and the edge device is via the bluetooth via the bluetooth protocol so this one gives you quite some benefits because if there is even something wrong with the edge device which for example cannot connect to the internet via this bluetooth application we can also make some network adjustments if necessary as soon the edge device is onboarded in our platform we are just white listing the connection to the pallet by creating an edge host on the pallet and then as soon as this one is created the edge device and between in the cloud they are just pairing and then at this moment based on our configuration in our pallet all the profiles will start be deployed to our edge device meaning that our application are getting deployed and then roughly for our full stack of course depends also on the network quality of the customer but we need for example 30-40 minutes to fully install an edge device on premise and then to create edge devices we are also using terraform modules which are provided by the spectrum team so now for example I will just want to highlight for example some key strategies we are actually thinking that are very important for example whenever you guys are entering running edge devices on premise the first one is related to data encryption so based on your business it can be that for example you store some sensitive information on the edge device so then you really need for example to take care or think beforehand what actually do I need to do so there are three things which we are definitely recommended encrypt data in transit encrypt data on rest and then nevertheless you need to have like a solid approach for encryption keys management because I mean if keys are leaks or whatever you know you have to be able to rotate or even to revoke the next thing we think it's very good actually to be followed is this a zero trust approach and then in this case try to because I mean the main problem is that nothing is on our environment so then we cannot trust anybody so just go for example with least privileges access so we even on our side we even remove for example user password credentials from an edge device so actually even if I want by default nobody will be able to connect to the edge device because there are no credentials and then nevertheless define some policies which has to be enforced here I'm talking mainly about the Kubernetes devices you know to make sure that for example on attacker or like a bad actor is not able to pull images from untrusted sources and then just deploy on your Kubernetes cluster in cases compromise and then the third thing I think is actually crucial is intrusion detection so as I said I mean in the field for quite some time and I don't think there is a perfect software which is 100% secure and then I think you guys can also confirm this one so it might be just a matter of time until something bad will happen and I think nobody wants to get like a spot in the in the newspapers that there was like a security breach or data breach for your company so then it's very important to have some measures to detect for example if a breach actually occur so then it's really important to be able to react fast when something like this happen there were many companies when actually somebody was just breaching devices and then they would just stay there for I don't know months until they detect so we definitely don't want this one so then to react fast define a process to exactly know what you have to do when actually a breach happen rotate keys and then in case of some keys are compromise we just have to have like a process to revoke the keys now we will talk also a bit of about our key findings regarding the maintenance of edge devices into the field so for example from this perspective as I said we want to in the first place we actually want to deploy updates automatically so automation is key so there is no doubt that somebody will consider to do something manual because on such amount of devices it actually I mean not impossible but it will be really hard we have like a lot of headaches so I mean talking about automation service and operating system updates as I highlighted already in some of the previous slides and nevertheless guys you don't need to reinvent the wheels so I mean some of the wheels are already there so just try to follow some well-known paradigms like you know GitOps approach which gives you the possibility to deploy services on Kubernetes and also follow up some best practices which are related to Kubernetes deployment strategies like you know roll out the deployments and then nevertheless relating to maintenance the other important parties alerting and monitoring we definitely need to automate alerts and then try to create for example some smart thresholds because I mean there is nothing more annoying than some false positive on alerts and it will happen several times people will just send it directly to the spam box and then we need to find obviously first the issues and not our clients and the next step will be to consider AIM and ML to actually improve this by running some anomaly detection to be able to detect some deviation from your normal patterns and then finally let's take a look at monitoring so you want to be able to have a broad view of your fleet out in the field and as Bogdan said find out about issues earlier before your customers do so that you can react and be aware before they actually make the phone call that there is some sort of issue so at the bottom you see an example of our geographic like operations screen that shows you where your customers are running and then more information here on the side to give you an indication of whether those clusters are healthy or if there is a problem with it and you can even go further with that I know that densely uses tags a lot to not just do location data but also customer data and they can use this to automate like which specific configurations go to a customer and the tag based system can also be used within pallet so that it can go down the views into like anything that matches particular tags so as you get to thousands of devices this becomes really critical to make sure that you can zoom into a particular environment really quickly actually with tags you can even for example deliver updates based on certain tags which means that for instance you can just start deploying for some updates in a specific region so here I just want to emphasize that you know the take away from here is just considered to have like a single pain of glass for monitoring because this will actually help a lot guys so now as we are approaching the final minutes of the presentation I will also share with you some of the lesson learned we had during this almost a year journey so let's just dive for example on those so the very first one learning I mean initially I saw that actually yeah it's not so difficult to manage like an edge device but then actually it's way harder than we saw initially so then there are extra security challenge we go in unknown customer environment and then remote troubleshooting also give us some headaches the other big learning on our site is like selecting on edge management was also like a complex and kind of a complicated process because I mean there are some solution on the field but you know technology is still young as I mentioned when we are talking about managing such an amount of edge devices and then the other thing which was also important for us it was like flexibility and scalability right because we want to be able to manage a lot of devices and then nevertheless one big one big point from my side is integration with the open source because in this case you can easily try different things and then you can decide which one is best for you and then nevertheless changes the only constant in the software environment so then things are changing fast today we are using library it can be that in 6 months you know vendor locking just to avoid going with the solution and then to be blocked and then to be very difficult to try out something else so that's why my recommendation is really to embrace the open source community and then here on the right side I just put like an image where we see Kubernetes in the middle and then we can see that almost around all the challenges and key takeaways from us there are a lot of libraries and the reason I also like Palette when I was doing the POC because I had the freedom to try without actually requiring the without asking the spectrum to do some custom implementation for this price, Irona I was able to try different solutions here based on based on whatever I was targeting and then afterwards I was able to decide which is best for us so that's pretty much so we will just go to the last one alright while you think of any questions if you may have them if you're inspired by this talk and you're interested in joining Dance Free Serona in making the digital dentistry happen the middle QR code brings you to the Dance Free Serona careers page where you can find open roles in their organization and if you want to find more resources about this particular talk the QR code on the right brings you to a landing page on the SpectroCloud website which will have this slide deck a link to the video as soon as the CNCF releases it and more resources about this talk and with that I'm opening it up for questions go ahead so for example if there are really hardware failures do they replace the device itself I mean right now they are on a single node solution so in case of a failure they just ship another one because they do push out copies of the patient data to safekeeping in the cloud which means that at some point the device is replaceable but they are looking at in the future going to multi node strategy 2 or 3 nodes running there so that if they have a physical hardware failure then the other nodes can take over on our side we have some use cases like for example a dentist scanning maybe 3-4 times a day so then it doesn't make sense for example to bring to edge devices for this case so we definitely have one node cluster but then obviously we have also some use case like with big hospitals where we will definitely go with multi nodes clusters 2 questions you mentioned proxies very shortly as one of the problems but never touched on the solutions is this just covered by pellet as well or did you have to do something for that yes I mean whenever I mention about proxies just imagine that for example you just go in some practice where actually in order to be able to reach internet which is definitely necessary for us you need to make some configuration of the proxy which will be provided to us by the IT department of that practice so we need to make sure that we can cope with this one because as soon as we will turn on the edge device there is no connectivity so that's why I mention in the onboarding slide that we have like this onboarding application running on mobile which talks with the device via the Bluetooth so then we will be able from this device to push network settings to the edge device for example one thing is here to create to adjust the proxy yes so out of the box we use cloud init you can add in additional configuration to push proxy at device flash time or at device boot up time but in their case they wanted to be able for a field engineer at initial implementation time to do it via an app ok cool thank you second question you mentioned that the usual case in the onboard we are a QR code they scan and that directs them to a website have you ever run into issues with corporate policies that they couldn't log in on the phone and that leads to a weird dance of getting that redirection to a laptop where they actually have their credentials actually what we're seeing is that while the QR code approach is really nice for a demo all of our real customers they use other options so they either use something like Terraform to pre-provision these devices so that as they power up they can immediately register and since we've added other registrations somewhere last year most customers really like that option because then they can just power it on and it will automatically register into their pallet tenant and then they can decide from there what happens with it ok cool thank you no actually it's nothing special we have some specific specification but it's nothing more than some commercial products which are put together as an edge device yeah it's regular x86 hardware so I guess I have two questions so has your role this out have you yet run into issues where where there's some cv or something against the bios or some other firmware that you need to go update on the field as well or is this something that you've seen yet? so the way that this that we approach is because this is in immutable OS there is some reduction in attack service because they cannot make permanent changes to the device of course that doesn't solve everything but it does help in mitigating how powerful the attack vector can be what then is the case that if it's a kernel issue with Ubuntu pro which densely is using and if it's a cv in a regular package then what they do is they will run a fresh build that gets the newer version of the packages and builds a new image they upload that to their container repository and then through pallets they can schedule an update to the newer image which will download the fresh image reboot into it and then the cluster runs with the cv is fixed that doesn't address if the UEFI bios has an issue you are just talking about the Linux OS which is important right so yeah UEFI patching is not currently part of the solution so the pre-staging step is most important here to make sure that everything is at the latest available versions but it will probably be more part of device lifecycle management where these kind of things are taken into consideration so we've seen this when we had customers deployed at the edge they might have applications built for the cloud like containers but those might assume that they have uninterruptible power and then they go around them on the device at the edge and I would assume that the dentist is going to power off everything at the end of the day there's a challenge at the application level to know are these containers that end up being deployed robust enough and I don't know if this is something that you guys have seen and if you managed to test for this stuff because I don't really know how to do this looking at all of the containers that are available out there so to be honest with you so for example as we are just shipping an edge device without any keyboard and even without any monitor the easiest way whenever they restart is just to have a hard restart so then for example sometimes we also have like some issues for example I mean we are using Longhorn as a storage and then we had several issues with this hard reset approach because actually for example they were not able to recover after restart but then lately the Rancher team they actually improved this one so then we have some workarounds for example in place but I see actually that now this one it looks way better and the other I mean with we didn't have many problems but then the other issue we had also with this kind of ninja restart approach was related to the database with the Postgre database we are running on the edge device because as soon as it was a hard reset next time for example when the database started it has to really recover and do a lot of things so it was actually taking quite some time to come back to a proper state I understood that because relaying on cluster API is it something that you are using in this so pallet the core platform uses cluster API for everything except edge because there is basically no cluster API solution for edge so edge is based on Kairos OS plus a bunch of our own technology where we mimic a lot of the capabilities of cluster API for an edge specific solution so what we do is for all other platforms like public clouds and on-prem vmware bare metal with canonical mass we use cluster API to build the cluster and then our cluster agents take over to install all the other components so you hear like all of the different software projects that Bogdan wanted to experiment with that is part of our cluster profile and our agents allow you to do that for an edge device that additional part is the same and our cluster management agent is able to do all of those additional components but the building of the cluster itself OS, Kubernetes, CNI, CSI that is handled through a proprietary solution that is built on top of Kairos alright, thank you all for showing up in such huge numbers and have a good trip back thank you guys have a good trip back