 Good morning, good afternoon and good evening. Welcome to Kubernetes at Edge Day. We hope everyone is doing well on these interesting times. Today we are presenting on Kubernetes at Edge for AI ML solutions. My name is Sachin Rathi. I'm a principal for Edge Cloud Solutions at Red Hat. Hi, my name is Amul Chaubey. I'm a solution architect lead at Red Hat. So let's go over the agenda. At the part of today's presentation, we'll go over the following topics. We'll cover what is Edge to understand the definition of Edge. What are typical challenges at Edge? Deployment architecture type for Kubernetes at Edge. We'll go into machine learning and understand what is TensorFlow. What technology we have used in our Edge solution design? And then we'll go over the architecture for IoT and ML detection using TensorFlow. So let's start by understanding what is Edge. Now, Edge is defined as any computing and network resource along the path between data sources and the cloud data centers. So Edge is not a point, rather Edge is a continuum. As you can see on this picture, the Edge spans from Edge devices all the way back to the centralized cloud. And in between, you can have multiple intermediary locations which can host your additional Edge infrastructure. The location of your Edge application and infrastructure is ultimately going to be defined by a number of factors, such as latency, response times, available compute network or storage resources. So, Edge computing challenges. Some of the key challenges for adopting Edge computing are lifted here. When we look at Edge computing, there are a lot of different choices of hardware and we need to define a solution which will cater this large landscape of hardware. So when it comes to data locality and security, the rules for governing the data changes based on locality of data and this brings in the compliance as well as security issue along with the privacy law. The confidentiality of data can often be at rest and poses a huge challenge. Edge devices are resource constrained and energy sensitive and it's had a huge challenge and workload priority become important. Another aspect is maintenance. Device maintenance is one of the key challenge. For example, how I will update the device or patch the device. Banu challenges. So, banu challenges are as you do to multiple workloads sharing the same network file. So we have to be creative about this and use workload prioritization or network policy restriction to overcome this pump. Also, due to unreliable nature of network is no need to be autonomously managed itself. And these challenges can be overcome using Kubernetes. Kubernetes has different ways to navigate around it or it can be overcome using other edge solutions that we can come up with. So with that understanding of the challenges, now let's take a look at three different deployment models for containers at the edge. First and most straightforward option for deployment includes deploying Kubernetes all in one cluster. This provides capabilities to have a lightweight Kubernetes orchestration where resources are minimal and response times are quite stringent. Second option is where we continue to have stringent response time requirements but we have a bit more resources which however may be distributed at multiple locations. Here we separate out the Kubernetes control plane from the data plane. Control plane becomes centralized while worker nodes are distributed to multiple locations. Finally, we're going to have requirements where we see all containers already deployed in the devices. In this case, the management of those devices gets taken care of directly while another Linux foundation project that we have been recently looking at called Open Horizon. Now let's get into the machine learning aspect of this presentation. So let's understand what is TensorFlow? The definition is written there but typically the TensorFlow is machine learning framework and it might be your new best friend if you have a lot of data and you are after state of art solution in artificial intelligence like deep learning or neural network. TensorFlow has been used in voice recognition, text brush application, image recognition, time series algorithm, video detection and many more. TensorFlow is open source and you can download it for free and get started immediately. You can build and train machine learning models using API like Keram. You can train and deploy model in cloud on-prem or even on Edge. You can see on slide there's the vast ecosystem of TensorFlow deployment type. For our project, we use TensorFlow core as well as TensorFlow Lite. In order to build the Edge solutions, a number of components come into play. So firstly, we need a messaging platform. This is needed to accept data or MQTT protocol. NMASA provides this capability with Apache Active MQ Artemis. Next for real-time streaming, we make use of Kafka on Kubernetes and this is provided by Projects Trimsy. Camel K provides the capability for mediation and routing of the sensor data. We also make use of Prometheus on the operation side for monitoring and alerting requirements. CEP here is used to provide storage for sensor data coming in for training of the models as well as for historical reasons. Jupyter Notebook helps in providing the development environment for building and training the model. The developed artifacts are then checked into a source code management system such as Git from where the container images can be built and stored in container registry such as Quay. Now, let's get into architecture of IoT anomaly detection. So we'll use IoT device simulator in our design. We have deployed Kubernetes at Edge using all in one cluster deployment model. You can see it on your bottom of your screen. We also use Kubernetes on cloud on the top which can be run on any public cloud or even on private cloud. You will see the component mentioned by searching in previous slide in this diagram. Also, from Kubernetes at Edge point of view, you'll see that we will send the alert to Webhook and Webhook can be used for feedback loop control as well as alerting the end user. In the next slide, Sachin will go over the details on training model flow for our architecture design. Excellent, so here the training workflow is highlighted in the red and that's what we're gonna follow. The networking data captured on the devices is sent over the MQTT protocol to NMASA which of course has the Artemis Broker running. This data is then transformed in camel care and from this point on, the training and inference models kind of diverge. Those are two different workflows from this point. For training workflow, this data is passed on to camel care for any normalization and then stored in the safe data store. This allows the models and code running with Jupyter Notebooks to use this data for training and testing the models. Now, once the model training is done, the artifacts and code are then checked into Git. From here on, the build process is kicked off and the resulting image is stored in query registry. From here, it can be replicated to any other registries at any edge locations. So let's understand the serving model from the TensorFlow. If you look at this diagram, the green dots will show you the serving model and we're talking about step number four where the TensorFlow is running on Python application or the serving and prediction. The model serving is simply the exposure of the train model so that it can be accessed by endpoint. Endpoint here can be direct user or other software. We are using autoencoder model in TensorFlow. Autoencoder is neural network which takes in stream data and create output. We are looking at mean squared error to detect anomaly. If the real time data, mean squared error is more than threshold, Prometheus will script the data of top end result and then using Prometheus alert manager to notify the end user. Webhooks can be used as a feedback loop control to re-meditate the detected anomaly. We have designed and prototyped the solution for anomaly detection. We can use this design in a lot of different scenarios. The consistency and predictability become mission critical for this solution. Again, this is not the only design for anomaly detection using machine learning for eight devices. And that is all we can cover in allocated 10 minutes. We'd love to go over our code and design model, hopefully in some other presentation. However, please do reach out to us for any customization of this solution or any specific use case that you may have. Thanks a lot for attending. Thank you very much. So with that, we sincerely hope you enjoyed this session. Here are our email addresses. Please do let us know how we can help. Also, don't forget to provide your feedback. Enjoy the rest of the sessions at Kubernetes HD. Goodbye.