 So welcome everyone. My name is Francesco. I'm from Red Hat and I work in Red Hat Hi-Hi CO team and I'm specifically part of TOT which is a project focus on our commander system for AI software stack Today I will talk about graph neural networks. So this is a Different type from the traditional one because they focus on a specific structure data Which are the graph and you will see why this is so important today Especially because you have many types of graph everywhere social networks Knowledge graphs and many types of graph which contain many useful information that we want to use in neural networks So I will not be too mathematical. I will not use too much math Because I know it's quite late so This is the agenda for today, I will briefly introduce neural network graph So the basic components for graph neural networks, and then we will go a little bit more into the details so We will try to answer this simple question. So what is a neural networks? What are the graph? Neural networks are also graph. Yes, actually deep neural networks and graph deep neural networks So Neural networks is a very broad Topic it will take I think months to explain all of them, but I will try to do that in few slides so First of all from a general overview We have the artificial intelligence domain So you have robotics and other fields then we have a sub domain, which is the machine learning one So in machine learning we try to teach the machine a specific task using Different class of methods. We have a supervised one unsupervised one and the reinforcement learning Then we've gone a layer deeper We have the neural networks which can be adopted in different classes So you can use them in supervised and supervised also in reinforcement learning and This type of neural networks are one type of approach that you can use in machine learning The last layer is the deep learning. So deep learning is when we basically use this neural network with many many layers to learn Deeper knowledge of the of the data set So Let's start with an analogy. So first of all, we know that Neural network are inspired by the most Efficient and powerful machine existing which is the brain Here we see on the right corner on the right on the right corner that you have the Dendritis, which is a basically receiving some signals and this goes into the cell body, which is processing the signals and the output is sent to the other neurons as An analogy we have the artificial neurons which receive Basically different inputs as numbers and these are these will be processed somehow and with some ways and biases in order to produce some So this is the general Way how the neurons has been the artificial nerve and be inspired But of course this output Takes many and it needs to be Bound somehow otherwise the output could be between minus infinite and plus infinite. So this is why we how do we decide if the neuron is Activated or is fired we use activation function. So this activation function gives you a range in which the The neuron can be Fired or not and there are several types of activation function and this activation function are different depending on the architecture and for computational Purposes and performances. These are also different types and So we know how the single neuron works. So now we go to the Neuron network, so we put together many types of neurons and Typically, this is from the input to the output you go from left to right. So they the data Flow in something called the feedforward propagation because you go from left to the right in food forward and then For each neural network, so you can have input layers hidden layers and output layer in deep neural network you will have much more hidden layers and This is how the general structure of an error networks appear, but how does it actually learn? So in order to let's say train a model, we need some data set so the data set is Let's say one of the fundamental part in machine learning and how to basically train the model in order to predict for some specific application Then we need an algorithm able to Give you the Let's say the weights and the biases and how do we Quantify if the algorithm is getting to the right performances We define a cost function and we minimize this cost function using an algorithm, which is typically the ground in this end This is a very simple way to explain how they work So of course in practice there are more complex thing to consider there are parameters there is a learning rates and Other issue that you need to deal with when you train a model But this is for the sake of this talk. This is just what we need to know at the moment and just to give you a little comparison again with with the brain we know that artificial neural networks are different from the brain in the sense that in terms of size it said that the brain is around the 3300,000 in the trillion of neurons but artificial neural networks are much less because of computational purposes and limitation of the machines in terms of speed the neural networks the artificial one are of course Depending on the power that you have of your machine while the one of the brain depends on the age on the gender or many other factors The training algorithm for the neural networks for the artificial neurons are the kind of descent while for the brain We still don't know actually the power consumption for the brain is quite low, but for artificial neural network actually depends you can have training models using many GPUs and many Machines so this is quite demanding in terms of power The field of application the artificial neural networks are focused on a specific task while the brain can actually deal with many tasks and can learn very easily other things and Regarding the signals as we saw at the beginning the comparison the analogy between the two They act more or less in the in a similar way, but this is just to show that we are still far from mimicking the behavior of the brain, but we are closing the gap we will close the gap in the future probably So I show just a simple neural Architecture and until now, but nowadays we have many type of architectures. I will not go through all of them, but I will show you some very famous and most used one for many application like image recognition or natural language processing This is convolutional networks convolutional networks are used to extract local features that can be common to the across all the data set and Typically they work very well from for different tasks. Of course sometimes they can they can also Fail for example in like in this case There are other type of architecture like recording their own networks, which are used for basically for when we need to have Loops in the neurons that will mimic how to have some Memory in the in the record in the neural network. So to remember some context This is important especially for application with the machine translation and text in general So I could go I will stop here related regarding the neural networks. We could go ahead, but For this is what we need to know until now. So now we will move to the graph part So the graph Basically spread in many or almost all sectors. We can find some example with graphs In general, what is a graph? A graph is a type of data structure, which is made by nodes and edges these nodes could contain entities or some specific Object and edges will represent instead the connection between these nodes or these entities there are many types of Graphs that we can classify in different ways first. For example, we have direct and undirect graphs so And that graph can be basically traversing many direction while the direct graph will follow up a size path The Graph can be weighted or level so you can have several type of numbers or you can have strings but in general they can be very different and the approach to Work on this graph can be also quite complex and different respect to the simple ones They can be planar planar if we put basically the graph on the on the plane on a 2d space You can see that the edges are not crossing then in that case is a planar Graph otherwise a non-planar graph The graphs can be dense or sparse dense is basically depending on the ratio between the number of edges and the vertices Homogeneous and heterogeneous if the type of nodes and vertices and edges are similar static or dynamic if the The graph is actually changing in time. For example, if it's for a social network something the graph is dynamic because it's changing in time But how do we deal with this graph from? For in computational words, so there are some type of matrices This is very simple math so we consider three nodes for example and we can we can construct this matrices this matrix that basically How the nodes are connected between between each other you can easily identify how many edges and which edges are connected between the nodes The degree matrix is a diagonal matrix. I show here as a single vector, but You can see that basically it tells you how many edges are for each nodes and there is another Matrix well known that is used especially in in dealing with graphs And this is basically the difference between the diagonal matrix and the other the adjudgency matrix that I showed before And This is a list of examples so graphs we can find graphs basically everywhere and In other language processing social network, you can see that they cover almost every type of sector and this type of data are Produced in many companies. So this is very important. And this is a problem that we need to deal with So Now I will go a bit Into deep into the graph neural network. So now we have we should have the basic Background, let's say to understand how the graph neural network works So first of all, let's focus on the motivation why we went from The traditional neural network to the graph neural networks. So first of all is from a data perspective So most of the traditional neural networks will deal with the Euclidean data And so we find images numbers text audio This is something that we are dealing today with most of the traditional neural networks But there is another class of data which are called which are called non Euclidean data And this type of data requires a different types of approach in order to be dealt with So you find basically all the graph structures trees networks manifolds and non-Euclidean data is actually a upper set of the graphs so the graphs are Part of this day this data, but there are approaches which are called geometric deep learning So this is the set above the graph neural networks and No, I'm sorry The second reason is related to the graph embeddings So there are typically three approaches that you can use in order to embed the networks So embedding networks means to move from the graph Basically from the nodes from the edges to what two dimensional vector and once you move to the two dimensional vector You can basically apply all the other conventional Neural networks But what we wanted what the graph want to do is to basically use this this representation of networks and most of the new neural networks are basically based on on on the graph embeddings and The third reason is basically related to the human race So we want to go as we want to get a step closer to mimic what the brain is going to do So we need to start to deal with these types of of data So Description of the graph neural network so graph neural networks are one type of geometric deep learning as I said geometric deep learning is the Domain that deal with all the non the non-Euclidean data, but graph is one subset of these particular approaches and What we can actually learn with the graph. So these are the typical Type of problems that you can ask when you want to use graph. So you want to predict the node classification Which basically means that you have a graph with some nodes Which are level and some nodes which are a level and you want to predict What is the label related to to these nodes and this is important for many fields as you can see there is predicting preferences for user In a social network social network or to predict the customer if they should get a loan or not The second type of problems that you can have is a link prediction. So you want to find the basically hidden connection between Vertices and this is very important for the commander systems in different fields Community detection or they also called cluster detection. So if you want to identify some cluster inside the graph and that can be basically Cluster together, so if they have similarities for certain nodes and you want to discover some Specific cluster inside and this is very important if you have a very huge graphs and this is also Used in a social network or for websites Graph and sub graph classification, so if you want to classify graph This can be important for many Application as you see there is also images because the graph are Upper set of the also the Euclidean data so you can map also images to graph So this is an extension of what you can do with With the graph Regarding methods and application so the core problem of the Of the graph neural network is how to deal with the representation of the graph in a way that they can be used in a machine learning model basically and Here is the main idea So as I said before we're talking about to graph embeddings if we focus on one of the problem I said this which is not the classification This is what is what is done in another bedding. So typically you have for each node. You can build some But for each of these nodes you can build a computational graph you can aggregate this This graph for each of the node and basically you can have a simple representation of all the graph for one targeted node So the idea is similar and what actually was done is to Start to consider Aggregation function for all the nodes Then to define a loss function as I show at the beginning how an error network works I basically to put there in the middle and never networks So then you train on a set of nodes So you decide which nodes you will use for your training model and then you will test them with the other notes And finally you can generate embeddings for the new nodes. So the core Problem on how to define the graph neural network is deeply related to the to how you define the aggregation function So regarding a bit of history the graph neural networks are Quite new I would say respect to the other type of neural core the traditional neural networks And they were born basically in 2005 This is just a little steps that you were using initializing random embedding for each node and they were using some constrain Algorithm to iterate on all these nodes, but this has some limitation that has to be overcome Later on and you see that there is a gap of Quite many years before moving forward because they were mostly focusing on the epilinear data And so the solution was to use one of the type of recurrent neural network applied to the graph neural network And you and this was a huge improvement The other at the same time there was Starting to move from the convolutional network start to be applied on the graph So what I show at the beginning for the images which are in today can be also it was extended to the use of graphs and this led to the Creation of new types of neural network for graph which were based on the convolutional networks and This was basically from now on there will be like 20 or 30 papers every year since 2017 which are focused on new type of approaches and most of them comes from applying what we learn in traditional neural networks and we now we are applying them in the graph neural networks So how do we distinguish most of the graph neural network? As I said is mainly related to the how do we aggregate the neighborhood function and this is basically all the graph that you can find Nowadays so most of them are Distinguished because of the aggregation step that they take and then you have maybe you have some update function Which are specific for each of the of the methods, but as you can see there are all the Traditional ones or you can find convolution network gated Graph neural network all of them take some ideas from the traditional one and Regarding the application of the graph neural network Well, they can be applied almost in all the domain that we saw for the traditional one The advantage is that we have the use of graph which have semantic relationship between the nodes and And can be much more information and in the in the representation So you can see here for example how you can deal with the images in the in the graphs um text In almost every field in labeling neural machine translation They have been they start to be applied almost in all the domain that you have in traditional ones Physics so here is very important especially for the type of because here the data are especially graphs for molecules and All that is related with biology and Also in combinatorial optimization. They were used with the salesman problem They are starting to try to use also the graph neural networks And a knowledge graph and graph generation Related to the this is the last point. So what are the issue at the moment and And You can we can see that there are some related to the structure So not we cannot use the same number of layers for example for the graph convolutional network applied to the graph neural networks because there are some Problem with Over smoothing and this needs to be deal with other type of neural networks Scalability there is not just a generalization for the in the in the scale of the That's a new type of graph which deal with very big Sizes Iterogeneity so most of the graph neural networks are based on the Assumption that the graphs are homogeneous so that you have similar type of notes and similar type of edges This still a big open Point in research is the dynamic graph also how to deal with graphs which changing time Interpretability which is also something that we're dealing with the traditional neural network And there is still not a consistent theoretical framework So there is not a precise way to say that we have to use this kind of neural networks on some specific application So conclusion Graphs are basically spread everywhere. We can have You can find data based on graph in every application We can use graph also in with machine learning and they can be applied in many tasks We are taking a little step closer to mimic the human brain and there are still some open point which It's a which are open in research at the moment And Yes, really nice talk, but I had a question regarding training the neural net. So I'm assuming this is frame using backdrop. So Is there any K taken that there are no loops in the graph so that how does backdrop even work doesn't it Go in loops and the gradient blowing up and things like that. Does it how do? training this network actually work so the idea is To use the the core problem is how to represent the graphs in a way that we can insert in a machine learning problem so what I show is how to define this aggregation function in order to move from the space of the graph to the space that can be applied in traditional networks and The most of the initially the algorithm was different from the back propagation, but then thanks to the new architecture You can apply basically the traditional Gradient descent problem is also in the graph neural networks You assume that While your What do you mean there are no as in the entire network is there any loop inside the like it's a graph I do do is ensure that there are no cycles in the glass graph Well, there are of course some of the graph Some the assumption is that the graph can be like Homogeneous and there are no cycle inside the graph, so there are still some limitation of course for some type of graph So there is this assumption still that needs to be built with