 Thank you very much for the introduction Yeah, and this talk I basically want to answer the four following questions I want to give them a short introduction in what is SAP HANA then I want to explain how you can use it The next question is why are we dealing with graph at all and Which features do we offer and at this point I have to say that I will demonstrate a version of our labs Which includes features that are not in SAP products and we serve the right to not offer These features in future products Yeah, okay, so what is SAP HANA so how many people have worked with SAP HANA? before Quite a few so as a SAP as company is Market leader for business software and these business software has to process a lot of data and hence these applications need a database and SAP HANA is the enterprise solution for that. So it's basically a relational database system. It's an in-memory database That means that all data is Lays in main memory in a very compressed form and You can use it on premise or in the cloud and We within SAP don't think that's just a database Because we think it's more a platform because we offer so much different capabilities like Advanced text search algorithms or processing to spatial information Or we offer also machine learning algorithms Okay, so How you can use it usually SAP HANA runs on in large server landscapes But with the recently announced HANA express edition You can download it and develop on it on it And it's a stripped version of SAP HANA which is limited up to 32 gigabyte RAM so basically It can be used for your project aspect and Okay, before we come to graphs. Let's give a little introduction because graph is not graph There are many different graphs Let's talk about simple graphs or directed versus undirected graphs and basically there are two major Frameworks for graphs. The first one is or graph models. The first one is the resource RDF the resource description framework and this is Introduced for knowledge representation and semantic web it represents the data as triple with subject predicate an object and These vertices and edges don't have any attributes So you specify a vertex has age and then as object you use a literal and it's a standard The second big Graph model is the property graph model in this model each edge and each vertex can have attributes arbitrary many attributes The edges there can be more arbitrary many edges between two vertices and these edges are directed and For our use case in our use cases. We have complex types and their relationships We think that's the natural way to represent the data. So we use the property graph model Okay To explain why we are dealing with graph let's have a look in how graph problems are solved traditionally and Traditionally the graph Algorithms around in the application layer. So an application fetches all the data from the database Which it needs to and then it runs the algorithm That means that sometimes you have to copy the whole graph out of the database into the application and you have sequel as Interface which doesn't offer a nice graph abstraction and Each application who wants to run a graph algorithm has to implement it on it it's all on in the application and a More modern approach is to use a graph database which offers graph abstraction and Now the application can push down the algorithm to the graph database and the graph database Processes the the information and only returns the result But this has also disadvantages So usually you have to replicate your data from the database into the graph database And you cannot combine it with different features that are already in the in the usual database So you don't know or it's not clear in any case If you just want to replicate the graph data or the graph data with some attributes because you also want to process additional attributes Okay, let's have a deeper look into the interfaces and the requirements for graph databases and We think there are two Different kinds of queries The first one is somehow pattern matching or traversal style and they usually are very local in a graph They access only a short piece of the graph They are highly selective and typically they are read and write transactions So a typical example is loading and I'm updating a Facebook page and for these type of query Graph database is usually offer some declarative interfaces like sparkle cypher or pgql and the declarative interface offers the abstraction from there how to what data should be provided and it gives an optimizer to The chance to deal with low-level stuff like selectivity estimations and parallelization The second type of query is the graph analysis. Usually these graph analysis are much more complex. They touch the whole graph and They are usually read only and in typical examples are page rank or connected components and For these type of algorithms graph databases usually offer API's or domain specific languages that are Specified for these type of queries Which means that they offer constructs compared to to general language General purpose languages that you have the complete expressiveness, but they also add building blocks Which makes it easy to to write it in in a graph way what you want to have and typical examples therefore are Green mal or Priegel Okay Let's come to our approach. Our approach is to add graph functionality into the racial database system and we want to offer equivalent graph abstraction as interface like a Like graph databases do it and so for the declarative interface we want to use open cipher and For the analytical graph queries we offer graph script, which is a language we defined and in some minutes I Will talk a little bit more about it and within the database we can process the graph data in a native and so in an efficient way And the big advantage is that the graph that the graph engine is completely integrated into HANA So we are completely transactional. We do not have to replicate the data and We can process the data that the most current data. So you never work on outdated data and why we are Due to the fact that we are in the same process. We can have direct access to the data Which are already in main memory because in HANA all data are in memory And our features can be combined with all other features from the database like to spatial Processing and that's persistency. We use we still use the the tables of the database. So we use the infrastructure What you need to provide to get these graphs abstraction is basically a set of attributes Which need to be there? So we need a key for each vertex to identify the key. We need for each edge the source and the target vertex and a key because we want to have more than one edge between two vertices and So we can create a column table vertices and define this with a key and additional attributes like you want and Then we have the edge table In which you provide the key column source and target column and the source and target column should reference to the key column of the vertex table To enforce that the graph is consistent And if you have this schema, then you can create a graph workspace, which is a metadata object and and It works as a few and gives you these graph abstraction. So in the graph workspace schema you graph workspace statement you Say which edge table which vertex table you are using and which are the attributes of the corresponding tables And then you have the graph workspace and if you want to apply an algorithm on top of and use the graph in the Graph workspace you just have to say I want to use this graph workspace From data by theory. Usually it's not a good idea to denormalize your data because you have problems with updating data and now we have We know that vertices need to be in a table and then usually there is a type so can be a type of person and can be type of post and Sometimes it's it's better to have persons in a table and and posts in a different table and normalize data for this case if you want to stay in your know normalized Data scheme then you can also define views instead of tables and Create a graph workspace on top of the view and everything works Okay, and I want to use a little demo I want to show a little demo and for the demo I have here's data that represents a social network and for vertices we have posts persons at labels and for edges we have Created so a person created a post or liked a post and a person can follow another person and we build a graph workspace on top of it and This is the basic to run all the graph algorithms So let's get a little bit deeper to what we offer and Let's talk a bit about The declarative interface so for petr matching we started to implement a subset of open cypher Open cypher is a project from a neotechnology which Publishes a part of the cypher syntax for free that different database developers can also implement it and we started basically we can match edges and vertices and we can Express arbitrary complex filters and it we can project the result and sort of result So in the first example, we are looking for two vertices Which are connected by two edges and the first edge is of type creates and the second edge is of type likes That means that we are looking for persons that created a post and liked their post In the second example, we have a path and Here we are looking for for persons that Follow the person with name Franziska Schwarz and the the follow And The edges in the past need to be follows so this is defined here in the old statement And from this person's we need to we have a look in what posts are created by this person's okay, so Let's get to the little Data Probably you can't read it. I'm sorry for that So I have here an HANA express edition and I now have Uploaded the data so in the social network Sheen I have two tables which you can see here's the edge and here's the vertex table now I create a graph workspace on top of it and To execute cypher we need you need to create an Calculation scenario, which is an analytical view. It's in a specific HANA extension On top of it and we need it because we need to specify on which vertex we want to apply the cypher statement so we create this calculation scenario and If you have created the calculation scenario the cypher stands here the rest of the calculation scenario can be Fued as a template just copy it And then we select the calculation scenario we get the result Which is in this case are two two vertices from Francisco Schwartz, so this was the first example where a person liked and created the same post and again for the path we create The calculation scenario and we can select it and we get all the posts that Persons that directly or indirectly followed another person Road or created Okay The next features we offer are built-in algorithms and we have three of it The first one is a neighborhood search Neighborhood search is basically a breast first search which returns the vertices and The level in which the vertices are as input parameter You you need to provide one vertex or a set of vertices You have to specify if the direction of the which direction or which vertex you want to traverse if you want to Traverse outgoing or incoming edges or direction any if you want to ignore their direction You need to specify if you want to which levels you want to get as result And then these levels are the usual breast first search levels And you can additionally apply vertex and edge filters As output you get a table here with and in this example when we start from Frankfurt, so this is basically a Small graph with vertices of German cities and Think of it as a transportation at the work so when we start from Frankfurt and want to Go outgoing edges from zero to two We reach Frankfurt as Sears as the first so the the second level. It's the start vertex should get immunity in the second level and Berlin in this and Sorry Berlin in the second level in Munich and should get in the first level Okay, the second algorithm is a single source shortest path Basically you have to provide a start vertex and the Rate Which we want to use to calculate the shortest to define the shortest and the rate Needs to be a column of the edge table so it needs to be an attribute of the edge of edges in general and you can also omit it then we take one as weight for each edge and Again for for the output you get all reachable vertices with the cost to traverse from the start vertex to the corresponding edge Okay, a third algorithm we offer strongly connected components so in within a strongly connected component each vertex is reachable from each other vertex and This algorithm don't need any parameter and as output you get all vertices with a component number and All vertices with the same component number are in in the same component Okay, let's talk a little bit about graph script Graph script is a domain specific language that is easily designed to solve graph problems and I will Here I talk a little bit about a vision of what we want this language to be It should be a high-level language You should be able to write your own complete graph algorithm in this language But we will also want to offer building blocks that Graph specific and they deal with low-level concerns and these buildings book can be typical standard graph algorithms Vertices edges graphs and other objects like like path and trees should be Types of this language which makes it possible that You can return You can return a graph type as result and apply a second algorithm to it So for example for shortest path you can return a set of path or one path and then you can Loop over the the set of path and for each path you can apply a second algorithm and It is it makes it possible to optimize in the background because when we call a standard when we call an algorithm and We call it with a tree sometimes we can use a much more efficient algorithm than if we call the same algorithm with a general graph the idea is to to make Make it make custom logic combinable with standard graph algorithm that you don't have to have to Completely write the complete algorithm on your own but use these buildings blocks and use a little bit of your own logic in between and This should be so flexible that you can solve all your problems for your domain And In the background or we use just in time compilation for that There is the goal that you have to write only in a high-level language and we compile it In a low-level language and use this one and our goal is to To be comparable fast as if you would write your graph algorithm Directly in the low-level language. So we want to offer you An abstraction, but we don't We want to exactly we want to offer an abstraction, but we want to lose performance I Answer answers after talk Okay, so Here are a few examples like how the syntax would be So graph script is a proceed procedure language and here We create a procedure For this example, I I want to calculate the average out-degree of my graph the first line I Bind my graph workspace to a graph variable Then with the vertices construct I Take all vertices out of the graph and get it as a multi set So so a container and I can do the same thing with the edges and then I can count my elements in the In these Multi sets and can divide the number of edges by the number of vertices and get the average out-stick key A second example shows some basic construct. So here I have three three built-in functions I Slightly want to adapt an is reachable algorithm here in the first line I again take take my graph workspace then I fetch an edge out of it and and with this a specific key which I provided as input parameter and Then I take the target vertex of these edge and I fetch a second vertex with key I Took a second input parameter and now I call this reachable built-in algorithm on graph G from vertex V to vertex V to and get the result Okay, and the third example is Should show that we offer a general purpose programming language construct like That you can loop with a for each over a container and use if else blocks so here in the example we want to sum the year of birth for For two vertices if they are persons Whatever reason there might be that you want to do that, but you can do it. So you again you can Fetch the graph workspace get two vertices Can put these two vertices into a set of vertices can loop about over this set and With if you can query if the type is person and in this case In this case that is a person you can sum the year of birth Okay, again, I will show you that these examples work So here This is my my first procedure with the average out degree. I just create the The procedure and then I can call the procedure and get it out get the out out degree And with this reachable, it's the same so here I can also specify Here I've specified the input parameter. So I took edge with key 111 and Vertex with key 8 and I had a look if if the Target vertex of this key is reachable and that's the case and also the last procedure Deliversive result. So in this case I get 3980 so both vertices are probably persons because the otherwise I wouldn't I didn't get so much a so big number Okay, this These are basically the the features we offer so not all features are in the product but Many of them are and you as I said you can download HANAx presentation play on your own with it And here is the link Where you can easily download it so you have to log in into an account you can download it You get an image of a virtual machine and then just put it into a virtual machine and everything is installed In it. Okay. Thank you very much. Yeah We have time for question So first question Do you contribute anything of your work to the open source community and the second one you use open cipher for better matching? Mm-hmm, but you use your graph script developed by you for the other stuff Is there something a cruel end to graph script in your source for two? And so the first question is no the Nothing is for open source, but you can use it for free, but it's not open source You can build your open source application on top of it And the second one and there are Other domain specific languages, but I have I don't know if they are open source So there are typical examples, but I can't tell you if they are open source The ones I know are not There are reasons to develop graph script instead of using cipher expressions The the The point I wanted to explain is that for our we think that there are two typical types of queries one is the pattern matching style and There you can use this declarative interface to and optimize it But if you want to really to solve any problem really completely Domain specific and you want to have the complete expressive left you need to have a different type of language Okay Okay, so basically what I am an edge needs to have Key and the source in the target column But usually you want to represent also So so we allow additional attributes and type would be an additional attribute Exactly Vertices Okay, you need to Provide basically one table or one few and Any attribute you want to have in for for all of vertices So and the the columns in this table or in this few are a union about the attributes of all vertices But when a property is missing for a note, you have no value exactly. Yeah, so you don't have the actual flexibility of So the the property graph model is schema free from the model But in the physical representation here in our case, we need to have some constraints. Yeah Like Yeah Then you then you have to view can you actually write on So Yeah With multiple tables Yet we support writing by sequel only so you trust you update your data with sequel and then you updated the corresponding table and everything Then it will be fetched You You can develop on it, but for more concrete Information, please contact the license agreement. There is a license agreement read it I don't have all the details in it here present that I can explain it to you