 Thanks for coming. My name is Kevin and over there sits my colleague Christopher and together with our both students Timo and Lukas we are currently developing on Glube and short disclaimer but wanted to do that. Currently we are hiring we have a lot of open positions and projects in various topics so if you're interested please visit our page and of course send an application if you want. Okay I guess motivation I can do it fairly shortly since we are here in the craft dev room and you all know what graphs are you know that they can be directed they can be labeled and they can have some certain properties nodes and edges. Yes so this is basically known as a property graph model right? So let's take a look at Glube. So Glube is an open source framework which can handle heterogeneous and now temporal graph data in kind of a distributed manner. If we take a look at the Glube stack we will see that we are embedded in the Hadoop ecosystem and with that we can use graph storages like Apache Cumulo and Apache HBase as data sources and on top of that we are building on Apache Flink. For those of you who doesn't know about Apache Flink, Apache Flink is basically a distributed data flow engine with which offers certain data structures and certain data flow operators which we use to implement our model and our graph operators. So currently we are on Java 8 and we are using a patch license 2.0 and we are yeah we are releasing weekly snapshots and we are deploying our stuff to maybe in century so if you want to get in touch you just can use our maybe dependencies. Okay if you use a Glube you will notice some unique features which are basically extensions of the common property graph model. First of it would be the concept of logical graphs. A logical graph is basically an abstraction of a subgraph which gives the subgraph more meaning. Like nodes and relationships a logical graph can also process labels and properties. Behind me you can see three partways overlapping logical graphs. Yeah Glube is able to process one single logical graph or all of them in the graph collection. The second feature that comes with Glube is a set of operators. Since we have those two classes logical graphs in graph collection we can divide our set of operators basically in two classes and one class that can consume logic graphs one or two and one class of operators that can consume a graph collection one or two at a time. For example our grouping operator up there is consuming a logical graphs and produces a logical graph. On the other hand our pattern matching operator consumes a logical graph and creates a graph collection and in addition of that we are also able to run graph algorithms on logical graphs as well. Okay and the third extension to the property graph model is our temporal extension. Our approach to handle over time evolving graphs is to add B temporal attributes to all the elements of our graph model. So to nodes to edges and to logical graphs as well. We distinguish between two types of temporal attributes. First validate time or valid time and this time well from and well too. Valid time is basically a user defined property. You can decide what data you want to store in there. So for example Chris over here started at a company in 2011 and his contract goes up to 2019. On the other hand the transaction time is a system-based property. So for example the transaction from timestamp was set as soon as the element gets created within the Purdue system. Those time attributes are first class citizens in our graph model. Okay as soon as we finished to extend our graph model with this time attributes we added another class of operators. We added specific operators that can handle those time attributes. We basically implemented two new types of graphs. Temporal logical graph and temporal graph collection and we made sure at the time we implemented this that all the classic operators are also able to use temporal logical graphs and temporal graph collection. So there's no trade-off in using the classic model or the new temporal property graph model. Now we implemented four sorry three operators snapshot difference in grouping. Pattern matching is still work in progress and what you can do with those operators my colleague Chris will tell you. So thank you. So now I will continue with a very simple use case to show the flexibility of our operators and how we can change them together to build and temporal analytical workflow with our Purdue system. So imagine we have a hospital with different services and sites or accident and emergency to surgery, cardiology or oncology and imagine the employees wearing RFID sensor that captures when people meet in this hospital. That means two people talking to each other or get in a smaller radius than this RFID sensor captures that and collects data like these two employee IDs and the time stand when this meeting happens and when it ends. So we have here a time period when this connection happened. So and this is captured over weeks and weeks in this hospital to see okay how these connections evolve over time. And another fact is that we have a virus in our hospital. It's called virus X here. The symptom is that third eye is growing in the on the head. So you see when someone is infected by that virus and another property is there's a transmission of that virus to another person is only given if the contact of them is over five minutes. So it's a quite simple constraint in our case and incubation period as five time units in our case. That means if someone is infected and we detect them death then we have to have a look at the time period from five periods ago these infection to see okay which contact these person had in the past to see okay for example which services of this hospital we have to put in in quarantine. So okay so our question is in case of an infection so with hospital service are at risk on contracting the virus. Okay so imagine we have this full history of our graph and connections here on the left side we have our time scale here at the at the bottom and we are just with our API loading our graph from a data source it could be a CSV from HDFS or a cumulow or edge space and we have a lot of data integration operators to parse these contacts from the RFIDs to a logical graph on our case here a temporal graph. Okay and then we have a look at one specific edge how it looks like we have to like here two employees of the service oncology and our contact edge is attributed with these time interval that represents the time when these two people meet each other. So these are the data on the left side we see okay here five time units contact I marked them here in our timeline just for example. And now the case arrives breaking news virus is detected on one person in the oncology at time point T20 and I marked them here in our time range and now we want to know in the last five time units because that is our incubation period what happened inside here in our temporal graph and therefore to achieve this to don't look at this whole graph but only the part or this period of the graph we can apply our snapshot operator. The snapshot operator is applied on this full history of this graph and consumes a temporal predicate and we are we built some predefined temporal predicates oriented on the sequel 2011 standard the temporal extensions of sequel like from two as of between and and so on but you can also implement your own temporal predicate if you have any special conditions or something else. So in our case we applied it from two operator from time point 15 to time point 20 and if we are applying that all edges removed that are not considered to that time time range here. So and now because this graph can be very a huge we maybe we will see it here in the small example on view here but if this graph is very huge we need a better view on all these connections so for that we are applying our group by operator. The group by operator gets a specific grouping configuration as first argument we need a specification where the vertices or on which criteria the vertices should be grouped and we are selecting the service property of each vertex to gain a grouping on that all employees that working in a specific service are grouped together to a super node. A second argument we need we can give aggregation functions for that vertices so which this super vertices should be aggregated we don't need anything for our use case here. The third argument is where we want to group the edges. We want to group the edges by label because we have only connected edges here inside and we want to aggregate all grouped edges to calculate the maximum duration of that contact to know because our rival's the transmission is only given if the contact is larger than five minutes we want to know okay what is the maximum duration of a contact here and if we apply that group by operator we have now grouped all vertices with the same service together and on each edge we have a new property called maximum duration here that captures the maximum duration of all grouped edges inside our graph. And as you can see here now the only durations that exceed these five minutes are inside here this from the oncology people and between the oncology and surgery and inside this surgery. Also this duration exceeds but they are not connected to the other graph where the infected person is so that means we have we don't have to consider this and then we can extend this workflow for example by filtering now for connections over five minutes and we know okay the employees of the oncology and surgery has to be put on a quarantine to prevent the spreading of our virus. So I know this is really a very simple example but we you see you can you can glue our operators together to build this analytical workflow and now we can put the result of that in a we have a dot file thing where you can look at this graph with a graph this or put it in a CSV back to an HDFS or back to a cumulo back to edge edge base and so on. So this is only a short example to sum up all the things so we have a distributed graph analysis platform called Gradube. In Gradube we have a temporary property graph model with bi-temporal support. In my example I just talked about the developer time but also transaction time is supported by all these operators. We have logical graphs and sets of them called graph collections and we can compose all our operators together to build or have a flexible mechanism to build an analytical workflow. And if you want more information then visit our website, Gradube, this is redirects to our GitHub repo. We have a good wiki where every operator is explained how it works with examples and so on. We have a getting started guide where there's a short input graph and you see a sample workflow how to work with Gradube and we have a lot of examples also in this temporal area then you see how Gradube works in specific. Since every operator is built on top of or with flink operators everything is distributed in our environment and the papers that are linked on the Gradube page have also some evaluation results that shows the speed up on bigger clusters. Yeah and that's it. Okay, thank you. Thanks a lot. Unfortunately we have only time for one or two questions because we have a little hiccup in the schedule and next talk actually begins in five minutes. I was wondering whether you can also catch relations between time windows. Maybe you see the contact of one contact was on Monday, one contact was on Sunday. Then of course the sequence is important. For instance if you take your virus example maybe if A has contact with B and day later B with C then of course A can lead to the infection of C. But if B has first contact with C and then A has contact with B then C cannot be infected by A. Yeah that's right. And is this something that you can model with your language? Yeah so I think this question was regarding like time-respecting paths in our case. And this is one point of future work in our system that we put them in a temporal pattern matching operator where you can define in your pattern that you're looking for the chronological order of these things. And in this example here this was only a grouping on that small period and we consider okay if these periods are like in one hour then we consider all connection in this hour and doesn't or the chronological order doesn't play a role in that example.