 Good afternoon, everybody. My pleasure to be here. So I'm interested in the program verification. And one way to do it is to use the language-based approach. So here on the left side, you have a sequential program with some safety property, like a violation of some assertion. So one way to check algorithmically this property is to use a language-based approach. So in this approach, what you will do is that you will extract a push-down automaton for the control of the program. And you will extract your finite state automaton for that speaks about the data of the program. So for the control, we have basically the valid sequences of statements, but you don't interpret them. It's just a sequence of statements that the program can do. But you don't give them interpretation. And the finite state automaton will restrict among those sequence of an interpreted statement those which are semantically valid. By taking the intersection of those two, you obtain the program execution. So now if you look at the type of those languages. So on the top for the control, you have a context-free language. At the bottom here, we have a regular. And the intersection of the two is again a context-free language. And the safety checking problem on my program reduces to language emptiness on this context-free language. So this approaches limitations. And one serious limitation is that for the regular language, so you can only model finite data domains or finiteized data domains. So here in this talk, I will show what you can do when you want to consider infinite data domains. And that's the picture of the result. So on top, unfortunately, we will not consider fully the context-free languages. We're going to consider an under-approximation of context-free languages. And this under-approximation will be parameterized by some integer. And the intuition is that the higher is the value of the more execution you cover. And at the bottom, we're going to consider patronet languages. So the idea here is that one marking in the patronet represents one data value. And since you have potentially infinitely many markings, so you model infinite data domains. So I will show that the combination of the two is actually what I call an extended patronet language. And remember that the safety checking reduced to some emptiness problem. So it will be the case also here. And it's also decidable. So that's for the big picture. Now let's get into the details. So first some notation. So here for the context-free language, I will not use the touchdown automaton. I will use context-free grammar instead. And context-free grammar that generates bounded index context-free language, which I will define in a minute. And for the data, we have a patronet with two markings, an initial and a final one. And that's the question that we are asking. So we are asking if there is some word in the language, so in the bounded index context-free language. And this word is a sequence of transition. And these words bring me from the initial marking to the final one. So I will sometimes use this picture here, where this depicts a derivation tree in the grammar. And at the leaf, it gives you the word that's generated. And you want to go from the initial to the final. I will also use this symbol. OK. So let me define what are those bounded index under approximation of context-free language. So just a quick reminder. So a grammar, you have a set of variables. You have an initial symbol. You have an alphabet. So here I use t because my alphabet is actually the transition of the patronet. And then you have a set of production. You have an example of grammar here. So derivation. So this is just a sequence of words over the variable and the terminals. Such that from one word to the other, you apply some production. OK, now a more interesting step. So what's the index of a derivation? So for the index of a derivation, you look at each step in the derivation, and you count how many variables you have. So here I have one variable, two variables, one and zero. So it's just the maximum number of variable you have seen along the derivation. So for this derivation, the index is two. OK. So the definition of the language generated by a grammar. So that's the classical one from textbooks. And this one, so it's the d-index under approximation of g is only all those words that you can derive using some derivation of bounded index. So if you want to think about it intuitively, so know when you derive a word, you have a budget on the derivation and you must stay under that budget. So for the above grammar, so for any grammar, the index, the approximation of index zero, it's always empty. For the above one, the index, the under approximation of index one, it's also empty because the only choice you have at the beginning is to apply that production rule, which produce two variables. And the index two approximation is this set of words. And in that case, we are very lucky because it coincides with the context-free language. So what to get away from this slide is just that the bigger the d, the larger is the under approximation of the context-free. Okay, this is one interesting result. So here when I define the d-index approximation, I had to change the definition of what's the language generated by a grammar by putting these constraints on the index of the derivation. So it turns out that given a d, you can actually build a grammar such that it embeds the budget constraints directly in it. So, and I'm not gonna go into the details, but the idea is that you're gonna throw indexes on each variable, yeah, you're gonna throw indexes for each variable of the grammars, and then you will rewrite your production rules in that way. And the interesting case is this one. So when you have two variables on the right-hand side, so one can keep index i, which was the one of the left-hand side, but the other one has to go one index lower. So you cannot have two sums with the same index. So one of the two has to decrease. And the language of that grammar, so for let's say d equals three, is coincide with the three under approximation of that language. So okay, now you might think, okay, he's gonna be speaking about those bounded index approximation, but why should we care about that? So let's see what are the pros and cons about those approximations. So if your language is regularized, generated by a grammar of that form, then the index one approximation coincide with the original language because you only rewrite a variable into another variable, okay? Also if the language is linear, so for instance here, this grammar generates a to the n, b to the n, also the under approximation of index one, coincide with the language of the grammar. So you may notice that if you use push down automaton for such a language, you cannot bound the stack space that you need to recognize that language. Okay, and then there are some more exotic results that say that if the language is a subset of the trigger expression, then there is there existed d such that under approximation coincide with the original language. Also if you assume commutativity of concatenation, meaning that in that set under that assumption a b equals b a, so you can also, you also have that there exists some value d such that it captures the original language. So and the negative result is that for instance, visibly push down languages are not of bounded index. So if you consider the language of well parenthesized expression where a is opening parenthesis and a bar is closing parenthesis, so there is no index that will capture the whole language. Okay, so now you know what are those bounded index approximation. So in input we have a grammar, so we have a value for the index we want to consider and this patronet. So, and this is the problem we want to solve. So let's go, actually not yet. So let's do something much simple. So let's forget about patronet for now. Let's just take finite automaton, okay? So I replace the patronet by finite state automaton with initial and final state. And what I want to know is that it's the same question. So if there is a word that I can derive in the grammar and from the initial state it leads me to the final state here. So it's basically the language intersection problem for regular language and a context free grammar. Okay, so and how I can solve this problem. Okay, so I'm starting from this X zero variable. So if it is the case that there is a production that rewrites this variable into sigma and from the initial state in the automaton by reading sigma I end up in the final state. So then I'm done. So because the word is just sigma, okay? So that's for the base case if you like and for the inductive case you have the following. So if your variable rewrites here to bd minus one and cd so now you can ask the problem twice. So because, okay, let me explain. So what you want to go is to go from the initial to the final state. So what you can say is that, okay, I want to go to the initial to some q prime by taking a derivation in bd minus one and then from this very same q prime I want to go to the final state by using the derivation of cd and because of that production I know that if I can find those I have solved my original problem and number three is just the same thing where actually the index are swapped. So the d minus one come first and here the d minus one comes last. Okay, now let's gather those intuition and put them inside some algorithm and the property of this algorithm is that it's gonna return so it's gonna reach the return statement so what can happen else is that the assert statement here fails if and only if I have such a word w, okay? So and that's a recursive algorithm and what it takes in parameter is the variable of the grammar, the initial one and two states, initial and final of the finite state automaton, okay? So you have the base case so you guess basically some production starting from x l that we write it in sigma and if this assert succeed then you can just go to the return statement and you are done because you found the witness which is sigma. Now you have the two inductive cases here and this is what I explained you so you want to go from the initial to the final so you guess one guy in the middle and then you solve two problems so going from q i to q prime and q prime to q f but here look what I did I did something which you would not expect so for instance, so the first recursive call to query is always on the variable with the lowest index. And here the variable with the lowest index comes second it comes after b of l but I choose to call query anyway first for this one, okay? And it does not change the correctness of the algorithm. Also if you look at the first recursive call compared to the color you can see that the index is decreasing each time you do this recursive call. And also you can observe that the second recursive call it's what is called the Taylor recursive call so basically this call you can get rid of it by using go to statement so that's a programming language technique so actually what you really have is only one recursive call because the second one they can be replaced by go to and for those recursive call the index is always decreasing so you know that the depth of the stack is gonna be bounded by l. All right, so now what I'm gonna do so remember I had three parameters so the variable of the grammar and the two states of the automaton now I will since now I know that the stack is bounded I can take those parameters out actually I'm gonna only take the states of the automaton and I can deal with them on the site with two arrays so I can do that now because I know that the recursion depth is bounded so from this query algorithm you saw in the previous slide I'm not gonna write a new one which I called traverse and let's use those two arrays that basically models a two stack of bounded height so it's basically the same algorithm but now instead of referring to the parameters so I'm just referring properly to the right components in those two arrays so in here let me explain you how you have to deal with arrays because it's not that obvious so here you have a call at level l and you want to know if you can go from qy to qf so what you're gonna do remember you guess again in the middle and you have two sub queries so what you're gonna do is to first you're gonna copy qf at the stack frame before here in the array mf then you're gonna guess the q prime state in those two frames here and now you're gonna solve, you're gonna call you're gonna have this recursive call for l minus one so you will actually solve this query now so can I from cl minus one find the word that brings me from q prime to qf and if you return then you just continue by solving this query that remains so there is this little bit of programming with the stacks but nothing too complicated okay now let's get back to patronet so because basically I've put all the ingredients I need to deal with patronet so now my arrays they don't contain a state of a finite state of automaton but they contain markings of patronet and an initial and a final marking to start with okay and that's now my version of traverse which deals with patronet okay and so that's all the okay those are the statement of the previous version for the finite state automaton so you can see how you match what we had for the finite state automaton in the case of the patronet okay so okay okay so now let's I will explain you again how we do we deal with the logistics of the two stacks to prepare for the recursive goal so we will have to copy the marking to transfer the marking from this frame to this one so this is when I invoke this transfer from two function then we're gonna guess some marking which just adds non deterministically token in places of the net so for instance I can guess this guy and then I will do the same thing so okay one important observation and it's that you when you do this transfer of tokens you don't need to transfer everything you can just do a partial transfer of the tokens and the whole algorithm stays correct in that case it's because of monotonicity of okay and why I'm saying that I'm saying that because now if I look at all my function calls here all the red statements so for each of them I can write some this is like low level patronet programming for each of those statements I'm not gonna go into the details it's just boring but so okay and what you have to remember from that slide is that what you see now in the slide so everything but the assert statement we have here is just a patronet so I have a fancy notation for patronet but it's nothing but a patronet here so I just have no to deal with this assert statement okay and this was the assert statement that was displayed on the previous slide on the previous slide so what we know about patronet and test for zero so when I say test for zero it coincides with the assert statement is that they don't mix well together so if you're not careful you cross the decidability line all the time there is only one case that is known why you preserve decidability it's when you use assert statement of that form so you have to fix a total order on the places of the patronet and when you test for zero so you can test S1 for zero by itself but if you want to test S2 you cannot test it by itself you also have to test S1 and et cetera so if you want to test SN you have to test all the other guys from zero so in that case the reachability problem for this extension of patronet it's known to be decidable it's a result by Hina so now if I look at the assert statement I have in the program so what we have shown in the paper and you can actually easily observe it is that if you test for all those places instead of just those it does not modify the correctness of the algorithm and then you just have to find another ring on those places for instance this one and then you fall back on the result of Hina okay and this concludes the proof that if you want to decide reachability along bounded index context free traces so this is actually equivalent to I mean you can reduce it to reachability in this model of Hina okay and that's the main result of the paper we also have shown the reverse direction so if you take the model of Hina you can actually set up some bounded index context free language and so on such that you can do the reduction the other way around so and we're still the general problem for full context free languages so this is still open and I think it's a very hard problem thank you