 Good evening. So, of course, there are many terms in the title, the demand explanation. In particular, so what is an art? It's Kanuth said. So, or what is a science? Really the question that people have asked me is why is it not the science of correct by construction programming? Are there any well studied examples of principles there? So, in some sense, the point is if we knew it well enough so that you write a computer program to write all other program, then it would be a science. Even a problem statement, the program will apply the principles and generate the resulting program. And then of course, we would all be out of business. So, thankfully, it's still an art. But more importantly, science tells you the principles. If you do this, this will happen. Science gives you a deterministic answer. In any situation, you make a difference what will be the resulting state. But what intervention to make? That is an art. You have many choices at any point, each of which you understand scientifically well enough. But which particular choice to make, which particular series of intervention to take? That is an art. Of course, in that sense, the practice of all science are an art. So, now the question is what do we mean by correctness? Of course, the obvious meaning is does the program do what the user intended it to do? But how do you become sure what the user intended it to do? Or in some sense, a lot of errors happen because people think the program is doing something or is supposed to do something when it's doing something slightly different. So, ideally, what you call is that you need a problem statement, a formal problem statement called a specification. That is ideally, if possible, a mathematical and ambiguous statement. Of course, if that's not available, then you try to make it as precise as possible. So, given this, what is the traditional approaches for ensuring correctness? Of course, the most popular one that probably all of us employ is testing. You write a program, we run various test cases. So, here in some sense, the specification is this set of test cases. The program is correct if it gives the expected answer on all the test cases. Of course, the problem is that you can only write finite number of test cases and there's a risk of missing out some always. So, if you are more scientifically minded, you may employ verification, it's called formal verification. You write the code, then you mathematically prove that the program does what it is supposed to do. So, the only issue is verification happens after you have finished writing the code. So, if the program, if you fails to verify, then you have to go back and change the program. So, the act of verification in some sense, the correctness criteria came after the program was written. Whereas, incorrect by construction programming, the correctness arguments come before you write the program. You say, if I need this and for this, it's like by applying scientific principle, I know that if I do this assignment, it will change the state in this way, which is what I need. So, you give the correctness argument or correctness argument guide the act of writing the program. So, that by the time you finish writing the program, it is correct by construction. So, I will give a very quick example. So, right now what I will do is this introduces concept of correct by construction at a very high level. Don't worry if you don't understand the details because I am going to come back to this and cover it in lot more detail in the later half of the talk. So, imagine that this famous maximum segment some problem, which essentially says that there is a given there is an array r. So, you can assume any sub-segment of r, consider the sum of them and you find the maximum among them. So, that is the problem. So, a formal problem statement is something like this. Suppose you assume nothing, there is no pre-condition, what is called a pre-condition, s is the unknown program and then r is the post-condition, r is what you r is the condition r here that the variable r contains the maximum of the segments of the sum. So, now there are many choices available to you. You can divide there in 2 half, do some calculation, merge it, you can scan the array from left to right, right to left or whatever you wish. So, in a correct by construction style, essentially you employ some principle. So, for example, suppose you decide I am going to scan the array from left to right. So, question is, so if you are going to scan the array from left to right or then of course, your program will terminate when you have finished looking at the array and at that point you say that well, why does the program achieve the desired condition? So, to ensure that at this point when this loop terminates, the program achieve the desired condition, you must have something else called a loop invariant true. So, this loop invariant and this termination condition together will give you the post condition. Of course, the point is every time you evaluate this, you do not know whether this condition is true or false. Therefore, this p 0 must be true, this some other condition must be true all the time, every time you come here so that the 2 together give you the post condition. In this particular case coming up with p 0 was very simple. In this expression which is your desired post condition, you simply replace capital N by small n and then and when the small n becomes equal to capital N by definition, you get your post condition. So, this is what we mean by correct by construct correct by construction. This loop is this program is correct provided, you can write the loop body s 0 which maintains the loop invariant p 0. So, s 0 should be such that whenever you start with p 0 and this guard, you should end with the p 0 true. So, again this is a very quick 1, 2 minute introduction. So, now what do you know about s 0? Really nothing except that I am going to go left to right. So, at some point I am going to increment n to n plus 1. That is the only thing that is known. Sorry, this mouse is there a pointer or something? Mouse, keeper, you cannot see the mouse. So, n is n plus 1. So, let us not worry about essentially what now say the scientific principles will tell you that something what must be true before this. So, it is something similar to p 0, but a slightly different formula. So, you have filled a part of the body and then you have derived new conditions that must be true for this remaining program fragment. Now, if you look at now again by applying the principles of correct by construction programming, you can discover that the r must change if n is to go to n plus 1. And then by doing some calculation that we are not getting into now and I will come back to this later. We have further filled this up. We calculate that r must be same as r max s where s has to satisfy certain property given here and then so on. Then similarly we do something for s and so at this point our program is complete and this program if we did not make any error in doing this calculation then this program is correct by construction. So, the key point here is that all stages we had a correct by construction program. The program may have had holes. If those holes could be filled if the holes had a certain enough such that if the holes could be filled such that they have certain desired property then the whole program is correct by construction. And so we were guided by formula manipulation and essentially whenever we had an intuition that maybe I will increment n, maybe I will introduce a loop, maybe I will do something else. We explore logical consequences of that that if I want to do this what else must be true so that the whole program is correct. That is the basic philosophy of correct by construction and we will cover this example in lot more detail but later on. But just before that a brief talk I outline. I will talk about some of the preliminaries and then I will cover two examples in detail the binary search and this maximum segment sum. And finally I will talk about a system that my PhD student Deepak has built to assist with this process. Okay so first I said we have to specify what the program is supposed to do. So that is what we call is a quantifier notation. Essentially this is a familiar sigma for sum except that we just write it in three parts. There is operator, there is a set of dummies then we said for the dummies which satisfy certain range there is some term on which we apply the operator. So for example for all the i between 0 and n sum sum up the terms of a of i. Or in this case we are saying that the array is sorted. So for all ij this should be j here, ai is less than equal to aj. And similarly the max is again an operator so this is the maximum segment some problem we just discussed that this is the maximum of the sum of the segments of it. This is a notation we will use for specifying the conditions. So the next is now the specification of program. When we say a program is correct it means it is a triple q s r which says program s when it started in a state satisfying q will terminate and it will terminate in a state satisfying r. So this is what we really mean by program correctness that if this is a value this is the specification of a program this is called q is called a precondition r is called a post condition. So for example suppose exponentiation is an unknown program if b is a non-negative integer then the program is supposed to compute a to the power b where we have not a is an integer but we have not put any constraint on it. Or this is the specification for maximum segment sum if n is a non-negative integer then the program must compute this maximum sum in the variable r. So this is the first thing second is the notion of weakest precondition. So given any such triple given any program and a desired post condition r you can ask what must be true before x is executed such that r is true. So it is the weakest such condition that means of course many more things can be true at this point but this must necessarily be true. So the example being here to take this example we do we decrement x and then we assert that x is still non-negative. Of course this is a valid specification if we started this program in a state where x was greater than y y was greater than 10. So doing this we know that x is still going to remain positive this is a valid specification for this program but the weakest condition for this is simply that x be greater than equal to 1 before this assignment statement is executed. So that is the weakest precondition. So this notion of weakest precondition for assignment which in some sense is straightforward if you are going to assign expression e to variable x and you want the condition r to be true afterwards then before doing this assignment what must be true well in expression r replace occurrences of x with e. So this must be true before doing the assignment so that after the assignment r is true this is called the weakest precondition of assignment and there is a simple example here we wanted x greater than equal to 0 so here we replace x with x minus 1 so the weakest precondition is x minus 1 is greater than equal to 0 which is of course same as x greater than equal to 1 here. Note the assignment statement is the only statement that changes a program state everything else is conditional so only time you are making changing the value of a variable is via an assignment so this apparently simple concept is a fundamental concept in reasoning about programs. So having just covered this now I will just talk about the binary search example I have chosen this example because all of you of course must have written binary search and of course it is very easy to get it wrong also but more over in many different situation you can adapt binary search the search is a common problem and lot of times you would like to apply a variant of the binary search in a complex situation. So let's just see and make sure that we always get binary search right because this problem with these pointers crossing over getting it an infinite loop and other things. So first thing is we have to come up with a post condition for the what is the specification for the problem it's a search problem so we want given an array a we want an index variable x such that a of x is the value v we are looking for but of course what happens if the value v does not occur in an array what is the program supposed to do that. So we can simplify over I mean we can change the aspect to say if there exist such a value then x must give us that value and if value v does not occur in an array then x can be anything but of course that is not what we typically want in a binary search even if a value does not occur we want the nearest value so we would like to replace this expression with this specification so note that so either if x occurs if value v occurs then we are going to get it from this condition otherwise we are going to get that point where the array jumping over v so a of x is less than equal to v and a of x plus 1 is greater than v. So already I would like to state that the art in the title that coming up with this specification for binary search is already an example of an art we could have started from here also this is a spec or there are many different specification of the binary search possible so what essentially there are many different paths and which path you choose is up to you but furthermore now remember how are we going to go about this so you already have an informal idea in your mind you will start from the two ends of the array and you want to zoom in so somehow you have to bring in these two ends of the array into the picture or these two ends of the interval so again so we would like to rewrite this expression such that we can bring in two variables that represent the two end of the interval basically this is the at the end of the program this is what happens but what happens at the beginning so we rewrite this yes so good point I am just coming back to it so if that was the case our correct by construction process must discover it somewhere that what if v does not occur in the array so that is a point so we will just come back to that so we introduce this by replacing this expression x plus 1 with a variable v y and then saying this is our desired post condition obviously this assumes that it element v occurs in the array and somewhere in our derivation we must confront that fact so we rewrite this once you rewrite this in this form I would say the problem is practically solved after this it is reasonably straight forward by straight forward I mean there are very few choices after this in a very reasonably scientific way we can derive the whole code and being sure that it is correct let us look at this of course there is a standard transformation I showed you before we will take this as the termination condition and maintain make this the loop invariant so now the strategy is simple you have to find some x y start from any state somehow keep changing x y provided you maintain this loop invariant and then if you ever reach this condition then your program I mean you would have found the right value and here to here would be a very standard transformation now the only reasonable value you can think of is x is 0 and y is n because anything else if you are going to reduce the interval you have ignored the part of the input and that immediately gives us this precondition if you are going to if you are going to do x equal to 0 y equal to n so that means before we start this program this condition must be true so essentially we have now separated our binary search problem in two parts you can of course trivially check whether this condition is true or not if it is false then you can give a different answer if the condition is true then only you proceed with this piece of this code so in that sense in the correctness consideration show themselves off if we are being rigorous about the process okay so given that we will maintain this invariant now you have a strategy in mind you are going to halve the interval and choose some point in the middle and you know one of the error in binary searching sometimes you do not get right in the middle right you may end up with the if the two points are nearby you may you just get in an infinite loop because you have not properly chosen so right now all we will say choose a point in the middle that simply satisfy this condition that is the interval is getting narrower that is the only requirement we will put on this and then we want to explore under what condition the left end of the interval will get reduced right so x is assigned so very simple all you have to do is to compute the see you want this predicate p to be true the loop invariant p to be true after this assignment so you compute the weakest precondition of this what is the weakest precondition of p with respect to this well simply in p you substitute h for x so you get this answer so when you enter your loop body p is true after exiting from the loop body I mean exiting when I reaching the end of the loop body you want this to be true because you have not changed y so a y greater than v stage so you can do x is assigned h provided this is true so you have calculated in some sense so you have calculated this code fragment that if a h is less than equal to v you can safely do x is assigned h and you know that the invariant will be maintained that your program will be true similarly you can ask when will I assign y to h same thing and then you find that x is unchanged so y can be assigned to v if a h is greater than v by the weakest precondition so that immediately gives you and it so happens these two conditions are complement of each other making our life easy so if a h is less than equal to v x is assigned h else a h is greater than v y is assigned h so look at the whole program so in a way we have finished writing the program the program is correct we have not yet decided what should be the value of h the important point is does not matter as long as any value that satisfy this condition the whole program is correct so we could have chosen x h as x plus 1 or y minus 1 it will be traditional linear search or something but the whole program will be correct so that is the important point that we have identified what properties of the variables we are using so fact that we choose the midpoint as x plus y by 2 is only related to efficiency of this program not related to the correctness of the program whereas in a traditional binary search when you choose x plus y by 2 and the division is not proper the program becomes incorrect so look at this this is a reasonably simple program and I hope that never in your life will you get binary search wrong with this if you just get this specification right and this can be adapted for many different situations so this is and of course this is a template remember we have not even used the fact that array a is sorted no where did we need this in this derivation the only property we needed was this so from here we reach this sortedness comes to assert that ax equal to v if there was that one value that you wanted to find it is only after this you will need the sortedness property this piece of code does not need and as a result of that in fact if you see this is very similar to the bisection code for finding roots of a function for example so for those of you are familiar if you don't know you can ignore it but this code can simply be transplanted to find the roots of any function by finding 2 n point 1 which is less than f of x and 1 gives you higher greater than f of x is greater than 0 f of x is less than 0 keep taking the middle point and keep moving and this condition will change till you reach within an epsilon threshold of that so in that sense this is what I meant by binary search can be adapted for many different situations so that's my discussion of binary search it's a simple example and thought all of you will be familiar and now I will attempt something slightly more complicated if you have any question feel free to interrupt at any point yes in this piece of the code yes so the array is sorted it will be used if you want to know whether a of x equal to v or not it is in this part that you will need the sortedness property so you want to know the exact index otherwise if the array was not sorted then there will be many places where this condition could be true x may give you multiple answer any other question okay yes so bigger picture here is this is about any problem see something like this you are doing informally in your head when you start writing code you have an informal coding strategy in your mind what we want to make sure is that you can translate your ideas into a correct program to that extent you have a strategy but you explore one step at a time so what we did you just remember what we did here we introduced a loop then we said I am going to choose some point in the middle I chose some x and then I just said okay under what condition can the left point can be moved to that edge so we explore logical consequences of the choices that you want to make you have a vague idea that these series of choices will lead to my final answer so here we want to make sure that you can do that rigorously so in that sense I am suggesting programmers adopt it so to make sure that they can translate your ideas into correct code obviously many of these ideas can also be used in automatic program synthesis but that is not part of my research agenda to the extent that it is all about as naturally automatic program synthesis has to deal with many of the issues okay so now I come back to maximum segment some problem so we said the compute the maximum of all the segment sums of a given integer array and this is just an example given this array there are various segments examples and maximum of it is 6 which will come for this segment so this is a reasonably famous problem and a linear time algorithm given in 1984 and many times I have tried this and people typically write quadratic or even cubic solution so what we will see is that now you have some informal idea about how to get a linear time solution and we will make sure that we can get that solution correctly just before that to show that the problem is non-trivial this is the book we use logic in computer science overtakes so sort of the almost similar problem minimum segment sum this is the code the book tries to verify this problem and while verifying it is a two page proof given somewhere it says and I have just taken this is the scan this is the code essentially it says that the something something reverse the complexity and ingenuity of this program and it justification needs to be taken offline so it gives up on the formal proof and takes offline the only point I want to make is that the book is saying this program is complex it is ingenious it is a very clever thing the book is appreciating whereas our aim would be to argue that anybody any one of us could have discovered this program if we were systematic if we had the basic idea so we would like to demystify how did someone come up with a program like this okay so this is now the specification for the problem so the we want max over so all the pq represent the end point of the segment and this is the sum note that we have carefully allow this definition allow empty segment because p is less than equal to q so we allow if all elements are negative we allow empty segment to be an answer if n can be 0 in which case the whole array is empty and 0 is the maximum segment sum with an empty sum of course you could you can change the specification but the point is incorrect by construction programming lot of thought goes in coming up carefully with the specification lot of what we call are the pro bugs are actually errors of specification and not errors of implementation when you say oh I didn't think of that case it really means you were not thinking maybe you thought p could must be strictly less than q I mean you unconsciously you just assume that in some part of the code so this separation is important of errors of specification and implementation so now I have already so this is the post condition the first step is now we know that same thing we want to iterate the array from left to right and so we want to this transformation I talked about not that this is a semantic step essentially this is a strategy you have in mind and this is a simple syntactic implementation of the strategy so the fact that you happens to replace capital N with a small n is incidental for some other problem you would have done something different maybe so we rewrite this as this and the termination condition of course we know that n has to be n this p1 we have introduced for basically making sure the loop will terminate so I will not I will just retain the p1 but not talk about this it's only important for termination argument that I am not going to cover today in the talk so in general it's a good idea that whenever you introduce a variable you introduce bounds for it and then you during the program if your variable is going to go out of those bounds then your calculation will suggest that okay so with this we get this simple program that while n not equal to n and p0 and p1 are your invariant this variant is again required for loop termination essentially every iteration of the loop this quantity must reduce and when this quantity becomes 0 then the loop is going to terminate so we will not discuss the variant beyond this this is the very first thing very reasonable and straight forward and now question is again how to construct loop body of course I have already told you before that the only thing we can guess is we are moving from left to right so n has to become n plus 1 at some point in time so now what must be true before this well this p0 which is what is the loop invariant we compute the weakest precondition of p0 and which is nothing but in p0 n is replaced with n plus 1 and we have an unknown program s1 which precondition is p0 which post condition is p0 n replaced with n plus 1 and our task is to construct s1 and we know if s1 can be constructed with p0 s1 p0 and n plus 1 then whole program is correct by construction now how do we go about constructing s1 what what is the guiding criteria what should go in the body of s1 well you compare its pre and post condition p condition is p0 post condition is p0 n is assigned n plus 1 so let's look at p0 here in the post condition we replace n with n plus 1 p and q are dummy right so therefore in some sense r must change if you are changing the right hand side r has to potentially change for the this equality to hold so therefore r must have some new value r prime so on the point I am emphasizing here is that in correct by construction program there is always a justification for attempting something of course sometimes you can't think of justification you may make a blind guess also but so far we have not made any blind guesses we have always argued why we need to do something so r is assigned r prime where r prime is some new value of r that we don't know but this s1 the unknown program remember its precondition is p0 post condition is this r prime must be such that this is satisfied so by now we can guess that what we really want to do is to compute the weakest precondition remember what was weakest precondition what must be true at this point such that doing this assignment will result in this that is the weakest precondition the necessary condition so that executing this will result in this state and the necessary condition is of course state forward further we make a second precondition in this in place of r we put r prime so in p0 put n plus 1 in place of n put r prime in place of r that okay please ask me this is really the fundamental step in some sense there is not much to program correctness when you think about it it is the pointer free program so the assignment is the only thing that changes state and you have to relate the state before the assignment to state after the assignment and so there are very few concepts actually if you think about any program so let us look at this now in more detail we simply substitute p0 definition okay and now this Wp is nothing but these replacements so all we have gotten here is now r prime equal to this everybody with me this is what we want we want to find the r prime which will be equal to this expression you had you had done certain iterations of the loop and r was holding the partial result the partial max seen so far you have done one more iteration of the loop and you want to know what is the new max so that is the insight now the goal is of course in a programming is to relate the new iteration of the loop to everything done so far it is like the inductive step in an inductive proof you have something that is number 2 so essentially you would like to rewrite this expression such that you can bring in the r whose value you already knew so far so remember because this is the n plus first step we are talking about the simplest strategy is to really try to talk about work done till the nth step and the work done in the n plus first step and so you look at this predicate q less than equal to n plus 1 it is nothing but either q is less than equal to n plus 1 so from standard rules of logic you can rewrite this range as this so this is the from the rules of logic and now the next rule tells you you can split the range and the operator is the max so all you are saying is r prime is either the max till the nth iteration or the max of this expression so this is the extra value that got computed in the n or needs to be computed in the n plus first iteration the point is this logical manipulation are telling us what needs to be done in the n plus first iteration or in any iteration of the loop so this I mean this is where the correct by construction is coming up that our program will be correct by construction if r prime has this value of course we know this expression these are logic expressions they are not program variable finally our program I mean we need to replace them with program variable so we already know that p0 means that this expression is nothing but r so we will get this we can replace this expression with r and now we get r max something and we can simplify it further because q equal to n plus 1 so q was a dummy but whose value we know it's n plus 1 so we can replace q with n plus 1 so we get this expression okay so this is the really the main heart of the construction that when you say the clever how did one thinks about this the logical formula essentially this is this formula right you said after n plus first iteration you would like this and this is all guided when you can ask why did I do range split why did I do this because pretty much those are the only I mean sort of the standard things to do in that you think about you are doing the n plus first iteration you want to relate nth iteration to the next thing so these are all not wild guesses they are guess in some sense this may or may not have worked but these are all educated guesses okay in that sense why we are calling it an art because which rule to apply at which point would be an art sciences tells you that if I apply this rule how should the resulting expression will look like that is the science so now the only issue is that this is again a logic expression we need some program variable so at this point we have a choice we can either compute this we can write a loop where p will go from 0 to n plus 1 and we can compute this expression but we are exploring whether we can derive a linear solution or not okay so to do that essentially we introduce another variable s which is equal to this expression there is some technicality here but essentially remember here we have n here we have n plus 1 so we have to assume p2n is assigned n plus 1 if we assume that that is at this point in the code that is where r prime is given this assignment if this is true then r prime is this so we have now filled up one more whole remember this whole thing was a whole now we have come up we have discovered this program fragment r max s while discovering this we introduce a new invariant p2 that s a new variable we introduce and that variable holds this value and now of course our job is to make sure that p2 is true before this also we chose to propose p2 as a loop invariant so assuming p2 was true here we have to come here and then we have to make sure that the end of the loop p2 will be true which it turns out if you just push this p2 through here because n was n plus 1 p2 will become true at the end so that was a separate check that needs to be done it happens to be true when we propose this as a loop invariant okay now we can do this essentially this s1 like I said this is an independent sub problem r does not occur here so this is like a new problem to be solved so you can apply the same strategy and then you can calculate s it turns out s is slightly more complicated expression but this is mainly because the n plus 1 occurs in two places this complication comes because the range split here gets rid of this n plus 1 but it does not get rid of that n plus 1 so that brings in the complication but again in some 7-8 steps you can by straightforward calculation I mean when I say complication meaning you have to be careful you can if you are careless you will make an error if you do it carefully you will discover this so this is our final program and I submit that any one of us who have studied the technique could have discovered this program there is nothing complex nothing magical about this program this gives you a linear time program for maximum segment sum of course it is possible that we may not have been able to complete this program maybe we chose a path where maybe there is another problem for which there is no linear time solution in that case we would not have reached till this point please see if you have any question because this finishes the derivation of the segment sum yes yes finally because we know this solution how do you even know this set of heuristics because one has tried it so as one solves more and more problem one can keep increasing the databases of the heuristics and a brute force over that will naturally always give another example I will tell you I said there is a technicality as of now this remember what we could have done we could have simply introduce this as an invariant s equal to this expression we did not do that please okay so s equal to this expression would have been a natural choice to make actually that leads to that leads to an out of bound error array out of bound error I told you at some place that there is a technicality so this is again in that sense a creative step to actually not use n plus 1 but to use this that is one answer so for this particular similarly there are other choices we made for example what if what if there was no linear solution to this problem so the program that you are saying the computer maybe you have to introduce a loop here to compute this expression which is of some suppose we are computing some other predicate then somewhere you have to use the property of that predicate so which property of the predicate to use? obviously for as I said as your database increases more and more and in a brute force method you can try many things many problems will be solved thankfully as of now that I mean the programming is not a solved task so we are in business system I don't know if I answered your question but I am saying yes when I once you have solved something you put that in database and if those steps are going to be tried in some combination then eventually you will discover some of the solution is how effective is it on every new problem you have to the extent that it is a science and to the extent I have been arguing that there are very few choices so well defined heuristics may lead to lot of good program I agreed that lot of these ideas will be useful for automatic program synthesis also but it is just that we are not there yet heuristics is large enough, too large to be discovered so wouldn't a bigger challenge be to come up with heuristics automatically because solving this is a time it is kind of cheating right because you you know what are the right places you know what are the right things to do if you are saying that I knew the solution and therefore I could derive it then that is not so first I will show you now few examples where I did not know the solution that solution I had never seen in my life before I applied the technique and I got that solution in fact there were other known solutions for the problem so that is the one part other parties in our business is good to have the huge challenges so you can do a PhD and I mean in a serious way come up with some set of heuristics in the process we have come up with some set of heuristics we have systematized some of the knowledge of what heuristics to apply under what situation so to that extent of course that yes it is a huge challenge and we need lot more people to attempt this challenge but let me just show you some more examples ok so before that I am moving to example point I want to say is that on one end is huge but lot of problem at least to understand this concept this is small set of core concept so of course you have to understand first order logic this is specification the loop invariant I talked about and this assertions I put in the middle they are called annotations because they are static thing assertion typically used as a run time assertion when you know the value of the variable because the annotation is true regardless and then I discuss some of these rules that we applied so using this small set of rules lot of problem can be solved point is that is what we are doing at least in the initial courses when we are learning programming when we are thinking about correctness then these concepts are enough because most of them see what makes a program hard is a loop otherwise of course program you cannot go very far without a loop then it will be a linear flow in a loop essentially you have to think about the next iteration of the loop how it relates to the work done so far so that is why some of these rules suffice when I am saying there is a small core that you can master to solve many problems of course you will run into complicated situation where this won't suffice you will have to transform your formula in a different way you have to think of the new properties of the function being computed so that is where the creative step comes in okay so just the so the methodology in some sense again answering your question was whenever possible you would like to follow this outer path so this is to the extent that mechanically it's possible some natural or predicate transformation will suggest themselves if they don't then you have to think of the properties what property of some you should exploit we have implicitly exploit many properties that the sum is a symmetric associative operator I mean I didn't talk about this because so obvious that we implicitly exploited that if it was not so then maybe this schema would not have worked right and finally even you can't find like you can't calculate it but maybe you want to try something so you would like to guess something and then locally verify that whether this works or not whether this maintains the invariant or not so this is roughly the programming methodology we are proposing just to give you an example I will not go through the details of the derivation but I will just show it's a very simple problem just read the statement in fact you would think it's so simple that it should not be talked about given an array write a program to compute if there exists a sequence of 52 entries followed by 50 false entries of course the idea is every time you don't want to read last 100 entries you want to make sort of a single pass algorithm and doing only maybe few variable lookups not 100 because this 50 could have been 500 or 5000 this is a very simple problem so sort of we I have you know so now the specification gets slightly complicated and there exist a point such and so instead of 50 we use the d the range such that m minus 2d to m minus d so you have d values that are true and followed by d values that are false that is the specification for this so far is the standard thing that we did before we try n not equal to an and then all at this point the calculations get bit complicated so now you try to compute the new variable of r and then you will find that something else gets introduced and this time we don't make it a loop invariant so actually we again introduce a new variable s but this p3 is need not be loop invariant in fact we materialize it right here without introducing a loop invariant using some take further calculation and then further invariants are needed so we again go into another program finally that s2 we calculate and again the similar loop and s2 is this so look at this final program this is the final program of course this is a lot of assertion but now compare it to some standard solution this is an alternative solution both are correct and both are similar complexity but I mean just to answer the question that at this point it's not that I had seen the solution before this is the solution I came up with I've never seen it in my life before and it's dramatically different from the known solution which also I had not seen I saw this only after that hopefully that answer I mean this was a standard this question came in cs101 to one of the test so this was the model solution further this is essentially state machine approach this approach also you can do a correct by construction of what is the strategy you choose I chose a different strategy here I may have decided that I will employ state machine and then I can do a correct by construction derivation of this solution that since I mean the methodology helps you implement your ideas correctly which idea which part to follow is an art that was a choice we had to make this is a framework for organizing your thought for exploring logical consequences of your idea so what you are doing informally earlier now you are doing formally and you have an early discovery of an era or sometimes you may have a no clue what to do and here for example that s expression particular that the way we got s I think yeah if you made an error in that calculation then yes but the point is detecting errors in calculation is an easier task than detecting error in a program why give you a program and I didn't put that max 0 part you read it somebody else reads it and they have to figure out whether this program works or not either they will think about it or they will test and they will discover so in that sense it's clearly the methodology is helping you because that calculation you can recheck somebody else can recheck and then there is much error so A checking error but no B just suggesting itself because some of these invariants are not obvious once you learn the technique so then the invariants suggest themselves part of it I have been emphasizing that if you are systematic there are very few things to be done a good student need not know logic they will be systematic in their mind they can see most of us probably the methodology will help so just one more example because this question has been asked okay so this is maybe I will skip this example essentially here again you see this just this what you have to compute is p equal to A to the power B of course you can multiply A B times but I would like to speed it up somewhere you would like to use the power so trick is to come up with this specification that is again something you can say you can templatize it but there are lot of ways in which you can do it in fact it will lead to lot of different solution so rest of it I will not go into but I mean it is essentially once you decide on this as a invariant then again it gives you the same thing and then you explore what if I introduce a power of x you can calculate the consequences of that you can come up with the right answer but I will tell you another problem again this is a problem published in science of computer programming let's not worry about the problem just to show that again I had not seen the solution this is two different separate routines with lot of nested loops this is the published solution this is a solution that when we solved it using the methodology I mean following one particular approach we got this solution that as different from this as it can be these are two preprocessing routine the L fill and R fill and this is the final code which is fundamentally different from what we did without any preprocessing so anyway so that's the part of it so this is just some motivational example what is the research agenda here and so many of you want to know essentially what I talked about this calculation style of correct by construction programming and we said it helps in constructions and more opportunity to explore alternative solution but as we discussed the one of the problem is this a manual what if you made an error like it was asked so you would like a sense of certainty if you are giving me a scientific method I would like to know my answer is correct I can recheck it but it will be nice to have more than that the theory of treasury is involved because a lot of obvious thing that you can see through also have to be formally proved so this is the with my this is the worker this system has been built by PhD student Deepak who is here and the goal was to automate the mundane task and leave the creative task for the user frame provide a framework so essentially I will just since we don't have much time I will quickly just run through this so that previous derivation that we had I showed you this derivation if it was to be done in a tool then you would need to organize it essentially you have to think about the what are the programs and here we are just doing pure formula manipulation so you have to switch between program mode and formula mode and then you have these tactics to apply which will apply do this transformation then you step into something go back out and then you sometimes this may not work out so you may have to backtrack take a different branch and so on okay so this is what the tool looks like essentially there is a content panel on the side you can choose which tactic to apply and for the chosen tactic you can apply the parameters the important point about the tool is that it is not a it is not a program it is not a text flow here this is a hierarchical system so after lot of experimentation and thought we have decided to use this hierarchical system where direct editing is not allowed you are saying the program has to be correct by construction at all point if you are arbitrarily change one part of the program then obviously your whole program will break down so you are only allowed to change the program by applying a tactic and system will check if the tactic application is valid or not and system will reject your tactic application if it is not valid and with each sub program there are associated pre and the post condition and of course there are unknown program fragments which also have the associated pre and the post condition so your whole program is in some sense a tree the hierarchical tree then of course there are lot of program transformation techniques like this we saw many time the important point here is that there are certain applicability condition that this program can be translated into this provided P implies R1 so the system integrates with the theorem prover to verify this that otherwise it will say sorry P does not imply R1 and therefore this information will be rejected then you have to step into sub formula lot of formula manipulation facilities provided here and then of course you have to extract the context properly and so there are lot of interesting I mean research work to be done here and then of course sometime you want like we said there are you do not want to go through the detail formal thing so you may just want to guess the thing without doing the detailed calculation you say oh I know x equal to 0 will do the task so then you are allowed it system and then if it can of course the verification may fail even though your guess is correct the verification because system is not powerful enough right not all theorems can be verified automatically so similarly the formula manipulation can be verified and so this is the there are lot of other things in the system like you can sort of some of the notation so you can go into minimal view you can capture the derivation history what were where you branch what are the different techniques and all of that so that is all about the GUI of course then there is a I won't discuss that I mean lot of research has gone into the theoretical foundations of the system what kind of transformation rules are to be I mean how do you extract a context and then sometimes there is a formula manipulation rule changes then there are other things like assumption propagation I mean what when you are doing manually we are making lot of jump when you try to formalize it lot of challenges come up right that as I said SU introduced as a new variable where is that variable supposed to be declared how is that variable when we say okay it is going to be part of a loop invariant how do we propagate it so this is a theory of assumption propagation that we have come up with anyway we can ignore this these are this whole set of program transformation rule these are the four publications that are available from my web page and the CAPS web page the theoretical foundations are described here and then the idea design teaching experience okay thank you