 Ευχαριστώ για την προσπαθή σας. Ας δούμε να σας σημαίνουμε τι είμαστε τήρα. Είμαστε πιο από τα θεωρία που κοινωνίζουμε, η θεωρία που είναι η δημιουργία της αλγεβρας και της αλγεβρας που έχουν την ίδια εξηγητική πόλη. Η δημιουργία της αλγεβρας είναι η λαγκότητα που βλέπεις με μερικές αλγεβρας και μετά να χρησιμοποιήσεις αυτές τις 5 αυτοπιστές, δημιουργία, δημιουργία, δημιουργία, δοχασμότητα, δοχασμότητα, δημιουργία και δοχασμότητα. Βλέπουμε πως βλέπουμε ότι αυτό μετά χρήκει να γνώσεις κάποιες εαυτές τις αδερκοίες όπως η συγγραφία, όπως η αφηπότητα. Βλέπουμε ότι στα κοινωνισμότητα υπάρχει σαν κάθε ασθέση, όπως και τη σύγυρα, ο παραδιστής της αυτοπιστής Και then the first result that we saw was that algebra and calculus have the same expressive power appropriately understood. And appropriately understood means that you have to look at the domain independent formulas of first order logic. And then it is true that relation algebra and the domain independent fragment of relation calculus have identical expressive power. And I also mentioned that testing whether a given first order formula is domain independent is an undecidable problem, but we have an effective syntax that captures domain independent property. Then we looked at these three fundamental decision problems for a query language and in particular for relation calculus. The three problems are the equivalence problem, the containment problem, and the query evaluation problem. And in the case of relation calculus both equivalence and containment turn out to be undecidable. This is a very easy corollary to the Trachtemrose theorem, to the finite undecidability of finite validities. And then we also mentioned that the query evaluation problem which is just a model checking problem is p-space complete. In fact I didn't have to do any work for this, it was already covered in the lectures in the morning. So here these are these problems again by way of reminding you. The query equivalence problem contains as a special case logical equivalence and that's the special case when the queries are Boolean. So in general you have two queries of the same arity and you ask is it the case that on every database, plus database means always finite databases, is it the case that they give you the same answer, in particular in the case of Boolean queries it means are they true on precisely the same databases. Query containment means that the result of evaluating the query on one database containing the other and you want this to happen on every database. Then we took a closer look at the query evaluation problem which again was covered in the lectures on yesterday morning and the query evaluation problem has as input two objects, the query and the database. Then you get two families of problems by fixing one of the two inputs. So if you fix the query you talk about data complexity, you have one decision problem for every query and there the complexity drops to log space. Query complexity or expression complexity is where you fix the database and you let the query move around as inputs and then it can be P space complete. So for coming from a database point of view this really tells you that the problem that you are interested have very high complexity and in fact it can even be undecidable. Certainly query evaluation, certainly, excuse me, query equivalence is something that you would care about because it's what you would like to do. Someone gives you two queries and you want to test whether or not they do the same thing. So this motivated the question of are there fragments of relational calculus, fragments of first-order logic for which containment and equivalence and evaluation are easier than the full case. And in fact there is a very nice language, it's the language of conjunctive queries and you are all familiar with this language. I'll give you different ways to think about conjunctive queries and the advantage and the interest in focusing on this class of queries is that they capture the most frequently asked queries against real database systems. So this is the motivation for looking at this class of queries. Formally speaking a conjunctive query is a query definable by a formula of first-order logic that has this very simple syntax. It's a bunch of existential quantifiers applied to a conjunction of atomic formulas. Atomic formulas, I mean here literally positive formulas, conjunction of atoms of this form. Some of these variables are quantified out, some are free, and the query returns on every database the set of 4k tuples that satisfy this expression. By the way, this is a domain-independent calculus expression so I don't have to worry about the universe. Just active domain semantics works perfectly well here. But there is another way to think of these queries. These are precisely the queries that are definable in relational algebra. I remember the five operations as projections of selections of Cartesian product where the selection is very, very simple. The condition is a conjunction of a quantity. Where does the conjunction of a quality is? So here, well basically in some atoms you may have, the atoms may share the same variable, right? So that corresponds to the quality between the variables. And if you want to go to SQL, is again a very natural class of expressions you can write in SQL. This is select from where, where in the where clause, remember the where clause absolute the selection condition is a conjunction of a quantity. So it's a very natural fragment to consider. There is another way to write them, which would be of interest to us when we move to data log in the second half of this presentation. And this is, we can write them as rules in logic program. So in the logic programming rule, a logic programming rule is something like this, something like this, where the conjunctions are replaced by commas, quantification, extensive quantification has disappeared. It is implicitly denoted because you have some variables that occur on the right-hand side, but not on the left-hand side. These are assumed to be existentially quantified. Typically when we write a query like this, the right-hand side is called the body of the query and the left-hand side is called the head of the query. And as I said, the missing variables, the missing variables are extensively quantified and commas stands for conjunction. So let's take this syntax and look at some examples. Path of length 2, all the pairs connected by a path of length 2. In calculus you will write it like this. There exists a z, x, z and z, y. As an algebra expression you would write it like this. As a rule you will write it like this. The existential quantification is oppressed. You infer it because z is the variable that occurs here, but not on the head. And similarly a cycle of length 3. It's a Boolean query that says there is a cycle of length 3. You would write it like this as a conjunctive query or like this as a rule. It's perfectly fine to have heads with no variables. This simply means that all the variables on the body are extensively quantified. So this is just different syntaxes for the same type of object. Let me point out that every relational join that we saw yesterday is a conjunctive query. And in fact it's a conjunctive query in which we don't have any variables that are extensively quantified. All the variables occurring on the right occur on the left. So it's a quantifier free conjunctive query if you will. And that's how you would write it in SQM. Conjunctive queries, if you talk to the database practitioners they call them SPJ queries. Select project join. Although they select you have to understand that you're only talking about conjunctions of equality. So these are known as SPJ queries. So now let's look at our fundamental problems for conjunctive queries. And for the time being I'll suppress the equivalence problem. I'm only going to look at conjunctive query evaluation and conjunctive query containment. So these problems simply mean given a conjunctive query in the database find the result of evaluating the query on the database. Containment is true that on every instance i q1i is contained in q2o5 in the case of Boolean queries. We have implication logical implication between two existential conjunctive existential positive sentences without any junctions at all. So we have logical implication. Remember that conjunctive query evaluation for the full case of the full calculus or algebra was space complete query containment is undecidable. Have we gained anything by going to this small fragment and the answer is yes. There is this very nice result which I'm going to give you a complete proof of this. It's actually not a difficult result. It's much much easier than anything you've seen in the lectures of my colleagues here. But very important result by Sandra and Merlin from 1977 that in some sense conjunctive query equivalence and conjunctive query containment are the same problem. Which is very different from the full algebra. One was the side of the other was undecidable. Here they are the same problem and they are both NP complete. And of course the question is why is the common link. By the way, Sandra here is the same Sandra that we saw his name mentioned in the SOC Sandra this thing with graduate of IIT Kanpur. This is we saw his name yesterday morning with the alternation right in Ram's presentation. We saw it in Anousha stock. This is Sandra Harrell. Sandra Harrell paper with an open problem for logic for P time. So he also had this paper in stock 77 with Merlin. He was at IBM at the time. Today he's at Microsoft. So what is the common link. The common link is the so-called homomorphism problem. So I'm going to introduce this homomorphism problem and give you just a little. Just scratch the surface of this beautiful problem. Let me remind you what the homomorphism is. You can think of it as a relaxation of the concept of isomorphism between two relational structures. So you have two databases or two relational structures. A homomorphism in our case of databases is a function from the active domain of the one to the active domain of the other. That preserves membership in relations from left to right. Meaning that if you have a tuple that belongs to some relation of I, then the image of the tuple under this homomorphism is a tuple that belongs to the corresponding relation in J. So it relaxes isomorphism in two ways. It does not have to be one, one and two. And also this preservation is only from left to right. It's not different only if. And as an example which I will use to illustrate some of the concepts and the techniques here, then the results is that the graph is three colorable. If and only if you have a homomorphism from the graph to K3, where K3 is the triangle. You say click with the three elements. And in fact, this is a very strong if and only if, in the sense that the three colorings of a graph are precisely the homomorphisms that you have from the graph to the triangle. So three colorability is a special case of what we will call in a minute the homomorphism problem. All right. So here is the homomorphism problem. Given two database instances is there a homomorphism from one to the other and we will use this notation i arrow j to denote that such a homomorphism from i to j exists. And one thing we can immediately identify is the exact complexity of the homomorphism problem. It is NP complete. It certainly is in NP because we guess a function and verify that it preserves membership from left to right. But it's also NP complete. I just showed you why it's NP complete. In fact, it's NP complete even if we fix the right hand side, right? For a fixed j, g is three colorable if and only if there is a homomorphism from g to K3. What is two colorability? Two colorability is the existence of a homomorphism to K2, just the end. And K colorability is the existence of a homomorphism to the click with K elements. Okay, so we know the homomorphism problem and we know that the homomorphism problem is NP complete. Here is an exercise that is time to something that Anousos saying yesterday when he asked the question I think is three set NP complete under a four reduction, right? And of course you have to think how you're going to code satisfiability as a class of finite structures. Well, if you do that then as a follow-up exercise is to show that three satisfiability is a special case of the homomorphism problem. The only difficulty is to coming up with a writing coding of three set. All right, so the homomorphism problem is a fundamental algorithmic problem. I just showed you that colorability is a special case. Satisfiability is a special case. Planning and many other problems in AI can be viewed as special cases of the homomorphism problem. In fact, there is a very important paper by Federer and Vardi back in 1993 in which they argued that every problem in constraint satisfaction, which is a large class, a large algorithm in paradigm, in artificial intelligence can be viewed as a special case of the homomorphism problem. And in fact, I think Moshe Vardi tomorrow in his invited talk at the conference will talk about aspects of constraint satisfaction and its connection to homomorphism. So we know now the homomorphism problem. So now let's proceed and explain what is behind the Sandra Merlin theorem. This rather surprising situation, remember, valuation and containment, very different problems for full first order logic. You go to this fragment, they are the same problem. And the fact is that they are equivalent to the homomorphism problem, but I need some, a little bit of machinery to show you this connection. And for this, we need to bring into the picture canonical conjunctive queries and canonical database instances. And this is very simple. What I hope to do in the next five minutes is to make you able to do the transition every time you see a conjunctive query, you also see a database and every time you see a database, you also see a conjunctive query. So it's like a magic picture. You look at the database or it's a conjunctive query, you look at the conjunctive query or it's a database. So let's see what's going on. It's very, very simple. So let's start with a database. So we have a database. With every database we are going to associate a Boolean query. It's going to be a Boolean conjunctive query, which we are going to call the canonical conjunctive query of the database. So what is this? Every element of the active domain, every element of the active domain becomes a variable. And then the facts of the database, these are simply the tuples that belong to some relation, become the atomic formulas, the conjuncts of the conjunctive query. So concretely, if your database had just these three pairs in a binary relation e, eab, ebc, eca, then the canonical conjunctive query would say, I wrote it as a rule, would say exe is y, exe, which really means that there exists x, there exists y, there exists z, so that we have exe and easy y and pyx. So we look at the database, we write down the query. Logicians know this, it's the positive atomic diagram of a structure. That's all there is to it. But now written as a formula in this little fragment of first-order logic. So from a database we can go to a conjunctive query. Here are more examples. This is a database, this is a database which is basically the complete graph with three nodes, and the canonical conjunctive query really says there are three nodes and for every two different variables we have all the possible connections. So we go from the canonical conjunctive query, from the database to the canonical conjunctive query. We can do the reverse game, we can start with a query and view it as a database, a conjunctive query. Namely, if we have a query like this, we simply view the variables as elements of the active domain and we populate database with the conjuncts of the query. So that's what I meant before, I want you, whenever you see a database to think of a conjunctive query, whenever you see a conjunctive query to think of a database. Now I wasn't planning to say this but I was inspired by Anous' presentation today. Anous spent some time describing very nicely the three-width of a structure, right? Well, there is a connection between three-width of a structure and the canonical conjunctive queries. This goes back to some work that Mose Vardi and I did and then we had a paper with Victor Dalmau. So this is from a paper with Victor Dalmau, myself and Vardi back in 2002, where the following thing comes out that a database or a relational structure has three-width less than or equal than K, if and only if its canonical conjunctive query, which we call here IQ, can be written with at most K-variables. So if you don't like the compositions and if you don't like the Cop and Robert's game, you come from logic, this is if you will the logician's description of the three-width. Now what I mean can be written with at most K-variables. I wrote the conjunctive queries in prenext normal form, right? But you can think that you can use fewer variables by using nestings, reusing some variables, right? So all you allow in the syntax is atomic formulas. You allow atomic formulas conjunctions and extensional quantification, but you are allowed to nest them and reuse variables, right? So as long as you can write this, the width, so to speak, of this formula is at most K. That's exactly what it means to have three-width K. So there is this tight connection between three-width and the expressibility of the canonical conjunctive queries this way. But that's an aside. It's not relevant to, I just wanted to mention to connect to what Anous mentioned today. So we have canonical queries and canonical instances. And here is another example. If we start with this conjunctive query, then we get the corresponding canonical instance. And there is a very, very basic fact which should be obvious. And the obvious fact is that every database, if you have a canonical query, then the canonical instance satisfies the query, right? I mean, that's pretty much by English. Or else, then a Scott said one is a self-proving kind of theorem, right? All right. So now, I want to, I'm almost ready to give you the proof of the Chandramenian theorem. And this uses a very simple lemma. When I teach this to my students, I call it the magic lemma just to get them excited, but there is nothing really magic about this lemma other than trying to catch your attention. So the magic lemma, which I'll save it here, says that the following are equivalent. The following are equivalent for some database J for some conjunctive query Q and some instance J. One is that J satisfies the query and two, that there is a homomorphism. H from the canonical instance of the query to J. And this is really nothing else but the semantics of first order logic, when you come to this simple fragment of conjunctive query. Why is that? Well, let me argue informally here, because suppose that J satisfies the query. The query is of this form. There exists x1, exists xm3, right? Therefore, by the semantics of first order logic, it means there are elements, say 1am in the domain, such that when you plug them in the formula, the formula becomes true. But this really means that the function that assigns to every variable value ai is a homomorphism from iq to J. Just follow the definition. And vice versa, if there is a homomorphism from iq to J, then these values, these are values iq is the instance you get from the query, right? So the active domain are the variables. So now we have ways to instantiate the variables and make the query true. So these are, we can use the values of the homomorphism to give values as the interpretation of the existential quantifier. So this is not a deep lemma, it's just the semantics of first order logic. And now I'm ready with this to show you the proof of the Xandramellin theorem. In fact, the Xandramellin theorem, you can state it into two different equivalent forms. One is dual of the other. So here is the Xandramellin theorem. It says that if you have, I'll state it first for Boolean conjunctive queries and then we will consider queries with positive variety. It says that the following are equivalent for conjunctive queries. One is contained in the other. Two, there is a homomorphism made from iq prime to iq. In other words, the canonical instance of the bigger one can be homomorphically mapped into the canonical instance of the smaller one. They cross, they're not paradoxical, they cross, they're nothing paradoxical actually. And the third is that iq satisfies q prime. Let me write it here. Xandramellin 1977. And it says if I have q and q prime, which are conjunctive queries, then the following are equivalent. One, q is contained in q prime. Two, there is a homomorphism made from, there exists a homomorphism made from iq prime to iq. And three, iq, the canonical instance of q satisfies q prime. The dual form is where you start with the instances. And you ask yourself, is it the case that there is a homomorphism from i to i prime? In other words, do we get a yes answer to the homomorphism question? This equivalent to i prime satisfying the canonical conjunctive query of i and it's equivalent to qi prime contained in qi. Now this is interesting, right? Because you take this homomorphism problem that I showed you is very rich. You can do satisfiability, colorability, planning, in fact all of constraint satisfaction and the database theories become very heavy, it's a database problem, right? Conjunctive query containment or conjunctive query evaluation. Notice that there is something else that's interesting about this theorem before getting into the proof. If you look at the first statement, I give you two queries and I ask you, is it the case that q is contained in q prime? This requires what kind of statement is this? For all databases, right? This is a big universal quantifier, this is for all i. But look at this statement. This is now an existential statement. There is a homomorphism, right? Sounds a little bit like the completeness theorem. What is true on all structures is exactly what has a proof, right? That's the essence of the completeness theorem. We turn a universal quantifier into an existential quantifier. So this has a little bit of this flavor. Well, that's why we're getting NP complete. It's exactly right. But on the other hand, this is also at the logical level, this is also second order. We are quantifying over all databases, right? But the point is that this is an infinite set, while this is a finite space. So we are turning, in fact, an infinite set to a finite set. Notice also that this is interesting in its own right because to verify this statement, q contained in q prime, this means that for every database you have to test that q of i is contained in q prime of i. This tells you it's enough to test it in only one database, right? So again, we did use a complicated test to something simple. All right. So now we are ready to prove the homomorphism theorem. And we are going to use a magic lemma. So let me ask you, how many times do you think we're going to use the magic lemma in this proof? It's 1 implies 2, 2 implies 3, 3 implies 1. So how many times should we use it? 1? Any other guess? 3, 5. All right, so let's see why. So let's prove 1 implies 2. So assume that q is contained in q prime. This means that on every database you have the containment. In particular, look, we had this trivial fact that the canonical database of q satisfies q, right? Therefore, the canonical database of q satisfies q prime, right? Ah, by the magic lemma, it means there is a homomorphism from iq prime to iq. I'll be using the magic lemma for different substitutions for j and q of course, right? 2 implies 3 is again an application of the magic lemma in the other direction, right? If there is a homomorphism from iq prime to iq by the magic lemma, okay? 2 implies 1, it follows that iq satisfies q prime. So we've used it twice. So let's now prove that 3 implies 1. So assume that iq satisfies q prime. By the magic lemma, there is a homomorphism h from iq prime to iq. That's what we have. Now we have to prove that q is contained in q prime. Now there is no other way, but start with some j that satisfies q and we have to prove that j satisfies q prime. So start with a j that satisfies q. Well, by the magic lemma, that's the fourth application, there is a homomorphism h prime from iq to j. But now look, we have a homomorphism from iq prime to iq and a homomorphism from iq to j. There is a nice thing about homomorphism, they compose. That's obvious from the definition. So the composition gives us a homomorphism from iq prime to j. So by the magic lemma, that's the fifth application, we get that j satisfies q prime. So basically, this was really all that was in the Sandra Menin theorem. In 1977, you could get a stock paper, not today, right? The proof looked more complicated there, but it's nice. We can fit it in one slide after we have built this machine. This is a very simple result, but extremely, extremely useful as you will see. I mean, we will see many applications of this as we go along. And I hope it's clear. Any questions about this? All right. So here is an example. These are two queries. These are conjunctive queries. Well, look, I claim that q is contained in q prime. Q is contained in q prime because there is a homomorphism. By the way, I don't know if you see what is the canonical instance of q prime. Do you see? This is a cycle with four elements, right? So I'm just too colouring this cycle with four elements. That's my homomorphism. So there is a homomorphism from iq prime to iq, because iq prime is the cycle with four elements and has a homomorphism to the first one, which is just the edge, right? So there is this homomorphism and the other way around, it's very easy to see there is an obvious homomorphism from iq to iq prime, namely the one, the identity. So the two together tells you that these two queries are actually equivalent, right? Because we have containment going either way. Of course, you could reason it in a different way, but ultimately you'll have to use something like this. It's just an illustration of the homomorphism theorem. So we saw before that the graph is three-colourably, and only if there is a homomorphism from the graph to k3. But now we can use the Chandramelian theorem to give database, equivalent descriptions of this. This means that k3, the clique with three elements, satisfies the canonical conjunctive query of a graph. And of course this also means that the canonical conjunctive query of q3 is contained in the canonical conjunctive query of z. So really this tells you that you can take db2 or oracle and use it to do three-colourability. Not a good idea. Why is that? Because look, see what happens here. Look at this one. You have a tiny little database. There comes this big graph. So you take a huge conjunctive query with as many conjuncts, elements in the join as the edges, right? And that's not what the database systems are good at doing. We are using relatively small queries and huge databases, right? But in principle you could use a database solver to do a colourability and of course to do satisfiability testing. So by the way this answers your question about the uniqueness. I mean here you have lots and lots of homomorphism, in fact as many as the colour. Okay. What happens if the queries are not Boolean, if you have positive parity? The homomorphism theorem goes through but with a slight modification. So let's assume that the queries in the head, they have the same three variables. Then containment means there is a homomorphism which from the canonical database of the bigger to the canonical database of the smaller which maps the variables of the head to the variables of the head one by one. So the homomorphism cannot move these variables around, it respects them. And this is the same as saying that look, now Q prime has three variables. So IQ together with these elements satisfy the canonical conjunctive query of Q prime. So it goes through, it's a little easier to do it for conjunctive queries and then you can play with applications. For instance you can prove that if you take this binary conjunctive queries Q and Q prime Q is contained in Q prime simply because you have a homomorphism from IQ prime to IQ and here is the homomorphism IQ that you verify. So that's illustrating the homomorphism theorem. So now we have of course made progress to understand the combined complexity of conjunctive query containment and conjunctive query evaluation because we now know that the following problems aren't be complete. Given two Boolean conjunctive queries is one contained in the other given a Boolean conjunctive query and an instance that the instance satisfy the query they are the same as the homomorphism problem by Sandra Merlin and therefore since the homomorphism theorem isn't be complete these aren't be complete. What about conjunctive query equivalence? This was containment and evaluation. Well that's very easy to see also and the reason is the following that it's very easy to see that the following problem isn't be complete. I give you a graph that contains a triangle. Is it free colorable? That's NP complete. You just take your graph and add a design triangle. Given a graph containing a triangle is it free colorable? That's NP complete. But now if you have a graph that contains a triangle then the graph is free colorable if and only if its canonical conjunctive query is homomorphic equivalent to the conjunctive query of the clique. So we get that the conjunctive query equivalence problem is also NP complete. So here is a picture. This is in some sense a very crisp way to quantify the gain in complexity and this ability that we obtained by lowering our expressive power. We gave up on union, we gave up on difference. We just kept projection selections and very careful selections just conjunctions of equality. So our language became more limited but we replaced undecidability by NP completeness for both equivalence and containment. What about the query evaluation? Well, the gain of course is in the combined complexity, right? Where it dropped to NP complete. For data complexity we are still in log space because we had the log space even for the full calculus. So the lesson is that, yes, you can get better behavior for these three fundamental database problems if you make your language smaller, right? I mean that's the lesson. Therefore, having seen this, okay, let me not say much about this because Vardy probably is going to talk a lot about this because the combined complexity is, yes, it's decidable but it's NP complete, it's not as bad as P space but it's still NP complete. A lot of work has gone into finding tractable case of combined complexity. This started with a very nice paper on acyclic joints by Michaelis Yanakakis in VLDB 1981 where the idea was to start putting structural conditions on the conjunctive query. Now, since the conjunctive queries are really instances as we just saw you can think of it as being structural restrictions on the instances. So this led to many, many extensions. For instance, if your instances have bounded tree width then you get also tractable behavior. There are generalizations to something called bounded hyper tree width and there is extensive interaction between constraint satisfaction, logic and graph theory and that goes into this area. But this is a little bit, I mean I could give a whole course on this and I have in the past but I will just move on. So we saw before that the gain was this by restricting the language. Now we can go back and say let's look at what we gave up. So we gave up these two operations, right? And also here we kept only a quality, right? Let's try to put some of these operations back and see what happens to the complexity. Okay, so probably the most natural thing to do next would be to consider relational algebra expressions that also allow union. So in other words we are talking now about unions of conjunctive queries. So a union of conjunctive queries is simply a disjunction of conjunctive queries. An injunction of these simple existential formulas where each of them is a conjunctive query and there is another way to think of this. Namely if you go back to the algebra you can see what am I going to get if I close my relations under the operations of union, projection selection with the quality conditions. The union of conjunctive queries you can think of this as being a normal form, right? Like a sum of products. So the second one is what I will call for the purposes here monotone query. Sometimes people call it positive query. So again a union of conjunctive queries is a disjunction of conjunctive queries. A monotone query is an expression of algebra that uses only union, Cartesian product, projection and selection with the quality conditions. Clearly every union of conjunctive queries is a monotone query. The converse is also true that if you have an expression built from these operations union, projection, Cartesian product, selection with the quality only you can transform it into a union of conjunctive queries but there is a blow up in general. There is a blow up in general. It's the same blow up when you take something like conjunctive normal form you translate to disjunctive normal form. So monotone queries are precisely the queries expressible in first order logic using conjunction, disjunction and the extension quantification. Here is an example of this transformation. This query is a monotone query is a join. Remember the join is really conjunctive query applied to a union. This is equivalent to this union of conjunctive queries but you can see we have made it much bigger. But in terms of expressive power are the same. We are going to see an interesting trade-off now as we go along. So it's everyone clear. We are talking about the same class of queries but with two different syntactic ways to express them. Let's first talk about the union of conjunctive queries. Saigiven Yanakakis in 1981, the same year that Yanakakis had his paper in VLDB they had another paper in VLDB where among other things they had this nice result. It's a simple result. I'll show you the proof. It settles the containment problem for unions of conjunctive queries. And says the following thing. Suppose I have two finite unions of conjunctive queries. One is contained in the other. If and only if for every i on the left between 1 up to m there is a j on the right from 1 up to n so that qi is contained in qj' Intuitively what it says it is obvious that if two holds one holds, right? That's the most blatant way we have containment. This theorem tells you, when you talk about unions of conjunctive queries the most blatant way that this can take place is the only way it can take place. And the proof now is a two line proof if you use the Sandra Merlin theorem. Let's see why. So I'm going to prove only one implies two. I argue that two implies one is obvious. We're going to use the homomorphism theorem. So we assume one holds. We want to prove that each qi is contained in some qj' So let's take one of them. Take qi. Well we know one database that satisfies qi. It's the canonical database. So the canonical database of qi satisfies qi. Therefore it must satisfy the right-hand side. Therefore it must satisfy one member of the union. But now by the homomorphism theorem we have that i qi satisfies qj' by the homomorphism theorem qi must be contained in qj'. So that's a trivial proof using the Sandra Merlin theorem. So that's the power of this result. The other direction is obvious. Any questions about that? Very simple. But now look at this gives us. This tells us that the query containment problem for unions of conjunctive queries is NP-complete. Is it NP? Well the NP-hardness is even the special case for conjunctive queries. The membership follows because now we can take all these homomorphisms and do a big guess. We can combine polynomially many guesses to one big polynomial guess. So for every query on the left we guess some query on the right and the homomorphism. We can verify. So we guess some pairs and verify that for every i less than or equal to an m this function is a homomorphism from one to the other. Of course the NP-hardness follows from the previous from the case that even for conjunctive queries NP-hard. I will let you verify that the query evaluation problem for unions of conjunctive queries is NP-complete. That's an easy exercise. So now this is nice because it tells us that we could throw in the union and not make things worse than they were without the union. Okay, this looks promising. So now let's look at the monotone queries. Remember the monotone queries have the same expressive power as the unions of conjunctive queries. But we saw that the syntax is more compact. Well, Sageven Yanakakis had another result in the same paper. They didn't have just the almost trivial theorem I showed you before. They saw that for monotone queries the containment problem is pi to p-complete. Basically this is the price you pay to translate the monotone to the normal form as unions of conjunctive queries. And pi to p is the second level of the polynomial hierarchy contains NP-contained in p-space. Of course, the prototypical problem is, for all, the quantified Boolean formulas with two alternations starting with the universal. So what comes out of this is that when we go to monotone queries, we have the same expressive power, more compact syntax, and we pay a price in query equivalence and query containment. We go from NP-complete to pi to p-complete. But nonetheless certainly below p-space which was the full case. So this gives a pretty good analysis of what happens if we include the union. We can try to put a little bit of negation. And the most innocent form of negation that you can think is what happens if you allow in the selection conditions inequalities. They are not equal or less than. So what happens if we also allow inequalities? In this case you talk about conjunctive queries with inequalities. So for example, this would be a conjunctive query with some inequalities. My first look was a very bright young student that was constantly, but unfortunately died very, very young in 1988. So that the query containment problem for conjunctive queries with inequalities is in pi to p. And van der Maiden ten years later proved a much lower bound in POT 92 that it is pi to p hard. So we understand completely that the containment problem for conjunctive queries with inequalities is pi to p-complete. The evaluation problem stays in the same complexity as before as NP-complete. I'm not going to show you this. In effect what you lose here is the homomorphism theorem. And what you lose here is that you're not dealing really with only one canonical database. You can think that these inequalities are describing some partial relations between some equality types between the elements. And then you have to take on all possible databases that extend this type, complete this type. And that's exactly where the complexity, that's exactly the complexity to go up.