 So, let me tell you about the problem that this talk is about. So, if you've got some finite language, but unlike most finite languages, it's, well, except English, and other real languages, it's ordered, not totally, but partially, so some elements will be comparable, some not. So, you've got an order on letters, that will give you an order on words by the normal dictionary order. I've got some language, how big can sets be inside that language, such that any two elements of my set are incomparable? Well, obviously the answer in general will be infinity, so to get a sort of good question, we should ask about the part of the language of words of length n. So, then for each n, this will be some finite number, and the question is, how does it grow as n grows? And, yeah, so the size of the, in general, for, in the theory of partially ordered sets, the size of the biggest n to chain is called the width. So, that's the reason for the title of the talk. So, the problem is, if I give you a language, and typically we sell some special family of language, like regular languages, or context free, we want to know how the width grows a little bit. So, that's just writing down the lexicographic order. And I should just remark, for me, the dictionary order, the lexicographic order will put prefixes before suffixes. So, if a is a prefix of b, then a is below b. Of course, that doesn't actually affect this problem because we're only interested in length n, but it is more convenient. Okay, so, why are you supposed to care about this problem? Well, one reason is that it's a sort of moderately natural generalization of quite a classical problem, which is just, if I give you a language, how many words does it have of length n? And in particular, something which has been independently published about nine times or something is that, well, sometimes for context free languages, sometimes just for regular languages, is that if you have, well, even a context free language, then this grows either like an exponential in n, or just like a polynomial in n. And of course, our problem generalizes this because this is just our problem where partial order makes everything incomparable. So, everything of fixed length is an anti-chain. Okay, so that's motivation sort of one. The other motivation is more practical, which is computer security. So, in real life, you might be sort of running your computer programs on some computer which you share with some, with an untrusted party, and we're interested in controlling how much information can get from you to them about sort of what you're doing, your password or whatever. So, so the kind of setup is that Alison Bob are interacting with some shared system which we'll sort of think of as some kind of finite state machine, and they provide inputs and get outputs. So maybe, you know, maybe this is the thing on your computer which allocates memory, so you send things saying, please give me some memory, and it says, here you are, or no, I don't have any. So, a long time ago, in a sort of very classic paper in security, Gogan and Massiga gave a definition of what it means for there to be no information flow from Alice to Bob, and basically that says that what Bob gets to see doesn't depend on what Alice did. So whenever Bob provides the same input, so he gets the same outputs. Okay, and our goal here is that we want to make this quantitative. So maybe it's too much to ask that Alice doesn't affect Bob at all, because if Alice were not there, then Bob could have all the memory to see himself, but if she is then he only gets sort of roughly half of the memory. So maybe that's too strong a property, and what we want to be able to say is some information can get to Bob, but not too much. So that's the other motivation, and what that has to do with this is that another story is that that problem actually reduces to this problem. If you can sort of construct a certain automaton, given the specification of the system, you can construct an automaton, which is sort of Bob's view of the system. If you view Alice's actions as non-deterministic, and then it turns out then what you want to do is sort of count sets of observations that Bob might make, but subject to the constraint that they have to be sort of consistent, because there has to be some strategy for Bob, which tells him what to do at each step, which allows all these strategies to be realized by Alice, and what that consistency condition sort of ends up being is exactly that the set of words you're looking at as an anti-chain in this partial order. Yeah, so answering this question of how much information gets to Bob is exactly answering this question of how big an anti-chain can be. Okay, so that's the motivation. Okay, the plan of the talk is I'll just tell you some kind of basic definitions and some basic facts about anti-chains, which we're going to be useful later on. Then I'll talk about the results on regular languages, then context-free languages, then I'll talk about, well, maybe so we saw that the big thing for language growth was there's a dichotomy between polynomial and exponential, and we'll see the same thing for anti-chains, but maybe you want to know more, maybe you don't just want to know it's polynomial, you want to know is it n squared or n cubed or whatever, so we'll talk a bit about that. Fine, there's a similar thing, tree languages, which I probably won't have time to talk about very much, and then one of the nicest things about this is that there are lots of quite appealing open problems, so I'll talk about those as well. Okay, so definitions, I guess you know what an anti-chain is, it's something where no two elements are comparable, a chain is the opposite of an anti-chain, it's something where everything is comparable, sometimes an anti-chain is going to be too strong for us, so we'll also talk about quasi-anti-chains, which are like anti-chains except they might have prefixes, which in the lexicographic order are comparable, but we're going to sort of imagine they're not comparable. Okay, so I'm interested in sort of quantitative bounds on things, so we'll say a language is polynomial, if it's number of words of length n is only polynomial, and exponential if it's exponential. So in particular, what we're interested in is how this width thing grows, and I'll also call it anti-chain growth, so a sufficient condition to have exponential anti-chain growth is that you should contain an anti-chain which is exponential, that's a sufficient condition, but it's not a necessary condition because it could be that the anti-chain you want to consider at each length n is sort of completely different, so it could be that you have an exponential anti-chain growth but actually you don't contain an exponential anti-chain. Okay, so well regular languages are all about union, concatenation and clean star, so we probably want to have some theorems about them. Well, union, it's sort of obvious what happens for things like polynomial and exponential anti-chain growth, I mean finite union doesn't really do anything to that, so let's leave that aside. Concatenation, anti-chains behave nicely, if you concatenate two anti-chains you get another anti-chain, and also if you have the control of the anti-chain growth of L1 and of L2, then you get sort of quantitative control on the anti-chain growth of L1 and L2. Okay, so that's concatenation, what about clean star? Well, there's no hope of getting a version of this for clean star because after all L star is going to contain lots of prefixes, so it can't be true that L, if L has an anti-chain then L star has an anti-chain but the next best thing is true, it's an anti-chain apart from those prefixes, which means it's quite an anti-chain and actually that's kind of good enough for government work because if what you care about is polynomial versus exponential, a quasi-anti-chain is just as good as an anti-chain because there's a kind of sort of Ramsey theory type argument that says that if you are an exponential quasi-anti-chain then actually you contain an exponential anti-chain. Okay, so let's just talk about language growth for a moment and then we'll see how we sort of adapt the arguments that are used for language growth to anti-chain growth. So the kind of critical condition for languages is that you want to ask for each state, let's look at the language of words that I can get by following some path, starting at Q and getting me back to Q. And so I'll call that automaton, i.e. my original automaton but where I have to start and end at Q, A sub Q, Q and the critical question is, is it the case that the language of that automaton is a subset of W star for some word W? That property is called being commutative and the critical point is that if that is commutative for all Q then your automaton has polynomial language growth and if not then it has some exponential language growth and that's easy to prove because if there is some Q, obviously I mean a relevant Q i.e. for which you can get to it from the starting state and you can get to an accepting state from it. If there's some Q which is not, for which LAQQ is not commutative, then you can take two words in that language which are not prefixes of each other and then W1 plus W2 star is an exponential language because W1, W2 are not prefixes and then you put something on the beginning and the end to get you from an initial state and to a final state and so you have an exponential growth set inside your language. On the other hand, if they are always commutative then you can say well, I suppose I had a word in my language, it sort of beetles around, visits the starting state a few times but then after some point it never visits the starting state again so by induction after that I'm in a sort of polynomial size thing and the thing that I get starting and going back to the initial state for the last W star so overall I live inside W1 star, W2 star, W2 k star for some W1 up to Wk and that's manifestly polynomial. So that proves that. So in some sense if we want to make a similar argument we need a sort of partial order version of being commutative so we're going to want some condition on LAQQ which ensures we have polynomial anti-chain growth and which if it fails gives us exponential anti-chain growth and it turns out that what you want that, sorry I should have said you can decide this at any time. Yeah and it turns out that what you want that condition to be is LAQQ being a chain so every two elements of LAQQ being comparable. Okay well why is that? Well if it's not a chain then you have two incomparable elements that means that W1 plus W2 is an anti-chain therefore W1 plus W2 star is a quasi-anti-chain therefore by our lemma there's an exponent and it's exponential of course and therefore by our lemma there is an exponential anti-chain and then if you stick something on the front and the end to get you from the start to the end as before you get an exponential anti-chain. So in fact the conclusion of that is that for regular languages actually if you have exponential anti-chain growth then you do have an exponential anti-chain so the thing I said at the beginning could happen actually can't happen. Okay now the converse a similar argument works you can say well for anything in my language I'm sort of going around and visiting Q the starting state a few times then I do something else then I never visit the starting state again. Okay well that's a chain by assumption a well it's a singleton so it's certainly a chain so that means these both have polynomial anti-chain growth of course they're chains i.e. anti-chains all have size one and then this language which is the language that we get never visiting the starting state again by induction also has polynomial anti-chain growth that means by our lemma this whole thing has a polynomial anti-chain growth. So we're done. So that proves the main thing. Okay and then what if you want to actually check for a given automaton well you can do it you could basically you can make some you can build some little automaton b which accepts only kind of synchronized interleavings of words which are incomparable because you know what does incomparable mean it means look at the point where they differ for the first time are those by comparable letters or by incomparable letters and that's clearly something you can check with an automaton and then once you've done that you just say well let's look at the interleaving of two copies of aqq for each q well and then see if that has empty intersection with b or not and that's of course in polynomial time. Okay what about context free languages again the language growth question here is classical and there's kind of a quite similar condition you instead of looking at laqq you say so pick some non for each non-terminal I want to look at the words I can get in productions that start at that non-terminal and end up with that non-terminal so I get some sort of a word on the left non-terminal word on the right so what's the set of words u that can be on the left for some w and then similarly for the right and again what I care about is whether this these things are commutative and then the proof is kind of similar in both directions. Okay so I guess what you might guess is that when I cover anti-chain growth I want to say I care about left and right a being a chain by every a but that's actually not right we need a slight more fine version of that we want left we cover left being a chain and then not right a but right sub w a which I mean I'm asking for each fixed w I might get on the left what words can I get on the right so left at right a is the same as this where w is existentially quantified but now I'm looking at one fixed w what things can I get on the right okay so how do you prove that this is necessary and sufficient well if left if you can get an anti-chain if you can get things which are incomparable on the left then it's easy to show you can get an actual anti-chain if right w a is not a chain for some w and some a it's a bit more delicate because you might when you sort of combine many you know when you want to concatenate the things you're getting on the right you're also getting stuff on the left and you have to ask sort of how does this stuff on the left compare to the stuff on the right but you can show that actually you get an anti-chain family so you get a set where the words of any fixed length are incomparable and in fact so now the thing I said could happen can happen and in fact it's very easy to think of an example of a context-free language where it has exponential anti-chain growth but all anti-chains are indeed finite okay but yeah so this proves that there's similarly similar dichotomy because of course in the other direction you do a similar kind of induction you say I might use a for up to some point and then after that I never use it again so done by induction okay so that proves that also context-free languages have a dichotomy between polynomial and exponential okay so yeah so I briefly said for language growth the condition of context-free languages is also decidable in polynomial time but interestingly for anti-chain growth that is not so in particular it's actually undecidable to determine whether given context-free language has polynomial or exponential anti-chain growth which you show by reduction from the problem of non-emptiness of intersection of context-free grammars which the last talk told you was undecidable so you can reduce that to the problem of determining whether a given context-free language is a chain and then that you can reduce the problem of exponential or polynomial anti-chain growth so that proves that this problem of determining whether a context-free language has polynomial or exponential width is undecidable so we have this kind of maybe slightly surprising kind of grid where everything for language growth is in polynomial time and everything for NFAs is in polynomial time but the kind of killer combination of context-free language and anti-chain growth is undecidable maybe that's not exactly what we'd expect okay what about exact growth rate so in particular in kind of the world of security maybe it's not very exciting to say it's polynomial even though that may satisfy the CS theorists if my you know in some sense log n bits of information in time n might be regarded as safe because well I could guess a key at that rate anyway or something but if it's a thousand log n maybe that's not so good so we'd be interested in knowing whether it's you know n or n squared or n cubed or whatever so for the case of language growth this is not so bad and I mean the reason for that is basically that having polynomial language growth is really a very strong condition so so in particular I mean if you look if we look at the transition graph of our automaton obviously we can sort of think of the strongly connected components kind of independently and then within each and then and of course there's sort of a dag of those and within each component well everything has to be commutative so what that really means is that for each component there's one word such that as you go round and round this component whatever you do you're getting something inside w star and so that means really all you have to keep track of is sort of where you jump on and where you jump off and then the kind of the complication is that for NFA you need to be a bit careful to make sure that you can't get the same thing in two ways because you know this component has well if this component has w star and this component has w star for the same w that doesn't really give you an extra polynomial factor but modulo that it's not so bad to keep track of everything and work out what's going on but for anti-change life is not so good because we don't we don't have this very very strong condition that really within each strongly connected component only one thing can happen so basically so what we end up doing is you sort of look at your transition graph and you're going to color some edges a sort of special sort of column red let's say and the important thing that we care about is is it possible uh... to get between these vertices such that on the one hand i can get from q to q prime and on the other hand i can get from q back to itself with a different word which in particular is incomparable to uh... the word that took me from q to q prime so if we think about drawing the graph on this it's about automaton we have some white edges and some red red edges the critical question is for a path how many red edges does it go along uh... because what that sort of means is every time i see a red edge i have a free choice of doing w prime some number of times and then eventually doing w and sort of moving on with my long-term plan of getting to an accepting state and of course it's pretty obvious that there are no cycles involving red edges because if there were then you would have exponential growth okay so what we want to do is prove that uh... this this quantity of sort of the number of uh... red edges you can go through on a path is in fact the um... or after maximizing over paths is in fact the order of interchange growth okay so uh... it's easy as with everything it's easier to prove this for dfa than nfa i don't really have time to tell you about the details of the proof but the point is that if you think about uh... words you can get without visiting a red edge then you get then you have to have those have to have the property that they that anti-chains in them are only constant size and this is quite delicate proof but um you can show it basically basically the most important idea is if you so you want to say you can't have two different runs that um... are incomparable but that doesn't quite work what you want is simple parts and then you have to take into account the fact that if i have a run which might contain loops there are many different ways to remove the loops but you can still show that if you think about not the simple part that uh... part with loops corresponds to because there isn't just one but the set of simple parts that's still finite and you can only have one representative of each of those inside your anti-chain so that proves that your anti-chain is finite so that's sort of the vague intuition okay so that that tells us what we want to do for dfa now we'd like to prove this also for nfa the way you prove that is maybe a bit surprising you actually don't do it directly but you recall that this quantity d a is well defined for both dfa and nfa we just haven't shown that for nfa it corresponds to anti-chain growth so the way we prove that is we show that this is actually a property of the language so if i have two nfa which generate the same language then this quantity is the same for both and in particular that means given an nfa i can look at its determination that's a dfa and it generates the same language as my original nfa therefore the anti-chain growth of that language must be d sub a prime let's say where a prime is the determination which must be equal to d sub a which is what i wanted so that proves that this thing of like how many red edges can be in a path is indeed the anti-chain growth for nfa and okay and it's easy to see that um for that uh... you can compute this in polynomial time because what you want to know is which of the red edges and you can do that with a similar kind of make an automaton ask about emptiness thing that you could do for the dichotomy problem so you just do that for each edge then you then you have the graph and then you just want to find the longest path with red edges which is an easy problem kind of just a flood fill okay so that shows that you can do this in polynomial time okay then there's something about well you get a similar thing for you can talk about regular tree languages and you can define a sort of analog of the lexicographic order on trees and if you do that then what you find is that growth is either polynomial exponential or doubly exponential so like two to the two to the n um and the kind of main idea there is you need a you need a sort of analog of aqq for trees and it turns out that what you want is sort of what i have referred to as pairs of trousers so you want sort of a tree or kind of a tree context which has a hole at the top and two holes at the bottom with the property that if i if i'm in state q here and here then i can also be in state q here and yeah that's the thing that determines whether it's exponential polynomial exponential doubly exponential and i mean in particular i'm not not that it's exactly a hard thing but i'm not sure anyone's actually written down the fact that in particular if you consider the discrete ordering this tells you that the language growth for tree languages is polynomial exponential or double exponential so that's a true thing okay so that's that's about it for the results of this paper um but let me tell you about some open problems perhaps a question you might be thinking as well you talked about exact you had a section you called exact growth rates and you talked about exact orders of polynomial growth but you didn't talk about exact orders of exponential growth well that's the first open problem and actually this is quite an embarrassing thing because as far as i know no one even knows how to tell what the order of exponential language growth is for an NFA which is pretty extraordinary really so far if i give you an NFA as far as i know if it has exponentially many words of length n nobody knows how to find the constant of proportionality so for dfa it's kind of easy you just look at the transition matrix and then the biggest eigenvalue of that controls the rate of exponential growth but for NFA that just tells you the number of accepting runs um and it's kind of not clear how to use that idea for dfa i could also so okay so language growth is known for dfa language growth isn't even language growth isn't known for NFA but it's also not clear how to use that thing for dfa to find the order of exponential anti-chain growth for um dfa so that would that's also an open problem okay uh another thing is well for language growth actually so i gave you a polynomial time algorithm which is basically quadratic for language growth you can actually do it in linear time can you do that for anti-chain growth um another thing is well again in sort of insecurity land maybe it's not that helpful to know that your system is secure in the asymptotic limit what you really want to know is if i run it for a thousand time steps how much trouble am i in so you might be interested in knowing the exact value of the width for some fixed n um if you're given a dfa it's sort of easy to do that um in time polynomial in n in unary but time polynomial in n in binary i.e log of the number of steps or for an nfa it's both open and then a kind of more vague question is are there other things that you can sort of usefully think about as being anti-chains with respect to lex for some partial order um i guess the kind of feature is that it's um any time you're asking about sort of where does the first point difference come that's kind of what that's about so yeah that's it okay so uh in an infinite anti-chain i think that well okay that is easy because um in in the polynomial the order of polynomial growth if it's zero that means finite anti-chains are finite if it's one or more than they're infinite well it well it's how you're given the um how you're given the uh automaton so of course i mean you can determinize and pay an exponential cost so for decidability but if you want if you're asking about polynomial time algorithms then yeah so the orders are directly from whatever order you have on the alphabet yeah and uh so the case where you don't have um so if you have to get this which are in the polynomial then you can immediately have that all extensions are in the polynomial yeah so that would immediately give you infinite anti-chains right well it depends i mean inside sigma star that's true but of course it might not be that i mean the question is which extensions are in the language okay but i mean you see so there's one case in which i guess it's just the polynomial where you have the order on the alphabet yeah this extends to the order of uh words yeah so then you don't have any anti-chains at all yeah and then in the other case you have at least two letters in which case you have basically a lot of different different anti-chains inside sigma star yes so uh why is it so difficult when you go in you know the regular language so how is it lacking with this well i mean i mean i guess ultimately the question is how does the regular language meet i mean so if you have incomparable language letters a plus b then you're right that a plus b star is an exponential anti-chain but i mean the well a how does this intersect with your language l and also it might be that for every i mean it might be that for every well okay yeah so the question is how does it intersect with your language l i guess i mean the the whole difficulty of the question is what what's actually in l and what isn't lunch