 today's Thursday. Okay, so as I mentioned before, welcome back my friends. This is right now some work that's being created as we speak by, with Gujie and myself, hopefully soon to be an archive, but it's turning out to be so rich that it means putting the paper together is a little bit of a challenge. And what this amounts to is, how to phrase it, the last this morning I mentioned those results at the end about the minimal thermodynamic work with anything that implements a Turing machine. That's all based upon saying what are the rate matrices of a continuous time Markov chain, so in other words a physical system that's connected to an external bath that actually goes through and computer Turing machine. So it's got that kind of a model under the hood and if you ever look at some, for example, in that particular case, what's the minimal entry production? You would come up with zero and a certain sense you, what you have to do is specify many many things in addition to what a computer scientist defining that Turing machine would specify. But let's think a little bit more about what the computer scientist engineer would actually do and what the thermodynamic consequences of what they do might be. So actually wait, I guess I got to do a splatter and does this work? Yes it does. Okay, so this is actually what we are working on is mostly that's focused on so far deterministic final automata and the question is what is the minimal dissipated work? What is a lower bound on the entry production for a given computational system to complete a run and then be reinitialized for the next one? So it's crucial here we're talking about complete cycles like with a Carnot cycle or the modern solution to Maxwell's demon, things like this. We're looking at the total cost when you get back to starting for the next iteration of your computational system. So the first thing to notice is that in many computational systems think back to for example to deterministic final automata they actually decompose in a very clean way. In the case of a deterministic final automata there's the computational machine itself which is the automaton and then there's the string of inputs being fed into the machine. The engineer by and large is designing the automaton. They're not designing the process that generates the string of inputs. The string of inputs is something that's being entered by a human being for example. So that means that we can view the degrees of freedom the variables in the computational machine as accessible and I'll specify what that means more precisely in a little bit. Whereas the other variables are inaccessible and that's what I the distinction between accessible and inaccessible is going to have to do with the amount of work that is required to reinitialize them. So basically at the end of a run the engineer is going to reinitialize the DFA and there they bound on the amount of work that's going to be required is going to be given by the spot of these variables are special. The engineer is built the device. They've built all the little logic shortcuts and so on and so forth of the DFA. So when they reinitialize it for the last run they can have tried to make that be as thermodynamically efficient as possible whereas when the human being generating inputs to the DFA or whatever is generating inputs to the DFA when they just start on the next string when they are reinitialized that way when they sample internally their own probability distribution to produce the next string that process by which they're doing reinitialization typically the engineer can't access that. They can't actually exploit any kind of notions of thermodynamic reversibility as far as that is concerned. So therefore what that means is we can go through the following. What is a lower bound on the work required to reinitialize the accessible variables after a run? Based upon the things that we've been talking about all along up to now the generalized land ours bound since we're going from the final state we're reinitializing it going back to the initial state. This change in entropy is the minimal amount of work that would be required by the engineer to reinitialize the DFA. Okay then let's look at the oops that should not have been there well already one type on these slides. Okay so what about when you reinitialize the inaccessible degrees of freedom in the case that they are all actually generated by a human being who is external to the engineer they have no access on it at all then they can't extract any work by reinitializing the degrees of freedom that are inaccessible but you can imagine there might be other processes where other scenarios which they can actually extract energy by the reinitialization of the inaccessible degrees of freedom. So to actually give meaning to inaccessible what we are actually saying is the following when you reinitialize the inaccessible degrees of freedom you cannot do it thermodynamically reversibly like this. You don't have that kind of fine grained control of the inaccessible degrees of freedom. Instead you can almost view this as an ansatz. We're saying that to reinitialize the inaccessible degrees of freedom the way that you do that is you are just coupling the state of those inaccessible variables to a heat bath where they've got a Hamiltonian such that the Boltzmann distribution for that Hamiltonian is actually the initial distribution of the inaccessible degrees of freedom. So you run through the process you get to the end so let's say for example here is a probability distribution over the inaccessible degrees of freedom for example bit strings input to a DFA at time equals zero. We run through the process and it gets modified to be this. The way we then reinitialize it is we take this particular system and we couple it to a heat bath such that this right here that is the Boltzmann distribution for the heat bath. So we take blue and we have on it a Hamiltonian of the heat bath of the inaccessible degrees of freedom such that when we go through this process of just you irreversibly, thermodynamically irreversibly reinitialize the degrees of freedom the amount of work that you can get back is the amount of work that is involved in taking this off equilibrium distribution and letting it equilibrate again. And that's the change in the expected energy of that particular system. Okay so that's our starting point that the total of the lower bound on the dissipated work once we decompose this computational system into inaccessible variables and inaccessible ones. It's going to require work of delta S or the distribution of the accessible ones to reinitialize the computational machine. The engineer is going to have to burn that but in general they might not be able to extract any energy from reinitializing the inaccessible ones and we're going to be saying that at best the amount that they could extract back would be the change of the expected value of this particular Hamiltonian such that the Boltzmann distribution of that Hamiltonian is the initial distribution over the inaccessible degrees of freedom. Okay so let me go through that a little bit concrete. The lower bound on the amount of work that you should do. The amount of work that you will lose and never recover again. Exactly so, and it's obviously a very very weak lower bound but nonetheless so let's consider the case of I don't want to go through squeaky. Let's say we have a DFA it's got some initial state and you know it might have some final states but whatever it's got its internal states. That's the DFA. It is being fed inputs from some bit string from some stream of bits equivalently. In this particular case we're saying this is the system of interest. These are the accessible degrees of freedom. This string of inputs, this input word is the inaccessible degrees of freedom. So let's say that this input word is going to be, what symbol do we use? W is it? Yeah. So let's say that at each iteration Omega the input word is being generated by sampling ID sampling some distribution before the process starts. What we do is that this actually defines a Hamiltonian such that either the negative or the Hamiltonian is a distribution. So at the end of the run the initial distribution of the accessible degrees of freedom that's a delta function at q zero. Typically you start in your initial state. You could have imagined that there's more than one initial state and you could have some probability of which one you start with. But in the simplest case there's one and only one unique initial state of the system of interest of the accessible degrees of freedom. This process goes through. This becomes a distribution over the accessible degrees of freedom. So it starts from a delta function becomes something else. To reinitialize it we've got to make it be a delta function again. That requires work. And the amount of work by the... Sorry David. So isn't it the dynamics of the system of interest deterministic? We will eventually make them be so. But even though they are deterministic remember this is randomly chosen. Yeah. So that means that this becomes a distribution. In finite bad formalism for example when we partitioned a system like I think of this composition as a global system even though we still write down the Hamiltonian for the system that describes the deterministic behavior of the system. The system in general evolves stochastically because of its interaction with the environment. In this case the stochasticity is induced by this environment which is composed of the bit strings. Yep. Okay. Exactly. That is the exact identity we're going to be making formally. This is identical to the finite bath formalism to the Hamiltonian formalism I should say where this is the bath and it's got a random initial state. This is our system of interest because of its interacting with the bath it actually starts in a delta function distribution but then becomes something different. And the minimal amount of thermodynamic work to reinitialize it is just that ending entropy minus the beginning entropy which is zero because it's a delta function. In contrast we're saying that because the string of bits coming in is not under the control of the engineer the probability distribution over W is not in the control of the engineer we're just assuming that it's free relaxation thermal relaxation. To reinitialize this we're saying you know in fact usually the engineer is not going to be able to get anything back but as a kind of a best case of what they might be able to do if an accessible is going to mean anything at all then we're going to say that the maximum amount of work that they could actually get back by thermalizing this is going to be an upper bound on how much they can actually recover when the whole system gets reinitialized. If they were given the opportunity you engineer you're actually allowed to be reinitializing the bath we're assuming that they would be able to extract at most this much work by reinitializing the bath by letting it just thermalize freely without getting anything close to land ours bound. So essentially reinitialization assumes that you decouple the two systems. You couple the at the re-initialization actually decoupled at the re-initialization step yeah we re-initialize the accessible degrees of freedom by semi-static relaxation and we re-initialize the inaccessible ones just by letting the system re-thermalized by being coupled to some external. We switch off the interaction. We switch off the interaction yes good point yep in the re-initialization step yep very very good point okay so yeah so time is going to be interesting so this is a way now as I was saying if you look at this formula here that then was for is that formula for the lower bound on the dissipated work this is exactly the same formula that arises in the Hamiltonian approach to just stochastic thermodynamics but a couple of things to notice first here we're paying no attention at all to any dissipated work that actually occurs in running the process forward. In general there would be dissipated work there as well but we're only looking at the work that's dissipated in the reinitialization process so as I say this is a lower bound on the dissipated work so that's one contrast with the standard approach of the Hamiltonian formulation but notice something else though as well in the the Hamiltonian approach and for that matter in quantum thermodynamics as well we say that the initial distribution over the state of the system of interest in a state of a bath is p of x times p of y it's a product distribution in the context of stochastic thermodynamics that's very weird if I want to be able to say something about the second law or the thermodynamics of the interacting gas molecules in this room or what have you they're not going to start in a product distribution in general that's a very very strange thing and in fact if you allow them to be coupled arbitrarily all the results in the Hamiltonian approach break down cooler that stuff is cooler than what Kristar Izinski did and so on it depends crucially on this and that means that well yes I can make an experimental system where we've actually got that independence but this is not going to be describing things I find out in the wild so to speak in contrast when we're applying this to computational systems it is almost axiomatic that every run of a computational system the computational machine itself its initial state is formed by sampling a distribution and the actual in this case the actual words coming into the dfa are generated by sampling some different distribution it's statistically independent so whereas it's a very restrictive assumption in the context of Hamiltonian thermodynamics here it's actually just built into what happens with actual computers okay so anyway so that's a lower bound in the dissipated work as I say is the exact same formula as in the Hamiltonian finite biostochastic thermodynamics therefore we're going to be calling this dissipated work e p even though the entropy production even though that's not strictly speaking legitimate there are subtleties involved with this this is basically you can see the results of last night I fell asleep so I didn't manage to complete these so good is actually going to be taking over in a moment but let me just point out some subtleties here one of them is so I just took the first bullet comparing out this rationale here in what we call the inclusive thermodynamic formulation of computational systems comparing it to that in the Hamiltonian stochastic thermodynamics there are some subtleties like for example notice here that um the if there's any kind of an input string input word that cannot ever occur the zero probability of it occurring then the associated value of the Hamiltonian is going to have to be infinite to forbid it that means that to be able to have a well-defined expected energy of that bath Hamiltonian we must make sure that as the system evolves the whole computational system evolves we never actually have that bath Hamiltonian with any probability fall into one of the values such that sorry that the bath state fall into one of the values such that the bath Hamiltonian is infinite so if there are some bit strings that are initially impossible we must choose our set of variables carefully so that whatever we are identifying as the bath which is initially this bit string it doesn't actually ever with any probability have one of these values where the associated Hamiltonian is infinite is what the elaborate your question oh um well by definition well we okay okay okay because we are because we are reinitializing the inaccessible degrees of freedom by thermalizing them and if we want to say that some of the states of the inaccessible degrees of freedom can never ever arise that means that in the thermalization process that Hamiltonian for that particular state has to be infinite to make sure it's got zero probability so there are these kinds of subtleties um yes do you want to take over now actually okay yeah yep it is yours can i uh make a comment or please so essentially if you are the engineer and you want to optimize uh this uh machine uh so what you would like to do is to minimize the um uh reinitialization work okay yeah exactly and so the formula for that is given by this and then there are going to be things like because this is formally identical to the Hamiltonian formalism we can actually take a lot of the results in that formalism and apply them here to analyze the thermodynamics of arbitrary computational machines where notice there's no specification here rate matrices of temperatures it can be non-marcovian as well it's all non-marcovian and so on and so forth but we can exploit those results so for example we've derived an integral fluctuation theorem for the amount for this lower bound on the dissipated work in computation machines without and with and it's a lower bound so we're not paying attention to anything that actually goes on in the forward process only the reinitialization step okay so um uh good jay will take over for the last part of it and um i might scoot up to get a coffee and come back so okay yes microphone by the way i guess that's the end of my lecturing so um thank you for not throwing tomatoes at me or uh any or balls or anything like that and um i hope you at least got some glimmers um awkwardly um productive they might be or just why this is just such drop dead cool stuff okay now what we're gonna do is to actually prove relation between um minimal dissipation that we want to get and they why am i like yeah okay i'm gonna do this okay everyone hears me right okay so chalk so just to give some like this this overall landscape i think i will say actually physical in a sense we can argue but i mean this means energetic and this means this is almost like equivalent okay so we talked about for example in the previous lecture a complexity measure an algorithmic complexity measures like a comma girl complexity right living complexity and so on so forth and david actually sort of presented a thermodynamic complexity measure okay where you try to minimize the heat flow so then this was for Turing machines okay we didn't talk about computational complexity yet now we're gonna do it so there are two major sort of components of computational complexity in computer science theory that's them goes by the name of descriptive complexity okay one of them is the resource complexity all this you know cool stuff with the computational complexity in a sense comes from the resource complexity okay computer science lots of computer scientists are concentrated on this aspect but to be honest we don't know i don't know how to talk about like computational complexity for any computational physical system that we are modeling but there are simpler things we can do for example when we discussed this hierarchy of computational models yesterday we introduced finite automata and like push down and so on so forth and then Turing machines right so our goal is to start discussing computational complexity starting from this part we're going to be introducing that and starting from the lowest level of computational machine because again as david also emphasized yesterday computer science aspects for finite automata and the sort of like this computational complexity aspects of finite automata we know about them already it's great so we have access to this what we have to do is to come up with a physical model that we can use to describe these relations potential relations between computational complexity and thermodynamic cost of running a computational machine and thermodynamic complexity okay so finite automata let's remember the sort of the definition of a finite automata it's a five tuple this is the formal computer science definition that you can come across whenever you like you know open a random textbook in computer science signifying finite automata deterministic finite automata and it's a five tuple so q is a set of states okay two zero is the initial start state we think that mostly there's a unique start state so you know where your start you value start from like computing and processing these strings and row is the transition function that takes a pair of symbols this is the alphabet that is defining symbols for example this is a binary alphabet that takes a pair of current state of the finite automata and the string the symbol of the string that the finite automata is processing at one iteration at one time and maps it to an element again of the set of states okay i'm going to introduce a diagram in a second it will all be great and this is the set of accept states for now let's just stick with some you know assumption that pardon us it's just like one okay so the one thing okay so we yesterday introduced two equivalence relations first i'm going to start with this so that we can take this abstraction and actually make sense of it okay don't stand for language okay oh by the way no there is one more definition that i need to give we define a language that is accepted by an automaton m in the following way it is a set of strings okay that is generated by alphabet uh sigma this clean style that david introduced basically it's all the strings that you can generate 0 1 1 0 0 and so on so forth okay so omega string one string is an element of it and you define a language such that it's a composition the set of strings that satisfies this so if you apply the transition function and so like you're starting from the start state and you start processing one string if you end up in one of the accepting state or in your unique accept state of the automaton when you hold then yes this string is in your language so this is just one step i mean rod describes one step or the iteration of any step oh yeah you can do that that's great actually yeah so let's do that actually the length of omega okay so this is the language and now we can start actually defining the equivalence relations we had two equivalence relations one of them is defined over the language okay it's shown by tilde l so we are taking two strings a pair of strings u and v and we say that these two strings are equivalent to one another if they satisfy the following condition take an arbitrary string and concatenation and if they're like in the both if they're sort of like in the language in the same language for all of the strings that you can come up with of what the case because it's the set of accepting states okay i'm gonna put a diagram in a minute okay okay or i can actually do it right now i think because it will be yeah that's a great point so let's do it like this one okay this is a finite automaton deterministic finite automaton this is the start state okay um in this interesting example this is also the accept state but we don't care about it unique like the start state and accept states that might be that they might be different but i think like this like this yeah okay so this is a finite automaton and k over there it includes q zero if you want to write down q you list all the states okay alphabet is the binary alphabet and transition function is for example in this case you can write it as a table okay that makes sense right perfect so equivalence relation so we gave the first definition of an equivalence relation but we also said yesterday that well we want to when we run the computational machines such as a dfa we want to be able to distinguish these strings if they are not equivalent to one another so we are also defining another equivalence relation over the dfa okay till the m so we said that if you take two strings they are equivalent to one another over the dfa with respect to the dfa if they satisfy the following condition we start from the start state we start processing these strings that are given to us these are different strings okay they come they can come in different lengths we don't care about it so and when we actually sort of like end processing these strings so that we read all of the string we consume it if we end up in the same state after reading both of them we say that they are equivalent to one another one another over the dfa like for example over here one one is actually for example you read this one you read this one you come back to q zero it's equivalent over the dfa to just like reading string zero okay so now drill something else so believe me just that this this is an automaton that recognizes the following language okay let's say that l i don't know l3 i guess okay this is a language and this is the dfa that recognizes this language let's say m now all right so i'm gonna first of all say something if you remember from yesterday we talked about things like my hill narrow theorems and so on so forth we say that there is going to be this unique dfa that recognizes this language with a minimal number of states and we can talk about these things because we know that there is an infinite number of finite automaton that can recognize this language l so i'm gonna draw for example one of them okay so this is let's call this m prime okay this is also an automaton that recognizes the strings the bits the visible by three okay for example i mean yeah just like and and these are these two automata are equivalent to one another one another you can just see it for example read one zero one one zero one one you come here you accept it one zero one zero one one zero one yeah okay it's true right one one zero yeah exactly right yeah okay if there's a problem then i can't think of it but i guess that's okay but anyways so yes so one way to see that these automata are equivalent to one another is actually sort of ask about this question of okay the only difference of this m prime from this m that we just sketched here is that you are introducing a new state right so let's ask the question of basically okay i'm gonna do my last so let's just ask the question of basically okay does this does this q three state that we just introduced does it do something different than what q two does here for example q two reads one it goes back to itself it goes it reads zero it goes to q one okay so if you just follow the steps you will basically see that q three x exactly in the same way as the q two does so it's basically a larger finite automaton that recognizes the same language but it has larger number of states okay so one thing that is really important to us in to us i i just became a computer scientist to computer scientists is that when you're for example running a dfa in the physical sense in your computers and you want to for example you have terabytes of data that you want to like sequence and try to find regular patterns and for example biological sequence like what you have is instead like this alphabet and you're trying to find this regular patterns what you do is to construct like dfa's and let them process the strings okay so if you introduce more states then it will take longer time to run this and sort of like come up with results and so on so forth so this is something that you do not want to do for simple efficiency reasons okay so for that matter people came up with this at least for this finite automata this measure of size complexity the supreme it's even better thank you so that i won't see myself while walking around so okay so okay can i continue i mean so how much longer do you folks want to go i can be done in 15 12 to 15 if it's okay with you is that okay with everybody you hold the proverbial whip hand okay so i'll go up and start okay okay thank you very much okay okay so we are continuing so like i promised you that the end result will be like really sweet so we can continue with this so yes again what we want to do is to be as efficient as possible so we want to come up with this you know minimal dfa's to perform our computation so computer scientists in like 60s and 70s they came up with this baby measure of complexity so when i wrote this like sort of like sketch this you know the big picture we talked about two different kinds of computational complexity one of them is the cool one resource complexity and the other one is descriptive complexity okay so size complexity is a form of what is called a descriptive complexity and it is basically given by this so the size complexity of a dfa of any dfa that you have is basically given by the number of the space of the dfa okay and the size complexity of a language of a language that is accepted by a set of like infinite actually you know infinite set of dfa's is given by the number of states in the minimal dfa so now i can just quickly prove that this is a minimal dfa that actually recognizes this language and we're gonna do this in the following way okay i'm giving you a set of like these strings that you can form by just considering the binary alpha that's 0 1 1 0 0 and so on so forth so if i sort of like put you put there this constraint of okay i'm defining a language and i want to identify the strings that are acceptable divided divisible by three bit strings so i know that there can be only three classes of strings because if you give me a bit string and i try to divide it by three either it's going to be divisible by three remainder zero or it's going to have remainder one or it's going to have remainder two modular arithmetic right so you can basically partition this set of strings into three classes which is basically let's say i'm just going to say remainder one oh sorry remainder zero remainder one and remainder two okay so this basically suggests that if you want to recognize the language by a by any kind of a machine like a dfa you need to at least have three states to identify it right how many states does it have three so this is the best you can do so let's go back to this equivalence relations that we defined we defined an equivalence relation over a language right this tilde l which is highly abstract it's not really easy to actually sort of visualize it but we also defined this equivalence relation over the machine the automata that recognizes that language and one thing that we can realize is that oh geez okay good okay so one thing that we can realize is that sometimes when you have an equivalence relation over a dfa if for example this is a minimal dfa this equivalence relation over the machine is coinciding with the equivalence relation over the language okay but sometimes it doesn't for example if you don't have a minimal dfa it will not in the following way so now i'm going to okay so we say that for example this is one class of strings that are divisible by three this is the remainder one remainder two if you go back to this dfa what we see is that we can actually relabel these classes by basically this states up to the f a for example in our case this is the equivalence class the state basically symbolizes the equivalence class q zero divisible by three q one remainder one and q two remainder two okay so there's a like a one-to-one relationship between the equivalence relation l and m but in this case we just actually discussed that q three doesn't do anything different than q two so if you want to describe this equivalence relation lambda m prime what we do in a schematic way is to take it here for example okay so we are actually sort of getting a finer partitioning of the set of all states okay now i'm going to use a different one but so i'm using blue for the equivalence relation over the language which is basically identical to equivalence relation of m minimum okay but if you come up with something like an equivalence relation that is defined with respect to a non-minimal automaton you will always get a finer equivalence relation what you're doing is like you're basically splitting this equivalence classes partitions into smaller and smaller and smaller pieces you're always getting finer and as a i don't know i think is a cute point you cannot ever get a coarser relationship otherwise the automaton that you construct would it be able to recognize that language so now building on that this kind of equivalence relations we are almost there okay i'm gonna keep it here and i'm gonna go like this so we discussed equivalence relations interchangeably in set theory like basic set theory we say that equivalence relations over a set in this case set of all strings or the set of this language that we define they defined partitions over the set okay this is like okay set partition that's what we go partition is basically if you partition a set you're basically grouping the elements of that set into pairwise distinct non-empty subsets okay so one thing that we are going to introduce is what we call to be a partition refinement so this is a definition okay okay so let me give you two different partitions of that set one of them will come basically correspond to the partition that constructs this machine m minimal one and one of them will correspond to the non-minimal m prime okay okay so this is one partition this is another partition okay if you want to actually write it in terms of this m prime you can you know take basically this q3 inside of this guy here but they because like they just belong to the same equivalence class because these guys are identical so nothing's gonna change okay so these things are called blocks this is a block this is a block this is another block okay so partitions are composed of blocks and we define the partition refinement of any two partition in the following way we say that for example in this case this is like this one we read this in the following way m prime refines m if you can construct m prime by splitting the blocks of m into pieces just as we did here so in like the informal sense you know the mathematical definition of what you're doing this operation here is a partition refinement okay so one thing that is interesting about this relationship is that it defines what we call to be a partial order over the set again when I say set just think about set of the strings and how we partition them okay partial order it is basically a binary relation that is anti-symmetric this is the difference and preserves reflexivity and pop up and then transitivity we have this condition because we know that if m prime is refining m m cannot refine m prime so it's anti-symmetric so probably four minutes or something like that so we got these schemes in our head right so we know what is a minimal automaton non-minimal automaton and how we refine them okay this automaton refines this automaton and basically every non-minimal automaton refines the minimal automaton okay let's now consider the set of all automaton that actually recognizes this language l it's gonna include elements like infinite actually in yeah I mean elements like I don't know m prime and you know that there is a unique element inside this set which we call to be the minimal dfa last mathematical operation that okay penultimate and there's another mathematical operation that we are going to consider is that when you have a partial order over a set this is called something that you actually know that we saw it's a lattice it's a lattice of butt right it's a lattice of this partitions that is ordered by the partition refinement now what I say it's going to be become really clear let's start from you know like this lattice structure that we have is like for example you have this point and then you sort of like go like this like this like this and sort of like you have this kind of like a tree like structure right so this is something that we actually we're sort of familiar with even if not informal transfer let's start with this one what we are going to do is to construct a lattice of automaton over this set okay so for example essentially infinite but we can I mean I'm not going to go into mathematics but we can sort of like again partition partition partition it and have an finite one so so in our case we start with this guy over here this is the minimal automaton think about it as the you know the root of the lattice and when we have non-minimal automaton we base what we do is basically you find this minimal automaton okay for example in this m prime one introducing a state that is equivalent to another state for example q3 x in the same way as q2 but I could add another state that would act equivalently to q1 or q0 and so on so forth so basically one of the m primes can look like that but there will be another like q1 q1 prime that you can like distinguish and there will be q2 and so on so forth so as you go like as you take this road upstairs you are increasing your size complexity as you go down to this root of the lattice what you're having is basically you're minimizing your size complexity and this is what we're using and just the simple proof of showing that if you minimize the size complexity one less mathematical operation if you minimize the size complexity you're going to minimize dissipation final mathematical operation that I just I just kept talking about is that on this lattice is a mathematical structure where you can introduce some really interesting mathematical operations and one of them is what they call to be the join okay and it is defined in the following way let's say I'm going to use alpha and beta okay and it is composed of these blocks that we just consider it this is a block this is a block okay and the join is a operation that is shown like this in general where basically you are taking intersections over all the blocks that define these different partitions okay um there yeah basically ij natural numbers okay so one thing that is interesting about this operator is that let's not think about you know just instead of considering random alpha and beta think about this minimal automaton partition at the root and a non-minimal one that corresponds to this kind of different partition and take the join of them because of the fact that it we just pronounced basically you can construct m prime by splitting m if you take this intersection operators you will get the as the result of this joint operation join operation and prime itself so if I give you a minimal automaton a non-minimal automaton as they're in their partition constructions and if I apply the join operation what we have is the non-minimal automaton itself I think okay now final step so let's remember that we wanted to talk about time or dynamic cost right so we're going to do it now again what we wanted to understand was the following one this is the formula that David used so this is basically the so you have a system okay and you decompose it into basically a computational machine in an environment so now what we are doing is coming back to this finite fat formalism of stochastic time dynamics you have this sort of like this idealized universe okay and you split it into two which is the computational machine and the environment that actually generates or heats the input strings like 1 0 blah blah blah okay so now one of the things that we can do is that we can define the universe is like the triples of these um or the order let me not go into triples but basically just the pair of like the state that considers the stage of the evolution of the dfa this is just like um let me use q here and also the strings that are generated by the environment and your system your universe evolves as the this changes basically or yeah as you iterate something as you process these strings or as you buy and you update the dfa correspondingly okay so this is like just we are not going to be using it here but what we do in terms of partitions is that you sort of also partition it by writing the entire phase space over these pairs as a like a Cartesian product over these two the elements go here are the state of the dfa the elements that go here are the strings generated by the environment okay so this is another one so this is how you can actually ask the question of okay my computational machine is the so i okay and this this doubts as that i'm computing for the entropy entropy production over this so i so that i generate this i i have this entropy production as i process the strings over the dfa and this is the heat flow okay in the environment so one thing that we are going to do is to take this result and the and the fact that we can actually locate this automata on a lattice that is defined by a partial order over the set of the strings that are accepted by this automata and just see what happens to the change in the entropy so one crucial thing that we do here is that we emphasize that from a computer science perspective the state of the automata and the state of the system it's actually a direct delta distribution because it's initialized to one state and it's it just starts evolving from that one state right so if you actually want to compute delta s which is like sd the difference between the final final time entropy and the initial time entropy well the entropy over a direct delta distribution there is no uncertainty it's zero so what you're going to compute is basically the final time entropy one thing that is crucial about this lattice is that um we will show it uh we put it in the paper we we don't have to talk about it right now is that this lattice doesn't change as you process the strings okay so this is important so whatever the time that you're checking the result that we are going to prove now just by one step it's going to be holding for all probability distributions at all iterations independent of the dynamics and everything so now we defined this this joint operation and this is the key to our proof one proposition that is actually proved by Arnold's like just like I don't know in algorithmic 60s and the theorem okay we start with the proposition he suggests that and he proves that the entropy let's say s over a refined partition is always larger than the entropy over the partition that is not split into two that is not refined let's show that very quickly turns out this is basically the joint entropy okay yeah exactly perfect thank you so this is basically joint entropy you can check it Arnold has an algorithmic theory book I don't know what this like what it's called but something I don't know we just google Arnold's algorithmic theory um so you can write this as this one now what we're going to do is take this one insert this one okay because we know that the joint of two is these two partitions where you have one refined partition of another partition is basically given by the refined partition so this is basically this left hand side is sm prime this is non-zero conditional entropy that's it okay so one thing that we know is that delta s that is given by this it's always greater it's always minimal for the minimal automaton and we know this because also like there is a unique minimal automaton it's at the like this sitting at this root of the letters and so on so forth so one thing that you want to do is if you want to minimize the change in the entropy you minimize the system so but what we're actually asking for is to minimize the entropy of production right how should I should I go into that okay I can go into that just like okay one minute we're gonna jump a lot but it's gonna be there so this is the change in the expected entropy of the batch right so when you run two like I don't know some many equivalent dfa's and one of them is the minimal dfa and the rest of the non-minimal dfa's what you do is to let them process these strings that are generated by the very same environment right so this term is essentially I mean there are also much more other formal arguments but this term is essentially the same for every automaton for the minimal automaton for the non-minimal automaton so you don't care about it okay so if you have this then you have this and sorry this is E I oh yeah that this is right so because we have this we can also write this this is prime and this is that so this is and use then of course there are like one of the things that we didn't discuss is that if you have a different computational machine the way that you can decompose this universe it changes okay you can introduce pointers you can make architecture as complex as possible but basically what you can have is always recreate this kind of elect this kind of a picture where you basically suggest that entropy is a partial order over the set of automata equivalent automata and so one other question is that so this is a result that holds for all distributions at all iterations independent of how you run the automata and so on so forth and another result can be derived by considering like if I give you like two different automata none of them are minimal you know that how do you compare the automata then you start to ask this question of like for example material based by basically suggesting that you know if you need to know something about the way that like the dynamics underlying dynamics or the probability distributions and so on so forth and if you read the paper you will see that by just assuming like this such of the like really mild I think like this I don't know fairly sensible assumptions there is also different like a very gentle ways of comparing the dissipation costs of any automaton that recognizes a language so this is something cute okay so and this is the I think first step towards actually talking about this computational complexity and thermodynamics building on the algorithmic complexity work that David introduced and from that on I think there can be like yeah some really marvelous splendid landscapes that we might see but yeah I hope you enjoyed it and thank you very much okay thank you are there questions so well okay so then I think we thank you very much for this last lecture and and then we well we meet again at four for the next lecture and tomorrow for the for the exam okay