 working in Gottingen and he will talk about almost the same topic. So there's a lot of discussion in the field, so welcome. Yeah. Thank you very much. Thank you very much for the invitation. This is my first time at ICTP, even irrespective of the pandemic. Also thanks to Gili to making a good introduction in the field. It should at least raise the motivation why there are some very intricate questions there. So I will talk about a work, so it's a series of works with my excellent post-Octavit Hathi. So we tried to address the same system from a slightly different perspective. I mean, the motivation was a bit different. It wasn't as much as to quantify irreversibility, but rather how to actually make sense of how to physically define the notion of irreversibility when not all degrees of freedom are observed, either in time or in space or both. So the outline is going to be the following. I will have an extended introduction because Edgar yesterday and the day before asked me to maybe put a little bit more emphasis on the background so that everyone can at least hopefully get the motivation why these kind of questions are really deep and interesting. And then I will try to present two different aspects. So both are systems in or out arbitrarily far from equilibrium. As Thomas referred to, I will always look at from the perspective that the microscopic fundamental equations of motion of Markovian and the memory strictly comes from projecting out from coarse-graining, right? So OK. So I will start with a little bit of nomenclature because it's a bit confusing or it's not as clear what one means by anything else that it's either not Markov or not in general, not Markov. So the terminology is the following. So Markov processes have no memory. So basically the next state depends only on where you are. So this is the only thing you need. Then you have the first generalization which are semi-Markov processes. In order to describe such a process, you need to know not only where you are but also how long you have been there in this state. But you don't need to know what happened in the past and this fully determines the future. So it's called semi-Markov. Now I have semi-Markov second order where in addition to knowing where you are now and how long you have been here, you also need to know where you come from. And then you can go arbitrarily higher order. So a process which had infinite range of memory would have an infinite order of semi-Markov process because you would need to basically know the entire past. So you need to know where you visited the previous state, the one before, and so on and so forth. So you can then cover the space of non-Markov processes. I will talk about processes that are after coarse-graining, evolving on a discrete state space but you can generalize it to continuous space basically with the same thing. So the logic is the following. Every Markov process is a semi-Markov process and every semi-Markov process is also a semi-Markov process second order and so on but the converse is not true. So a semi-Markov process is not generally a Markov process and a semi-Markov process of second order is generally not a semi-Markov process. So this is, I will try to generalize the concept of stochastic thermodynamics from Markov processes to semi-Markov processes. And by generalizing, I mean they also need to hold for Markov processes. If they only hold for semi-Markov that are not Markov, I didn't generalize anything. I just derive something different. So I really mean to cover the blue ground density. So now the idea is the following. So of course what we want is Markov because simply the full process is specified by a single matrix of numbers. So these are the transition rates. And for a complicated system, inferring this from experiments is hard in general, irrespectively of whether or not there is memory or not simply because it's difficult. Semi-Markov processes are more general and have a high resolution but there is a cost. You need to have more. So you need to have two matrices. One matrix tells you what is the likelihood of visiting different states that are adjacent and then you need another matrix of functions which tell you what is the distribution of time of exiting from each state every connected state. So this is a huge blow-up of information you need to specify such a process. If you think about the semi-Markov process of second order, you need this information for all the incoming states in each state. So basically you need a matrix of numbers which are the splitings and then you need a high-rank tensor of functions to tell you what are the waiting times going. So you lose, you gain resolution and you sort of gain precision but you also explode in the demand of how to parameterize such a process from an experiment. So you always try to coarse-grain in a way that you are as close as possible to Markov and at the same time that the model is not complete crap because otherwise, I mean you can always coarse-grain any system to a single state and it's always Markov. It's just that this state doesn't move. So this is kind of the extreme. And what I will talk about in my talk are only semi-Markov processes. So everything in this blue and dark blue blob which arise from a coarse-grain or a Markov process. Okay, so this is what I will talk about. Okay, so now another brief remark on the coarse-grain. Of course, there is no unique coarse-grain. There is a continuum of different coarse-grain processes and I just decide to basically stick all of these processes in two extremes. One extreme is called lumping. So this is basically just divide the entire phase space so that the state space is just a union of middle states that covers the entire phase space. Sorry, Thomas, you don't see it. And the other one is called milestoneing where you don't do that. So you select small metastable cores and you really ignore the rest. Okay, so a middle state trajectory, which would be this line and the current state is encoded in the color. In this case, it's just the last visited milestone. So this trajectory is blue up until the first time it hits the new milestone and then it's orange up until the next time it hits for the first time the second milestone. So this is uniquely specifying the milestone trajectories. Okay, so these milestones may be closed hyper-surfaces. There may be open surfaces. There is basically no rule to that. You choose them such that the model you get is as close as possible to Markov because you can understand it that you can actually do something to it or if this doesn't work as close as possible to a semi-Markov, right? So it's always a compromise. And the model you can build also depends on the resolution of the data, the experimental data you have because you have a temporal resolution, spatial resolution, so. I will not talk about these experimental difficulties which arise, but they're very real from a physical perspective. Okay, so now the problem with lumping, if you look at the community that actually builds these models, so not theoretical physicists, not stochastic thermodynamics communities, but people that actually try to infer models from either experiments or computer simulations, they realize that lumping is really bad simply because it just prevents the emergence of Markov dynamics and Satya can tell you all about why because you have this frequent recrossings and this completely spoils down any exponential dense holding times in the lumped states, right? And so if you want to have a Markov process on the lumped space, you need to make a time resolution which is very poor, so the time resolution of the Markov state model must be much longer than any relaxation time in the hidden states, in the meso states, so this imposes a resolution. But even if you can find that, so assuming that you build a Markov model that has this very poor time resolution, functionals taking along this coarse-grained state space do not agree with the functionals of the original dynamics, right? So these are two fundamental limitations of these models. The good thing is they're easy to build. If you have intuition, you just cut the space, right? So this is also, this is a very strong motivation why one should use that. But it is the coarse-graining paradigm in stochastic thermodynamics. Now, on the other hand, the micro-stoning actually does lead to, it was invented by mathematicians actually to get as close as possible to Markov processes. And there is a rigorous condition in which case the coarse-grained process is actually Markov, or almost effectively Markov, is that if the equilibration time within a milestone is very short compared to the time of changing states, and if the periods where the microstate, so the full microscopic trajectory is not in either one milestone or the other are short. So it means it will not be Markov while it will have long excursions either outside and returning to the same milestone or long transition path times commuting from one milestone to the other. But then in this case you get a semi-Markov process. So they also have another, so they can uncover hidden states. In some occasions I will not talk about that, but what we have found and what was our interest was the fact that the milestoneing actually exactly preserved the microscopic entropy production. So this is something we were, we had the result fast, but it took us two years to understand why. But that is what I was trying to talk about, and we still don't understand it fully, I mean this is not a close story yet. And for those that may find, this thing doesn't work anymore, oh, now it will work? Yes, now it goes. For those that might find milestoneing exotic, I will just try to remind you that Kramers in his original work actually did milestoneing. He just didn't call it like that. I mean I don't know who decided the name. So basically what he did is that he tried to coarse-crain barrier crossing in a potential with high barriers, deep wells in a Markov state process to describe kinetics of chemical reactions. And what he said is that starting from an equilibrium in one deep well I just need to look at the crossing current or the first passage time to another minimum. So you have one milestone here, and then you realize actually if this barrier is high and sharp I don't even need the full barrier. So I just expand it around, let's say it's very sharp, I can expand it to quadratic order. He extended the integration to the full line. And then the rate he got contains a non-local effect which is the curvature of the barrier. The Crammers rates, which are typically used in stochastic thermodynamics in Markov processes are basically milestone processes because you have one milestone which goes up until a certain point. So this height here is roughly one KBT and another one on the other side. So this is very old. You don't typically think about it this way. So Crammers did not lump states. He really milestoneed it. So he already knew that this is the way to do it. Okay, so this is just a few. And the other thing is time reversal. We really introduced very nicely. So it's really about this asymmetry in systems evolving forward and backward in times. But there is a subtle difference between mathematical time reversibility and dissipation or physical time reversibility. So in other words, in general, one must not just simply read the trajectory backwards. So this is the way one quantifies irreversibilities. You take the Kulbeck-Liebler for the logarithm ratio of the forward path measure divided by the backward path measure. So the average is about the steady-state in this case, the sum of infinitely long forward paths. And only if one selects this exclamation mark, which is a very subtle thing, I used explicitly an exclamation mark, one actually gets the steady-state end reproduction this way. And a simple example why this is so problematic is take a Newtonian trajectory, a particle having a velocity and a position and read the trajectory backwards. We get a trajectory that is physically impossible. It has probability zero, because you get a particle moving backwards which has a momentum looking forwards. So it's well known that this is easily remedied because you just have to flip momentum, so momentum are all under time reversal. So this is well known and in a way obvious. But it highlights nicely that there is a distinction between this mathematical irreversibility of trajectories and physical irreversibility or dissipation. And this doesn't include only systems with momentum, but rather over them systems if there are magnetic fields. So you also need to flip the magnetic field. The magnetic field doesn't exert, doesn't perform work on a charged particle. But if you don't invert the field, you still get a positive end reproduction if you read the trajectory backwards. So what we wanted to ask is a good memory. So if I ignore now degrees of freedom, what is this exclamation mark? Is there any? And if it is, what is it and what does it mean? So this is a very deep question and it took us to the entire project of more than three years and getting all the proof to only maybe one and a half years and it took us two years to understand what they mean. We are still not fully there. To give you a bit of an intuition about the Markov systems, I first look at the Langevin equation. This is overdamped. I write it here as stochastic differential simply because well, this thing is nowhere differentiable, but probability one, this is always infinity if I divide by dt, but that is just a mathematical concern. And if you now look at the dissipation of an overdamped Brownian system, you can really use the classical time reversal without any tricks. If you don't have a magnetic field you just look at the log ratio of forward path, backward path and what you get is this and this relates the stochastic path integral along such a diffusive trajectory which is basically a force integrated along a stochastic trajectory up until time t with the entropy change in the path. To get some intuition what this means is best if you look at close trajectories. So I take trajectories which start and end at the same time at the same point, sorry and if this force is a gradient of the potential a system of base detail balance and this is a Stratonovich integral so it behaves like a normal integral you see that closing any path in a detailed balance system causes no entropy change in the path so this means any cycle returns both the system and the path to the same point. If this is not a gradient field then you have some entropy change net in the path so the system returns to the same point but you have dissipated something into the path or vice versa if the change is otherwise. So this is the intuition. Now if you have a coarse-grained state space a discrete process there is no unique way how to do that. So there is one paradigm that works for Markov process and this is called the local detail balance. This doesn't imply global detail balance this just means if you look at pairs of states they are as if they were in local detail balance and the paradigm says that the log ratio of the forward rate divided by the backward rate is related to the entropy change of the transition and the idea here is that this should work as soon as the relaxation in those meso states is very fast compared to the transitions. So everything as the transition happens all the degrees of freedom that are not observed are instantaneously at the equilibrium at the temperature of the path at all times is basically what the paradigm was stochastic thermodynamics is. And if you put this into the machinery you calculate out you get the laws of thermodynamics as you would like them. First, second, whatever. So now the point is here that this connects stochastic dynamics so the left-hand side with thermodynamics the right-hand side. So now the question is what happens if you do not have time scale separation. So this is basically my talk. And first I will give a motivation what happens because I'm not trying to say that every process is every coarse-grained process will be a non-Markov process actually typically if the barrier is high sharp and the minimal well of time you always get a Markov process so it works. For example here this is a double well it's a diffuses trajectory commuting between two minima and the subtlety is that if you look at the exit time from either of these states Markov when this exit time is approximately exponential like here so this exponential here has the same mean value as the distribution in blue but you see that the the dwell time so this is basically the time the trajectory spends before a successful transition is the long so this is meaning you reach local equilibrium and the transition is very fast because this is the black bar you basically don't see it but the total exit time is the dwell time so if you want to have Markov processes this transition time must be very short it must be negligible compared to the dwell time so this is well known again the thing stalls so now the point is that even when this is the case up until maybe 10 years ago now people have started nevertheless to measure the statistics of these small black bars here even if there is exponential statistics in protein folding Felix and Kaiser did it even out of equilibrium you know people looked at this in the context of catch bonds where you basically based on the statistics of the duration of these things you can see if there are parallel pathways between two states I mean this is a very the physical content in the statistics of these very small bars even if the statistics is exponential is very rich but this is not what I'm interested in I'm interested in what happens if they are not and this happens very easily and this is what the nightmare is whenever the barrier here is rugged or extended flat on top then you have this frequent revisitations and you spend a lot of time out of the milestone then you see the statistics is not exponential and what also happens that if you drive one transition in one direction very strongly so you tilt one stable meso state then the the dwell time is shorter so in both cases this condition is violated so the black bars are not much smaller now and this we call mild violations of Markovian what when things really go sour is when you have parallel pathways so you have two paths connecting the same pair of states and one path is fast the other path is slow like for example in the catch bonds catch bonds are non-covalent bonds which have this bizarre property that if you pull on them weekly they break immediately but if you pull very strong they become very resilient and how one can explain that is that pulling on strain just puts more probability on the slow path and then the system spends a very long time on the transition path and this we call strong violation so whenever you have either of those on at least one transition in the network you don't have to have it everywhere the entire process becomes semi-Markov because of that okay so we cannot describe this in full generality simply because it's too complicated so what we started with is a diffusion process on a graph so this is an anti-eat or Langevin equation it's over-damped, the force field and the diffusion landscape is quenched in the graph so whenever you have it's always there so there is no annealing there so this is how it actually looks like this is anti-eat or you shouldn't worry about that that is just the way you propagate things and how you should understand this from a physical perspective is that there is always a coordinate that you can find between two mesostates and this x here is just a progress variable along this correct reaction coordinate and this model becomes a good representation whenever all the orthogonal degrees of freedom along these reaction coordinates between the pairs of states are fast compared to the to the progress anyway so now what we do is we do milestoneing where we put milestones into the into the vertices so this means whenever a trajectory passes there and we imagine a Gedanken experiment where we have detectors with different colors in the vertices and whenever a trajectory passes a vertex you see a flash of a color then you just align those flashes on a timeline and the state change is basically the state is the same as long as the flashes of the same color when the color changes you have a state change very simple what you get out is our waiting time distributions out of each of those states are here so this is not exponential because this only this connection here 3 1 has a rugged barrier so this is how we have defined it but funny enough not only this transition is not exponential but also other transitions out of state 3 are non-exponential so this thing is non-local so this is not something this is a bit subtle so it's not exponential therefore it's not Markov-Gilly explained this so I don't have to go about it on a level of probability density so this has been analyzed phenomenologically by Landmann-Montreu and Schlesinger in the late 70s so we didn't invent anything new there but we wanted to know what to feed in the dynamics what are the waiting times and what are the splitings okay and then if the slide lets me I will go further okay so this is now the main results and all the results I will show by doing the coarse-graining it's a massive the proof is very long I will not even explain how we did that what emerges is the following so this is the directional waiting time density so this is the waiting time you spend in I before transiting in J which is the conditional first passage time divided by the splitting probability to actually go there is a sum of two statistically independent terms one is the dwell time the other one is the transition time so they are statistically independent as soon as you condition on the next state you decide where to go they are independent and then there are three symmetries one is we call thermodynamic consistency which basically relate the logarithm ratio of the splitting probabilities to some conservative part and everything that is in the waiting times is in this conservative part this was very difficult to see so all the contribution of waiting times enter conservatively so if you close the loop this cannot contribute to the end of reproduction to physical end of reproduction simply because they are the affinity part if you wish Markov state process is a force integrated along the link and if the process is Markov this exactly reduces to local detail balance so if the process is Markov this is the generalization of local detail balance for semi-Markov the symmetry two tells that the dwell time is just a state variable so it doesn't depend on where you go next it's just where you are and the symmetry three is the symmetry of transition path time so the statistics of the forward path are identical in law to the statistics of the backward path and this has been proven before so this is not a new result this has been proven by this work in 2006 okay this doesn't work okay so now coming back to this end of reproduction and symmetry so you see that this waiting times are not symmetric so the waiting time of exiting three to one and three to two they are different you see this in this picture and there was this work by in 2001 and 2007 where they calculated those and Kulbeck-Liebler divergence without this exclamation mark just taking the forward-backwards Kulbeck-Liebler divergence and the steady-state dissipation and they realized that it's larger right it should inequality should look in the other way around so they didn't understand what's going on but they say that a positive value of the Kulbeck-Liebler cannot imply broken detail balance in particular they constructed detail balance example there was a zero here so it cannot be and when you coarse-grain you said it sometimes if you coarse-grain you cannot get an entropy that is higher it's just not possible it's nonsense what is then missing what is the exclamation mark and this took us very long and what we realized that coarse-graining and time reversal do not commute so if you first invert so if you coarse-grain the backward in time microscopic trajectory you get a different trajectory the transition paths they are odd on the time reversal this is the exclamation mark the dwell periods are completely symmetric but the transition paths they are odd so whenever you calculate the dissipation the only thing you need to take into account the oddness of the transition periods and if you do that then first here there is an equality sign so this is what I have here so they are really reserves the microscopic steady state so this is the microscopic steady state entropy production and this is the entropy production rate in the coarse-grained dynamics and also if you hide cycles so if I now coarse-grained in a way that some states I leave out and therefore I mask cycles then this inequality points into the correct direction so it's not greater or less but sorry less than equal so you get the correct direction so this only holds for semi-market processes if I now perform the coarse-graining in a different way I can coarse-grain a system such that I get no we call this kinetic hysteresis by the way that we get no kinetic hysteresis for example if I decide to make a state change in half of these intervals then I get forward and backward exactly the same trajectory the problem is this happens for both transition paths and the process is not semi-Markov anymore it's semi-Markov second order I need to know where I came from and where I go previous slides they imply that the asymmetric weighting times and a semi-Markov process that arises as a coarse-graining of Markov dynamics are only possible in the presence of kinetic hysteresis so for semi-Markov process this is inherent it's just there okay so I have no time to go there now but I will go directly to my I try it doesn't work so it's not that I want to so in the last part I just wanted to show that local detail balance for Markov process is only a necessary condition it's not a sufficient condition so there is a concrete there is a direct counter example to that where you have a I'm trying to move forward this is basically the spectral gap yeah the pointer is I don't know what this is yeah maybe I'm just afraid that now it will jump several am I there already yes here okay so and here are the conclusions so coarse-graining and time reversal in general do not commute right so there are cases where they commute like Markov processes and other things but they are not this is not true in general when you have memory now we have solved the problem for semi-Markov processes also quite frankly not in full generality because you know this is there is no time-dependent potentials there are many things that are still missing even more things that are still not understood what I didn't show you but I want to just advertise that the time-scale separation is a necessary but not a sufficient condition for the emergence of local detail balance in Markov processes so this relates Markov processes and stochastic thermodynamics and that milestoneing is actually thermodynamically consistent even if the process is not Markov so the reason why this is the case is that even though you neglect parts of the trajectory where they are the entire information is still exactly encoded in the splitting probability so how the microstate actually visits states first the probability that I go from I to J first but not to all the other adjacent states in the steady state properties this is fully encoded so this is the reason why milestoneing the origin is in probabilistic potential theory but this is the reason why this is the case why it's thermodynamically consistent and thank you for your attention I apologize for being four minutes over time and I thank for the funding the German Research Foundation the Max Planck Society and the Studienstiftung thank you very much for a very interesting talk and questions from students so in going from lamping to this milestoneing is there an optimal recipe for shrinking the states down to rigorously no you can actually prove that you cannot know so not that there doesn't exist but that you cannot tell whether there exists one or not the thing is that from experience toy models yes if you give me the microscopic model so I know what happens then I can give you a very good guess and potentially in certain cases we can prove that it's optimal but if you have an experimental observation where you do not a priori do not see everything you don't see fast intermediates because of time resolution or any other thing it's a matter of trial and error and the worst thing is that typically people have to in order to build those models for molecular machines people need to combine different methods with different spatial temporal resolution so we have molecular dynamics going I don't know say a couple of hundred nanoseconds and then you have fluorescence spectroscopy going milliseconds or four spectroscopy milliseconds microseconds and then you know each of those sees a different resolution and different substates so it's really a matter of the modelers have a very difficult job there doing the series trivial compared to that thank you so I have a question about the model you started with so this diffusion on the graph so you said it's already essentially you are taking one reaction coordinate connecting each two states right? that one so which kind of models do you think this is a good representation for because in general when you start for a system where you have multiple metastable states you know that transition from one state to the other and the reverse one they do not follow the same reaction coordinate this is happening only if there is reversibility so everything is equilibrium what kind of models do you want to describe with this? I mean parallel paths in that sense are not problematic this one can deal with exactly the same way but I agree that you know if you have a path predominantly going this way and the backwards path going that way having two channels is not a problem for the theory so this is not per se a problem the problem is actually that if the dynamics that is orthogonal to those two paths is not much faster then you know if basically if you have Brownian paths they will not go in a straight tube they will be pretty scattered so there it fails so it's always if you have like an instanton tube which is a good representation but the good thing is this is a follow up work that will follow soon we replace now these links by a Markov state network so if I instead of this line which is a 1D process I put in a Markov state network not a 1D chain but some network the results remain unaffected the thing is it's just dry algebra there so it's little insight but you can basically the formulas don't change at all I don't have to change anything and I think that is to the extent that the new microscopic Markov network is a good representation of the dynamics it's more general but just to clarify I'm not implying that one can actually resolve that that would be a bit backward I just say that what one can resolve one can to some good agreement describe by such a process so this is the statement I'm trying to make but in reality if you don't know where the paths go you can only try you try it and you see how it agrees thanks I think we have time for one more question but I think Satya will ask first so let me ask you a very general question imagine that I'm an experimentalist so I'll just give you a time series maybe as long as you want a long time series is there any way to for a practical way to figure out if the underlying process is Markov or semi-Markov or higher order Markov well yes you can the way to do it you know this concept of hidden Markov modeling you can generalize it to hidden semi-Markov modeling which is technically more demanding but you can just try to find the best possible hidden semi-Markov network and the best possible Markov network and then you the quantifying the memory is trivial I mean we did this in my group I didn't talk about it but if you just have a time series quantifying how much non-Markovian it is is trivial because you can construct a surrogate process which is basically the Chapman-Komogorov propagator which is a two-time series and then compare the propagator with the original propagator so if the process at some scale is Markov then the distribute the Kullbeck libel between the two propagators is the same and by tuning the extra time which doesn't exist in Markov processes but does in this artificial Chapman-Komogorov construction you can actually probe exactly past which scale it becomes Markov so this is not a zero one condition environmental noise it's not that you say this is perfectly Markov this is perfectly non-Markov but given a time spatial resolution and your ignorance if you wish so margin you can at least say this is significantly non-Markov on this temporal scale for example for the sake of time I think we should go for coffee break I need it as well you can ask more questions in the coffee break to Alges thanks a lot again for your talk