 I will speak about how to approach sequential decision making and I'll speak a little bit about computational issues now. While you were still sleeping I prepared some slides here. It's not perfect, I apologize but I think it will give you an idea and there's a lot of literature but there's not really, you know, there's not like a textbook on this topic or let's say like this, there are a number of textbooks, I can also recommend you, on sequential decision making but they are not aimed at what we do. They are mostly computer science, control this type of fields. So I can't to say recommend you an overall book that you can read on this topic but I can give you hints on different literature that you can read to deal with this issue. And I'm trying to give you however an overview. So to start with I'll just come back to what we looked at yesterday. You remember this problem from yesterday? We had a technical problem. So you want to build the foundation and before you decide on the foundation the technical engineer says okay we should do a test, drill a hole. Now we saw yesterday that it was good to do the test so the value of information was larger than the cost of the test so figured out that you should do the test. But now in principle in a free world we can do additional things. So what can we do? We do a test and we figured that okay after the test let's say we have a certain outcome. The test is maybe saying that there is might be slight movements so according to our analysis we should we should do the deep foundation. But we can also do something else. Another test. So we're free to do another test. Because we are still not deterministically sure about the state of the slope remember. So if I give evidence on the test the result, remember that we did yesterday so we observe a small movement following the test then the, so what I said yesterday he has to have a temporal order so first there's a test and then now the result will be that we should do build the deep foundation but we might say okay you know this test was not very conclusive so why don't we do another test. So we might do another test and then we might do another test and another test. We saw yesterday that there is a kind of a maximum value of information that we can get which is this value of perfect information. So and I remember what was it 240 so the cost of the test is 20 so more than seven tests for sure is not going to be optimal but we made two more than one test. And in principle that's straightforward to include here and to save some time I have already done that here so first ask you so so if you want to add the second test what do we have to add in this influence diagram how do I need to change my influence diagram to add this is the second test yes in this case yes we need an additional decision node and we have a let's say this is the first test there's a test outcome and then we have a second test put it here it's another decision and what are the links going to this so we do a decision test here and the links associated with this note are so we know the outcome of the first test when we do the second test and then we have a link from this to the outcome of the second test and that has a link here and then we have also utility I can just show it to you sorry it's not very smart of me here to do it like this so you can't hardly read it I know sorry but you get maybe the idea that first test there's a cost associated there's outcome to that test that outcome depends of course on the true state of the slope and there's a second test possible test is a decision on the test again with the cost and again with an outcome that is again dependent on this it's not dependent directly on the first test so the assumption is that for a given condition the two tests are independent might yes might be now I'm not being so I assume that you take the same team take the same method and they drill almost at the same place they're probably some dependence in the performance of the test if you do a completely different test then probably it's independent so there's some assumption here that we do okay so here's this assumption and then we decide what we realized okay what we might I should mention is that and you might wonder you should have also you might say that you should there should be a link from here from this test decision outcome sorry from this test outcome to the final decision no there is no link here although you can make a link if you want but it's redundant because the influence diagram has this idea of no forgetting so every information that you had available at when you made the previous decision will still will always be with you so you keep all the information in your head and this is one of the reasons why in humans are maybe not that bad in how humans deal with this problem in our brains because we forget this and it turns out that it's not no forgetting actually makes the problem computationally very difficult it's no forgetting idea makes the makes the computation very demanding because you exponentially increase your your kind of knowledge and that will see later that that makes the problem exponentially increasing and in human humans we just forget so there are algorithms that deal with this problem by actually forgetting stuff because it was available we knew that when we did this decision and we keep it in mind so we'll not forget that but you could also make a link from here to here to make that explicit if you want no because the implicit assumption of this influence diagram is that you do not forget so if you want to I say that they would you will have to make a link from here to here yes so this common principle also if you have multiple tests I mean whether it's sequentially or just in the lab not sequentially but if you have correlation among your sample outcomes there is always a loss of information so okay so this is how we modulate and now what do you think should we do a second test yes or no so I mean it was difficult but you have the numbers this assumption is that the second test would be exactly the same type as the first test but it will be an independent realization so the first test was always we should do this first test so you think we should do the second test discuss first test had the value of information of around 60 at the cost of 20 so do you think the second test will be still optimal actually it's a tricky question because you can't answer the question even if you even if you run the model you can't answer the question because it's a sequential decision so in order to make that second decision you should know the answer to the outcome of the first test otherwise we are back with Jochen pointed to know that we are actually asking the question initially whether we should make one or two tests right away that you could also do we could say when the sequentially it's maybe a good idea but but often it's for for constrain for time constraints or you can't first wait for the first test to decide on the second test you have to directly order one or two tests at the company in that case you don't have a sequential problem but you have a just a problem of deciding directly whether you should do zero one or two tests but the way we do it here is that we say no it's more optimal if there's no constraint on time so on it's more optimal to wait first for the first test then depending on what that gives us we might get we might see that it's optimal to do a second one so let's run the model and just comes out so and the decision on the second test as you see now depends on the decision on the first test well and it depends on the outcome of the first test so let's open it and see what the result so it tells us it's a bit hard to read in the back and open so it tells us that if the decision on test one was no test which doesn't make sense so much but let's assume we had decided first to do no test one then it tells us that yes we should do the second test which is logical because that's like saying it's like the initial problem the initial problem was okay should we do a test yes or no and this says if the first test was not yet done then yes of course we should do a second test would be the first one in that case so that's trivial now the interesting thing is here so if we do the first test should we do a second test but what it tells us here is that if the outcome of the first test was slight movement or strong movement it's optimal not to do a second test in that case it doesn't pay in the expected utility it doesn't pay off to do a second test on the other hand because what happens is that even if the second test gives you a different result you will not actually change the decision anymore that's what kind of one could see but if the first test outcome was that there is no movement this tells us well then in that case it's still valid or still optimal to do a second test so the second test here is the optimal choice in case the first out test outcome was no movement otherwise it would be better not to do a second test so that's the sequential nature of this problem and now it could go on and you can say okay now I'll do the first test the fourth test until at one point you end up saying okay no it's stopped because whatever we do it doesn't make sense to do an additional test yeah so that's the introduction yes yes yes yes yes for this this number I mean the diff actually the value of information is the difference between this and this numbers and actually it's exactly the same as we had in the prior in the in the original case yes no different okay all right so that's what you made and then we'll come back to this just want to point out this no so what we can see now is that when I optimize the second test so this is the first test it's an optimization problem now I optimize the second test and the optimization of the second test requires me to consider in fear I mean we know that some of these cases are trivial for us but then what the genie does is he considers eight different cases no for the optimization of the second test it considers eight different cases because it has to do it conditional on the first decision and conditional on the outcome of the first test so it has to run eight optimizations now we could consider the third step the third decision if you would implement that that would be conditional on decision the first decision the first outcome the second decision the second outcome so we would have two times four times two times four so that gives us eight times eight sixty four sixty four you would have to run sixty four optimizations to optimize the third decision I mean to consider all the cases and it has to actually do that because what I didn't show now but of course they thought we have to optimize the first test no the first test what it what it tells us is that yes we should we should do the first test now these numbers actually slightly changed this line numbers have now changed compared to the prior thing because it accounts for the fact that we can do the second test so when it calculates this optimal here optimum here it has taken into account the eight possible optimals here so we'd have a third decision it would have to be taken into account the sixty four optimums at the third decision if I would take into account four it would have to take into account the now it gets eight to the power of three and so the problem that it has to consider increases exponentially with number of time steps that we consider or number of sequential steps that we consider it's a very fast this problem becomes in this crude way that it's doing here intractable and that's why I want to address also some computational aspects a bit late now also explain the same problem again with with another example alright so start here we go we have already seen that really good also think of this as a decision tree by the way no so the corresponding decision tree will look like this and there you can kind of see that the number of branches in this simple example here number of branches will increase exponentially with the number of tests that we potentially consider the number of sequences we consider that's the fundamental problem that we face so I would say that all decision problems are inherently sequential now in some maybe quite a number of problems we can ignore the sequential aspects not always often say no in this case maybe it's not actually feasible to say okay we wait first for the first test result to decide on the second we should decide from the start on how many tests we're going to do because of time constraints the construction guys are already waiting and they need to know what to do so we have to decide right away if we do zero one two or three tests that might be that's a common way to do it sometimes we just simplified and say okay we should consider this sequentially but the computations are too hard and we approximate it by ignoring the sequential aspect we know that this is gives us a suboptimal solution but it's reasonably accurate still so so yes so in that sense we can often limit the problem to a non-sequential problem the most of the problems that were shown yesterday by Michael and in his lecture and this is a central placement problem it's a hard enough problem so in itself so because it has to consider multiple infinite possible arrangements of centers in space in principle it's also a sequential problem but the sequential problem is simplified into just assuming these costs that you have for the inspection and for the wrong the C0 0 the C0 1 and so on so in many cases that makes you know that's completely okay but there are cases in particular when we think of making decisions over the lifetime of a structure where we can't completely ignore the sequential nature of the problem and we have to approach that that's this thing here so and now I'm going to motivate this detection planning so that's what was my PhD about this is not for my PhD but just do to clarify this know that this is a problem where we have to come how consider the sequential nature of the issue so the idea is I want to minimize the the lifecycle cost of a system the system is deteriorating and to manage the duration we can do inspection now inspection we have to decide on a number of parameters and should really accredited my PhD supervisor Michael Vaber or that's maybe the points I stole when I took from him when I started my PhD he said okay we have to figure out when to inspect where to inspect what to inspect and how to inspect and it also other people have come up with this exactly the same points know so when is exactly this decision this sequential problem so should be inspect in year two year three or every five years where should inspect here here here here here and that's also sequential in or it could be sequential because in an optimal way we should first do the first inspection and after the first inspection we should consider the results of that inspection to say okay now we should maybe focus our attention here or here or here so that's also sequential what and what should we inspect for what Michael also pointed out no we need to know what we are looking for when we plan the inspection we cannot plan inspections to look for something unknown or not yet thought of but I will not be the same but if we for example we know that we have fatigue potential potentially fatigue problems then there is a specific type of inspections we might want to do and the type of inspections we want to do to figure out whether there is a general problem that we are not yet aware of would be a different one so we have to know what we are looking for and then we have to figure out how as we could use again pointed out yesterday we have different techniques that we can use to do the inspection or the monitoring so these are the questions we have to answer and so the sequential problem is really just the first problem and I just point out this this this question is to say that sequential problem alone I tried already to explain it's difficult enough but then you have these additional things now in real life typically the what and how is more or less answered by in a discrete sense you have a discrete number of possible problems or type of problems and they have this discrete number of possible methods that you can use to inspect and that's typically also done in an expert assessment also by availability you might have certain techniques available and others you might not have available and you have experts that tell you okay this is the type of deterioration that we would expect here and method a and method b are the ones that make most sense to to to find this type of damage so that's typically done in a discrete kind of way a priori but the when and where it's really sometimes it's also possible to buy experts but in many cases it's difficult to impossible to find optimal even near optimal solutions by asking experts and what we're trying to achieve is to to solve this problem in a quantitative manner but it is a high-dimensional problem so I'll come back to this just the motivation now I'll first you have another example that I will bring at the end I'll skip it now for now and I will go to just directly to the computational aspects while you are still awake and then go back to the last example at the end so computational aspects as I said already can lead to exponential and or polynomial increase so of the problem so it's basically this decision tree and I this was taken from Jesus student of mine nothing new it's the it's this decision tree and what you see here is just the deterioration process I mean that for example when this is the random they know this again are random variables and these are decisions so it starts out and we have a certain unknown or unknown but uncertain development of progress deterioration think of a fatigue problem so the fatigue crack fatigue might might might or might not to grow in the different dimensions and that's if you think back of the structure is not only at one place but might be at many different places in the structure many structures we're looking at are quite big so that's why I'm thinking polynomial increase with system size larger the system to more states I have here then okay then there is a system condition and him just assume the system either works or not I mean infrastructures in most cases we can simplify it and say okay the function of a structure is to support what the activity I mean to support to stand there to support the load either it fails or it doesn't might have service ability limit states but you can say okay the structure the complete structure either works or the doesn't and for offshore structure that's very much the case so either they just they are called support structures so either they stand there and they they're okay or they call they collapse when they one here means actually failure it's a bit misleading so one is a failure if the size of structure fails we can stop the analysis because that's it but in of course we expect the structure not to fail and then we have to decide after some time whether we should do inspection and then again the inspection outcomes this is always one behind so basically here so here is the decision on the inspection and we can decide not to inspect at all or we can decide to inspect and then we can decide inspect here here here or here here here so the number of inspections but not only the numbers but also the locations because they are not all the same components some are more loaded by fatigue but other components are more critical so you know it's this is that this is itself is just an enormous problem just to solve this but you have almost infinite possible options here then you have outcomes again a random then you have the decision on the repairs we that often we simplify and we just say okay whenever we we find something we fix it but printable that's also decision and then you have the condition of the system so the deterioration goes on potentially and it involves and you have again a possible failure or not and then this whole cycle repeats and you can imagine all this this is an exponential increase in the problem size and if you would like to plot all the branches and put all the probabilities you complete your very quickly out of computation so that's not possible to do even for a small system so that doesn't work so what options and what I've seen and I've worked a bit in this field no so they actually only I found only really two as I said you can also try to be the experts you can do I mean there's other things you can do but from like they say computational approaches to solve this particular problem I would say one is based on partially observable Markov in decision process and related concepts there are some other concepts that are related to that and then what I most or what I'm kind of advocating is we call direct policy search or a heuristic approach and I'll explain both these concepts a bit quickly so Markov in decision process an example okay that was as I said this morning the example is missing it's because of your lucky so but what is it I will make it you can make an example here it's not very difficult to make an example so so for example position problem of going home after the after the dinner so this is let's say like this so here is here is how do I feel how do I feel failure no not F this is how I feel at time T and and then this this will make I will inform a decision or whether I should have another rocky or not at time T or whether I should go home so if I go home and it kind of finished here but and then you this decision will then will then translate into how I feel next time step then again I have to decide where I go home now or not so do I drink my rocky or should I go home and it says you there's utility so both positive and negative associated with that with the utilities associated with how I mean there are keys for free because you pay for it but so I don't have the utility for that but I have a utility for how I feel so I have a certain way of so most of how I feel the next morning so I feel perfect by the way so okay anyway so this is a whatever so this is a and then this is related so if I already feel bad here I will feel worse here and so this is a sequential decision problem at every point in time and another one another one eventually I go home and in a way if you are experienced you are students so you have experience no so you at some point you know that okay what you decide I mean what when you decide here you should also already think freestyle three time steps ahead because you say okay if I drink this second drug II I will get into a good spirit and I will drink the third one and I will drink the fourth one and then the fifth one then I will regret but I because the so so so it's not that I can only think of the next time step but I have to think already three or four time steps ahead when I do this decision and I mean we did translate this to a to our type of problems that we are supposed to deal with then this would translate to for example you know when you you can say you have a think of operating a water pipe you know this is the condition of the pipes in the system and at each point in time you can decide to do some maintenance actions those will affect the condition at the next time step of course which can point you can still we can again decide on whether we should do maintenance action the assumption is that when I do the decision I know the state of the system so I know my own feeling and in the suit in the waters pipe system the assumption is that I know how many leaks there are in the in the each point in time which probably is possible because you can measure how much water you lose so so this is a Markovian decision process which means once I fix here the this this note here but fix this this is the coupled from this that is Markovian properties that we saw is in the Bayesian network and in the mark of this is a mark of process here and then decision process is also Markovian because once I know this whatever happened in the past will no longer affect the future and vice versa this is the Markovian decision process and the advantage of this is a strong advantage of this is that I can break this this kind of kind of curse of dimensionality or this is this exponential increase with with sequences because how do I solve such a problem well there are two okay there finite and infinite dimension time processes so if you have an infinite dimensional process there's just I'm not going to discuss this but they're even easier solutions but we typically deal with finite time horizon problems so we have a 50 year service life or 100 year service life so we solve the problem by starting so now assume that this is my last possible time step because here is when your money runs out for for Rocky so that cannot get more than to here this is my last this is the last time step this is my last this is my last decision there are many decisions before that but different my last decision so what I'm doing is I'm fixing this I'm saying okay here I might feel terrible bad good or euphoric for any of those four states I'm making an opt I'm trying to I'm saying okay given that I feel terrible what is the optimal decision given that I feel euphoric what is the optimal decision so I'm going to optimize just this decision conditional on this and because when I fix this it doesn't matter what happened here that's independent so I can optimize this just by for these four different cases conditional on this for different cases not conditional on the whole history and so on then I have this optimal solution then I go to the next time step I'm trying to and I'm going to optimize this conditional on how I feel here and now the good thing is I have already solved the optimal the problem here I know the optimal decision here for given each of those four states that means that each of those four states is associated with an expected utility that is already taking you into account that the optimal decision is taken here but and then continue continue like this that gives just a linear so the computation time just increases linear with number of sequences and it's this state here is not very big there's no problem guess what this is exactly the Bellman's equation that was pointed out yesterday in the optimal control so the basis at least for this and then they have been a lot of developments but despite this basic idea is exactly this nothing else linear programming so that's if you have a problem like this it's not much of a problem however that's very rarely the case why because in most real problems we do not observe the state exactly so we do not know exactly what is the condition if you have a perfect health monitoring you would know but that's typically not the case as we saw yesterday we have a likelihood function that describes that gives us information that is indirect that we can use to update but we will not exactly know what happens so this is what so we get to a partially observable Markovian decision process and unfortunately this is this is an example I should have made this I should have shown a simpler one so this already been complicated here but what do you see here is the condition of the system the idea is that okay at time one we can make a measurement this is the measurement set which we have to decide on whether we do it or not so this is that's why it's a decision note here on whether we should do a measurement and then based on the measurement we have a new condition behind that is a decision repair we have avoided we have skipped this decision or repair here because we say whenever there is a problem we fix it when there's no problem we do nothing so the decision has already been taken and it's not shown here so anyway so this is one and then you have some costs here and then system state so but the main idea is this this is the state the condition and these are the observations they are not here but they are here which means that even if I know if I observe this I don't know this with certainty so I never have deterministic knowledge about this and this illusion is not applicable because when I the the past so given this knowledge the past does not become independent of the future so given this I can't just condition on this and say okay I can just consider the future so it doesn't work and in principle I would have now I would have to solve again the exponentially creating problem that's not convenient so what is done is so this actually here's a simple example I should have shown this sorry from a master thesis recently so but it's the same thing you know that is the condition and here is the observation and the observation is then used to do actions that they affect the that the consequence so what do we do instead is something is called a belief state is introduced this is the main idea of the genius idea of the partially observable Markov and decision processes these are things that also came up in 70s or 80s and then I think in the 90s first people started to use it for for planning inspections so that was around 90s and in recent years there has been an increasing amount of people trying or not trying using those concepts to solve inspection planning problems because the algorithms that do that have increased quite tremendously the main idea is that we solve the problem now by introducing this belief state what does it mean for instead of instead of conditioning on on theta like we do here which is not possible because we condition on a kind of a combination of theta and the observation this is the belief state but the idea is we instead of saying that the system is in state good or at this time or bad we say we condition on what we believe about the states so forever we could have we have just two states bad or good then our belief state is described by the probability of the system being bad or good I mean this is one minus so let's say we have a probability from zero to one of that we at this point in time believe the system is in a good state this is based and the belief that we have comes from the the combination of the prior and the update the observation we do patient updating we get a posterior distribution and in the simplest case where we just have two states this means we have a distribution over P the probability of being in a bad state from zero to one a better distribution will be appropriate for that so better distribution and that better distribution represents my belief state at point in time one and then in point of time two I mean I observe observe then I have a new belief in point-in-time two so now I believe that the probability of being in a good state is 0.9 based on what I have observed but what turns out is that if we can if we have if we fix this belief state which includes both of those then we have again a Markovian decision process just not in terms of the original problem but now in terms of this belief state B and then we can go back and use the solution to the Markovian decision problem so that's in a nutshell of course if you it takes a bit more time and it takes some quite some time to understand and implement all these things but that's the main basic concept of this upon DP what is the downside while this belief state can become very large so if my state here is binary zero one the belief state becomes a continuous random variable from zero to one which I have to discretize then if this is a continuous random variable my belief state becomes already infinite dimensional in principle now there are special cases where the belief where I can you know assuming norm normality and so on I can double if this process a Gaussian linear Gaussian process then the distribution posterior distribution at every point in time is described by a Gaussian distribution because distribution has two parameters me and sigma so the belief state requires to represent a distribution over me and sigma so that case that's still doable but then the linear Gaussian process are a field where this would be is quite easily applicable actually that's what my student here used in her analysis but then when you can do general cases this might not be just a random variable as we saw before in our problem that we have this can be hundred or thousand random variables so and then a thousand variables and then the belief state is obviously going to be incredibly large so until recently there was not much to do to apply this to more more realistic problems yet to just make these simple assumptions normality and so on now more recently in computer science and I mean this reinforcement learning and all these communities they have I mean they are very much interested in this is not just a problem for us this is general to solve planning problems in very general broad areas so they have come up with very powerful algorithms I won't go into the details that again are approximation to the solutions but they can deal with high dimensional problems higher dimensional problems so that is something you can explore yes please interrupt this yes then the information only comes on set one yes collection of possibilities yes well I mean you have to think that okay so now this thing here that looks like this B1 with my B my belief state now B2 and so on and now what you're pointing out is that okay then I have my ace and so on so basically my so basically what I do is I do this I'll just it's helpful to show this to answer the question no this okay this case doesn't look like okay sorry there's no arrow here in this example the arrow goes to the okay so basically I do a reduced reduced and I'm skipping the utility nodes here but I'm doing a reduced thing that then looks like this and you're right I mean of course I still have to consider that there is actually a T down the set and that can be higher dimensional but the first step that is done is actually to to to to a to establish this network here to establish this network you need to just to apply the base rule more or less considering and you have to and you can do that locally so you can just take you take first you take B1 and B1 would then be the posterior you basically take the posterior distribution of of this case of the two random binary case no it would take the posterior distribution of this P and then here you have to take B2 given B1 and and given this decision so that means that you have to say okay this is now your prior and if this is your prior you have to think of all the possible observations you can make here and you do a Bayesian again a Bayesian updating here to get this conditional on this so if I would believe that my probability if my belief that the probability initially is a 0.05 then there's a certain probably a prior posterior distribution that I will get here I mean I'm it I mean as I said it's actually trying to explain it fast but but you actually preprocessed this and then you do basically just a number of Bayesian analysis is here to to construct this what you believe here conditional or what is the distribution of what you might believe here given what you believe here and this you can do by a sequence of yes but I mean in this I mean this high-dimensional problems they are solved again with exactly sampling or different techniques including sampling techniques and I cannot go into this there's software is available actually the next slide here so I think the guy who knows most about this in our field not in general but in our field is here costas Papa Constantino so he's in Pennsylvania so if you have and he wrote a number of papers on this and and he really stepping the guy who really knows this stuff yes there's also paper my fingers I'm not sure if you're not sure this paper but he had also a paper I remember what he did compares different software solutions that are available they all come from the field of as I said computer science and on so they're not made target to what we do and often the problem that we have we don't have limitations and the problems that we have might not always fit within those limitations it's a bit tricky we had some students try trying out also some things but I guess that my Munich master thesis six months long in six months was I mean much students could implement problems but simple problems complicated problems you can't implement just in a few months here even if you have a software so this is something that is interesting to explore but if you go down this path you should maybe first think how you do it just don't just say okay this looks like a school thing and then trying it out okay try it out for a week or two but before you spend one year of your PhD about on this we speak with this guy all right that summarizes this any or questions that you might have on this type of techniques okay very good yes yes yeah that's as if you have a non stationary process basically on homogeneous process that's one of the issues is so okay so you cannot speak with this guy yes anyway so this is okay this one approach I'm going to show you a second approach this is what I mean we have been using this a little bit on simple problems because we have another approach and to compare our approach to this we have applied this a little simple problems but not on the big problems and this is what we call the heuristic approach or in the more computation in the more general computer science community maybe you've known as direct policy search and I've tagged with an example from a thesis from a PhD thesis or this is not exactly from PhD thesis but this is something we have done you have done in my PhD thesis some people yesterday mentioned that they have read this very good so sometimes people read actually those things that you do it seems if you're lucky anyway so so what is the idea so this is a reduced problem here we try or I try just to calculate what is the optimal time of inspection so we assume we have one component not the system of the system just a component that component deteriorates similar to the type of problem we consider with the Jochen and we have we now can do inspections at any point in time and based on the inspections we decide to repair or not to repair and the question is simply when should we do inspections just the when problem is answered here fixed everything else is fixed now but still it's a exponentially increasing problem difficult to solve so the very simple heuristics that we use and it's not my idea these heuristics I should mention again this was something that was given to me by my supervisor that in my PhD and was done by him and dollars before so the simple idea was to say okay we calculate at every point in time the probability of failure of my component and whenever that exceeds a certain threshold we'll do an inspection and this seems reasonable from an engineering point of view you know whenever you your reliability is decreasing below a certain limit you do an inspection and then you then we just vary this threshold and the times of inspection directly follow from this threshold so if I have the threshold that is lower I get more inspections if I put my threshold higher I'll get fewer inspections and I instead of having to decide every point in time whether I do an inspection or not I just have a single parameter that completely determines my inspection strategy that one option second option here the I considered also in my thesis was to say okay no instead of more practical from an operational point of view is to fix the time intervals between inspection so instead of so we just say okay we do every inspection every five years or every 10 years and then instead of fixing the threshold we fix the intervals between inspection that's the second type of heuristic rule and again you see it's just one parameter the time between inspection that even discrete if you prefer discrete optimization so I can reduce the big problem the simple problem what happens then is that once I do that once I fix that this whole decision tree this large decision tree actually turns into a and what I call it is an event tree because there is no more decisions are made for a fixed so I fix my parameter say I inspect every five years when I fix that there is no more decision to be made it's just an event tree event tree will the simplest event tree looks like this I have a failure or survival up to the first inspection with the component might fail or not this is the probability of failure at the first inspection I can have different outcomes but ultimately I'm interested whether I'm going to repair or not and that's okay there's a lot of decision here but that decision again is fixed so we fix and we say every time we find something we repair this is the second decision rule that I kind of suppressed so the first heuristic we expect every delta t years secondly every defect that is found is back or every defect above a certain threshold is fixed okay so I have some probability of finding something larger than that in that case I prepare and here the simple idea is that okay once I repair I'm going back to point zero but you can also do a different model there it's not so much important and or if I don't repair if I don't know if I find something lower I repair and I continue to have failure or survival again so this is still this could have many many potentially infinite and in the problems we're looking at they can get to infinite number of branches and this problem is still the limited number but in general infinite number of branches however because there's no decision here I could just if I could just use crude Monte Carlo to solve this the probability of failure is small then it's not going to be optimal I'm going to show an alternative approach but but since there's no decision here there's no optimization more anymore involved the optimization is taken outside of the tree the sequential problem is is replaced by a kind of just deciding on on a set of parameters defining my heuristic I can reconcile I can calculate the expected cost associated with this particular strategy and then this is what we do so we and then we just vary the different we just vary the different things and so we and actually this is not even sure I think it's not from I think that this is not from us I should have put a reference history this is from Jenny Nielsen this I think is from Jesus this from us that's okay but this I stole I tried to remember to put the reference before I upload this slide so this is from Jenny Nielsen so Alborg I guess some people are from Alborg here so she might you probably know her so she's also somebody who knows a lot about POMDP I guess Aleste and Gostas but still a lot so if you're in Alborg you can ask and speak with Jenny so she compared this POMDP approach with this idea of the probability threshold approach and to compare what is the total what is the optimality of the strategy now the POMDP principle should give the optimal strategy the reason and then the fact that this this this threshold approach that I showed here gives a lower total expected cost shows that there is a that the POMDP solver has actually not found the optimal solution because if you solve the full problem the POMDP problem in principle we should be optimal but it's yeah anyway so but we see here also that the difference this is a total expected cost from the optimal strategies are very similar even though the distribution of costs is rather different the probability of failure so POMDP accepts higher risk and gives and gives lower lower inspection cost whereas no higher opposite higher inspection cost and lower repair cost whereas the threshold approach gives higher repair costs and lower inspection costs but the total cost is the same and this is the comparison of Jesus where he also compared the POMDP she's the limit which is another version but you think of he dug a POMDP solution and you see it again this is the solution we get from periodic inspections so we just assume every maybe 15 is every two years every one year every three years every four years so we just assume that we have a certain number of inspections distributed over the lifetime at equal intervals and this is the the number of inspections and what we see is that the optimal is very similar and it is also because it's very flat the optimal is very flat so if you do a few now if you do a bit off it doesn't really matter so there are a number of other such results that show that okay for this type of problem the simple problem using the heuristic gives almost as good results as using the exact solution the heuristic has the advantage for our purpose as engineers that it also gives the possibility to to kind of incorporate operational constraints or or for example as I said no we might the operator might say it's not practical for me to to do inspections at random points in time I want to have a fixed schedule of inspections that's taking care of here because we fix it at every five years we have to do this regular inspection intervals so there's a benefit also from the from the practical point of view in using these heuristics and because if the optimum is so flat and we get almost the same solution then there's no my point of view not too much benefit in using the palm to be now that that's not necessarily or we cannot translate that to to the to the general case it doesn't mean that this is always like that they can be for sure cases where simple heuristics fail so now come to that but it's called direct policy search and in the general literature which means that so the policy what is a policy in since bastion already explained at the very first day but I think he used different terms so these terms are mostly what they use in this ring computer science electrical engineering communities so the and they and also influence diagram community the policy basically is a decision rule in Sebastian's terms so it says that at the decision here is not for the decisions that come here are not free I mean free in this as they are not the kind of decide whatever you want you're bound by your decision rule decision rule says for example here decision rule before would be whenever whenever the probability of failure is above so much I do an inspection that's a decision rule in the public view we are completely free we can based on whatever we know now we decide what to want to do that's completely free here we limit that by prescribing certain decision rules that in principle represent a subset of the possible solution space therefore they are not optimal or they're not in general optimal solutions but they make our problem much much simple so this is the policy and the strategy is just a set of policies so here the policy is the same for every time step but in principle I could also say as I go towards the end of the service life I might accept the higher probability of failures because if I'm in the last year of service even if the property failure is a bit higher it doesn't make much sense to inspect because if next year I'm going to anyway take that structure out of service whatever I might just hope that it will not fail in the last year so possible to have different policies in different years the trick is just to to make such that the policies can be described by a limited set of parameters the threshold time of inspections and and then you do an optimization over those yes of course they do the inspection by the quality of the big inspection I mean whenever there is a problem with a wind turbine they will send people to inspect themselves for the support structures the thing whenever the problem with the turbine yes they will send also people on the board to check them so for the tower to okay the opportunistic or optimistic opportunistic yes so if that happened then BOMDB or any method that can find a separate schedule for each single choice to inspect then you have a list and then you can send people directly to the location that need to inspect for the support structure then it would be suitable for the practice for the practice of the component yes yes I mean I'm not sure I mean it's probably possible to model that as a pump the POMDP but there might be a bit challenging to do that you said using the heuristic approach you need to fix the interval no no you don't need to fix the interval you can come it up you can come up with any rules that you like so I will show example that this is just examples of rules you can implement you can implement completely different rules you can implement to say a rule that says whenever the sun shines I'm going to inspect that's okay that's also a rule that you can implement so the only point is that instead of letting the decision completely open I'm parametrizing in a way the decision and I'm saying okay I and it's based ideally based on observations that I have the interesting thing is that this this whole this heuristic is not based on any information that I have actually this is based on information so when when because it's probably the failure depends on what I have observed previously so that's it takes into account observation this heuristic doesn't even take into account any information because it just says a priori every five years inspect whatever is the outcome of what you have seen previously now in your case you can say okay yes whenever there is a problem with the with the operator part with the actual turbine you do an inspection of the that's exactly that's a possible policy or strategy and you can you then have to model also the failure rate of the turbine itself when you optimize when you check whether that's optimal or not but you can directly implement that in in this in this framework here because this is a simple heuristic it is very simple heuristic that you mentioned so the heuristic is whenever there is a problem with the with the wind turbine upper part you do you go and you do inspect the support structure and then the question is of course how do you select the joints no then you have to add additional heuristics to select joints that's something we are working on but this actually shows that you know often you have actually constraints or other not constraints or engineering considerations that that prescribe already what you should do to some degree you don't want to be completely free and you don't want an algorithm that tells you you should inspect in year 7.5 and then you should inspect again in year 9 because that's not practical the practical thing is to go and inspect whatever you have to go anyway because the ship is expensive and then that's what is expensive or the transport so you go anyway at those points in time and that's a that's it's already a heuristic and you could directly run this. Other comments or questions? No, I mean basically the dynamic page network if you have this Markovian property of the dynamic page network will just give you will just result in in a palm dp in the end so you result in a palm dp and the on the most effective solutions for pure mdp are the pure mdp solutions so there is not a more effective solution for the influence diagram. Well you can always make a Markovian by the type of trick that I showed you but of course you increase your state space so you pay for that and yes there's a question whether that's actually feasible or not but that's the only way I see to I mean well said this is an alternative but but in terms of solving the influence diagram in an exact manner it's actually the palm dp the way to go. When I said here there's also this thing called this limit which is limited influence limited memory influence diagrams so there are other but they're similar I mean they start the I wouldn't say that this is the solution. Okay so this is a this is how it works but as you see I was not able to write it down for you I made actually this was I developed this in the whiteboard but it's no whiteboard here so I just copied it from the last time I made it. Let me show you an example let me not show it to you here there are two papers one is published and one is the one that describes it nicely will be published hopefully it's on point. I mean these things I've been used a lot of times as I said I mean this is not new so these ideas of using this heuristics is not a new thing but it has not actually been analyzed in a kind of more more more general sense that's what we try to do here and in particular in the second paper but I'm going to show you from another presentation here just a little bit how we use that in a meter more complex problem because that's that's ultimately what we want to do not to solve more complicated type of problems. This is from PhD from work of Jesus Luque so his slides are from Jesus Luque so basically we have a type of problem that we start with the we models at DBN dynamic based network and since you know that now and it's actually this framework that I showed you before this is very generic here but the idea is that we do have time invariant parameters time variant parameters and the iteration function so we have a larger number of parameters that describe our problem but that's still a component so that will not be a big problem but then we have possible inspections here but then we have a system problem so we are not interested only in in a single component but we're actually interested in the structure as a whole so think of it if you just want to have an example the two type of structures that we considered that the work here is the left hand side and the right hand side so left hand side maybe you know this is a very idealized structural system and the left hand side and the right hand side is a type of offshore structures or you mentioned offshore support there's a support structure where you might have fatigue problem this is not that fatigue is not for wind turbine that you might realize not that this is an oil and gas platform but you have all these joints and you know this the question is not only when do you go and inspect but if this is fixed like in your case where do you go and then should you inspect this joint this joint this joint or that joint that joint that joint maybe you have to send the divers which is very expensive or even if you have for nowadays probably unmanned vehicle robots that go down there but still very expensive to do so you want to really optimize where you go and check so just saying that every time the property of failure of the system is above ten in the minus four it's not going to be sufficient because you still don't know where you should go and inspect so the some principle if you could have this enormous decision tree which doesn't work we can try to make an influence diagram we make an influence diagram but already becomes also a bit sketchy here the idea is that we have these different components we have the different components and maybe there are 50 components or 100 components a series of this we make a simple model here in terms of dependence so what we assume is that the dependence between the different there's dependent between different components so if the fatigue is larger than expected in component one then it's also larger or it's also likely that it might be larger in other components and the same happens with material parameters so we consider that by introducing these hierarchical model here that says there's some kind of common factors simplified model to represent dependence but that's what the author presents more or less the information we have we could make a more sophisticated model but we don't actually know what other dependencies so there is dependence between components here and then each component has certain performance fatigue and no fatigue or corrosion then that translates into a component state the component can be in this case either either working or not working and that then translates into system state no system is failing or not and here also enters actually the environmental load which is not shown here but we have also the environmental loads that then decide whether the structure fails in that year or not and then it's of course what we want to avoid this is the actual ultimate failure but this is a huge and then now we have many decisions so each component can in principle be inspected in every year and this will not be possible to solve even with I mean we spoke with costas and other people but probably peak and that's all this large size problems unless we make some some assumptions that are not realistic I would really not realistic so that doesn't work anyway this is the model I'm not going to explain it so this is the thing we want to do and now we need is is heuristics because we're going to use this heuristics approach and the reason we're using is in this example but of course you can use different heuristics no so you could say okay we go we go and inspect whenever the turbine is having problems here we say we inspect campaigns we perform campaigns at regular intervals every five years every that's a parameter then we fix the number of components that are inspected in each campaign again that might not be optimal you might have that in the beginning you want to inspect more and later you want to inspect less we fix it at each thing then the third one is how do we select let's say we inspect ten components in each campaign how do we select those components now we say okay we try to pick those components that give the highest value of information for for sorry they give the highest value of information by considering what do I learn about this component and about all the other components that in principle will mean I have to make a value of information analysis within the value of information analysis that would be complete to doesn't work so instead we are using a proxy and go and say what that is then we also add this thing because it's okay they might be the case where where we end up having a too high probability of failure and so we add an additional threshold and say that for safety reasons if that threshold has exceeded we have to do an additional campaign we can turn out that the outcomes are terrible and this might lead to additional campaign and finally repairs are carried out whenever we exceed some repair criteria so we have in the end five or actually one two three four and here also there's one or parameter or that we have to pick so there are five parameters that are open vision parameters but once I fix those five parameters and for a given type of for a given state of knowledge at every point in time it's exactly determined what inspections I have to perform and what repairs so I take all the decisions out the ones I fix those five parameters the decisions are made and I can just simulate with the one thing would be to use Monte Carlo simulation we just simulate with the Monte Carlo simulation the history of the structure and I'm going to explain why that's not exactly optimal but just think of Monte Carlo simulation and we simulate the whole history and that's at least feasible unless you have a complicated ever very expensive finite element model so that would be the thing and now how do I what is the what is the proxy so how do I select the components well if you're the first study the first structure we look at this structure here this is the Daniels not because of me but the Daniel system that was studied by Daniels long before my time and it's a very simple structural system that is used often to study the fact of redundancy so the assumption is that I have n identical but independent components and so they are not they are not distinguishable and the load is carried by all of them together and then they can have we can have a brittle or ductile failure material behavior but the point is here in the model we have is that this can be subject to deterioration what I have to deterioration now we should inspect now because in the first inspection of course doesn't matter which one we inspect because because they're all the same so if we fix until we inspect ten components we can pick randomly ten components because there's no difference in them in this is a simple example and I go to the more complicated one afterwards but then in the second in the second round the second inspection campaign we have already expected some components so they are now not the same anymore for that point we have to decide which component to expect and now we said okay we should pick those components that have the highest that provide the highest value of information with respect to the total to the total system now we can kind of see or one can kind of show approximate not show exactly but in the exact same but but in an approximate sense is that the how much I can learn is a function of what is the probability of having deterioration in that particular component I did some studies like this already my thesis so that it's also there you can find them more information so so so ultimately we can use the probability of the component being in a damaged state as a proxy for the value of information the higher the probability of that component to be the damaged state the more we can learn from inspecting that component so that is the proxy we use here we just say whatever the probability of failure of the component we pick the ones that have the highest probability of failure of component and those we inspect because those are actually the ones that we also learn most about it yes yes but in this case it turns out that when the height is the correct point is that we have the higher the means actually there's a high probability of failure unless the probability of failure is close to one in which case we are at the other side but that's not happen here yes so now when we go to the other structure here or here it's not like this because the component are different so there is not first of all they have different I mean they have different functionalities so we can't just use the only use the probability of failure because they are structures they have also different functionalities so additionally we consider in this as a heuristic here we consider not only the probability of failure but also the importance of the component for the structure so we say that okay one thing is that we want to learn most about the structure as a whole and that is the components that have the highest probability of failure give us the most information but then it would be ridiculous if this was if those were typically be components that have lower importance because those have a higher probability of failure or can have a higher probability of failure because they're maybe just tertiary members that don't cause a lot of damage but the primary members are the ones that are critical maybe they are quite safe so I might not learn a lot on the rest of the structure by inspecting them but I will learn a lot about safety of structures there so so we have to somehow combine say okay we should inspect those components that are important for the structure in terms of this the fun the functionality and those components that give us the most information and then we know how do we do that we have a we have an indicator for the redundancy of the system with respect to failure of components but gives us forgetable if what if the failure of this component here fails what is the effect on the system failure probability and we have a measure for that and then we have the component failure probability the actual failure probability and we have these two things and and then we give we have a we have one parameter that that basically gives higher weight to one to one or the other so we say okay we have to weigh these two these two attributes and the way we give is something we don't know maybe it's more important to actually to actually inspect important components or might be more important to inspect components that have a lower that have the providers more information and then we've just let this parameter be an optimization parameter so this is a just a proxy for the value of information but it gives it gives a reasonable heuristic we believe anyway but you can come up with any other I mean there's a lot of freedom here you can come up with different heuristics how to pick components in the system let me implement this and I'm going to the details here okay now okay this is not the late so this was a few years ago and so we have some we have some changes in so that's why I showed you before this was on the whiteboard but this still gives the idea of what's going on so first of all we have to talk so now we want to do is we want to calculate now for a given first for a given strategy so given heuristic it's fixed values of the parameters we want to compute the total expected cost the lifetime cost of the structure so this is a as I said you could just run straightforward Monte Carlo analysis and essentially that's what it says here is as is the strategy so as we fix and initially we also fix set set that inspection outcomes the dimension is a bit tricky because the mention of that depends on the decisions you make so it's not a fixed dimension that has it's large and it has in varying dimensions but we just simulate some means we simulate first some inspection outcomes and we feel we simulate we have fixed our strategy so whenever the every five years we inspect every inspection campaign has ten inspections and so on then we can run the dynamic page network we run actually the algorithm that we have to solve the dynamic page network using discrete exact inference and that gives us the probability of failure at each point in time of the system conditional on on what we have observed and conditional on what we do and then multiply that with the probability of discounted discount rate here of course the discounted cost of failure sum up and we get the there's a little thing here behind that I don't want to explain now but we sum up the the risk so this is the risk then we have the cost of repair that how many repairs do we have to do and how many inspections we have to do and that is also determined rather be straightforwardly by this the strategy and the outcomes so that's okay now however this is always assuming that I know already the outcome of inspections which I don't so I have to do an additional integration and as I said set is very very high-dimensional even and is varying dimensional so that's a bit tricky to do so for this we use Monte Carlo simulation and I said if I said before Monte Carlo simulation is not very effective but the good thing is this in at least with this approach here we have calculated the probability of failure already here with the Bayesian network so here and we have calculated the risk so here we are not dealing with a small probability event they are cal they are taking care of here those are just the costs and to get you know if you know Monte Carlo so if you just want to calculate the mean value of something that is not a small probability but it's just a relatively robust number you have 200 or 100 simulations are enough if you want to case to simulate and include simulate over the failure events because those are you have probability the failure of 10 to the minus 5 or so you would need millions of samples but here we just need a 100 or 200 samples to do the Monte Carlo but you could also direct you can also get about the Bayesian network and directly simulate everything straightforwardly and then you have to reduce much more samples but it's also a possibility so compared to this you know in the exponential increase in direct solution is even a 1 million samples is nothing and then we find the optimum by doing an optimization over that now this is a stochastic optimization because the objective function itself is computed with Monte Carlo so you have noise in your objective function so we typically use kind of stochastic optimization here like a prosentropy or something like that but that's okay then this is just okay this was from Jesus this actually be the old so but anyway so how does that work so basically the idea is okay as I said in first you simulate those inspection results and so this is one simulation of the time and for this is the Bayesian network is used here the Bayesian network to calculate for a given realization of observations and so here is the first inspection campaign obviously every 10 years there is an inspection campaign here but then we exceed here the threshold this is the threshold that we have put and you have to have an additional campaign and it turns out that we do if we do a first inspection and the inspection result is not that good so the property goes and for the component that doesn't happen because the component whenever the inspection result is bad we fix it but for the system this doesn't work because let's say I find some defects in the system I can fix those defects but that still might increase my failure probability for the system as a whole because the other components are also in a bad shape likely so what happened here is okay there were some kind of bad results so there was a need for additional inspections that's done here in CND and then it continues and then again here is an additional campaign needed and this campaign here and so on this is for and then these are the co these are reliability of the individual components of which component you get also the property of failure so you see some components have high probabilities and other components have low probabilities and then those are inspected and so on and then you get the costs you see these are the costs for this example here the cost over time where these are the failure costs or the risks actually the expected failure costs there in this case they are low the risks are low and then here are the inspection campaigns where you have a mobilization cost that is high you have it goes down a little bit because of discounting you have the cost of inspecting each individual component that is the number of function of the number of components inspect and you have a repair cost so then you sum up those costs and you get the cost for strategy and these are we simulate two or three hundred histories like this these are a system failure probabilities associated with those two or three hundred histories and that gives us then at the average so this is for one history give you get these costs and then we do repeat that for hundred histories or four thousands and we take the average of those costs and that looks then when we sum them up and then we see that it looks like this for example here it is very simple example of this Daniel system let's just check okay how many numbers of components to inspect or campaign turns out that here the optimal number is free if we do five years interval but if we do ten years interval which gives lower costs overall so that is what we should do we should inspect maybe between four five six see that actually you see that there is a you know because of the Monte Carlo's if you do that is it's not you know we expect that to be smooth so that is a Monte Carlo error here but yeah sometimes so you should move or not that not surprisingly you should to inspect more components if you inspect less frequently and so on and I can go on and on and on the point is that this this strategy does not guarantee or it actually does not give you the optimal strategy because you don't consider everything possible I mean and if you look at these histories you can need that quickly you can identify that some of these things are pretty stupid like you see you inspect the two years in a row that doesn't make sense so this policies that we describe are not optimal but the thing is we can calculate it allows us to actually calculate expected cost of the strategy and you can always come up with another strategy compare it and then take another one that is better so since we have it's not possible to get an exact solution for this problem we have no reference solution we cannot say like that this is all this works or it doesn't work how would it works well the only thing we can say is that we have a way of identity of of of actually checking whether a particular strategy gives us what it gives us in terms of expected cost and we can compare it to any other strategy that we might come up with that you might come up with and we just pick the one that is best results but the fact that they that the fact that there's no reference solution is exactly the thing is that there is not possible to get an exact solution at the moment for this type of more complex problems for a simple problem we could compare it and we saw that they give similar results but then the exact one but here we don't know but we know that what we calculate here and what we calculate here is is actual expected cost plus minus a Monte Carlo error all right yes that's just a question please okay that's a good question so but basically I mean they are different the project and the probability of the failure of the system given failure of a component I mean these examples that we have here we have only in the order of 20 components and I mean there's more components there's more components but then we can reduce for example if you have let's say you have three wells that are within the same element you don't have to do three analysis because you know that the failure of one well is equivalent to the failure of another well so we can reduce the number of actual elements that we have to take out of the system so we are there on the order of less than 20 so what we do is we just take all the possible combinations of failure and we run a structural analysis for this these are simple pushover analysis simple non-linear pushover analysis that you have to do and then we do an approximate maybe use a simple reliability approximation the problem is actually the real problem is that it's not sufficient to consider only the what is the effect of the failure of one component because if that's the case we just have to take out that kind of component we run a reliability analysis and we do that for each component we have to do 100 reliability analysis that will not be a big problem nowadays the real problem is that we actually have to consider to be correct and that you can make this is something I investigated you can make a big mistake if you think that the effect of the component is related only to removing that particular component often you might have two or three components that work together and maybe taking out one is not critical but taking out two of those is completely disastrous so you have to actually consider all possible in principle all possible combinations of failures of elements and that gives you if you have 20 we still okay because 20 to 2 power 2 to the power of 20 is around 1 million but if you have 40 elements you end up with 1 trillion the combinations that again is not already feasible so we so here we just reduced the number of components that we actually deal with this otherwise you have to there's approaches by June Hossong and others that have that that can be implemented that can be used here to get an up to get an estimate of what is the system failure probability conditional on component failures that's an a problem in itself yes if you have larger systems okay so well that I go back to my main presentation but that is also more or less I think the end of it because I still had to I promised to look at or I said that we look at one more example but I know that it's time for coffee break so I will I will more or less and here the set here this is more or less what what I explained you have just changed a bit and we have corrected some errors and changed a bit notation but yes so this is described here and I mean this is an idea that's not again this you can you can find this in computer science so this type policy search is something that people in computer science all the few have actually been using least in recent years and but of course the heuristics that we have to come up with are really specific for for our problems so we can't take them over from from computer science so that's something that we describe here okay good questions or comments or more coffee break okay so we have to coffee break and after the coffee break we have presentations how many are there four so we have four presentations by some of you on specific things that you're dealing with let's see if you manage before the for the lunch otherwise we do it we continue after the lunch break