 Hi everyone, welcome to the SmartQuest Seminar. Today our speaker is Dr. Panos Mottis from CMU. He's going to talk about bridge security. Let me remind everyone that our next seminar is next week. We have a speaker from CMU. We'll talk about the control of distributor and curate sources. Dr. Panos Mottis has been a special faculty with the Scott Institute for Energy Innovation at CMU. He did his post-doc there. He's been a scholar from CMU. His recent grants include one from the National System Operator of the Marshall and for the development of transmission expansion time and back. Another from Google X, an integral tool for electrical engineering. Between 2018 and 2020, he served as a recovery research professor that exists in Switzerland. This is a very great seminar that actually can stay as a major. Since 2007, he has contributed to more than a dozen other big projects funded by the European Commission. He received his diploma in 16 degrees in electrical engineering at the National Technical University of Athens. He has published more than 30 papers and contributed to five of the projects. He's a senior member of multiple IEEE societies, including the chair of the IEEE communications committee and editor-in-chief of the IEEE University. Good afternoon, everyone, both in the room and remotely. I'm very glad to be here. I want to thank CMU and Liano for their kind invitation and I hope the presentation today will be interesting to all of you. We're going to have an interesting discussion following. I will skip the formalities of the introduction. We did a very good job with reducing the progress that I had from Athens of the way to in the US for the past few years. I would like to give a different introduction of myself, what keeps me up at night and that would be the wider penetration of electrical grids and especially of those that are specific of a particularly volatile nature. To do that, we certainly need a wide range of two-degree method, optimization, control, and beyond research, we also need to think about standardization and other kinds of policy initiatives. If I would like to start with a take-home message for you, if you get the opportunity to participate in voluntary activities of the IEEE or by the IEEE of NASP and so on, please, jump on them and join. Your contributions are definitely going to be valued and propagating the long term across the industry. The preliminary question that I always open up this discussion is whether we do need artificial data from our machine learning power systems. The thing is that as we have been progressing throughout the various challenges and the new paradigms of energy operation, we're getting to have a more complicated electrical grid, a more complicated power system operation and market. At the same time, we come to the realization that no matter how many of the efforts have been and how many different tools have already been used, we still find ourselves in catastrophic situations like the 2021 Texas crisis that led to practically non-rolly blackouts, the California wildfires which were either caused by electric power system equipment or in order to avoid the cause of fires from electric power system equipment, the electrical grid would be preemptively disconnected and not serving tens of thousands of customers. At the same time, we are coming to the realization that metering infrastructure in general data and power systems is reaching up huge quantities with the 2012 data for the US amounted to more than 100 terabytes of information available. Now, the electricity industry is also considering very seriously how to incorporate artificial intelligence and machine learning in power system operation and not just in predictive analytics and non-operational level. This is the every initiative that started not more than a couple or three months ago and tries to see what is the path forward and what is the landscape that we should be considering as we try to more practically and kind of more aggressively digitalize the electrical grid. I'm going to start my presentation with one of my earlier works that is now doing the rounds again mostly because we're talking about how reliable energy resources are or can be. As a very short intro here, I will just remind to everyone what energy and active power is. Energy is whatever moves heat charges, elevates any device around us. But active power, which is the difficult part of that is the derivative of energy over time. So what happens at any given second? At the same time, we need to be thinking of the electrical grid as a voltage levels and congestion limits. Voltage levels means that all our devices, if you check the plug, they say they need to be operating some specific voltage level. This cannot be guaranteed just by itself. Naturally, we need to have proper control actions and operation while congestion falls on the side of the system where we need to consider how the existing resource that we have. So clients, cables, transformers are getting overloaded or they are at a good level of operation. This is important, these brief definitions are important and especially the, I would say, the dilemma between energy and power. Having the available energy to cover the demand is not enough. We need to have the available power. At any given second, the generation must equal demand plus loss. Otherwise, we're seeing increased wear and tear, damages and system collapse. What happened in the 2021 Texas freezes, what you see on the lower right part of this slide for a brief few minutes, maybe another one or two minutes. The whole Texas power have gone into a complete collapse because the generation equals demand plus loss equilibrium was not met consistently for quite some time. This is where the role of the capacity comes in. The capacity is essentially a market structure. It is the long term promise of energy availability at some point in time. Generating companies, investors are going to be handed out money to build new power plants at the promise that they're going to be delivering the energy of that plant at the time after that plant will have been built. So there is the underlying in window of a guaranteed active power out. And this is where it gets kind of tricky for volatile resources as wind generators, photovoltaic and the likes, because these kind of resources characterized by volatility cannot be by default considered into this framework of capacity markets and hence firm capacity. And this is by definition because practically speaking and with research advances that have been going on for the past 20 years or so we know that we can handle the volatility of renewables in very tangible and very practical ways. First of all, we can make the resources fully bespatchable, fully controllable and also we can integrate them in hybrid paradigms or even bigger than hybrid paradigms, virtual power plants, micro-regions, so on, and thus smoothen out if I may, their volatility. We know that and you have seen that. And the question is okay, we can technically in the broader sense do that practically. How does this pan out? Like what is the way to take the volatility of such resources and turn it around on its head and make it a firm in its capacity ratios. We need to think of the hybrid renewable energy system so we combine various resources with or without volatility, it is up to us and it is up to the probability analysis we can build on them. And in this hybrid, we start with a very straightforward and simplistic way. We can make it way more complicated, but let's start with a simple way. The n minus one, we have a hybrid of n resources. We know what is the maximum injection of that hybrid then given. Moment, let's assume that this maximum injection is lost. How do we handle this loss with the other resources? We need to read this patch, the other resources to cover for that loss. Simple as it may sound, it has a ton of complications because we need to make this repurposing of our resources economically feasible. So if we're selling our energy for six cents and the dispatching will make it 12 cents, well, okay, yeah, technically we are repurposing nicely, but if we do that double the cost, no one is going to be happy. Essentially the guy who is owning that hybrid energy source and wants to make a profit. And on the electrical power engineering side of the problem, we need to be respecting voltage limits, congestion limits and be prompt in such a dispatching. If we lose the resource, we can wait for hours until we will dispatch the other resources to cover for that loss. We need to be prepared. All of this creates a quite a quite complicated landscape of how to address the problem. Let's break it down in aspects for the for the aspect of the electrical engineering challenges, which is of course what is the status of the grid in terms of loading voltage limits and so on. We need to rely on some kind of forecasting which will come with uncertainty. For that reason, we need to account for multiple scenarios and treat volatile resources in a non conservative way. Otherwise, we're going back to square one caretale, whatever is volatile to avoid any challenge. This doesn't make any sense in this new part that we'll try to discuss. As for the optimality of cost, which is an empty hard problem for the electrical grid due to its non convexity, we need to think again in a kind of smarter way and tackle this matter maybe from another perspective. And so, as we are thinking of these challenges, what can be done in order in order, first of all, to handle the grid uncertainty is to be robust to that uncertainty. When we are accounting for multiple scenarios, we need to explore the feasible space of those all multiple scenarios and plan the best possible of them for as many as possible. So we are not just looking for one dispatching. So we lose one source in the hybrid and we commit the others to cover the loss. Just one solution to that might not suffice. Why? Because if we have other volatile resources that may also fail, we need to have backup plans for them. As you hear it right now, you're thinking, okay, so we're doing essentially a domino effect backwards in the hybrid. You'll see that in the way that this is eventually solved, it is kind of not as reliant on volatility and solving itself through the structure of the solution. As for the optimality challenge, the best trick we can think here is to avoid solving the actual optimization and resort to fast arithmetic solutions, power flows and so on, and work with that. So instead of solving for multiple different losses, multiple different optimizations, we solve, I would say, a relevant problem and near the vicinity of the previous status problem to make it fast to solve and building on that we also want to make it optimal under various situations. And here is where the binary decision trees come in. The binary decision trees come in the sense that they can take a set of data that are marked as true, false, zero, one binaries there. And by decision we mean that we take this data that are marked in a binary manner and try to understand the other attributes and how the other attributes affect the binary categorization. So this case is the example of food poisoning in a bucket. We have people who had like some starters, some desert, some main dish, and through this decision tree we're trying to understand, okay, what was the combination of diseases that led to poison like symptoms. So if you want to read this decision tree, for example, let's go on this part, it says here that those people who didn't have dessert one and didn't have dessert two but may have had a starter one were definitely not sick, only 4.1% showed symptoms. But if you read this part of the decision tree, all of these are really called rules. When we read a decision tree from the leaf all the way to the root, this is called a rule of the decision tree. So this rule, starting from this leaf, it says that those of the attendees that had dessert one, had main two, had dessert two and had starter one, showed symptoms of food poisoning by 98.8%. So we have a data set with multiple attributes, which are count B, which are the script doesn't matter, but the categorization that we care about is a binary one. In this case, healthy or with symptoms, zero or one, true or false. No matter how complicated the other attributes are, the decision tree will try to extract a verdict, a rule describing this zero one binary categorization by all other attributes. The most typical way to develop a decision tree is the so-called entropy, the Shannon entropy, I'm not going to be going much into the math because I'm also giving a link to the papers if you want to read it in more detail. So the Shannon entropy means that I'm trying to make the initial set image subsets as non diverse as possible. So at every step of the way you will notice that the previous true false proportion increases on the one end decreases on the other. So this is what the entropy does here, the Shannon entropy as a metric of non diversification of the data to get to as clean lives as possible. This is what the decision tree is just to give a reminder to anyone who hasn't heard about it for a while or sees a decision tree for the first time. As with all of course machine learning and potential tensions to have that risk of overfitting and that's why we might be able to advise to use either post pruning or pre pruning with various methodologies to keep the model for overfitting. Now let me show you the idea of how we solved the challenge of the firm capacity by hybrid renewable energy source in this field. What you see here that looks like an octopus on the right is actually the projection of the ACOPF, the problem that is underlying the dispatching of resources in a power system, project them to axis for your ease. So you need to think about this octopus in the n dimensional space as a non convex non linear feasible as you see there. The global optimum for our case the most economical dispatch over hybrid resource for a given load that we are serving is this blue dot. Okay, so this is where we are before we lose any power so let's assume that in our hybrid power plant we have a wind turbine and this wind turbine is like one megawatt and the rest of the power and the rest of the hybrid renewable source has I don't know maybe the other three or four megawatt but it is smaller portions. If we lose our wind turbine, the old operating point will be here and the new operating points are going to be in this new feasible space. Now, to give you some intuition here, why the feasible space grow bigger, the feasible space grew bigger because we lost one issue, so all the others need to commit more. So the feasible space expands for the problem and we are trying to find the new global optimal, which is the yellow point here. So what do we do in order to approximate the new global optimum, we use decision trees in this way. We sample the feasible space of the problem with a power flow solution, not an optimization, just try random dispatches that are in the vicinity of the previous dispatches. So in this part of the feasible space, I have sampled various possible dispatches that work for my load, check, that work for my voltage and the other constraints, check, I call them true, and then I'm going to make some of these dispatches to be false. How I'm going to name some of them false based on their economics. So all these feasible points in here that are more expensive than what is this ellipsoidal, I call them false and the rest I call them true. Then I take a decision tree, I train, I learn from the decision tree. How are these active power shape points described? You can see here for example that for the active power of generator I practically I am below, I am above this level. So I have to be at least on this part of the active power for generator I and at least in this part for the dispatching of generator J. Then I do the exact same thing again in the new feasible space. Again, find new samples, call some of the previous false because they're not economical and all the way I am going to be converging the global lockdown. For those of you who have some background in complexity and algorithms, you might have already guessed that what I'm doing here is essentially a binary search. I am doing a half space search through a physical region, and the trick that I'm using there is that because the space of the problem is non-linear non-convex, I use a decision tree to slowly narrow down to the global lockdown. That's how I have solved the problem in this case. I here have an example from one of my papers what I assume in one of the islands in Greece that I have installed almost a 10 unit virtual power plant. What happens if I lose this windpark on bus 43? Well, this is what the decision tree tells me. It practically tells me don't turn on the diesel units as you will see the cost of the diesel units is more expensive. And if I don't turn on the diesel units, I can cover the 70 kilowatts by the rest, which are cheap enough for the purpose. So this is how it practically works. I have essentially explained this through the example, so I don't want to be boring with these details. What I want to focus your attention on is the third bullet, which says that there are some recent proofs implying optimal guarantees for the method. I think it is a classmate of yours here at Stanford. I forget his name. I don't know if he's like a PhD student here who has offered the proof that if there exists an optimal classifier for an information problem, then there exists a binary decision tree which is realistically induced. This is big from the aspect that if you know that there is a perfect classifier, then you have the proof that there also exists a binary induced decision tree that solves the same problem. So you can actually solve it in a practical way. Of course at the cost of some error and so on, but this is the idea. So what we have to show here and this is one of hopefully figures cross may upcoming papers is that there can be an optimal classifier for a nonlinear noncontext problem and based on that we can have a binary decision tree solving the exact same problem. And I have a few papers on this subject you can read through them at your page and please feel free to reach out to me for any clarifications or follow us. The second part of this presentation is going to be focusing on artificial intelligence application so machine learning is one thing machine learning has to do with identifying estimating a function again. Artificial intelligence, on the other hand, is usually tools that try to optimize a problem, particularly Schwarm genetic algorithms and so on. What I want to say out of that is that as you must be guessing in most cases, either machine learning or artificial intelligence coming to flavors function estimation or optimization. However, not all practical problems are either or we have problems that have nothing to do with optimization, nothing to do with function estimation. And this is why we need to reconsider AI tools that are specific to industries or non typical problems. Otherwise, we are always forced to fit any industry problem in either as an optimization problem or a function estimation problem. Clearly, this will not work in all cases. And here is where it comes to the discussion of voltage. Now, the reason why we care about voltage in power systems is because of Ohm's law. When we have to transmit electricity over some lines with impedance, this impedance will cost a voltage drop. So the simplest approach to say, OK, how can I overcome the challenge of voltage drops? Well, what you can do is limit the flow of energy over lines, over impedances and bring that energy generation as close to your load as possible. And the biggest exactly on this gist that I'm offering here is where I come and conceptualize a new view of the distribution systems. Distribution systems are usually radial, medium and low voltage are radial. So what I come and say is, OK, I'm trying to forget that the radius has like these branches. I can lump them on this part and I can reimagine this distribution feeder as a road, as a physical object. And I can also model the loads as point masses, as point masses on this road. I will also then use the impedance of the feeder up to every point in this feeder and call this mine axis system. So instead of using distances in miles, I'm using distance in Ohm's. So I can rethink the feeder as a road with an axis system with their own orientation system on impedance and the loads and the generation as point masses. With that, I can conceptualize the center of mass of loads. The center of mass will be given in a distance in this case from the substation of impedance. So instead of having, as in plain physics, the distance of, let's say, something meters from the one end of the other end of the physical object, I describe this impedance. Now, why is this important? This is important because I can think of the loads as one kind of weight and the generation as a complementary kind of weight or mass. So I can separate these two and say, okay, if the centers of mass of the model loads are counterbalanced by the center of mass of the model generation, I am essentially reducing the length in this case impedance that the power flows have to go through. What did I tell you three slides ago? The closer I get the generation to my load, the fewer the voltage draws. So this model is just exactly that. You model a whole feeder of loads as a center of mass, a whole feeder of generation as the opposing complementary center of mass, and you try to counterbalance them. How do you do that? You just move around the generation. So the generation that was near bus 150 can be reduced and we can move it and increase the generation or start up a generation, a generator or a battery, discharge a battery, near bus 9. You see how the center of mass of generation with this adaptation of the generation near 150 and the generation or a battery or whatever we have near bus 90, it has moved the center of mass of generation closer to the center of mass of the loads. When you reduce the length of the lines that the power flows have to go through and thus reduce the voltage drops. The idea is broadly as I described to you, but those of you who might be a little bit more observant might tell me okay, do you stop when you move close enough the center of mass of the whole feeder between generation and load. The answer is the latter. You will first counterbalance the centers of mass between generation and loads for the whole feeder and whichever part has been counterbalanced, you move on to the next part and you will do that a couple of rounds until it gets all the voltages within limits. So what you see here is that the initial state of the voltage levels is the bottom line in this graph, which falls below the acceptable level of 0.9 per unit. And in this feeder, I have re-dispatched active power of generations and maybe used some demand response or whatever else I had. So as the centers of mass of generation and demand are counterbalanced across the feeder and in lesser parts until you can see that the voltage levels have been moved completely above the 0.9 per unit. So you see how in every step of the re-dispatching, I was improving the voltage by a tiny bit all the way to above the 0.9 per unit as would like. This is the publication. I haven't worked more on it because I haven't thought of any other way to improve it. As I said in the starting slide of this second part, it is neither an optimization problem nor a function estimation problem. So the improvement I can do has only to do with how the conceptualization might be improved. And now let me gather for you some points and we can ponder on them together. The reasons for artificial intelligence and machine learning uses in power systems are by now crystal clear. The energy portfolios are very diversified. They have a lot of stochasticity in them. We used to have only stochasticity from loads. Now we consider stochasticity from renewables, from diverse stakeholders. There's not one utility. There are multiple now entities in the electricity market and also have large amounts of data that are just sitting there in most of the cases. For artificial intelligence and machine learning tools per se, there are the opportunities for the optimal control. As you are already aware of the ACOPF is a nonlinear nonconvex problem and we have to solve, not merely solve even with the most advanced relaxations and other mathematical structures. The stability problem is a problem of local extent that can expand all the way to a fully blown grid collapse and power quality is becoming an aspect as well as we are moving to more and more power electronics devices. Before the path forward, there are challenges and considerations. One of the challenges is a very hot discussion at the moment on whether we need the AI and ML tools to be interpretable. Interpretability means that we can take a tool and as we're discussing, oh, this is mathematically provable by that. One, one and one equals two, and this is not a black box. We would like, we would desire AI and ML to also have a kind of open knowledge structure, to not be black box. But any of you who has worked with AI or ML in the single picture understand that after some size, it almost becomes like a black box due to complexity, not because we actually want to be a black box. The lack of standardization software engineering lacks the new ones or the standard or the patent. You can copyright it, you can describe it in broader mathematical frameworks, but it is more deep, it is kind of more difficult to standardize it. The standardization will also be a very interesting challenge to address. I'm saying that because my good friend of mine who has not been to use Spirushka's Vasiliyadis is trying to offer tangible data and proofs to show, for example, how neural networks can be translated into linear functions. So if you can give this analogy between a tool that is in the broader context, black box and say, okay, stop black box, it is just linear functions and this is how you interpret that, you offer a helping hand to utilities and operators to better use them. We also need CRISPR guarantees of robustness and optimality. This is something that because we still consider electricity, as is water and other things, a social welfare basic product, we cannot just say, okay, yeah, let the market mechanism work this out, we need to have the lights on, we need to have water in our taps. So robustness and optimality is a necessity there. We need to also open up the academic research more to practice in the industry. There is some silo siloing there that has been creating, I would say at the very least frictions at the very at the worst difficulties in understanding each other's starting points. And of course, support initiatives as that for every end with as I said in there, my introductory slide, a joint voluntary activities with the IET and the various other engineering and computing bodies around the world and around science. And we thought I want to thank you for your attention. I want to thank you again for having me here. And I hope I planted some seed of thoughts to all of you and we can have an interesting discussion going forward. Thank you very much. I was trying to understand the second half of the presentation about the mapping the geographic physical grid that like the parents of a feeder to the resistance based one one dimensional, and I was They go there. You said that it wasn't an optimization problem but then I was wondering what was this shifting things around to the voltage got above a certain level. How is that similar to or different from optimization that sort of that changing I mean I get this this concept of the mapping that's really cool and valuable by itself. So, this is not optimization in the sense that the underlying problem the voltage problem is a problem of locality so you can solve that you can solve the problem by locally counterbalancing by locally compensating the losses caused by a flow of power flow of current. But in the broader sense it cannot be expressed as an optimization problem, because we go back to long convexity and so on. So, in that sense, the voltage problem is conceptualized as a problem of local resources matching, which to model local resources as if you're solving multiple different optimization problems. You want every locality to be solved. One optimization problem here, one optimization problem there. However, if you think about it, the locality is as we have like an impedance connecting all of them is leading them as well. So, you would be, if you would like to translate this optimization problem, it would be at best case scenario, the part of optimization. So you want or even worse again. So you need to make sure that everyone is happy and no one is falling behind. So this is why I was saying that in such kind of problems where the optimization model might exacerbate the size of the problem, you need to think in its root. What is voltage drop problem? A local grid point problem. And that's why we hear about, for example, capacity compensation of the stack of the feed and so on. So this local generators from the other end of the country can send reactive power, but in the meantime, you have spent all over the overhead lines. So this is, let's say, the idea there. So I'm not, I'm not rejecting the, let's say, the view of the problem was an optimization, but you would make an optimization problem by making the same time very complicated to actually solve it in a meaningful way. So this is like these steps that you hear that you see here, for example, take less than a second to solve. And if you think that the fact that the, first of all, the resources in the distribution system are relatively fast, photovoltaics, batteries, demand is sponsored, just turn off the suits and that's it. One second of solving the problem, combined with the fast dynamics of resources will amount to no more than a few, not even more than a dozen seconds to bring such a system within the proper quality level. So that's why we try to also avoid lengthy procedures of solutions in power systems because it reflects to the operation that needs to be kind of following along. So just to make sure that understanding you're saying that this conceptual framework allows us to look at the problem as many sort of separate local problems as opposed to this one big mobile optimization and that's the role of the conceptual mapping. Exactly. It's like a very simple optimization in a small neighborhood. Exactly. Got it. There are some questions online. Sure, I can read them, don't worry. I'll show it to everybody. Okay. How do you want to describe an artificial neural network as a linear function? It is completely false because the beauty of the NN season is not only that by activation functions, it's filtered like convolution, please clarify that. Again, this is not one of my works. I refer you to the work of Spiros Vasiliades at DBU. What he has been trying to show is exactly that, like how some neural network structures can be modelled as linear functions and with that feed it as a premise for the power system operators to have, let's say, I would call it friendlier feelings towards machine learning tools. So I do understand your objection there, but I could not expand more and this is not my work. The load mass techniques is given in theory does not have the capability to implement it in real operation, the calculation of its integral is not easy. If I understand correctly, the objection there is that the approximation of the center of mass modelling is difficult to calculate. I agree on the aspect of the integral. If we're going to be solving this as a problem of physics with distances and weights, yes, but if we model the electrical grid as a distance of impedance and we tab the loads and the generators for their total impedance from the infeeding part of the substation, it is easy because it's essentially, if you want to model it and I can even pull up my paper for this, you calculate impedance multiplied by load and then you divide all of that by the load, the total load of the feed and with that you can approximate the center of mass as impedance. So I do agree that if you take it in the physical sense, it is not easy, but if you use the physical intuition to project it as a model of a new system of axes, it is not actually that difficult at all. If there is a follow-up, please feel free to unmute yourself if you are in the chat or something and feel free to ask me, I would love to follow up on this. You can see in the paper that I pulled up, you can see all the details of the calculation of the center of mass modelling. Another question showed up. The question is, do you assume a radial network for computing the center of mass? As I said during the presentation, the example that I had in the paper and I also calculated was with a radial system, so I would take a distribution feeder, I would lamp up the branches, so I would start first with a depth first sense. I would start with a depth first sense to find the longest impedance in that feeder, and that would be my modelling of the feeder as a rod. All the other branches would be lamped up with their loads and everything on the main model of the feeder as a rod, as a physical object, and I would work on that. From calculations and preliminary tests that I've done at the time, I've been able to show that we can go also to the two-dimensional. So, instead of considering the distribution feeder as a radius, it might be also a loop, a mass network. We have to go there though at that point on the two-dimensional model of the feeder or of the grid as a physical object to attach to it points of mass models and do this counterbalancing between generation and demand. Any more questions? Thank you very much. Thank you very much.