 Really happy to introduce our next speaker for this morning, Professor Ram Rajagopal. He's Associate Professor in the Department of Civil and Environmental Engineering. Ram also directs the Stanford Sustainable Systems Lab here at Stanford that focuses on large-scale monitoring, data analytics, and stochastic control for infrastructure networks. Ram has also been involved with many of the BITS and WATS projects, which is one of the initiatives of the Precord Institute for Energy. I'm very happy to have you here, Ram, to talk to us about grid modernization and data science as it relates to it. So the floor is yours. Thank you for the introduction, Arpita. I always loved this program. It's an opportunity to meet all the new students and also for myself to learn from the other faculty in some of the other sessions. So what I wanted to do today is to start with this concept of a learning grid and then illustrate kind of two ways in which we are exploring this in research here at Stanford in my group. And that will help you see how data science can be useful for grid modernization. So what is the learning grid? Just to start off, if you think about the grid as a system, and we will see in a moment details about the system, a system that can be learned or managed from learning is a system where data is collected about the system. Then you apply some learning on that data typically to infer what's the state of that system. Since now I have the states and the goal for my system, so let's say minimize cost while scheduling power, that's a typical objective on the grid. I can take my model and my predictions and use them in an optimization framework. The outputs of those optimization then need to be implemented in the real system. This can happen because we have operators like in our independent system operators, there's actually operators who make the decisions and implement them, or it could happen more automatically like in a storage system where a computer calculates all these numbers and automatically decides all the settings. And then the key is once you take the actions, kind of this system then responds to all these actions, and then you can again collect more data, learn better, optimize act. It's very, very important to have this loop in any of such learning system because what you want to be able to do is by making this system go through different behaviors, you can improve the quality of your model. So in the beginning, your models are not going to be very good, but as you continue in this loop and persist through it, the models will actually typically surpass any kind of traditional modeling which is static. So I had a question there, which is off these four activities, collecting data, learning, optimizing and acting, what is the hardest to do in our electricity grids today? So you can imagine a utility or Tesla or even the Kaizo, the California independent system operator. There you go. So 28% of the people said collecting data, 13% said learning, 40% says acting, 19% says optimizing. This is a very, very good answer here. The two hardest steps are really collecting the data and taking actions. And the reason for this is a reason that goes beyond the technology issue. So typically the data that you need in order to do effective learning involves data from multiple organizations. So let's say you are an owner of a collection of distributed storage systems or you're a Tesla, let's say. And in that case, you need the meter data from your customers, but the meter data is owned by the utility, and it's hard for the customers to share that right away, even though they have the rights to do that. So these institutional barriers and organizational barriers are typically a big issue around data collection. Of course, when we scale the management of our energy system, as I'm going to show in a moment, there will be many, many more data streams and figuring out how to pull all of that together in real time, it's going to become a challenge as well. So there's going to be a technology challenge. In terms of the action, the reason for the challenge here is because in the grid, reliability is the key motivation. Cost comes second. So you'd necessarily want the grid where electricity is available for you whenever you want to use it. If you can do it at a low cost, great. But if not, the priority is the reliability. And because of that, it's hard to implement strategies that incorporate learning. Because typically in the learning, you might end up with models that don't make such good decisions when put into the optimization, and that could affect the reliability. So unlike a web application where if a wrong ad is targeted to you, the consequences of that is not so bad. It's just an annoyance. In the grid, it can have some serious consequences. So typically the way this is dealt with is to take any kind of learning, modeling, and application, or even data science insights from the grid into actions to transform into actions, requires you to do demonstrations at various scales and levels over time. So now let's see why do we need to worry about such an idea of the learning grid. This already happens today, and it's accelerating more and more by automating some of these steps. But why do we want to do that? You must have seen in some of the other talks, you know, what is the grid. So here I just have a summary. The traditional view of the grid is we have a centralized, large generation that produces power. These generators basically produce enough power to meet the demands. So basically generation follows loads. The demands are figured out by the utility who reads meters from customers and goes into a wholesale market and buys the power from the generators. And the system operator makes sure that demand and supply match meeting transmission constraints. And the distribution network is managed by the utility, but just to make sure there's enough reliability so you get the power. There was not much visibility in the customers. And this has worked extremely well for the last 50 to 100 years, if you think about the time of Tesla and Edison. But this grid is changing. We have the addition of renewable storage and so on. So that means generation now is of different types, and it's not necessarily controllable. So it's variable. And at the same time, the consumer is becoming far more sophisticated with a lot of different resources, like, you know, EV charging, storage, the capacity to generate power himself. And there is this idea now that in this grid, in this changing grid, the only way to get a cost effective solution for the balance of power, given the reliability constraints that we have, is not to have generation follow load, but to actually have generation and what we call load, which is everything that's behind the meter, behind the distribution network, meet each other somewhere in the middle. And that opens up a host of questions and issues. And the big challenge is, how do we do that? Because today, we don't necessarily know preferences of people on how they want to charge their cars. We don't really have really good models of the distribution network, which means if I want to have loads participate in matching the demand supply. So I reduce my load when the generation is lower, for example. I need to make sure the reliability of the distribution network remains the same. You know, and even if power is flowing back and things like that. And then in the markets, we are seeing a dramatic change with the participation of all these different resources. And building these solutions and scaling them up, designing marketplaces, algorithms, etc., is the main focus of a lot of the research that goes on. And one critical challenge is how do we get all of the information and inputs needed to design these solutions? And if you look at the requirements of the future grid, there are many different ones. But I feel personally that these are the three critical requirements you're going to face. First of all, we will have to move to a lot more autonomous operations because you're going to have just a larger number of assets and they're interacting in such complex ways. It's going to be very hard for a system operator or a human to then make those decisions as it happens today. The next thing that you need is you're going to have to have far more complex modeling. Today the models are mostly based on physics alone, a little bit of economics to design the markets. But there is no complex modeling. But now if homes are trading power with each other and suddenly they gather together to participate in the wholesale market and generators are making bids, all of this needs to be modeled. And this happens at multiple scales. And it's through novel data because we now have the data through the cloud for all these devices. And we need to be able to use that to build these models. And the last piece, I think, is going to become more and more important. As we are seeing on the grid right now, due to COVID, we need some adaptivity in all of these systems. So I can't create rules of thumb and stick to them. Because, you know, as things are changing, my rules of thumb fail. My traditional forecaster may make a mistake. So now we need a lot more adaptive forecasting. In the same way, if resources are going in and out, depending on the conditions, situations in the real world, we need some adaptation to that as well. So these are the requirements. And the challenge is, okay, how do I address these requirements? Well, the opportunity is really that we don't have to use just the traditional grid data anymore. What is traditional grid data? You will be surprised to learn the traditional grid data are sensors and transformers, some market data, and then some monitoring equipment in the transmission lines. So this is like the transmission network data and the distribution network data. But actually now, we have a lot more information. We have meter data from homes, we have data from the EVs, both while they're charging and while they're driving, there's traffic data, there is all kinds of weather information, etc., satellite data. And to build that learning grid, the real way to scale any solution is figuring out how to combine these different types of data and then doing those learning and optimization steps. It's great to have mathematical models, but unless you can actually tailor them to each one of the individual scenarios, you can't really scale solutions. So where do we use this data in this future grid? And even today, but in the future grid. So I made a little list here, which you can look through the slides. And in my mind, you can group it into a few categories. The first one, I will skip, it's like preparing the data. You all know what that is about. But the real first thing that we want to do is to learn about the different devices, the network, and the dynamics of the system. So there is a lot of that that needs to be done. And it's right now one of the hot topics of research in this area. The next thing you can do is now if I have the ability to learn all the inputs that go into my control optimization, market scheduling, I can also start learning how to do better optimization. So I learned all these inputs. And maybe I also learn about decision making under uncertainty over time. So that learning to optimize in the context of grid with the constraint of resiliency or reliability, that's a little different problem than traditionally what you see in reinforcement learning and so on. So there'll be a lot of research that needs to be done around that. We are starting to see because of climate change particularly and the introduction of a lot of new technologies and the cost curves of these technologies are changing so fast that they are hard to predict. We need to now learn how to plan. Planning was not learned before. We had rules of thumb and we follow them to build a plan in terms of what investments we need in generation and network and so on. That is going to change. And then more in terms of the operations of the system itself, we need to do better predictions of the state. That means to know how power is flowing. Where is it going to? What are the voltages around the network? And those measurements need to be incorporated by all the different systems that operate today. And that has not happened at all yet at the scale that needs to happen. And finally, we will need to improve a lot our ability to detect change both in the short term and in the long term. In the short term are things like, okay, there is a forest fire. Some parts of the grid were taken away. Did it also trip some switches somewhere? And I need to be able to detect that. And of course, even predicting those faults and transformers and so on that might have cost the forest fire. So all of these are things to use the data in. And the last thing I want to do in this introduction and then show you some actual examples is just to maybe give you a tip about if you are interested in this area, there is infinitely many problems. You can go from modeling a storage from data in very high level of detail all the way to modeling the whole grid, all the way to planning for that whole grid over a horizon of 10 years. So those are very, very different scales. But whenever you want to do some data science or data driven research, I noticed a few rules of thumb are very, very useful. First of all, I would start by saying, don't worry about methods. The methods themselves, the exact thing you're going to use a regression, linear regression or neural nets and all that. It's less important because those you will be able to develop it once you have the problem statement set up and all the data that you need. So in fact, the first thing you need to do is you pick the problem you want to solve, you need to identify for that problem, what is the data and typically interesting problems will involve new data. That doesn't mean necessarily new sensors, but it could be sensors that were used for one purpose like a smart meter is used for metering, being used for another purpose, like using the smart meter data to schedule and dispatch storage. That's an example right there. So figure out that and I'll give you a second tip about data. All the interesting data science projects you can go through, you know, journals and presentations, you will notice they will always link multiple data sources together. Even when you're doing projects around reinforcement learning optimization and so on, that also is a good advice. So spending time on this step really, really is key to be able to solve problems. The next thing that you need to figure out is for the problem you're solving, do you have supervision or not? What do I mean by supervision? A lot of times we collect operating data from a system and we don't have labels to tell us what was happening with the system actually. If you have those labels, it is a whole set of problems on how do you build solutions and optimization around it. If you don't have those labels, it's a whole another set of problems. So typically, that will determine the roads you need to take in this. As much as you can, you might find it silly, but labeling data will prove to be your best friend. So a lot of the projects we did in my lab, but also, you know, in Es Azevedo and many others, their students spent time doing this labeling and then from that, it got a lot of benefits. And then the third thing is once you have figured out your data, figured out the question you want to answer, the data, built a little model for it, it's not enough to show that the model performs well. These days, that's not so interesting. We know that you eventually will find a good model. The question is really, can the model give you compelling insights or significant performance gains with somewhat of a practical setting? That's kind of my three tips on how to quickly get into an interesting project. So, you know, using a lot of just traditional data and going for very complex neural network and deep learning and saying, hey, you know, this is this is the innovation. So I'm going to give you a lot of interesting solutions these days. You know, so that that's my personal thinking here on this topic. Okay, so now I want to show you two projects that can illustrate somewhat these issues that that I talked about. First of all, we did a project in the last four months in my group, where we were looking at the impact of COVID-19 restrictions on the grid. So how do the restrictions impact electricity demand? What that was the key driving question, does it reduce electricity demand or does it increase? Because it's not completely clear, you know, you don't consume so much in the office. Maybe you're consuming more at home because your home is less efficient, as an example. The second question we have, when we started this off, we just started with that first question. But over time, we figured out that it's very interesting to understand the relationships between demand and mobility. And one of the reasons we started this project was several of the students were doing lab projects. And for many months, they were not able to access the lab projects. And so we were figuring out maybe they wanted to do something. So they said, you know, maybe we can work on some questions related to COVID. And we had been reading in the news statements about how the load shapes for the grid have changed. But there were many contradictory statements. And we thought, what if we got the data and did a proper statistical study? So that's how this started. And let me show you what did we learn. First of all, what approach did we take? You want to understand how the grid demand is changing, which means you need to know a counterfactual. Because I only observe the change, the demand. But I don't know what would have happened in the absence of change to actually calculate the change. So the first thing that you need to build is what's called a counterfactual model. It's a predictive model that predicts the electricity consumption every day, every week, or every month based on historical data before the incident you're analyzing. And it allows you then to say, you know, if nothing had happened and my predictor is very good, I can believe that this is what would have happened. And you take the difference between those two, that's the change in the demand. So that's what we call the baselining in the demand prediction. And you can then estimate the percent demand change. Once you have that percent demand change, we can then correlate it with many different things. The first thing that we correlated it with was the confinement index. So we have all these different confinement policies around the world. And there is no way to do an analysis unless you codify all those policies. Luckily for us, some groups around the world have actually done the painstaking work of codifying lots of different restriction policies in terms of intensity into some categories. So whenever you have this type of work, that is a necessary piece. And then, you know, you can do things like, you know, understanding the results. So we collected data for 60 countries and regions in the world. This took us about a month to do. We went through different websites and so on, contacted people, etc. And we have now a database of hourly electricity consumption from 2015 to today to all these places. And then some level of corresponding weather data that could be improved. But here's what we got. So first, let's see what happened to the grids. I'm just going to skip that. That was the counterfactual model. You can look later in the slides or read the paper that's in archive. But here's kind of the first result that we got. Here in the colors, the darker colors, you know, the deep red means deeper change. So the demand decreased by 20% versus, you know, yellow means no change. And you can see all the different countries and regions in the world for which we had data. In hashed, it's really about the restriction, the confinement level. So confinement level three was the strictest restriction with the shelter in place. So, you know, people have to stay at home, etc. And we just created the hash so that if in that month, you know, less than 50% of the month was level three, between 50% and most of the month, and then 100% of the month. Since this picture is pretty small, let me just zoom into a few here. Here is Europe and Asia. You can see here that, you know, in the first period, which is February, the first month, not much change. But in China, there was already a demand change. And they were already in level three restrictions. And you can see there that their demand dropped by 20%. And now if you go through the plots here, one after the other, you can see that in some way, the restrictions do seem to correlate with deeper drops in demand. So that was the first kind of insight. But the more surprising insight for us was that, for example, look at this third panel here, if you look at Italy and India, they had a very deep demand change, the same order of magnitude of China. But the other regions did not. And why did that happen? Is that something to do with the intensity of shelter in place or the number of days? Or was it to do with something else? It could be that in some grids, the electricity consumption goes a lot into manufacturing, for example. Unfortunately, we don't have the data broken down through that. And if manufacturing stops, the demand decreases a lot. So that was an important question. And it had nothing to do with number of cases of COVID. So we didn't do a deep analysis on that, but it didn't seem to correlate with that. The other interesting thing is in different regions of the world, with different types of people, different cultures, you could see similar changes in demand. And places with the same culture, for example, the whole of Europe, it's not homogeneous. So there's something interesting going on here. So what we decided to do was, well, since everything is varying everywhere, why not take the demand change time series and use clustering and find out if there is any commonality. And to our surprise, we found four clusters, which we label here extreme, severe, moderate, and mild. You can see here the different countries and regions that belong to each cluster. So the US is mostly moderate. Europe is mostly in the severe category. Just so you understand, in the 2008 recession, electricity demand consumption in the United States in some places decreased by 2%. And it was a big alarm. You can now think 7%. That's gigantic. Or if you look at Italy and India, where it decreased by 26% at its peak, that's a monster amount. I mean, this is unheard of, all of these numbers. So the next question we had is, okay, this is really cool. There's some clusters. So it's not infinitely many ways the demand has changed. And you can see here, there's a decrease in the plateau. We haven't looked at the recovery. That would be the data from June, July, etc. We then wanted to figure out, can the confinement index level actually explain this? In order to do that, we use the machine learning model to regress the demand change against the number of days at each confinement level. And basically, what we found is that the relationship is really stronger with, for those groups where the change was more extreme. I'm going to skip all of that. In terms of load shapes, we found that indeed, weekday load shapes have become like weekends, particularly on regions where the demand change was stronger. So there's a lot of next steps on this analysis, including building models for sectorial impact, using the same methodology to understand climate change impact, and developing a global resiliency monitoring system. Since I will not have time for the second half of the presentation, I'll just open up for questions now. We have a question from Karen. Karen, go ahead and ask your question. Hello. Thank you very much for the insightful presentation. I have a question on your point of having generation and load meet each other. So what is your opinion on variable electricity prices to the end consumer as a mechanism to incentivize end consumers to adapt and shift consumption based on the increasingly volatile electricity production? Yeah. So this is a very excellent question. Right now we have time of use prices, which are not real-time adaptive, even though they call it real-time pricing. It's more like prices which you might set, you know, every year or something like that. People talk about maybe setting prices monthly. Some customers like Stanford participate in the wholesale market, so they do see the prices and respond to those. The challenge for designing smart pricing is really understanding what is the elasticity in the demand. You need to learn that, because unless you know that, your pricing algorithms might work poorly. The reason for that is also in the grid, all of the dispatches decided that they in advance, maybe 80-90% of the power. So there needs to be some predictive ability in doing all of that. Thank you, Professor. Very fascinating. My question is quite simple. Just in terms of the work being done on grid balancing and grid analytics and grid learning, who are the main actors working on this? Is it startups with utilities and transmission companies as customers? Or is it in-house with utility companies and transmission? Because it is such a whole infrastructure. We're just trying to work out who's actually working. So then, you know, the next project I was going to talk about is called the Global Energy Atlas, where we use satellite data and then more fancy machine learning to actually map all of the grid, the solar panels and so on. And that work is actually being adopted by startups and companies outside of the utility. The utility normally does not have the capability to do all of these advanced analytics. But companies like Siemens who offer products to the utility might want to incorporate such solutions and then they compete against startups that build on these ideas. That's what we are seeing. Whether that's effective or not, that's a separate question. In my view, the biggest challenge is how do you get the data that you need to do stuff. And the two examples I have today, one, we use the public satellite data. I can maybe just quickly show you guys one photo. Here is kind of what we did. We actually mapped all the solar panels across the whole of the United States just from satellite data. And that's really public data. And in the first project also, which we are now transforming into a resiliency estimator, it's also based on public data. So if you're building out of a startup, I think you need to have access. You need to be solutions that leverage public data. If you're in the utility, you have access to your own data, but then you don't have the human capital necessarily that can do all the analysis. And just a related follow up if I may. Presumably these models are system specific. So if you were to go and then apply it in a different country or a different part of the transmission grid, I presume your models would have to be recalibrated or even potentially redone with a different data set. Is that correct? And is that one of the challenges? Yeah, that is a challenge, although creating models today is very easy and fairly quick. So in the COVID example, we actually built 20 plus models and we fit them to every country. And one of those 20 is going to be the best one for each country. And we use that as a counterfactual. For this machine learning here, what we learned is we also have now run this deep solar project in Germany. You do need to retune the algorithms to run in Germany versus the United States, but we managed the whole of the U.S. And then now several states inside Germany and we were able to map. So there is a lot of tuning. I would say that the biggest obstacle, if you're thinking about starting a company on this, is going to be getting the data that you need to answer a question that's useful and who your customer is going to be. If it's the utility, then there's all kinds of reliability questions they're going to ask you, which take time to answer. If you have the time, we'll take one final question from Catherine Burner. Go ahead Catherine. Thank you. My question is pretty similar, so maybe it can be a short answer. So yeah, I was just wondering if there's any other ways to monetize these ancillary services that you're talking about other than directly looking for payment from the utility? Are there any other ways? Yeah, so there is a whole, in the industry right now, there is this idea that you can, if you gather enough resources, you can participate in the wholesale markets. So there are, I mean, that's independent of utilities in some of the system operator plans. In some plans, it depends on the utility. So you can, and there are companies doing that. So here in California, a company that came out of Stanford is HomeConnect, and HomeConnect is they bid into the market and basically send signals to your air conditioning and so on and so forth. The question there is really, how fast can you scale such a solution? I feel right now the biggest opportunity, if you are thinking about, you know, I want to maybe do some entrepreneurship around this area. I think the biggest opportunities are around the storage and EV charging and all the analytics that go around that because those resources are much more predictable. And you can have a few large or moderate sized resources, control them and really earn some real money. So we have a project called PowerNet that explored a lot of this, deployed it. And right now we are actually looking for someone who is interested in understanding commercialization to investigate, you know, certain application areas and so on. So if you're interested in that, send me an email. There's a group of five, six faculty that's involved in that project. And that's our next step.