 Welcome, everyone, to today's seminal. Because today is Thursday 1, he's a PhD student in the Department of Civil and Environmental Engineering. He's going to talk about a project, a very interesting project, in which they use satellite images to determine the locations of distributed energy resources, in particular solar installation, and grid infrastructure. And just a quick reminder, we have another one. Our next seminal is next week. Professor Ines Azvedo is going to talk about the just remarkable factors. And after that, who is Shoban, is going to talk about EV charging behavior. That should be very interesting. So our speaker today is Dersen Wang. He's a PhD student in Civil and Environmental Engineering. The minor in computer science. His supervisors are Professor Wang Rajaboba and Professor Luan Matunda. His research interests in machine learning, sustainability, and social science. In particular, he aimed at the development of AI-driven methods to provide closer solutions for building sustainable urban energy systems. He received his Bachelor in Energy and our interview from Tsinghua University in 2016, and Master in Mechanical Engineering from Stanford. Hey, everyone. Thank you for being here today. I'm Dersen Wang from Civil and Environmental Engineering. Today, I'm going to talk about the energy address, which is machine learning-based mapping for the distributed energy resources, as well as their interaction with infrastructures and communities. And the address is a collection of maps. And the background picture shows in this slide shows an example of access. It is called the Atom Orbis Terrarium. It is considered as the first modern address of the world. It was published in 1570. So we can see how people had already done an excellent job in mapping our Earth even four centuries ago. So why do we need another address? Well, there is today, especially, why do we need a energy address? That is because our Earth is undergoing a drastic change. Actually, not towards a good direction. The climate becomes warming. And there are numerous evidence shows that human activities are a major cause of the climate change over the past century. When we get faster transportation, cheaper energy will also get higher temperature. The climate change can harm humans in turn either directly or indirectly through our infrastructures. Because our infrastructures were not originally designed to adapt to such a climate change. So the climate change will increase the frequency of extreme weather, which will pose severe pressure on our infrastructures like power brick. There will be more power crises and more white fires ignited by power lines. To tackle the climate change, we are trying to decarbonize our energy sector by introducing more renewable resources. Solar wind has been growing rapidly in the past decades in the global electricity generation. And to reach the climate goal, it is predetermined to grow even faster in the future. However, our work is never homogeneous, especially due to drastic conditions. While living in the world, we are advancing clean energy technologies such as blue pop solar, electric vehicle, and home batteries are rapidly penetrating the life of wealthy people. Meanwhile, there are still a billion people in the world letting them basic access to electricity. Even in the US, while there is rich energy resources and a 100% electric consumption rate, there are still over 30% of households reporting different kinds of energy insecurity like the challenge of paying energy bill or suffering from frequent power disconnections. That's where the domain to categorize a state with relatively high average income level, strong solar radiations, and good policies for solar installations. Still, for a lot of locations, there are still significant barriers for those communities to install solar and the barriers comes not just from their economic ability but also from their local infrastructures. The hosting capacity of the power grid for the solar PV over there could be as less as zero or less than 1.5 kilowatts. That means that even the local people want to install solar, can afford solar, the local power grid does not have enough capacity to host the solar. So that's why we need to zoom in to a highly granular level to investigate the heterogenearity across places, across communities. And that is the first step before we can figure out a conclusive and equitable pathway for every community. Besides the spatial variation, temperature change also matters a lot. Here, we show how different technologies have been adopted among populations over time like the telephone, refrigerator, and the internet. And they are usually adopted by a few pioneers at the beginning, then the speed runs up and runs down, and finally, the adoption was saturated. So we also want to know how these types of curves look like for renewable energy. We are interested in who adopted first or more concerned about who are that behind. So camera change is also very important and besides the spatial and temporal variation, the energy condition also proposes technical challenge to the power grid, because our power grid was not initially designed to adapt to the condition. A critical rule of power grid is that supplies must be demand at any time. This works well in the era of conventional energy, but when a large portion of it is replaced by the renewable energy, things have changed. Because renewable energy are intermittent, unstable, they're highly dependent on values, when they're connected to the power grid, they can cause the instability to the entire system. So we also want to get a transparency on the renewable energy and the connections to the power grid. However, unlike conventional power plants, renewable energy are more decentralized and distributed. For solar PV, they can be installed on numerous residential building rooftops or even behind the electricity meter of individual households. It's challenging to know where they are, what the capacities are, and how they have been deployed over time. On the energy distribution and transmission side, there are also challenges. Transmission grid is the term of the power system, the current power from the power plant to substations. And we have rather comprehensive information on there. But for distribution grid, there are the branches of the power grid. They carry power from the substations to individual consumers. They're more distributed and managed by different utility companies separately. And even utility companies themselves may not exactly know the information like their locations, the connectivity, and their status, like whether they're on the ground or overhead. We tell about whether they're overhead or on the ground is because overhead power lines are one of the major causes for destructive white fires. It costs a lot of errors to be burned every year. And white fires can cause power grids in turn by either reducing the line capacities or completely damaging the power lines. However, unfortunately, all these information are owned by different, are kept in a large number of data sites. For example, in the US, there are thousands of utility companies, solar companies, regulatory agencies, and different organizations. Each one has their own data, which is not accessible to others. So is there any way to break such isolations? So one way is to let those organizations contribute their own data. Berkeley lab has a project called Tagging the Zone, which maintains a large-scale solar PV registration database covering 30 states in the US. And then data is based on the voluntary data contribution by different agencies, utilities, or the PV systems participating in incentive programs. So what's the limitation? If a system was not participated in any program, or does the program does not want to report their data, then the system will not be recorded in the data set. So, and we can also see that the dataset has very good coverage for the regions with many incentive programs, like the West Coast or Northeast. But for regions with no program or very few programs, then the coverage is not good. So, but we also want to know how solar adoption are happening over these places. So, given such limitations, we are wondering that if there is any way to obtain the high granular spatial and temporal information in a non-intensive, scalable, and generalizable way. Fortunately, there's increasing availability of heterogeneous spatial data and remote sensing images, including sunlight and aerial images of different resolutions, times, and the spectabets. We also have street view images and a lot of public field informations, geographic information for roads and buildings. On the other hand, machine learning has achieved great breakthroughs in recent decades. It could be used to automatically extract information from the road data to get information we're interested in. So we hope it can eventually help us to construct a comprehensive spatial map and temporal changes. So in this project, we propose to construct energy access, which is a comprehensive and dynamic interface for distributed energy resources, as well as their interactions with infrastructures and people. And it consists of geospatial layers from the supply side, to the transmission and distribution side, and finally to the demand and people side, and in this project, we mainly focus on the spatial temporal solar PV mapping on the supply side, and the fine-grained distribution-free mapping on the energy distribution side, and the characterization of people, on the people side. And here is the outline. Let's first go to the supply side, the spatial mapping of the PV, which is called PV Solar, and also the temporal mapping of PV, which is called PV Solar Plus Plus. As we know, for solar PV to work, they need to expose to the sunlight, so there's a good chance to capture them from the top-down view of the remote sensing images. And a straightforward way is to bring this problem as an image classification problem. That means that we can label each image with a binary label indicating whether it is positive meaning contained in solar and the negative meaning contained in no solar. And by providing such a binary label for a lot of images as supervision, we can train a classification model, a convolutional neural networks for classification. And actually, in practice, it can achieve good results for classification with over 90% precision and recall. And even though in the real world, the number of net view samples is more than 100 times than the positive sample, the result is still good. But given that if an image has already been identified as positive, how do we further estimate the size in this image? We know that to provide full supervision, we need to label, we need to annotate the solar panel size for every solar panel for a lot of solar panels to train the model. But in that, but PV is super labor-intensive, is it possible to do it in a more efficient way? Actually, you can do it in a weekly supervisory. Weekly supervisory is that we still only provide a binary class label for training the classification, but the model can gain the size estimation ability at the same time during the same process. Let me illustrate this with a simple example. Here, we just assume most of the layers in the neural network can be regarded as black box, but we will open it later on. But here, we just tell about the intermediate result at the very end of the network. And it is a set of matrices called the feature maps. And each feature map can be regarded as a result after multiple signal filtering and transforms, and each feature map focus on a specific type of media features in the image. And for classification, we just apply the average operator for each feature map and get the average, and then multiply it by some weights and then add them up together and get the final number. And this weight marked in red can be trained during the training process. And for classification, the final prediction is based on the final number, like if this number is greater than the threshold, then the image is identified as positive. But what if we remove this average operator and directly apply these weights to the feature maps after the model has already been trained? What can we get? Actually, we can get a weighted combination of the feature maps, which highlights the areas of the objects we are interested in. It's called cut activation map, means that the pixels highlighted in this map are the most indicative of the cause we are interested in, the solar panel. And can we improve it further, improve it to be better for a more accurate size estimation? Now, let's open the black box to see what's inside. Actually, the result after each layer in the neural network is also a set of feature maps. And for each downstream layer, it extracts the features from the output of its upstream layer to derive the feature map. And we can also apply the fast activation map on each set of the feature maps to get the fast activation map, the CAN, at a different position. And what we can find actually, we can find that from the upstream to downstream, the features that are extracted become more specific, more indicative of the cause, but of lower resolution. But from the downstream to upstream, the feature becomes more complicated, but no easier, but of higher resolution. So there's a trigger to accurately estimate the size of the solar panels. You want the fast activation map to be both specific and of higher resolution. So we try to break the trigger by using a training paradigm for the 3D layer-wise training. And that means that after a model, the classification model has been changed. We just freeze it and add a new layer, new convolutional layer, and the middle point of the neural network. And we choose a rather upstream layer because we want to leverage a higher resolution of features extracted over there. And we train this newly added layer for classification with only binary label as supervision. And after it has been trained, we can generate a fast activation map using the feature over there. And we can see, compile to the original one, original one, this new one, this new fast activation map has less noise, but we still don't use it. We further add another convolutional layer right after this newly added layer, and also train this layer for classification. And now we get another fast activation map, which is even better than the previous one. So after with this 3D layer-wise training, we are forcing the newly added layer to greatly extract features from the previous layer to get a better, more cleaner, and more complete representation of the objects. And we can see the fast activation map generated with a 3D layer-wise training has better quality than the ones without 3D layer-wise training. It has less noise but keeping the completeness of the projects, of the objects we're interested. And we can use it for estimating the sizes of solar panels in the image. And it can achieve the mean-relative error of 3% at every given. And we finally deploy this model to over 1 billion image titles across the particular US to construct a nationwide solar PV dataset containing 1.5 billion solar PV systems currently US. And for each system, it contains the geolocation, size, and subtype information. And here we visualize the solar deployment grid across the US from the state level all the way down to the system level. We also develop a web-based platform to let public users visualize to analyze the big solar data. It provides the visualization platform interface to enable users to compile two variables at the same locations simultaneously and from the state level to the system's checklist. And it also provides a data analysis platform that enable users to correlate the solar deployment grid with other variables, like income, meaning debt, by controlling for the third variable like the solar radiation system. So the big solar reveals the static patterns of the solar adoption. But we also want to get a temporal information to get to know, to understand the PV adoption trajectories over time and to predict the future growth of PV and to conduct total inference to analyze the intervention events of different solar PV policies. And to do this for each image, for each solar panel, we backtrack the historical images at the location of this solar panel and get the historical images. It's like a sequence. But one challenge is that the image resolution, many historical images have so low resolution that even a human can hardly tell whether there's a solar panel in these historical images. But we know that eventually there will be a solar panel deployed at this location. So we can use the latest images as a reference to let it compare with each one of these historical images. And with this comparison, we can tell that whether the same object also exists in each one of these historical images. And with this approach, we can identify the installation here for each solar system. And to implement this comparison-based strategy, we developed the Siamese network, which means that the two branches in the network with sharing the identical architecture, each branch take one image as input and the features extract it at different positions on top with each other between these two images using some modules and the comparison can generate a similarity and the similarity can be further used to derive the final prediction. And with this model, we can actually achieve very good performance in estimating the installation here of solar PV systems. For 86% of solar PV systems, the predicted year of installation by our model is equal to the actual year of installation. And we use this model to construct this spatial temporal PV installation data set for 420 counties across 46 states in the US. And this one is called if-solar plus plus and the coverage is still growing. Later on, I will show how we use this spatial temporal data set for solar adoption and analyze analysis over time. But now we just move on from the supply side to the energy distribution side. In this section, I will introduce the distribution grid mapping by combining multi-model data, which is called if-brick. And the two major streams of the works for the distribution grid mapping and modeling. The first one is the distribution grid topology estimation based on a measurement state like voltages. They assume the nodes in the distribution grid are already known. And the measurements at the different nodes are available and those measurements are obtained by the small meters deployed on the consumer side. And the final goal is that even those measurements are different nodes, they want to estimate the power line connections between different nodes. But this cannot be extended to the open world because the small meter is not finally deployed everywhere and we cannot, in the open world, we cannot have prior knowledge of where those nodes are. Another stream of works is for the open world distribution grid mapping. And they directly map hostage your locations and connectivities of the distribution grid from scratch, completely from scratch, by relying on some imagery data like the arrow images or night light maps. But the problem is that for many locations, for many places, the resolution of arrow image is not high enough to detect the utility poles as well as the power lines. And the resolution of night light map is also low. It's only at around one kilometer per pixel, which is impossible for one grid distribution grid mapping. And in comparison, street view images are widely available with rather homogeneous resolution. Then instead, the method developed based on it can be more generalizable. And instead of using the horizontal view of street view images, we use the upward view of the street view images because it has such simple, simple, multi-sinclair-geometric relationship, which could greatly facilitate the estimation of the power line directions as well as the utility pole orientation. So, given those street view images, how can we derive the final distribution grid map containing not just the overhead part, but also the on-the-ground part? So we know that on-the-ground part of the distribution grid cannot be captured by the street view images. So we integrate automodalities of data like load networks and the building maps. Now, let's see how we can derive the final grid map by integrating all these data. The street view images are processed by two neural networks. One is the utility pole detector. Another one is our line detector. They are used to estimate the utility pole orientation as well as the line directions in the image. And then by intersecting the orientation estimated at different street view positions and by intersecting the rays of these orientations, we can localize the utility poles on the map. And in this way, we can construct a map of the geolocations of different utility poles. And our next step is that we want to predict whether there's power line connection between different poles. And to do this, we integrate the load network as well as the line directions, the polarizations, and derive those features for classification, that whether the two poles are on the same road, whether they're next to each other, along the road, there are any power line detected between them, and the power line end goes estimated by the power line detector. And in this way, we can develop a machine learning classifier to classify, to predict whether there is a power line between two poles. And in this way, we can construct a map of the overhead grid consisting of both the utility pole locations as well as the power line connections. And our final step is to derive the on-the-ground part of the grid. And to do this, our approach is heuristic. It is based on two assumptions. The first one is that buildings are more likely to connect it to the nearby grid than the grid are far apart from it. That means that we can overlay the estimated overhead grid with a building map. And then we want to derive those buildings that could be connected by the estimated overhead grid within a certain distance. And to do this, we virtually dilate the past, as they dilate the estimated overhead grid with a certain radius to cover those buildings. That means that this part of the building could be connected to the estimated overhead grid within certain radius. And then our next assumption, our second assumption is that all buildings are connected to the grid, which means that the electrification rate is 100%, which holds in many countries. And under this assumption, those buildings that are not connected by the overhead grid must be connected to the on-the-ground grid in some way. So that means that if we want to estimate the on-the-ground grid, we can just find a path to connect to those on-connected buildings. So we use a graph algorithm, which is called dextrous algorithm to connect those buildings on-connected by the overhead grid and derive those paths to connect them. And we use those paths as the estimation of the on-the-ground part of the grid. And this way, we can obtain an estimation of the entire grid map, including mostly on-the-ground part and overhead part. And to evaluate the performance, we use two metrics, one is precision, which means the fraction of the estimated distribution grid of which the actual distribution grid can be found within a radius R. And recall means that the fraction of actual grid that can be detected within a radius R. And for the five test areas in Northern California, under the R equals 30 meters, we can find the precision can be 89% to 98%. And recall can also be around 80% to 90%. And this model can also be extended to other regions in the world, like the South Saharan Africa. We know that electricity infrastructure will not be in a good condition over these places, and we don't have much data about it. So our approach may be helpful for these places in the electricity infrastructure management and planning. And we find that without any retraining or fine tuning, our model can maintain a very good precision on the test areas in the South Saharan Africa. But the recall drops from 80% level to 70% level. Now let's move on from the energy distribution side to the people side. And in this section, I'm going to introduce the urban Tibet, which is a multi-modal representation learning for the communities. And we need the characteristics of communities because they're important for the decision-making, for the policy-making. And it is essential for us to get this information to enable a people-centric energy transition. So but how to capture the characteristics of an urban neighborhood, of a neighborhood, of a community? The common way includes the census and the survey. In the US, the census is conducted every 10 years. And it costs 15 billion, which is very high. And another approach is the American Community Survey, also conducted by Census Bureau. For the smaller geographic units, the data is usually available by averaging over the five years. And it is based on sampling. And the cost is also verified, it costs 250 million per year. And we're wondering if there is there any way to obtain such highly granular information in a more cost-efficient and generalizable way that we directly extract those information from the open data? Now let's look in the communities or neighborhoods. A community or neighborhood can be regarded as a container containing the local environments and the local businesses like restaurants and stores. And the local environment can be captured by the street view of images. And the local businesses, the information can be captured from the textual data on the open platform. Let's look at the ratings, the categories, and also their customer reviews. But how can we represent such a container? As we know, an important pillar of natural language processing is to represent a word and a vector, like word to vector. And that means that by just receiving the vectors, the computer can know the semantic meaning of the word itself. And for example, given a document, the model will project the words that are semantically similar to the positions next to each other in the vector space. For example, in this document, Yale occurs before university. And the word Stanford also occurs before the word university. So the model will project Yale and Stanford close to each other in the vector space. And that means that if two words are semantically similar, the corresponding vectors are also close to each other in the vector space. And this has been widely applied in many applications that Google search, like given a query, it can help you obtain the semantically similar content to the query. And can we have a similar idea? Like, given a neighborhood, can we search for its top 10 most similar neighborhoods, like in a search engine? And we know that the neighborhood contains a strict view. And according to the situation of this first law of geography, everything is related to everything else. But near things are more related than distant things. So for example, street view one, it is more likely to be close, to be similar, to another street view near this street view one than a street view that is far from this street view one. So in the vector space, the model will make the vector of street view one closer to the vector of another street view near it. And recalling the vector of the street view that are far from it. And we also, the model also forced the neighborhood, the vector of the neighborhood closer to the vector of street views in content. And in a similar way, as the neighborhood also contains local businesses. So we can also establish the correlation between the neighborhood and the words describing the local businesses inside this neighborhood. For example, if a customer used the word expensive to describe a restaurant in this neighborhood A. And the model, well, and another word dirty is not associated with any local businesses in this neighborhood A. So the model will force this neighborhood A, the vector of this neighborhood A, to be closer to the vector of expensive in the vector space. But recalling the vector of dirty. And in this way, we can derive the vector representation of every neighborhood by incorporating both the street view images, street view information, as well as the texture information. And this vector representation can be used in a lot of function tasks at the prediction with just a simple model for regression, like the prediction of demographics, and the prediction of the realistic price, and prediction of the solar reduction rate, like the number of systems per household. It can also be used to retrieve the similar neighborhood to a party neighborhood. For example, for neighborhood A in Chicago, the model can retrieve the most and the least similar neighborhood in another city in New York by just leveraging the distance of vectors. And we can find the neighborhood A and its most similar partner can show the similar demographic features, like the average household income, the median age, but the least similar neighborhood will have the different demographic features. And now we get different spatial, the methodology to construct different spatial layers, but we also want to overlay one layer on another for some real-world applications to solve some technical challenges or to instill some socioeconomic insights. For example, by combining the solar PV map, distribution grid map and the solar radiation map, we can estimate the solar energy on the power grid to facilitate the renewable energy integration. And by combining the solar PV map and the characteristics of communities, we can analyze the solar adoption patterns to inform policy design. And by integrating the solar PV maps and the electric where you call charge demands, we can estimate the, we can analyze the integration of the EV question. Yeah, sorry, you only have a top view of the solar PV panel. So how are you estimating how much it's able to supply? How much? How much power is it able to supply? Because you also think about like the plasma plan where all we have to take power, all that sort of stuff. So yeah, in another work, we have integrated the 3D building data to estimating the new pump tilt to get a better estimation of the size of the solar PV. And also by combining the distribution grid map with the natural disaster maps like the Wi-Fi rate, we can estimate the grid with our ability to do justice. And in this project, we mainly focus on two applications. One is to combine heat storage, solar plus plus, and all the two spatial layers to understand the solar adoption patterns across spaces and time. Another one is to integrate a deep grid with the Wi-Fi rate to estimate the grid with our ability to Wi-Fi. Now let's look in the first one. From the deep solar data, we can establish the correlations between the solar deployment rate, characterized by the number of solar systems per thousand households with different demographic variables, like average household income and population density. And we can see that the solar deployment rate actually increases with the average household income, but then saturated at a higher income level. And the solar deployment rate also shows a nonlinear relationship with the population density. The first increase is then peak at around 1,000 people per square miles and then decreases. And those correlations are steady. We also want to know the temporal variations. So we analyze this from the perspective of technology adoption left cycle. We use a classic adoption model that is called bus model to characterize the adoption trajectories over time. And the bus model is controlled by a differential equation, which is I will not go into detail here. But in generally speaking, bus model can be used to segment age, adoption trajectory, cumulative adoption trajectories into four phases. The first one is the pre-adoption phase, which means there's no adoption at all. Then there's a ramp-up phase, which is from the adoption onset to the peak of the growth rate. And then the ramp down, which is from the peak of the growth rate to the saturation. And finally, the saturation phase. And by grouping the communities into different income levels, like the low income level and the high income level. And for each level, we can plot the fraction of different fraction of communities in each of the four phases. And we can find that in 2016, over 60% of high income communities had already started adoption, which means not in the pre-adoption phase. But in five contrasts, this number is only 30% in low income communities. So then instead for low income communities, there's the fraction of communities that have not started adoption yet is more than that in high income communities. But if we exclude this pre-diffusion part from the plot, we can only focus on those communities that have already started adoption. And we can see that for low income community in 2016, there are over 40% of communities that already entered the saturation. But this number is only 30% in the high income communities. And we can also run a multivariate regression to see the correlation between the time of pre-adoption onset and the saturation adoption level with different demographic features. And we can see that medium household income shows a negative correlation with the time of pre-adoption onset, which means that higher income communities can do adopt earlier. And it also shows a positive correlation with the saturation adoption level, which means that higher income communities will have generally have higher saturation adoption level. So to summarize, compared to the high income communities, low income communities not just started adoption later, but also it's more likely to get saturation, but at the lower saturation level. And we can also combine the deep grid distribution grid map with the white file grid to investigate the vulnerability of grid to white files. And as discussed before, we know that the older half power lines can ignite power lines and can ignite white files. And also the normal two files. So why not blurring all those lines on the ground? That is because the cost of on-the-ground power lines is very high. It is at the one and a million level per mile to bury those on-the-ground power lines. So it is an order of magnitude higher than the cost of overhead power lines. So we want to get to know the status quo of the line-varying status over different places to see whether the power lines are buried at the proper locations where people need them the most. So to do this, we apply a deep grid model on the two largest, on the territories of two largest utility companies in California, the PGFE and Southern California Edison. And then we can leverage our data to see different correlations of the on-the-ground rate. And we can see that on-the-ground rate shows a positive correlation with medium household income conditioned on different white file rates. Given on the high file rate regions, the higher lower income communities also have lower fraction of power lines being on the ground. And also the on-the-ground rate is a positive correlation with the white file rates. But for the high, for the low density regions, the on-the-ground rate is generally low, which is, but also insensitive to the white file rate. Same as the white file rate was not fully considered in the line-varying decision. So, and this is also reflected in our current policies of line-varying, which is called Rule 20, made by California Public Utilities Commissions and three different types of projects under this policy. The Rule 20A project, the cost is shot by the entire utility customers. But the eligibility criteria primarily focus on the aesthetic or convenience purposes, like whether the overhead grid affects the traffic on the streets. And for 20B or 20C projects, the cost is fully paid partially by the project applicants, by home owners, local governments, and developers. So what can we find? So that means that if people want to vary those lines for our white file rate, they can only go to the Rule 20B or 20C. That means that for low income communities, they need to pay the cost out of their own pocket. But for low income communities, they may not be able to afford those costs by themselves. But low income communities in the white file areas need the cost sharing the most. But neither the income level or the white file rate is considered as an eligibility criteria for the Rule 20A for the cost sharing. So we hope that in the future, these things should be considered in the cost sharing to make the protection of the white files become more equitable across different communities. And our data set can also be used to localize those hotspot zones that need the priority for the investment or projects by combining the household income map and per household library costs. We can localize those regions with low income level with a high per household cost for burning power lines that actually drove them fixed fuel. And the local communities over there may not be able to afford the light burning costs by themselves, so need the priority for the investment. So in summary, the energy access is a comprehensive map containing the geospatial layers at different sites. On the supply side, we have saw the spatial temporal mapping of the solar PV, like the deep solar, deep solar plus costs. And on the energy distribution map, we saw the distribution grid mapping by combining multimodal data. And on the people side, we saw the representation learning of the urban neighborhoods. And we also saw the two applications like by combining the deep solar, deep solar plus costs and the demographics, you see how we can use them to understand the solar adoption patterns across space, across time, and across different demographic communities. And we also see that by combining deep grid with the white power rates, we can review, we discuss the non-uniform free vulnerability to white parts. And just to outlook forward, you can see there's a, it is very promising to see the marriage between the two different fields, the marriage between the machine learning and the sustainability. And there's a very good opportunity to leverage the machine learning, which is one of the most exciting tools developed in the past decade to solve climate change, which is one of the most significant challenges facing human beings. Machine learning can be applied to solve a variety of costs in the climate change and sustainability research like the mapping, prediction, simulation, auto inference. And solving the problems in the climate change and the sustainability research can also in turn ignite the insight for the machine learning research from the physics-informed machine learning, multi-modal learning, machine learning values. And also we know that training lots beyond machine learning model will generate a huge amount of carbon emissions. So if we can apply machine learning to solve the critical problems in reducing carbon emissions, then machine learning itself will be carbon neutral or even carbon negative. And in that case, I hope you're excited as me for the exploration of this cross-disciplinary integration to create a more sustainable and intelligent work. And finally, I would acknowledge my advisors in the two research groups, the professor Ron Reservova's group and the professor Mojumna's group. And I also want to thank to my collaborators and those funding agencies and the institutes that provided the data and supported me for my research. And that's it. Any questions? Are the codes, the software code, are they somewhere so that students can use it? In the source code? The source code, yeah. Yeah, yeah. For example, for DeepGrid, it is public available in-house. There's a website that we have an open platform which also provides the different resources including the data and the code to this project. Are you able to estimate whether there will be substations in the type of thing within the DeepGrid system? Yeah, in the DeepGrid, we haven't estimated a location of substations yet. But that is possible by leveraging remote sensing images. I think you just need to tell the cloud features and the substation. That's cool. Other questions? It's not of interest. I mean, I can understand how this is a lot more scalable in terms of being able to get the information, but it seems like at the crux of it, this information should already be in person. If I'm looking at like, sort of BB data, I have to get a license to do that installation. So have you guys been able to find that data to correlate something, finding more? How are you doing that? I mean, correlating. Like, as you expect to see, you think that my solar panel in this area has a capacity of 100 kilowatts, but actually in that area, it might have a capacity of 80 kilowatts. How are you confirming what you're estimating? In comparing, like, to an estimate in the mismatch. Yeah, exactly. Do you have any ground truth? Yeah, the ground truth, yeah. Yeah, we are compel them with, like, we manually label to some, first of all, at the image level, we estimate the performance, the accuracy of predicting whether this solar panel in the image, and compel it with the ground truth and locations, and they can achieve, like, for 90% of the system, it can be detected by our model. So any more questions for students? You see, if you have any questions? Yeah. Can you map the number of homes per home of transporal former in different areas and the remaining ability to electrify more homes? Yeah, we can, because we have the green maps, like, are in different areas, so we can also coordinate with the building maps in different locations to get, like, we also have the solar maps, so we can estimate the capacity, like, remaining capacity in different places, like, to see whether it could be electrified more homes or provide to host the capacity of solar PV across different places. So that could be downstream applications for these maps generated by our models. Okay, good, thank you.