 data science problems that we have in my research group in atmospheric science. If I sort of step back from just what's going on in my research group to what I think are some of the really big problems in atmospheric science, here are two of them. One is what's happening with climate, how do we understand the physics and patterns of climate today and how do we predict them into the future. And then the other really has to do with exposure to toxic chemicals in the atmosphere. And that's both, you know, we mostly worry about people, but if you're trying to feed the earth, then the damage to crops from toxic air pollution is significant, billions of dollars. The same if you're trying to keep a herd of cows or a million or a billion of them, which is what are, not a, well, there's 20 billion or something on earth. It's in the central valley alone, there's more cows than people. And really what I want to talk today about is, you know, the question, the strata, science question that we take on in an attempt to answer those is really how do we find the right mix of observations of the earth's atmosphere with a model that's capable of synthesizing them so that we can learn something new and have a better predictions. So we're looking at this question of how do we take observations and go to some useful prediction about how the world works. And you're all familiar with one example of data science in the atmosphere, weather prediction. The earth is two billion square kilometers, and we don't care about, you know, to solve the problem of whether you really need to solve the whole earth. But what we care about is the weather here. We don't care about it in Walnut Creek. And we know from just sitting here that across those 10 kilometers from here to Walnut Creek or here to Orinda, even, the weather is different. So there's this incredible need for really high spatial density, high time resolution information in the context of this large-scale problem. And the large-scale flows, the 1,000-kilometer stuff that you see here, you know, here's a cloud spanning the whole length of the Pacific Ocean, those flows interact with what's going on at the small-scale that we feel. You might think about it, you know, if you're predicting, if we're looking at a weather forecast and they say it's going to rain here and it rains in Napa Valley, we feel like the weather forecast failed, right? The weather guys, they had an incredible success. They got something on the scale of the whole globe right to, you know, a fraction of a percent of the dimension of the globe. So there's, you know, part of what goes on in atmospheric science in many ways more than in other fields is this effort to think about how we communicate to the people around us. And that's both because of these interesting weather issues where you can get the problem almost exactly right and everyone thinks you screwed it up. And because of the frustration of climate scientists with the fact that we understand climate really, really well and people think we're totally confused and don't know anything. So there's a lot of effort, you know, much as there is in the evolution to understand how it is people come to learn something that's much more in climate science than my friends in chemistry who work on materials and solar energy. So in my own work on this problem, we do three things. I'm going to show you an example from remote sensing where we look at the composition of the atmosphere from space. I'll show you an example where we do a lot, a lot of sensors in a small space and then some strategies for combining those in some sort of data model strategy. So let me just give you a sense of what we're doing. If you look at the Earth from space with this instruments that I work with, there's a satellite called GOM2 which has a pixel at the surface of the Earth that looks like this red square that goes overhead at 9.30 a.m. There's another one, OME that goes overhead at 1.30. That's the one I like the most because that's the highest spatial resolution we have. This is about 13 kilometers north-south and 24 kilometers east-west. And here's the one that will be launched five years from now that I'm most excited about. The first two are in what's called low Earth orbit. So they're going overhead once a day and they cover the entire planet. This one with the highest resolution will be in geostationary so it'll sit above North America and it'll get that every hour that the sun is shining. I've drawn a single pixel for each of them. In each of those pixels, we're measuring the full spectrum of reflected sunlight. So each pixel is getting the radiance of the Earth, the light reflected back from the Earth passing through the atmosphere twice, once from the sun on. And then we're taking the ratio of that to the light at the top, the irradiance that's coming directly from the sun to learn about what's in the atmosphere. So each pixel, there's zillions of pixels, has the spectrum and we're recording that and then analyzing it to measure half a dozen different molecules along the line of sight. I just want to show you one. This one is the molecule NO2. NO2 is a brown gas if you looked across the bay in the morning and it looks brown as you're looking to San Francisco, that's what you're looking at. And this is what NO2 looked like between the surface and about 10 kilometers above the Earth 10 years ago and you see every major city, you see a big power plant, another big power plant, there's a little dot there that's hard to see, that's a little power plant. And that's what it looked like six years later, eight years later. We got rid of a tremendous amount of the NO2 that goes into the atmosphere by putting better catalysts on passenger cars and controlling coal-fired power plants despite the VW debacle. They're not enough cars in the United States to make any difference in this and there's hardly any diesel. On the scale of the globe, there's really different stuff going on. If I had the before and after picture here, this would be way lower 10 years ago. This was higher, Europe was higher. You can see there's a big range of what's going on over the surface of the Earth and all of it is interesting at these very, very large global scales and then at the local scale. And that's the space that we're in is how do we learn from this global mapping spectrometer something about what's going on in your neighborhood. And this is where I think the really interesting stuff about atmospheric science comes. I'd like to say that all the really interesting questions are ones where the chemical time scales and the dynamical time scales, the weather, are the same. So here's an example where I made a model of NO2 in some simple plume. Imagine this is a city. If the winds are slow going out of that city, the lifetime of the molecule, the amount that's there divided by the amount that's emitted at some point in time would be 11 hours. If the winds are fast, it's 6. And that's because of some strange non-linearities in the chemistry where when it's in the green color, the chemistry is faster. So the rate at which it goes away is controlled both by the amount of NO2 and the winds and the intersection of those determines the overall lifetime of some source putting stuff into the atmosphere. And that plays out. Here's 12 examples of a day from that satellite. Winds are slow. The winds are fast. You can see there's a lot more if you just integrate that with your eye than over here. And if you put those all together, we rotate all the winds to be parallel to each other. Here's the winds slow. Here's the winds fast. And you get a really different overall lifetime for the emissions from that city. We can calculate that at really high resolution. Here's an example of why. Here's the NO2 from the largest power plant in the United States. It's at four corners. There's two boilers in this model. We can see the difference between those two. We can't see that from space. And then we can see that this OH is what controls the lifetime in the middle where the concentration is high. The lifetime is really long. And then you get to the edge, which is that green color I showed you in those other images. The OH is really high and the NO2 crashes. So that's one example of how we can generate a lot of data. And then we need to think about it in the intersection with models in different ways. Here's another one. We have a network of sensors, 20-odd sensors. They're roughly spaced by two kilometers. Two-thirds of them are on the roofs of middle schools and high schools. And the idea was to tie a science experiment to a classroom education experiment. So my grad students go out to the classrooms in these schools. And we intended to go talk about climate. And actually what the teachers most want us to talk about is what does it mean to be a scientist? What path did you take? How did you get there? It's great when ordinary-looking people go in. None of us really look like we're a movie star. And so that's what we mostly do. And sometimes we bring those people to Berkeley and take them through labs and show them what it's like to be a grad student and try and get people who've never thought about being a grad student to think about that. But the science is really to make a measurement at every one of these spots and think about what we can learn about the atmosphere from a much, much higher resolution, more comprehensive network than's ever been laid out before. So here's a picture of San Francisco at night on this sort of spatial scale I was just showing you. We don't have many sensors in San Francisco. We have two at the Exploratorium. And so here's one of those ones on the Exploratorium. The sensors are all in a box like that. What's in them is a CO2 sensor. That's what drives the cost. And then there's a particle sensor and sensors for air pollution gases. So each of these boxes has those things. And they're reporting back to us over the phone network with data at one minute time resolution. 20 of them at one minute a day all the time. So here's a year, for example, from half a dozen of those CO2 sensors. And you see all the kinds of things that we know about the atmosphere, and then we have an opportunity to learn. So if you look carefully, if you follow along the bottom of this trace, you see that it goes lower in the summer. Goes lower in the summer because green plants take CO2 out of the atmosphere and make leaves. And so there's less CO2 in the global atmosphere in the summer than the winter. You'll also see that excursions are higher in the winter than they are in the summer. The excursions are all at night when the atmosphere isn't mixing. Even though the emissions are down by a factor of 10 at night, the mixing in the atmosphere is down by 100. So you get higher concentrations at night, even though your emissions are lower. It's totally counterintuitive, but that's how it works. And you see that in the winter it's even more stable. And you're familiar with that. You know if you walk around in the winter and someone's got their fireplace going, you smell it. It's right there. You can see the plume wafting. It's hardly mixing. That's what we're seeing here in this network. So I'll just give you a couple examples of how we think about this. Here's an example of a sensor that's particularly sensitive to the Bay Bridge. And you can see in black is what it looks like on the average weekday and in gray on the average weekend. And you can see that we drive about the same amount on the weekdays and the weekends. We just drive at a different time of day. You see we don't have an evening rush hour on the weekend. I'm always shocked by that because I've been stuck at that bridge for hours trying to get into the city on the weekend. But on average, at least, that must not be true. I must always be late to whatever I'm going to, so I feel it worse. The other thing is that we close this bridge to make room for a new one. And if you compare the Sunday when we closed it to the average Sunday, there's a lot less CO2 in the world. I had this grad student here from Harvard visiting for three months. And he was building a model I'll show you next. He had no idea the bridge was closed. And he discovered it in the data. I'll show you how he discovered it, it was cool. So the way he discovered it was he took a weather model and he built a full inverse to calculate the emissions from a model of the emissions he made at one kilometer resolution for the whole Bay area. So he took that model and he built the inverse. And this was his result, summarizing a tremendous amount of work that he did. And the result was that on the data bridge was closed, there was no emissions there. Blue is a lot less than the initial guess in the inverse. I think that's cool, but that we could get from looking at one sensor where the network really comes in is that not only were we not over there as a community, but we were over here and over there and over there. And that's what's really exciting about this is that this network of data is telling us not the obvious that we weren't on the bridge when it was closed. There's 100 ways we could have figured that out. But there's no easy way to figure out where did all those people go, at least in the sense of their emissions. So I just want to give you this sort of brief introduction to a couple of data science applications that we use to think about the chemistry of the atmosphere and its connection to health and climate. And it's really that we have these chemical observations from space that are really a new opportunity to think about things. And in the examples I showed you, another example that I didn't mention as I flew by, if you looked at the wind vectors on all those plumes from Riyadh, not all of them were parallel to the chemistry. And I think that's telling us that the chemistry can fix the weather models in ways that we really don't know how to do yet. And that's an interesting avenue that we're pursuing as one possibility. So that's point number two. And then we have this idea of dense networks. And there's lots and lots of people thinking about how to produce orders of magnitude, more observations about the Earth than we have now, and then synthesize them in some interesting way. And it's this question of how do we understand what each of those instruments is and understand the calibration so we can actually talk about all the data in the same conversation instead of a thousand individual instruments. It's actually one collective instrument. Learning what that means from a measurement point of view and a data science point of view and interpreting it is really the challenge before us. It's a thousand individual instruments isn't gonna help us, but a thousand that are all on the same page somehow is really gonna change things. So like most faculty, I don't do anything. These are the folks who did this work. Alex Turner did that model. Alexis Schusterman is putting this network together and these other, where Josh is working on the satellite remote sensing. So it's a team of us thinking about this from various perspectives. And thank you for coming and spending your...