 We're excited to talk to you about big data because we're undergoing a revolution in the way we can use data To imagine the world that we live in and to have conversations about the changes that we see coming in the world and I want to go back to 2009 when Barack Obama was inaugurated because this fairly famous picture of the inauguration is an Excellent idea of very early big data that single image wasn't just an image It was an interactive medium something with more than a billion pixels that could be explored and in zooming in to President Obama You're able to show the places you care about greater detail and attention and by exploring it the person Driving the interaction was able to become more in touch with the image more intimate with the data Here we see Michelle Obama with president Abraham Lincoln's Bible the original Bible of Lincoln Which is quite exciting for us to be able to find in the picture our relationship data data is changing because the interactivity we have now is due to new advances we have in Storing information in being able to use graphics processors on computers to make this possible and On being very smart about using internet bandwidth cleverly The ability to do this to a static image has recently been supplanted by the ability to do this to any moving image so for instance now I can take the question of biology of plants and Present to a biology student Plant behavior in a way that was not possible even a few years ago For instance, I can take a billion pixel picture of brassica rapsus every 15 minutes for a month When I do this then the student can zoom into the plant at 15 minute intervals seeing the plant growing It's a time-lapse, but it's a time-lapse with an unbounded amount of resolution detail behind the time-lapse so that the Learner can watch the plant falling over and trying to get up and they can see gravity at work and Circumneutation at work as gravity and light fight to epic battle for the plant's future Fundamentally the fact that we can take now any exploration through space and time of data Reveals to us the possibility of thinking about data very differently than we did 10 years ago Where we had to put it in a table and look for trends over time. So we can do this in the very large case as well One very nice example our own Sun we think of it as a yellow ball in the sky But the solar data observatory that is circling the Sun Allows us to look at the Sun again with nearly a billion pixels per frame of resolution and see it rotating But we can do more than see it rotating we can Interactively zoom into the Sun and watch the actual processes on the surface of the Sun so for example, we can look at small coronal ejections on the surface of the Sun and again by Doing this we take the child who's learning astronomy and we give them a new appreciation and understanding of the Sun itself One of my favorite things to do with this image is to show the origin of a solar flare If you go in this particular day of Sun data to the right spot You can see the beginning of the coronal ejection inside this deep rift in the Sun So take a careful look here, and you'll see the plasma starting to eject and as it ejects there you'll see it follow down the rifts and eventually end up as a solar flare and As your children would enjoy knowing that solar flare is much much larger than the planet earth So it's a good thing. We're so far away from that particular body Now the idea that we can show data and interact with it through space and time applies to any Quantitative data source even ones that we were not made with a camera like this, but we're made in simulation one of the interesting questions that cosmologists have always faced is What are the gravitational lines of the universe look like and how did we go from a homogeneous big bang? to a very heterogeneous set of gravitational lines and for the first time in the last few years we're able to take billion pixel super computing simulations of gravity and Show the formation of galaxies star clusters and super clusters The reason this is interesting is because the same cosmologists who could only look at this quantitative data analytically using tools like MATLAB and Excel can now Visually understand the data. This is important because when you're talking about trillions of pixels of data and by the way the Green circles are black holes forming the white dots are solar systems and galaxies when you think about the best most efficient Technique we have for taking billions of pixels of data and providing them to the human body The most efficient technique we have is our eye. It's the best possible input So the fact that we can take any data sets and create from its Quantitative information that can be visually Intimates to us is a new revolution that we have in our understanding of the earth itself and when I take all the data sets that we have on earth and Consider the ways in which we can apply the same techniques to understanding earth process Well, that's where our conversation with you begins today. This is a beautiful picture of lights on The earth at night, but it's a static picture and what we do now doesn't need to be static any longer So this satellite is showing you all fires across the planet earth over time So you can zoom in for instance in Saudi Arabia Interactively and this is freely available on the internet now and you can see oil extraction flames You can see Bakken and Marcellus crude oil extraction frames in America where drilling and mining operations are causing flaring You can take any amount of Landsat information over the last 30 years and composite it to create interactive demonstrations of mining of Valley mining in Australia These images over 30 years allow us to see human processes at an earth scale and allow Anybody to interactively see that mountaintop renew removal in West Virginia here You see farmland becoming a massive shale gas field so you can see urban development Here you see Lake or Mia in Iran Disappearing because of damming here to create agricultural land during a drought over the course of 30 years These are the kinds of changes that big data Visually can enable for us and to show you one very important part of this which is deforestation I'm very happy to present Matt Hansen Thank you a lot So my theme that I'll be talking about is one application of using earth observation or satellite imagery to track a particular dynamic and that is forced cover change and This is a similar Sequence of images that Ilha just showed from the Landsat sensor It's been up in orbit since different versions of it since not early 1970s And here we are in Brazil in 30 years of record You see fine grain deforestation pattern occurring in the appropriation of rainforest Converted to pasture land and row crops Initially a lot of these clearings are very fine scale colonizers from the south of the country coming up to establish a very modest Kind of subsistence lifestyle and when we look at this pattern in Rondonio This is the famous fishbone pattern of a forest cover change by Individual landholders later on we start to see big clearings in Matagrosso Which are related to agro industry big soybean fields big industrial cattle ranches and as we zoom out to the continental scale We see this what we call the arc of deforestation along the front of the Amazon rainforest going from the coast up in para all the way Around Matagrosso to Rondonio and Acre state and this is this is just an incredible record of human change on top of a landscape and The biggest thing that I want to convey is how do we move from this type of data to a thematic output? And so I'll scoot over to this picture when we show the time-lapse Sequences those are raw pictures their images and we need to turn those into quantitative estimates For example when you look at the changes across Acre state in Brazil how much force was lost how much force grew back and We have to have very clean inputs of imagery to turn that into a biophysical estimate of forest cover And this is an example of a cloud-free global image using big data processes where we start with a million images Filter through all of the images throw away the clouds throw away the smoke and try to Examine only the land surface and it by tracking the really good pixels of the land surface We can turn this into a measurement of forest extent and change I want to say very importantly that big data and its use for societal good is based on really Progressive data policies the Landsat sensor has 40 years of data in the archive and it's available to anyone on the planet So I can make my maps European Space Agency you can make their maps with Landsat data It's very important that providers have this type of mentality where they're Tasking that these these instruments Storing the data and then letting the data free if we do that we can kind of engage everyone from civil society to Private industry to government to come and look at the data and come up with a consensus Understanding of what's happening to the planet. I can't stress that enough because we move basically from research Playing around with these data demonstrating different capabilities to operational records and I would like to you to picture having 50 a hundred year records of every patch of ground on the earth how well how often was it planted as soybean how productive was it How long has it been a city? When was that turned into a pervious surface? How much does it flood and we can have we have this capability right now, but it does depend on technology does depend on progressive kind of Visions of data anyway, so we start with a million images. We can take this Clean image and time series of it and turn it into a biophysical product So here we have in green tree cover that didn't change over a 14-year period in red tree cover That was lost so if it's red it's deforestation largely Blue is gain which means trees were planted or they naturally re-grew and if you see pink It's both pink is both loss and gain so these are forestry land uses typically where trees are treated as a crop So over time we see the trees coming and going we might see the trees disappear and never come back Furthermore in the time domain we can disaggregate this to a trend So we look at the colors now. We're looking at only forced loss in this color bar of yellow to red with blue highlighted as this past year And we can see big fires in particular years in the far north of boreal forest We can see different reds and oranges meaning more recent clearing of Chaco in Argentina and one of the big Findings of this particular data set and it wasn't a finding was a confirmation of what we know was that the big deforestation country Brazil actually Through a policy initiative that included civil society and industry and government slowed the rate of deforestation starting 2006 to the present just went down by 70 to 80 percent and in this color bar You can see the yellow colors dominate the yellow colors are in the first five to six years of the period And that's what we see in the arcade deforestation. This is the only really Policy intervention in terms of slowing deforestation that we have to date They get a lot of credit for this and in fact the proof of their policy success is the satellite record. Nobody can refute that The other side of the coin is that all the other countries in the tropics combined drown out Brazil's signal So increases in Chaco loss in Argentina, Paraguay, Bolivia, Tanzania and Gola mi umbo forest Insular Southeast Asia Southeast Asia all of the Southeast Asian countries deforestation forest cover loss Increasing over the same time period to the point that it drowns out the Brazil signal It's a statistically significant increase in forest loss in the tropics But again, this this record lets us track that and if we're going to make a policy intervention or whatever We can measure the excess or otherwise of that policy intervention We'll zoom into Indonesia to take it down to another scale And one of the things that we like about the satellite is as it orbits the earth it is calibrated consistently So we have a globally consistent picture that we can make comparisons apples to apples of what's happening Well, we can drill down and look at individual countries even parks and say this is what's happening at a local scale That's another really powerful part of this big data story We look at Indonesia the beautiful cloud-free composite. This is raw imagery. We turn the raw imagery into the tree cover extent loss and gain picture and you can see the forest land use transition in Indonesia going from west to east you start in Sumatra Kalimantan portion of Borneo Sulawesi all the way over to Papua Sumatra is the most mature stage of forest appropriation and conversion to higher-order land uses largely palm estates, but also forestry Akaja plantations They have five-year cycles on some of these Akaja plantations. So you look at Sumatra It's almost done. There are a few protected areas left Borneo's in the next stage by the time you get out to Papua It's only logging roads logging roads laced the landscape and agro industries just starting so this is like a nice little demonstration of Humans taking a natural environment and converting it to a more higher-order economic purpose This is the annual forest loss and when you again track annually in 2012 from 2012 after the fall of Sahartu there was a there was a decline in forest Clearing that increased all the way through 2012 to the point that Indonesia cleared more primary Forest than Brazil in that year and they have a quarter of the forest of Brazil So they're going in opposite trends Brazil is going down Indonesia is going up now. We're going to look at What two countries are on the island of Borneo? Malaysia and Indonesia here's the image Here is the green tree cover red loss blue gain pink both at the end as we look right here We see a very clear line. This is a transnational boundary effect So you see economics and policy and governance very clearly in the satellite image across administrative boundaries whether they're Parks within a country or between two countries This is the border between Malaysia and Indonesia on the Malaysian side Intensive conversion of low land forest to palm of palm of states and like and that as you go up into the more Interior high topography of the interior Borneo Intensive logging then you cross into the territory of Indonesia and it's mostly protected areas and this is very useful information to understand The use and then here's the annual change and so basically the point is if we if we have progressive policies If we have good observational data sets and I like to describe these as public goods You know the GPS that we all use what if you had to pay every time you use one of those signals? It would be very low participation, right? It's just there and the GPS spun out this huge suite of industries and earth Observation like weather satellites should be in the same domain that we have regular Publicly available time series and the value added comes on their use and the characterization of land dynamics And the downstream applications of understanding what it means for carbon emissions what it means for Development and human health and I'll with that. I'll just pass it back to illa to show an example of urbanization Thank you These same tools as you can see from the way that professor Hansen speaks are Powerful when the visualization is coupled with narrative with storytelling by somebody who is a content expert That's the critical bit here How do we create big data products that are highly interactive but mated to strong content knowledge? so that everybody can make sense of the information and become a Active participant in civic discourse now the satellite imagery that lets us see deforestation Let's us see many different effects and here is just a Prelude to some of the different kinds of effects that you can see when you do earth time lapse one example here Shanghai over 30 years Let's you see land use and the changes in land use from farmland to urban land areas and again, this allows you to understand both scale and Categorization over time lights at night let you see places that have developed massively greater electricity infrastructure in the last 20 years Red and you can see that the whole area we're in has seen a very significant increase in electricity usage and urban infrastructure and Now I want to present to you two other very different techniques for big data visualization When we create time lapses from an earth view you're able to appreciate changes on the earth over time But we can also use the same graphics tools the same computer vision tools to take Many many dimensions of data in human behavior and make them visualized over time one interesting example is human commute patterns the way we move from home to work and An example here that I present to you is an example just for the state of Pennsylvania These are every commute of every car in the state of Pennsylvania But what the visualization allows you to do is it allows you to visualize everywhere people work and everywhere people live in the state But then by animating between those two points and giving you the ability to zoom in and out of the image You can start to see patterns in agglomeration patterns in behavior of how people live and work and by color I see income level Red here represents low-income jobs green represents high-income jobs So you can start to see the way people commute in terms of the suburban lifestyle that they have The amount of carbon that they release and going to work and the number of locations They they work at and the low income consumers in fact work in a highly disparate set of places Those are the malls whereas the high-income workers work in the financial district in the city center So these are animation tools that can literally take trillions of pixels of data and push them to your screen at the same time So that you can start to see patterns and trends that were very difficult to understand non-visually before Demographic data writ large demographic data about all of us is a very big challenge There is 50 dimensions in the US census everything from gender to how much money we each make To the education level we have and what type of work we have But now we can take the exact same tool that lets us see Landsat images on the earth and we can visualize arbitrary demographic data. This shows all census data for the United States Red represents high-income jobs in housing blocks green represents low-income jobs So right away you can see disparities in wealth across the United States But if you zoom in and play with time because remember we can go back and forth in time For the city of Seattle Look where is green and where is red and look as we slide time at how the green is overcome by red That's gentrification. That's why people working in low and middle-income jobs in Seattle Have to drive for an hour to get to work because they can't afford to live near their workplace any longer Now if I do the same thing for Detroit, Michigan That area has had a ten-year recession and when you play over time the same Transitional video there is no change in gentrification at the block level Because Detroit has had a recession for ten years. So there has been no conversion of wealth These are the stories that you can create Demographically, but the real power comes when you add more dimensions. We can take the wealth picture Green low-income jobs red high-income jobs and I can add race in the third dimension So let's add Hispanic race Proportion of people who are Hispanic country origin now when I take the map and manipulate it in three dimensions The spikes I see the peaks are where Hispanics live in the United States and what color are they a green? That's bad news. There's a very strong correlation between density of Hispanics in this example and lack of wealth I Can do the same thing for Asians by going in the US census data to all Asian demographics And what you see is completely different the spikes that you see developing or on San Francisco Seattle and along the eastern coast and they're red which shows you a high correlation to wealth and For a depressing picture choose African-American and then what you see is Massive densities of African-Americans dispersed throughout the east and the southeast But with no wealth except in Washington DC what? Big data manipulation buys us is the ability to change the relationship We have to the data because we can ask the questions. We could not have asked before and visualize them I'm going to take all that data that was on the map now and Simply put it on an x y axis instead the same tool. I'm simply not using the map anymore now. I'm showing you Male female ratio along this line and number of jobs along this line and I've scattered all Possible jobs onto that line for every housing block in the US and if I zoom into that what you'll see right away in green is It's centered at 5050 that is the places that have the most jobs half of them are men half are women doing the jobs But this red is still African-American. I didn't change the color code So what this shows you is it's well left of the middle Why why are there tens of millions of more blocks with women working than men in America who are African-American? Because of the incarceration rate We have so many African-American men in jail or out of jail that have difficulty getting a job in the US They we have a massive bias toward female employment in the African-American community But not in the non African-American community those are the trends that we can see now as Scientists as social scientists and as demographers these trends are powerful But the real power of big data with my last presentation part comes when you add narrative and social sharing Here we have a community in Pittsburgh who has a coke plant This is a coke oven that makes a coal into coke for steel mill operations So they have one in their backyard and they have major health problems with asthma and lung disease So they have their own gigapixel panorama taking pictures every 10 seconds all the time 24 by 7 on their own web page Mated to that they have real-time wind speed They have real-time reports from the federal government of air quality in their neighborhood as well as three of their own houses So you can see real-time PM 2.5 and they report their own smells and asthma attacks So they've taken big data and created an interactive site that allows governments Municipalities and the public all to take the same data as Matt was saying and use the same data as ground truth to understand the health consequences of local industrial action This empowers the local citizenry because now they have the ability to be at the same table With municipal leaders and with industrial leaders together trying to solve a problem by starting with the common ground of the same language And that language is interactive big data