 Hello everybody, my name is Haye Biddle, I'm a data engineer at Trace at the Stockholm Environment Institute and I'm presenting Trace today, which stands for transparency for sustainable economies and it's about topical deforestation and the relationship with global trade. I feel like I'm saluting this again. I'm going to give you an introduction to Trace the project, then we're going to do a little bit of a workshop and at the end we'll have a discussion. This is what I'm talking about, commodity, different deforestation. This is a picture of Calimantan in Borneo Indonesia and on the left you see Borneo rainforest and on the right land has been cleared in preparation for a palm oil plantation. Over 95% of deforestation is driven in some way by the plantation of commodity so that can be directly planting a commodity. It can also be a land grab. Actually something I was going to mention is that there's a hypothesis at the moment. We're pretty sure that currently in Brazil there's been a very sharp uptick in deforestation because people are trying to grab land while Bolsonaro is still in power because the expectation is that if Lula is elected then regulations around deforestation in Brazil will be tightened up. So people are basically just buying land while they still can and then clearing it so that they've kind of got that status and then they kind of sell it later. Well they might sell it, they might not sell it. So land spec speculation is also a huge part of commodity given deforestation and it is very much concentrated today at least or at the moment in the tropical regions. So this is data from the University of Maryland based on NASA satellite images and it's showing in each 10km square what is the main driver of tree cover loss in that square. So that's simply that the trees were there and the trees weren't there. It's not necessarily the same as deforestation because the trees could be planted again as in the case of forestry in the north. But what I'm interested in today is the red cells which are commodity given deforestation. So what we're saying here is that in that 10km square the main driver of deforestation in the square is commodities. It's not that the whole square has been deforested, it's just the drivers. And you see here in Latin America it's very much, well it's very much soy and beef and those two sectors are quite interlinked because people will clear forest, put cows on it and then a few years later put soy on it and then use the soy to feed the cows. So the soy and beef sectors are very much intertwined. And then here in Southeast Asia the main drivers are palm oil, wood pulp and paper. So by the way if you're interested in reading more about how difficult it is to be transparent about what's causing deforestation and what the drivers are, there's a really wonderful paper published a couple of weeks ago with my colleagues in science. It is called, I've forgotten what it's called, but I have a link to it at the end and it shows just how complex the drivers are of deforestation. So deforestation in tropical regions in particular is a big driver of global warming. I've seen estimates from 7 to 30% of global CO2 emissions depending on how you count it. But a direct carbon comparison isn't so useful because there are also a host of other advantages that forest give the climate local cooling effects, how it affects the weather, the value of biodiversity and so on. Now if you look at soy in Brazil most of the soy that's produced in Brazil is exported to other countries and there are very very strong economic pressures from the global economy that make clearing forest much much cheaper than conserving forest which is a bit of a crazy scenario. But so soy for Brazil is exported a lot to China, a lot to Europe and this creates very strong economic incentives in those countries to deforest. So if we're to understand tropical deforestation it's very very important to look at global trade. And of course we have millions of ships, container ships. There's a great site which is I think global shipping watch. You can see live all the container ships and it's just crazy. There's just trade bouncing around all over the world. And if we're going to make progress on this issue we need good data about what's happening. And there's been a few ideas proposed. How can we bring data to the system in order to understand it, understand how to change it. One proposal is called Farm to Fork. So that's where you trace a product right back from where it was grown to the person who's buying it. So you can imagine you go into a coffee shop and you buy a bag of coffee and it says this coffee was grown by a mister just outside of wherever. There's also been proposals to bring blockchain to the problem here. You can kind of imagine blockchain. You can record a transaction for every time that coffee changed hands and then you have a kind of immutable record and you can trace things back. The problem with most of these proposals is either that they cover too small a sector or too small a slice of the global trade. And a lot of the trade we're interested in is trade that would happen outside of those outside of the conscious consumer path or that they just come too late. And we need data on this now because it's a very pressing problem. We can't also say necessarily that obviously the spatial component within, say Brazil, is very important. A cow that was reared on land that was in the frontiers of deforestation given to one that's been reared on land from decades ago. So it's very, very important to get a spatial picture within the country. So Trace's proposal is that we already have plenty enough data to be able to map the middle part of the supply chain to a good enough accuracy that we can take action on it. So I'll explain what that means. So this is a simplified model of the supply chain for beef. So we have a cow that's reared on a farm, and that farm is located in a municipality in Brazil. When the cow is big enough, it's sent to a slaughterhouse. The meat is then exported out of Brazil. There's a change of ownership from the exporter to the importer. It's imported into a country. It goes into a supermarket and ends up being eaten by someone who buys it in a shop. So that's one example of a kind of simple supply chain. Now we have good data on parts of this. So between the two ports, port of export and port of import, there's UN trade data, which tells us exactly how much volume is going from which country. We also often have customs records within the country for taxation reasons. So we know who owns each container, how much it's worth, which port it came from, who's responsible for taxing it and that kind of stuff. So we have pretty good data on the trade part. We also have records of slaughterhouses in Brazil. So, again, the Brazilian government puts out who owns each slaughterhouse, where the slaughterhouses are located. We also have estimates per municipality of how many total cows are slaughtered. We have estimates of production of how many cows. So we have bits of data on part of this. And it's enough to be able to map this middle part. So we don't have really good enough data to go right down to a farm level, but we can get down to a municipality level. We also don't have data really beyond the port of import because this is typically a private supply chain and private companies want to hold their data for a competitive advantage. So this data is hard to come by. But this middle part, we can do a good job. And our argument is that this is enough to be able to take action on the problem. Now, since we have the municipality and for reference, municipality in Brazil, for example, is about 15,000 square kilometres. So that's about 120 by 120 kilometres box. Once we know the municipality, we can combine that with satellite data and we can say, okay, what's the deforestation in that municipality? And then we can attribute the deforestation to all of the goods which are exported. And because we're doing this for all of the volume of the supply chain, we can say, okay, we know that from this municipality, 10% of it went to Russia and 10% of it to Germany and that kind of stuff. So we can take that deforestation and we can apportion out the risk among all of the exports from the municipality. So another way of thinking about this is if you have a container ship of beef meat arriving in a port, we can label that container with which municipality were the cows raised, how much deforestation was in that municipality. We can also take things like what was the water scarcity, other environmental indicators, is there evidence of forced labour. So we can kind of attach a bunch of metadata to the container that arrives. So what kind of things can we do with this data? So this is just one example. We have many, many more on our site. So this is imports of bazillion soy and it's just filtered down to those imports that went to China and this is data from 2018. And what you're seeing here is a heat map of deforestation risk. So you have, it's measured in hectares of deforestation that we've apportioned to China for 2018. And I'm not showing you the volume here, but this red region up top there is called Matipiba. And that region is responsible for only 9% of the volume of the total volume of soy, but it's responsible for 80% of the deforestation risk. Whereas if you take a region like here, this south region, that's responsible for 35% of the volume and less than 1% of the deforestation risk. So this is very typical. We find that our data is this kind of power, this power law, or actually it's a relatively small part of the supply chain that's responsible for a lot of the deforestation risk. And that tells us, okay, we can just concentrate our efforts in one place. And if you look at Matipiba, for those, for the deforestation risk associated with that region, 75% of the risk is associated with just five exporting companies. So immediately the problem's been reduced there. Yes, questions. 80%, 80%. Exactly, yes. It's quite striking. And we've reproduced this in many other supply chains around the world. So this is Brazilian soy. We have modelled, I think it's 11 different commodities in about seven different countries. So we're covering about 60% of the trade in forest risk commodities at the moment. The other 40% is kind of a, well, diminishing returns for our work, but we're covering the vast majority of trade. And we see this kind of distribution happening in lots of other countries and lots of other commodities as well. So there's another example of what we've done with this data. This is a report that was written in May, and it was done in collaboration with GIZ, the German Development Agency. And the report is assessing the tropical deforestation in Germany's agricultural commodity supply chains. It's part of the preparation for a bunch of laws which are being proposed at the moment in the EU and the UK and the US for deforestation-free supply chains. So the law would place requirements on companies based in Germany or based in the EU to audit their supply chains and check that their supply chains are deforestation-free. So we're part of the advisory group for those laws. And as part of that, we produced this report but actually showed that Germany's deforestation risk is one of the highest in all of the five major signatories to this law. And also it's one of the most intense in the sense of deforestation risk per kilogram consumed. So you can read that report, I'll put a link up at the end. Another example is here on the right. This is some work we did with the French government recently. It's an interactive dashboard and it's targeted at public and private actors. The idea is that you can go on and this is the deforestation risk associated with soy imports to France. You can see these kind of numbers again here. I think this is 20% of the volume was responsible for 80% of the risk, I'm not quite sure. But you can go on to that and then you can filter down. So if you're a company and you happen to know who is importing, who you're importing or supply chain is, you can filter down to that importer and look at your own risk. So these are just two examples. I included this one because it's an example of a dashboard we've built off of our data, whereas this is just an example of an analysis we did. But what I'm more interested in today is trace kind of as a data transparency or an open data initiative and this is where you come in. I'm hoping that there will either be someone in the audience who wants to use trace data or has an idea how to use trace data or who has experience with open data transparency initiatives. I should say that all of our data is free so you can just go online and access it. I'm hoping to learn a bit from you and this is where the workshop comes in. So I am a data engineer and the kind of challenges I think that we have in trace at the moment in particular for our kind of open data portion of it are these. So firstly, how do we best give users access to our data? We target an awful lot of people. We're really trying to get our data out there and in use. We're targeting civil society journalists, directors on the board of soy trading companies, hackers, techies, researchers. So we have quite a wide range of users and we're trying to give people the access that makes most sense to them. So on the one hand we have a website where you can just kind of point and click and just see our data but on the other hand we also have an API where you can access it and do things with it. So hopefully there's kind of something for every kind of level of user. And then how do we shorten the journey between data appreciation and actual use? So we have tried to make our website look appealing and show how useful our data is but we kind of need to make the jump from people saying oh that's kind of cool to people actually then actually using it and doing things with it. And finally how should we engage with the open source community? So all of our data is public and open but at the moment our methods aren't and I do wonder whether we should open source our tools and our methods and whether that's useful and how best to go about doing that. So I haven't really shown you the data itself yet and that's kind of on purpose because I was hoping we could discuss those points but I thought they might make a little bit more sense if you've seen our site and seen our data. So my proposal is and this is what I'm calling the little workshop part my proposal is that we spend about 10 minutes just using these two tools one of these two tools to answer a question and you can do this on a laptop where you can just do it on your phone and then afterwards hopefully there will be some more questions or some feedback and we can have a little discussion or I can expand on any of the points if that sounds good. So the two tools I have are the Data Explorer and the Data API The Explorer is this website that we've built recently where you can just click around and you get some prepackaged graphs. The Data API is where you can actually submit an SQL query so if you're more of a programmer looking to use the SQL that might be good otherwise if you're just on your phone you might want to use the first tool and I have two kind of tasks which you could do if you wanted to The first is can you find out which three commodities are most responsible for Germany's imported deforestation risk and the second is which biome in Brazil has the biggest soy driven deforestation risk so hopefully they'll make a bit more sense when you go on the site and I'll give you ten minutes feel free to pair up with your neighbour and we'll come back in ten minutes let me know if you have any problems accessing the site as well I don't know if anybody is using the Data API if you go on that site you'll see a bunch of tables and what you need to do is click on a table and then there'll be a tab which is query and there you can type in your SQL query I'm not hearing anybody making any complaints so I hope everything's working Do you have a question on the whole page? Cross-related Which one is better? Choose whichever one you want Choose whichever question you want and then whichever tool you want I would probably use explore in both tools and they're just examples of things you could do also feel free to explore the site if you want Yes So we have the importing countries which is at the bottom and then the top section is the exporting or the producing countries Did someone say something? Sorry I think I've actually forgotten the answer to the first one You can divide up a country into political boundaries or you can divide it into kind of ecosystem boundaries So a biome would be the boundaries of a particular ecosystem like a boundary of a particular ecosystem So for example a biome might be a like wetlands area or a forest like the Amazon forest would be a biome and the mangrove forest and the Ecuador would be a biome I think on one of the pages the pages for Brazil we break down the deforestation risk by various kind of regions One of those regions is political regions and if you scroll down then one of them is biomes If I know what people should want of these are responsible for germination or deforestation risk presumably the thing that has the biggest drive for germination or deforestation risk can I assume that's globally because I can find which ones are exported and which are producing the most I found that It's globally It's just for Germany So then to this interesting question of what's the biggest deforestation risk for the whole world that's not something you can answer on our site at the moment because you can only filter import to Germany or I'm just going to look at Brazilian soil so I can't then say what's Brazilian soil globally compared to Pragrang grief globally that's not possible You could answer it through the API but I think not through the explorer Are you actually I'm going to write it down I tried a bit off around I think that's why I selected the regions production if we go in and release and they can't select anything first Is it just Is it a safari as well Is it also a safari How do you need safari? I was wondering what browser it was Opera and Chrome They're both the same It always did that or it just started doing it Thank you very much You can't answer it I will note that down We're about at 10 minutes anyway I presume it's enough time for everyone to have had a little bit of a play I can't exactly remember the answer to the first question I think it was definitely soy and then I think it was Brazilian beef Paraguayan beef and I can't remember the next one Paraguayan beef Paraguayan beef And what's the next one and there's no no false one We'll not see the answer for that And then the other question which I remember is actually the Sahadu which is responsible for a lot more deforestation risk than the Amazon which was a surprise to me because I had never heard of the Sahadu before a few years ago It's a very bi-diverse Savannah ecosystem and there's a lot more deforestation happening in the Sahadu than there is in the Amazon at the moment Yup Well Exactly, yeah Exactly, yeah Exactly, yeah So the deforestation risk we do count back because it can be that you it can be that there's a land speculation where somebody moves in they deforest the forest, they leave it for a year then they sell it to somebody then somebody puts a cow on it and then it kind of happens a few years later and say I think it's five years as I measure if I remember rightly so we look at deforestation in the past five five years Exactly, yeah Yeah I mean you can so I think on that side it's only showing our very latest data but we do this analysis for every year usually from 2010 so you can see the kind of trends over time but yeah, of course when the forest is gone it's gone and there's not much you can do about it well, you can plant it again of course Cool Great, so I hope that's enough time for everyone so I was hoping to have a small discussion or any feedback from you so I'm interested firstly in whether the tools and the data are easy to understand or if you have any questions about it I'm particularly interested if you think you could use trace data in one of your projects and then I have these three more kind of open-ended questions from before and yeah if anybody wants to say anything put your hand up for free to introduce yourself otherwise we'll finish it early ah, there's a microphone there You said you have a SQL API will you also have a REST API because that is something that is coming Yes, the SQL is SQL over REST so it's a REST API where you can submit an SQL query ah it's designed for web apps so we're actually using that API in this dashboard and in the site as well and it's intended for people to build web apps on on top of it we're not actually managing the API we're using this company called Split Split Graph and it's the kind of place where you upload the data it's kind of like a Postgres over web you can connect to it with a Postgres client or you can use a web client and send the SQL query over REST Good question Thank you, my name is Karsan Gabro from Tanzania and I'm interested to see that the region I come from is mostly nomadic pastoralism as well as movement based agriculture How can you trace that first of all, regarding that it's a very rural area and most of the substantial agriculture is the one that drives the economies and there is a lack thereof of literacy as well as the infrastructure for digital resources even though open data can actually create a meaning for approval but how do you bridge that gap where I can be able to use trace data in a community such as that Yeah, so the first question about movement that's obviously usually more of an issue with animals so taking cows for example we do have some movement data on cows from Brazil particularly if they move across borders to slaughter but to a certain extent it's a bit hard to capture the movement of the individual cows before they're kind of entered into the system and it's kind of one of the reasons why our model is to a resolution of a municipality so of course there's some kind of municipality movement but mostly it's within the municipality so it's kind of one reason why we can't go even to a finer resolution than that crops is a bit easier because they don't move and I think in our model we assume that crops are grown that they're moved somewhere for storage and then they're moved somewhere for processing and they're moved again to the port so we kind of model about three movements of the crops and we hope that that's good enough to mostly capture the model to your question about communities that's a really great question and I don't really know I would imagine that yeah I haven't really thought about how Trace Data could help the communities there and I I'd be interested to think about that but I haven't really thought of that before maybe mechanism on how we could do it yeah for sure absolutely yeah maybe we can talk after questions are always on the other side of the room thanks for your interesting insights I was suggesting some things about how to integrate your project to the open source or open data community do you know open food facts that's initiative in Europe I think that they are collecting informations about food processed food mainly and it's crowd based they're collecting crowd based data and maybe they have interest in integrating the deforestation risk of food there but I'm not quite sure because you are dealing with your mainly tracking for example beef from Brazil right or soy from Brazil but they are mostly directly delivered here to the supermarket right no sorry that may be a little bit confusing but we're just tracking all goods that are just imported into Germany and we don't really model what happens to them after that but for one example is that in our report we talk about leather that's imported beef or leather is a byproduct of beef that's imported into Germany and then used to make leather car car seats in Germany's car manufacturing so it's all imports open food facts but might be a good place to open food facts open food facts that's a quite nice initiative most data is from France I think and Southern Europe but they are crowd based and my little start up is dealing with the question of transparency from supermarket consumer I would think so I'm quite interested in getting more data into more qualified data of deforestation risk so I will really try your API out and take a look if you can integrate there something I noticed you nodding when I said that it was difficult with the private data totally because there's no transparency in processed food right some small ingredients printed on a packaging but you don't know where the soy is coming from where the beef is coming from that's from a consumer perspective the biggest problem I think what's the name of the start up Inno Co Inno Co I'm doing a talk in this afternoon I'll come along Thank you Hello Chris Adams from the green web foundation and climateaction.tech an online community of techies doing this kind of stuff I'm curious about what people would use if they did not have access to the trace data when they're trying to campaign for changes to say deforestation and things like that because the argument we've seen most of the time is that if there are groups who are trying to maybe win a political fight who don't have access to data that maybe another well funded group can have and I don't know enough about this field to know like if there is something like the global canopy project what would they be using before instead of having access to this data visible here for example basically is there an expensive source of data that people have to pay for otherwise that you are kind of making more widely available essentially Okay, okay, okay, no so as far as I'm aware we're the only data source that's doing this kind of global level tracking of imports to deforestation risk I'm not aware of any other private companies doing anything similar but there might be I'm not sure and we don't currently have any different tiers of data, we only have one data and it's all there and it's all public because if that was a question Yeah, the question the context I'm asking this question in is that when people are pushed say the IEA to publish more data to inform some of the kind of campaigning or academics or people who are not in a very very small number of very very rich organisations they basically say by making this data more widely available you're more able to have a wider set of people feeding into the policy making process for this so my question was basically if there are proprietary forms of data that people would use which mean that other groups or other policy making organisations have been excluded previously because it may be that there are groups who are trying to use this but don't know this exists and currently either have to pay for data or they don't have access to any data that they can base any kind of argument on. Yeah, for sure. Yeah, we're really trying to get it out there in a use. It's been picked up by the Guardian a few times for some articles. Where did you say you were from? So the Green Web Foundation? The Green Web Foundation. Okay, yeah. Okay. I'm going to go back to that. I'm going to go back to that. I'm going to go back to that. I'm going to go back to that. I'm going to go back to that. I'm going to go back. The Green Web Foundation. Okay, yeah. For sure. I'm trying to get that data out there. Yeah. Great. There's no more questions. I'll say thank you very much for your time. This is the links I promised. So that was the name of the paper I couldn't remember. Disentangling the numbers behind agricultural driven tropical deforestation in science. It's a really, really great paper to kind of break down and understand the drivers of tropical deforestation. This is a link to the report. Is there a type of that? Deforestation free imports? Not sure. This is a link to our blog post and a link to the report about Germany. There's a long URL or a short one. And then this is the link to the French dashboard. It's kind of similar to the data explorer that you used, but it's in French. A long link and a short link and we have Twitter and LinkedIn and all sorts as well. Thank you very much for your time.