 Marceline in Kenya and Jacob from Ghana. After that we're going to have a short break on area estimation and we're going to dive into drivers and we're going to have Anne from the FRA team giving us a peak sneak on the publication that was just launched today on the drivers of deforestation related to different scales of agriculture and then we're going to dive into the third part of the side event on open discussion and I would like to keep the most of the session on this trying to maybe answers those questions or others as the right. So please keep your questions for the last part. We're going to try to address them at that moment and without further ado and if everybody is fine I will just let Neko give us some introduction. Yes, no this is yeah it's good. Thank you. Okay, so for those of you who don't know who I am I am Niko Aguilaramuchastegui. I work with a carbon fund at the Biocarbon Fund at the World Bank according to the support that we provide on MRV capacities to the Latin American and Caribbean countries as well as the Francophone and Lucophone African countries. What we're going to be talking today about is basically what I see a clear example of why I say that MRV is a learning process. This is a journey that started 10 years ago where we weren't all in the remote sensing work towards tracking deforestation and forest degradation and we learned from failure and the difficulties that we were having in realizing what was it that we were seeing with our data sets. We became very good at detecting change but although we were seeing things from the heavens we realized that we don't have God's holy knowledge of things and in context is a very relevant matter. But also one of the things that we realized is that by looking at maps and producing statistics from those we were also using biased estimators so here came the idea of combining the maps the data, the geospatial data as we're going to be seeing in the presentation today with sampling approaches that would guarantee that we would not have that bias which is part of what we are asked to do by the IPCC guidelines. I don't see Sandro here but I mean he's been hammering the world with that so bias is one of the most important things that we need to take care for. Of course we started doing sampling, stratified random sampling based on strata defined by change maps and now we're into what I would call another evolution to that process because sampling intensities need to be high for us to capture this very rare phenomenon in the landscape in many cases which is deforestation which entails that sampling efforts are huge which is not always practical. So we need a means of prioritizing where to pay attention and really develop response designs that are really thorough and capitalized on as much evidence as is available for each one of those locations that we're going to be looking at and then trying to infer what happened. As we all know it's not always the case there's still challenges in some of those locations and we end up scratching our heads but well it's part of the journey that we are in and that we're trying to cover. So with that I pass it along to Remy. Thanks. Thank you very much Nikoa so yes based on those ideas that we do need specific sampling design to address everything that we're trying to measure I would like to invite Andreas Rollerat to join us and present his slides and I hope you can see that. Can you see that on the zoom? Just give me a sec, sorry. Thanks Remy. This should disappear right? Thanks Remy for the introduction. My name is Andreas Rollerat, I'm a remote sensing expert working here at the FAO Forestry Division for the C-Poll team. I'm giving my talk on behalf of the National Forest Monitoring Team presenting on an approach that was collectively developed throughout the last year called Ensembled Sample Based Area Estimation and tries to address actually that what Nikoa just presented. As an NFM team at FAO, one of our main mandates is to support countries in developing and advancing their national forest monitoring systems. While there are many different reasons to monitor your forest one of the drivers is carbon finance. For carbon finance we need reliable estimates of forest and forest change, keyword high integrity data which also needs to be consistent over time. As Nikoa said we're moving to sample based estimates to basically get unbiased estimates of forest area and forest change but the problem is that with the small areas that we're having we're ending up in a lot of cases with high uncertainties. The solution to that is basically intensify your sampling but this can end up in numbers that are becoming impractical to, no I don't see the person. Okay. Okay, that's better. Okay. So yeah we end up with a high number of points that becomes impractical then to look at each point. So with what I'm going to present today is basically a way that we're thinking how this issue can be addressed. Next slide. If you start from scratch you basically start out with a very dense sampling grid so one, two kilometers and the advantage of having such a grid is that you can run multiple algorithms on it. So the first step is we extract time series from satellite data. We run certain different types of change algorithms such as BFAS, CCDC and so on. We create some statistics on the time series data. We also add global data for example the global forest GFC data set from Hansen so we get tree cover, we get tree height, land cover, land use information and so instead of classifying satellite imagery we basically classify like the output of all of these different algorithms assuming that some of them are right, some of them are wrong. But as an example we basically get better results overall. And for this we use training data either coming from the field or from visual interpretation. Now when we do this classification we don't classify like different classes we just have a change and a node change class and in this way we can get out a probability of potential change. And what we finally end up is basically what you see on the right is a kind of a distribution over land of potential change. And you can see that a lot of area basically you will see later also this on the map has a very, very low potential of being changed because either it's outside forest or it's in the core forest. Now what can we do with this stratification? Next slide please. The easiest thing to approach this is so we lay out this systematic grid and we basically trust the algorithms on the lower end of this change probability and we expect higher variation on the higher end and this is where we're going to look at. So we just look at the subset of those but treat everything the same way which could for example be the case in high forest HFLD countries where for example you have huge areas of forest and you know that in the core of the forest probably you won't find any change and so your change algorithms probably won't detect anything if you have many algorithms that don't detect anything you look into the time series, apply certain thresholds you can be quite sure that nothing is happening. Next slide please. For countries who already maybe established a stratified random sampling and have already existing maps but end up still with higher uncertainties so they would need to go to intensify to lower these uncertainties which usually happens in a stable class. Now you could think of like running the same kind of things on their existing points or on their intensified grid and using the same kind of logic you wouldn't have to look at all of those points from the intensified grid but just basically at the ones that have the highest likelihood of change. Next slide. Second usage scenario is basically to use this whole approach as a stratification and this is what we piloted in Kenya and Marceline will tell us more about the results but in terms of methodology we ended up with and on the left hand you basically see these probabilities on the map and you see that large amounts of the country basically one third of the country have a very low probability of being changed just because they are in the range lens where you don't see any forest. Now the higher probability of change of course you have in areas where you have forest because forest change happens in forest right and the point here is that when we stratify this we use Naiman allocation we just have a few samples that we actually still place in this low probability stratum but this is basically just to check that our quality of the map is acceptable but then we rather look at the high potential change strata to actually get our estimates of change. Next slide please. And if you look at the strata that are displayed here in this map you can see like for example if you create maps for stratified area estimation you usually get like very narrow classes of change that are actually change. In our case the strata is a more white strata and we don't say it's like a change strata it's like a potential change strata but the idea really is that within this strata we capture all of the change so that we exclude omissions and this is actually how it played out in Kenya that we couldn't find any deforestation or omissions in the lower strata. Now you still have a lot of variability so maybe in your first round of interpretation you end up still with uncertainties that are not acceptable but now if you want to get down these uncertainties you just have to intensify in this high likelihood strata and don't have to look at the other strata anymore. Next slide please. Third scenario is again if you already have data and you want to quality control this you might not want to go through all of the points again so you want to find some kind of prioritization strategy and again you could use the same approach running all these time series algorithms classified with the data you already interpreted run a classification and basically see where the algorithms agree with the interpreter and where they don't and where they don't agree with the interpreter you look at them again and in this way and I think Ghana did this for their quality control you basically again reduce the numbers of points that you have to look at. Next slide please. For now this is still work in progress but we're having notebooks on github that can be pulled to C-Paw and that streamlines these workflows basically completely automated extraction of all these time series data and running these change algorithms is based on a combination of Google Earth engine and Geospatial Python libraries and just to give you an idea of how long it takes it's about for 300,000 points you run it for two days extract the time series run the change algorithms and for us at C-Paw because we pay your budget it's about a cost of $10. Next slide. So to wrap up we're addressing basically the question how to avoid bias and get to low uncertainties without having to look at all of those points. Next slide. For this we propose to use combination of algorithms basically the kind of a convergence of truth from all the information we can get between global data, time series data, algorithms to avoid basically huge interpretation in areas where we know that there's not a lot of forest or there's not a lot of forest dynamics. Basically we're here together the feedback from you experts so looking forward for the discussion. Thank you. Thank you very much Andreas. So let's keep this in mind how do we save the energies of everybody so that we don't have to process hundreds of dozen samples how do we avoid biases and how do we make robust and precise estimates. I would like to call now on Marceline to give us the second presentation. I will try to do something about the resolution. I will be a second Marceline and if you want to introduce yourself. Thank you very much Remy. Good afternoon. My name is Marceline Ojola. I come from Kenya and I work with the directorate of resource surveys and remote sensing. This is an institution that works very closely with Kenya Forest Service in monitoring forest and generation of the activity data for Red Plus. I'm here with my two Kenyan colleagues Faith and Peter from Kenya Forest Service and I'm going to take you through how we how we applied the ensemble model. Sorry. How we use the sample based approach in updating our trail in Kenya. It's going to come. Sorry for that. We all know that you can't wrestle with technology. Is that you? No. Yes, I know. I know how to get rid of it. That's the display settings of this thing. I go to controls move it here and it's still up there. Thank you. Thank you all. Let's try to. Thank you. If we can see the middle of the screen is good enough. I guess it's getting good. You can see the whole screen of the laptops because it's this screen that's causing the problem. Thank you. I'll proceed from there. Thank you. Thank you very much. In my presentation I'll start with giving a brief introduction of what we did and then I'll go into how we applied the sampling approach in our work and then I'll also talk about what we used to collect online in our interpretation and then I'll have a slide on the results. Kenya submitted the FRL in the year 2019 and this was through a support in a project that was supported by Jaika and our reference period for the FRL was 2002 to 2018 and it was at national scale that the FRL was reported and then we focused on forad plus activities and then we also had emission factors from pilot national forest inventory because in our country we have not undertaken a full NFI and we had a total of 121 plots spread across different forest strata in the country so the sample was representative for all the forest strata and therefore calculation or getting the emission factors was also representative and then for the activity data we had a project in Kenya that was supported by the Australian government that was called Systems for Land Based Emission Estimation in Kenya and through this project we were able to generate time series data sets for the country from the year 1990 to 2014 and then again through support from UNDP we were able to add two more years up to 2018 therefore we had a wide range of time series data sets that we could get different data points for the FRL and our activity data for the FRL we used the map subtraction method therefore when the UNFCCC technical assessment team reviewed our document we had a number of comments and one of them that is relevant to this session here was that we needed to improve on the methods of our EDE estimation and that led us to the Impress project that is supported by the UK Pact and through FAO for capacity building FAO came in to build the capacity of the technicians of the technical team that was working on this and we had two different sets of training both online and physical for the online training we were focusing mainly on the use of CEPAL the key focus was on time series analysis looking at the different algorithms that Andreas just talked about how they apply and what is all that is being looked at by each of these algorithms and then the total number of people that were trained in this session were 32 participants 47% of being male and 53 female and the constitution was from different stakeholders that are working on activities that are relevant to this process and then the physical training was on collect earth online and this also took place in Nairobi with the FAO partners coming down physically to train us on the collect earth online how to use collect earth online to interpret the sample points and this training now narrowed down to specific technical team that were working on the whole process of activity data collection so what formed the reference period for this updating of the FRL there was an MRV gap assessment that was done and this informed the reference period to be between 2013 to 2017 and the crediting period to be between 2018 to 2022 and therefore the activity data we were to collect them through sample based area estimation and then under the response design we maintained the land cover classes that we reported on the former FRL but then there was also a recommendation by the technical assessment team that we needed to separate one of our forest strata that is the coastal and mangrove that we had combined together we needed to separate the coastal alone and the mangroves alone because there was a feeling that the mission factors in these two different forests were different then again we also separated another forest strata that we had combined that's the mountain and the western rain forest and then in our interpretation at the collect earth online we had a decision tree which was informing whatever change that was the whatever point that was being interpreted interpreted and we had a decision tree that was guiding us all the way up until you reach the particular red plus activity for instance if it's a point in 2021 you look at is it a forest yes is it a is it a forest no if it's a forest you go all the way up to where it will tell you whether it's a sustainable forest management or degradation or stable forest or deforestation and same also for the non-forest category and then under the impress data collection for the activity data just like Andrea had explained we had a two kilometer grid for change validation and these two kilometer grid gave us a total of about 150,000 points or to be specific 149,460 sample points and then these sample points after the after the algorithms that were to identify the change what was done we had a total of 7,313 points of change that were now to be appreciated visually by the technical team I know you've seen this diagram severely but it's what is showing practically how the sample based approach was in our country we first started with the 150,000 sample points which through time series and the various algorithms we narrowed down to 100 and I mean we narrowed down to the 7,313 samples and then the algorithms were looking at the probability of change of each of the sample points that we had so the points that were relatively stable there was no interpretation done but the points with the possible areas of change were interpreted and then again the points with possible areas of change were stratified into different stratas and as you can see in the map here we have the first area which is basically our area having very few points and then we have the other the next strata with the light blue color and then we have the third strata and these are basically the areas of the high potential areas in our countries where we have different activities with various activities going on and then the visual interpretation was done to the 7,000 points and this whole process is iterative in that when we reach the visual interpretation point and then the point seemed maybe to undergo another process of analysis then it's able to go back and then to the probability analysis point and then it goes back again to the visual interpretation point and then during interpretation we had a sample grid or a sample design that was developed for our country and this design was considering the definition of our forest and therefore we had a 70 by 70 square grid to cater for our 0.5 hectare forest definition and then inside this grid we had again different 49 points interpreted and these again catered for the minimum minimum value for our forest definition which is at 15% and then during the interpretation we also had a set of different additional datasets that were helping us to interpret the points for example in the collect an online grid we had planet data from 2020 and these datasets is a monthly composite from the year 2016 to 2021 at 5 meter resolution we were also able to use Sentinel data which is also a monthly composite 2015 to 2022 at 10 meter resolution and then we were also able to access Google Earth engine which is also having time series datasets available to access Google Earth Pro which has very high resolution datasets for the interpretation and then we also had another special support the geodash and in the geodash we had different graphs showing the NDVI trends and the NDFI trends and also Landsat composites and later on even the SAR data was was also added to aid the interpretation thank you with all this process also that was for Amy with all this we also had quality check and quality control going on and a number of issues just as is listed there and then I also wanted to show you part of the results that we have the analysis is being finalized but this is just part of the results that we already have after using the sample based approach in our activity data or in our updating of our FRL and with that I say thank you thank you very much, thank you and for sticking to the time so we saw one example of Kenya using the ensemble approach that Andreas introduced now we're going to see a slightly different approach and I'm going to invite Jacob from Ghana to tell us about their experience using sample based approaches for first monitoring and actually payment can we see that properly online good afternoon my name is Jacob I work with the first Commission of Ghana the climate change department and I'm responsible for Red Class MRV so I'll be giving you a background to what we've done so far from the FCPF process now entering our streets our landscape is very very dynamic those that have been to Africa or West Africa particularly Ghana we have a very music landscape cropland forestland everything mixed up and it's so difficult to tease out what particular land use per the mapping unit that you are working on and for this we try to use the change maps to stratify our points that we had to collect but there are very low agreements between different algorithms so we try the GFC the BFAS the land trainer and as you can see these are very low accuracy and for that we couldn't base on these change maps to be our stratifier so we compare the systematic sample to stratified area estimates and we see that the more the area the points were intensified the accuracy rather gets better and there's no really change using the strata the land use as a base for the classification so more rather using the sampling would be good and we rather intensified as the first speaker said intensification should be high so that we can capture the different minute changes within the landscape so we rather have to intensify all of our plots and you can see that also with the intensification the accuracy also improves with intensification the accuracy also improves and the precision is also accurate so this we already had a QA QC process in place where we have people also doing double blind assessment of the points and we had 79% of interpreter agreements and these are the different plots that is thrown all across the cocoa forest program area for the FCPF and now to trees we see that we have to get more than one interpreter must analyze the reference data and also have an agreement with algorithm and the majority of agreements must be used for the final reported data and this is where the extra confidence comes about because it's not just a result of people interpreting using images they are seeing on the screen but also the algorithm supporting us to know that is there any agreement between what people have seen and what agreement the algorithm is also saying and because of that we are able to identify the discrepancies between remote sensing data and what also people have identified as plots and this also will reduce the number of plots that will need to be reinterpreted and that is the essence of the assemble and this has greatly helped Garnet's process as you can see from the results that we had as far as plots there's only a few where there's a lot of stable plots that is confirmed also by the algorithm and rather the higher disagreements between what we classified as forest change between what we classified and what the algorithm also stated but for all of this 114 plots of disagreements and compared to the 143 disagreements on what we classified as change and what the algorithm rather said is stable and this will rather bring all the points down that we had to reinterpret even though we would have gone ahead to look at all of these points again and this is where the algorithm becomes very essential for this process and for over 2000 plots we are only going to have 3% of that to reinterpret and having about 97 agreements already with it but what we interpreted and what the algorithm also said and for now Garnet is also assessing very high resolution imagery from investor of Maryland to rather look at also the reference data maybe if pre-historical what we are using wasn't as strong to give us good accuracy and for this currently my colleagues back in Ghana are already starting with re-analysis of the 281 plots and you can see that of the classified plots as non-forest only 17 of all the stable plots are identified by the algorithm as high forest change probabilities which is also a good indication that there has been a good job already done and these are the comparisons we've seen with the FCPF process the reference level that we had and also with this algorithm coming in to help us reassess the plots the average deforestation area for trees, credits and level and the good news is that all of these processes supported also by FAO has yielded positive results for Ghana where we've been able to already go through the third party verification for the first time submitted to FCPF under our first or the premier program the Ghana co-co-forestry plus program which has also yielded about 4.8 million dollars for the country by reducing 972,000 plus million and 1,000 CO2 equivalents so thank you and thank you to FAO and to all the people behind also the algorithms for the good job done and we keep that in mind. So we actually we are actually on time and we've been keeping up so thank you very much for getting quick. Like I said before I would like that we keep questions that we have regarding the sampling designs regarding the approach that is proposed to actually use ensemble data on this but if there's any one quick clarification for the countries and then we're going oh now we have like four questions on the room go ahead do your first. So Andreas said in the beginning just more on the methodology is how many stable points should you ideally check? You said we'll check a couple and Kenya didn't sound like they checked many and then in Ghana it was some so my question is do we have indicator how many points percentage or number points is a good idea to check these sample these stable sample points that we don't check enough sample enough stable points in the stable class. We're going to take those questions and answer them later there's going to be plenty of those that are similar I know they're coming so I don't want to have this right now but we take the questions and we put them down and we'll answer them I promise so there was Murray Danilo gentlemen over there and well it was Murray first let's take the questions and then we'll answer them thank you it was question for Kenya you said you identified the low confidence points the points that were classified with low confidence so then how did you use that information in your analysis design and then I didn't understand something with the Ghana you said you reinterpreted the points where they disagreed with the map so I wouldn't I didn't understand because usually the points should be the validation points that refer to the truth so when they disagree with the map it's probably the map that is wrong but maybe I didn't understand clearly thank you a question for Andreas and Marcelina when we did the study on the dry land forest one hot spot of the forestation was actually on the border between Kenya and Somalia where due to the humanitarian crisis the conflict there are millions of people moving and cutting down the dry forest which actually are in Kenya the map that you show these forests doesn't show up at all and actually you were reporting these as rangeland and in the 7000 points that you validate visually no one of them was there or how do you handle the situation like in a country like Kenya where you have humid forest the dry forest which required from a methodological point of view two different completely approached thanks for the question yes we'll take them later go ahead now I want to take all questions because we're going to have friends go ahead thank you for your presentation Olo my question was to Andreas Marceline for the Kenya you tested the assembled machine learning algorithms but the machine learning individually they have their own positive and negative sides did you see individually what is achieved by each algorithm and you seen the agreement between those algorithms how you go into the assembled machine learning algorithm ok thank you my question is for Marceline Kenya maybe I think you assess degradation deforestation this changes all the gains deforestation reforestation or restoration so if that is the case how did you arrive at the area of the communities for example degraded forest how many how much area the area is how much and the uncertainty for each of these and the other is maybe in Ethiopia we always faced a problem of interpretation when this confusing land is like from annual crop land if there is a change to grassland always in Ethiopia problematic when we interpret and vice versa and the other is forest land into shrub lands or shrub lands into forest lands how did you manage such changes in your collectors online analysis thank you last last one from Sandro and then we're going to have another round of questions but I may not have understood the well the presentation so it's a very simple question with the sampling so when you do the sampling you do the reinterpretation my understanding is that you correct the statistics that you derive from the map in term of area changes you don't change the map you just adjust the statistic this is the question yes or no that was a great question because it's an easy answer so I'd like everybody to keep in mind that we we have how do we quantify the amount of stable points that we need to interpret how do we deal with different landscapes and when we have dry and mix of human forest we require different methodologies what do we do for agreement between interpreters and how do we manage with tricky transitions to grasslands or other types let's keep those in mind so I'd like to make a break on those statistics methodological point of views on how do we do our estimation I promise that we will keep 45 minutes to that afterwards I just want to give the floor to Anne because otherwise the discussion is going to eat the time we're just going to have a quick 8 minutes presentation on the study that can we see that properly so just a little breath and then we'll move on to those things Hello, good afternoon to all my name is Anne Abrandtum I work at FAO in the global forest forest assessment program I must warn you some people might be frustrated about it or they're relieved but I will not go to technical details in this presentation but I will present you for the first time the result of a study that we conducted with many colleagues in house and it's really a joint effort and collaboration between different divisions across FAO including the agricultural division and the forestry divisions and also I must acknowledge that behind all these data that I will show you there is also the tremendous effort of the countries in the interpretation of the data for the global remote sensing survey 2020 so this study is also about a global sample based area estimators who are still in the topic but it's about an application at the global level to understand what are the drivers of deforestation and to go more in depth also about agricultural drivers of deforestation and the study as we call it to do large scale and a small scale farming contribute to global deforestation it was implemented in the framework of the global forestry assessment but also under the initiative of turning the tide of deforestation to deforestation and here you can see also the authors the main authors of this study in this study what we did is we expanded on the work conducting during the global forestry assessment 2020 remote sensing survey that some of you may know already it's a global survey of forest cover changes looking at two periods in time 2000 to 2010 2010 to 2018 and it was based on 400,000 samples across the world and interpreting using a collect earth classification it was a stratified sampling design based on the handset to assess tree cover changes and there was stratification applied on that based on ecological zones and the probability or the possibility of tree cover changes it involved 800 experts nationally to do the interpretation from 126 countries and the results of this study were released at the World Forestry Congress in last year in 2022 here you can find also the cover of the publication and I encourage you also to go to the fraud website if you have not donated to consult this publication because there are a lot of data on how the forests are changing over time by biomecosone and at the global level so what we did for this study so we did an assessment of the drivers of deforestation using Earth Observations satellite imagery and we looked at all the forest in 2000 or 2010 according to the satellite imagery which then were transformed into different land uses okay I think this is not the word presentation but that's fine okay so basically land use in 2018 that was considered as cropland was considered as cropland expansion as deforestation driver a forest that was converted to grassland was considered being impacted by livestock and the forest was converted to settlement was indicated as for urban or infrastructure development under the conversion of forest water then the drivers that was assessed there is the dam construction of change in water course and for all the other conversion of forest to the land use then it was other drivers and mainly the degradation affecting natural resources a conversion of forest to have another land use I just give you a reminder it's 4FAO considered as deforestation so these are the result of the survey in terms of deforestation driver between 2000 and 2018 so the result allowed assessing what were the main drivers of deforestation and showed that agriculture was by far the main drivers of deforestation it accounted about 90% of deforestation worldwide since 2000 which was much more than what the previous study was saying from these drivers they were about 50% due to cropland expansion and 38.5% grazing 7% of this conversion of forest to other land was also due to the expansion of old palm this global figure of 88.5 really can hide a diversity of situations leading to deforestation while it was very important also to understand what were the dynamics of land use behind this 88% in particular in the view of designing more efficient policies so what we did we tried to divide this 88.5 into different process of deforestation leading to agriculture so for both the conversion of forest to cropland and the conversion of forest to grassland what we did is we divided the process between small scale farming and large scale farming we know that several studies have already assessed for instance the impact of commercial industrial agriculture but in this case we also try to clarify the terminology and use some categories that could be also relevant from a remote sensing point of view of what we see from satellite imagery so those large scale farming categories and small scale farming were defined mainly according to criteria related to the investments of the production capacities and also also the expansion over the areas so large scale farming are all those agricultural activities involving industrial and medium to high technology production processes covering large areas and evolving significant capital investment machinery or infrastructure while small scale farming cover all the farming activities that apply non-industrial methods low technology production process they expand over limited area and they have the labor force as the main investment so the methodology that was applied to each sample of the for remote sensing survey 2020 where we had identified deforestation since 2000 so we were looking at each of the samples and for each of the samples either of conversion from forest to cropland or to grassland to livestock grazing we identified a set of criteria geospatial characteristics to in order to quantify how much was going to large scale was associated to large scale farming and how much were associated to small scale farming so those criteria are the landscape context looking for instance at the proximity to big roads cities and also looking at the fragmentation of the landscape and the forest the speed of clearing the field size of course very important the field boundaries if they were regular or not regular the shape if we had a plot that were continuous circular the field patterns also was an important criteria and the presence of infrastructure buildings and farming infrastructure and the result of the study gave us there were two main results the first one being what criteria are more relevant to identify large scale from small scale farming for each region and according to the different production system livestock or cropland and the second type of result is a share of large scale versus small scale farming involved in contributing to deforestation globally and by region so what are the results that I'm going to present you today first what we know from that is that during worldwide during the period 2000 to 2018 most of the conversion of forest to cropland or grassland occurred in the context of small scale farming it's for small scale farming accounted for 68% of the conversion of forest to agricultural land from this 68% 40% was linked to small scale cropland and 28% to small scale livestock this small scale farming it accounted in terms of hectares the estimates gave 103 million hectares of forest deforested losses associated to small scale farming practices if we look at the distinction between cropland expansion and livestock grazing what the study found is that the cropland expansion is associated to forest is associated to 71% of the case by small scale farming while for livestock grazing here also the small scale farming was the prominent cause associated to 64% of the area deforested for agricultural one minute this is just to give you also the distribution of where this happened and what we see that the results varied from region to region so small scale farming was linked to most agricultural driven deforestation in all regions but at different degrees and it represents for instance a share of 97% of agricultural driven deforestation in Africa 65% in Northern Central America 59% in Asia and 52% in South America and the highest share of forest loss associated to larger scale farming were found in South America where 30% of the agricultural driven deforestation was associated to large scale livestock production system as well as in Asia with the 38% linked to larger scale crop production and mainly for all palm plantations the conclusions that we have from the studies that it's an efficient methodology is replicable so we did the quality assessment and we found out that there was about 90% of the sample that were interpreted in the same way by different interpreters we found out also that adding parameters to the field sides to define large scale versus small scale agriculture is very useful so we need to combine different indicators especially characteristics in order to be able to really classify between these two different agricultural categories and this was particularly important in the case of livestock what we found is that these result align also to some other data from other study that show the important contribution of small scale farming to food productions some study show that for instance 70% of the crop production is linked to small orders 73% of the coffee and 20% to 30% of the palm oil so it's in line with these findings and also the study show that we need to strengthen the efforts to address the weaknesses of the current production systems when designing strategies against deforestation and we need to consider also the strong component needs in particular food security decent income and security in your rights when addressing the deforestation issue therefore therefore this study finding can support more tailored policy making and we can also use this kind of approach to develop a more fine tune approach even at different levels the study is released today so you can download it from the FRO website and I would like to thank you for your attention thank you thank you very much Anne for this great presentation showing how also this approach can be used to derive more qualitative information on drivers that was great so yeah check out on the link and maybe we'll share the link on the chat I've seen that there are also some very interesting comments in the chat and maybe we will speak them out but before I would like to check if Steve Stamman from University of New York is online to give us a brief wrap up feeling about those different approaches Steve are you online I am can you hear me yes we can floor is yours okay and we can see your slides okay see if I can get it to the slide show okay I will try to keep this very brief so we can get to the important discussion questions so this is a bit of a transition from the good applications that you've been hearing about into the discussion part we've heard about many different decisions and as far as the sample based area estimation through the applications that were presented earlier and I think it's helpful to have specific criteria to guide our decision making and to attach these decision criteria to specific components of the methodology so the criteria can be statistical and practical and in a sense statistical criteria should also be practical and we've already heard about these components of sample based area estimation so just a quick review there's the sampling design and the response design the response design is providing our best available determination of the condition and we want to use these particular components and connect those to the different criteria that I listed earlier so we had for example today an issue where we would want to evaluate a decision would be use of the ensemble classification it can be used in the sampling design in the form of stratification but we also heard about it being used as part of the response design possibly to provide the reference data also as a quality assurance device so when we are evaluating decisions such as the role of the ensemble classification we want to connect these features of the sampling design and response design at the top of the slide to these specific properties of area estimators in this case I'm listing the statistical properties so what are sources of bias and variance that come in from the two different components the sampling design and the response design and that will help us in terms of thinking about certain decisions we're making in terms of the sample based area estimation so let's start with bias and this is kind of a review of what the potential sources are from the sampling design we usually don't worry about bias because we can rely on sampling theory to give us unbiased estimators or ones with small bias and the issue of bias is essentially coming out of the response design that there are non-random errors in the reference classification or another way to think about it is imbalance between omission and comission errors so we would like to have ways of doing an assessment of that bias the question already came up for example of how many points would you have to check in that stable class to make sure you're really not still having omission errors so essentially we want to do what would be an accuracy assessment of the reference data in terms of variance there are two sources it can also come from the sampling so different samples would yield different area estimates and we've talked about reducing the sampling variance either by increasing the sample size which we know is costly or using map information through stratification or the model assisted estimation model assisted estimation has already come up in the chat from Kristoff we also have variance coming from the response design so that type of measurement or response variance would enter for example if we have two interpreters looking at the same sample unit and they disagree that would be another source of variability so we would also like to reduce that as much as possible so that we can see what a lot of the training and coordination efforts are designed to do we need to assess or estimate the variance because the square root of the variance is the standard error and we input that into the confidence intervals that we need and we can think of the total variance as being the sum of that sampling variance the only estimate is the sampling variance and we have standard formulas for doing that generally to this point we are not incorporating response variance and that opens up another very big issue we do have methods that we can start to do that but it's not something that's part of current practice so as we proceed to the discussion and the challenging issues that are going to come up there I think it's helpful to keep in mind that we have certain criteria that we can think about and these criteria associate with specific components of the sampling design and response design so typically sampling design and estimator issues will focus on the variance criterion and the response design questions and issues will focus on the bias criterion so we're trying to find good but not perfect solutions to these problems realizing that there are always tradeoffs of the choices we make specifically in terms of these criteria so I hope that helps as a bit of a lead in into the discussion and I'm sure there will be a lot of interesting questions to come up thank you thank you very much Steve there's great comments and points of wisdom regarding the fact that we are tackling a difficult subject and that tradeoffs are unnecessary I really appreciate that we don't have 45 minutes for questions but we do have 25 minutes for questions if we stay within time so I would like to open the floor so we have a first set of answers and then we're going to take a couple of questions but I can also see the hands so maybe I at least want to take the questions that were addressed earlier on maybe the first one on stable, necessary points of stable, Andreas do you want to take that? yes I can take that and maybe I group it with the question also of Danilo and also like now by Steve Steeman I remember also earlier this week there was a slide from Maria Sanz in the introductory session that basically you need to find a tradeoff between accuracy and precision and this is basically what we try to tackle and also as Steve pointed out what is the criteria so we also have to take into account cost and resources so if you're asking for a number of points that you should have in this strata I couldn't give you a clear answer it really depends on the case that's also why we kind of presented this usage scenarios because it depends on the country and it depends on the circumstances of the project also of the resource you have available in terms of Kenya the example from Kenya and the trilands so the point is like the map you saw is actually the two kilometer grid but it's like so small that you cannot see each and every single point so some of the points have been included also in those areas and have been interpreted I cannot tell you how much now on the other hand we have we use Lanza data and it's known that Lanza data in these areas has problems to take forest but we're willing to bring this method forward for example using Sentinel-1 data that can detect woody vegetation and integrate this I think one advantage of this approach is since you keep with this grid you basically can also go back in time so you're consistent and you can update when new data or new methods coming in to basically also update your old results so to stay consistent over time thank you there was a question for Ghana did you get that one? do you want to take that Jacob? thank you so the disagreement with the map and which one we interpreted so now the answer is no we rather reinterpret the points where there are disagreements both change and stable so for points that the interpreter said it is forest but it's not and also the point that we said it was stable but the algorithm said it's not so we combined so from the presentation you see that we had 143 of points that we had classified as change but the algorithm said not and points that we said was stable 114 but the algorithm said otherwise plus the 17 of the other classes that we had to add to this so it's all of the disagreements between interpreter and algorithm that we had to reassess thank you very much I believe there was a question for Kenya regarding the different types of vegetations or the border between Somalia and Kenya major things Marcelino, Andreas do you want to take that either one of you? I already addressed I said like I group this but you want to add something? Thank you Thank you very much Danilo for the question like Andrea as I said it's not like in the range land areas we didn't have any points to interpret we also had some points to interpret within that range land region but it's just that they were not as many as in the high potential area and then there was also another question from Ethiopia with the interpretation of different land uses which are a bit confusing and the answer I would like to give you is yes that was even a challenge with us for the first activity that we did because we used map map-substruction method and always map-substruction is a challenge because you will always find that there's no perfect pixel map in this regard and therefore when you do subtraction these changes the errors tends to be compounded and that is why when we went to the sampling based approach and like we all know right now a number of standards are very particular on accuracy and precision those are very key factors that are being considered and this is partly addressed by the sampling approach and then there was also another question on how we were able to look at the area of each activity under red plus if I go to write I did show a decision tree that we were using to arrive at each red plus activity and then in our sampling design the square the square design for interpretation was finally used to be able to give us the area of each activity data of course looking at all the strutters that we were interpreting that we were interpreting thank you thank you Marceline I don't know I have a doubt now that your question was actually not answered that's the the very ones to take answer this question there was also the question of Terje go ahead Terje so you were asking about the importance of the different single algorithms the assembly algorithm you used what is improved using this assembly algorithm so I think it works differently as you might have mentioned you don't really look at the accuracy of the algorithms usually with this change algorithms what you need to do you need to find a threshold on your own it's some model fitted to a time series and then you have some deviation and you try to find a threshold in this case it's basically the machine learning that learns from the training data where to set this threshold and then you assess your final accuracy but since you're looking with probability you don't get a discrete classification in this sense because so everything is then handled by the sampling at the end so it's not that you get a real map where you access accuracy in a classical way it's just a way of stratifying basically ok Sandro then Luca then Ethiopia Sandro do you want to repeat your question or can you understand it's a very simple question I understand that with this sampling plot with the data you adjust the statistic that you derive from the map you don't adjust the map this is the simple question am I right or misunderstanding ok so maybe we should say we don't we don't have a map ok but basically no we don't on the approach that has been proposed here and that Kenya has used that Ghana has also used that we have examples of in Côte d'Ivoire and there's actually a quite extensive comment from Christophe so maybe we can read that out afterwards the database consists of points and we have all these algorithms and this information that is being collated together and that we use but we don't change any of this of the the change probability that we use for stratification for instance but it's just a temporary measure and it doesn't show out in the end we have the point system and we make the statistic based on the point system go ahead Nécois perhaps yeah perhaps I can shed some light as someone who went through the difficult journey that you guys embarked being embarked at this point and a little bit of context we're working with one of the FCPF countries they present their first monitoring report it has huge uncertainties and then we try to work with the country and deliver support to see how those uncertainties can be tackled then we realize that the country is working with Feo and its team looking at this methodology and we don't want to generate parallel work streams so the idea is that we want to try to reduce the amount of work that the country has to do in order to deliver for either reporting framework that it's working with so what that entails is that we need to understand what's going on so the first thing was understanding what this probability distribution talked about and how it's put together and the first thing is that you understand that there's no map what you're doing is a total evidence approach you could have maps derived from different algorithms but you also look at all kinds of remote sensing based change data that you can put your hands on etc and based on that you generate a probability distribution of change basically you try to identify where it's more likely that the world's change and where it's not Andreas you correct me if I'm wrong because this has been my journey and I haven't been able to realize that and then what you do is you do a cluster analysis so you split the landscape into strata that are low probability of change moderate and then high probability of change and that should inform your sampling design and one thing is that this goes back and forth is which allocation approach do you use it's a statistically based allocation approach in this country specifically we had money for 4,000 points the homogenous systematic grid had 46,000 points and there was no money no time to analyze the 46,000 points so you have to prioritize so you are okay I have money for 4,000 how do I allocate my sample size for all three strata and that's what we did and then with the support of other consultants and the work team and so forth the 4,000 samples were assessed with a two round assessment for quality control etc and then based on the interpretation of those samples you estimate the area there was a question about how do you know how much area is represented by a sample when you do the cluster analysis of the probability map you have the size of the stratum so if it's a sample that falls in the low probability stratum then you know how much area each sample is going to represent if it's the moderate the other one and then the change what these does is that you do a more dense sampling assessment in the change stratum which is to try to now these doesn't solve the problem of detectability Danilo his question has to do that some instances the algorithms that we use with the remote sensing data do not detect the change that we're trying to do but this is not a silver bullet approach this is not trying to pretend a perfect solution what this is helping us is to optimize the sample allocation and the analysis we have worked with countries that have 270,000 samples and it takes them nine months to analyze the samples and with FCPF you have 45 days to deliver your monitoring report after the end of your monitoring period right so when you consider all those things this is an approach that we need to test and we tested it for this country and with well estimated uncertainties and things like that we got results that we feel comfortable with are they perfect no that's it that's our our adventure with this methodology I'm not going to disclose the country but because the report is not out there yet but yeah there you go thank you very much for the reality check thank you ok is that an answer to Naiko as you want to go ahead so assuming that in the stable I can do my sampling pre-sampling I don't know how you call 5% of the points are not stable let's say ok then what I do with this 5% that's I didn't get what do I do with the 5% because you don't change this. You can propagate, yeah. You just propagate this 5%. Yeah, you propagate it or you can, the other thing that you can do is you can use the information of the interpretation of the points to improve your clustering. To do again the clustering, to do again the clustering. Yes, thank you very much. Yeah, you're welcome. What he says is cheating. Let the world know. I think it's a good terminology. That's the terminology we can speak about a lot. But it's not cheating. It's trade-off. There's going to be some trade-off to be found between allocation of resources and precision that is necessary and reality checks on different landscapes. So there was a question from Luca. I saw Alfonso's hand. I saw Javier's hand. And I saw over there too. And in theory we have another 5 minutes. So please make your question and answerable. I have many questions, but I will do one. So it's great. It's a question about transparency because a systematic sampling we know that is not perfect, but we know exactly why that point was selected. So if you get to put yourself in the shoes of a person who has to assess the transparency accuracy of this data, so what type of data a country should provide to allow the assessment of the transparency. So for example, why that area was not sampled or not. So is it something that you think is easy? So can we share the algorithm, the probability risk map? So just to get some reflection on that. One quick answer to that. This is something that we thought about previously and it is part of the learning process. The solution we found in one exercise that we did and we support was to use a random seed for the systematic grid. So you randomize the starting point of the systematic grid. Of course you are a statistician, more statistician than I am. There's a lot of gurus of statistics here. So if you come up with better solutions you can deal with that. Just to add you can reproduce everything in every step where there's a random element you basically set a seed and you can reproduce everything. You can also see the single algorithms how they behave. Of course this would be a lot of work for a reviewer but you can also get the final map of change probability and see as for example Danilo pointed out maybe there's an area missing in the next round maybe adding new data because also if you think of ensemble approach now we're using just one data source and we're using a bunch of algorithms but ideally you would want to have maybe more data sources. For Kenya for example we went back to 2013 I think so we could just use Landsat but maybe in the future we can add planet data or central one data and let's say also from the input data that comes in kind of provide this ensemble. And maybe on the aspect of cheating or trade off. So the way it's thought is kind of an iterative approach. So you would start out with if you have like NICOASS like you have 4000 points so in this case how I played out you start maybe with 500 points that you use to train your algorithm and to get like this change probability. Now you look at your map and you say am I happy with that map then maybe not maybe I should add some more points because I know like there and there like it doesn't capture what I want to see actually. So I add new points to the training until I'm kind of happy with the map that kind of this stratification that it comes up kind of represents what you want and only then you start to select samples that you really use then for the final estimation step and from there on you can change your sample your boundaries anymore of your stratification from there on you can only intensify. But as shown like an yeah or that yes So we're coming to an end there's plenty of comments in the chat I would like to just underline that one that maybe the approach shown by Ghana and Kenya recently implemented Cote d'Ivoire could be further developed to combine the change probability maths in a model which this is maybe something we will have discussions about in the coming month. So what we will do is gather all those questions and share with somewhere on the GFOI proceedings or I don't know what they will be there but we'll make sure we make those available. Is there any final really one minute question to be answered otherwise I suggest that we bring it to the coffee break because we'll have another hour and a half to be like we have before the next plenary. We can go overboard a little bit go ahead please but feel free to escape if you think they've had enough. I think this method for the quality assessment and quality control is really good that it's something that a lot of countries that we work with we're very happy to use and I just wanted to point out that for the reproducibility and transparency of the process I would use a systematic global grid that you can pick plots from and apply a systematic approach all the time without using any randomness on the process because then you will be open to this kind of comment or criticism that you are maybe not playing straight with the data. Maybe I can answer to that one also because basically all the... so you have the Cochrane formula to look at the necessary density with a priori information and we do reality checks also on this using global products and basically every time we come back to change that up currently being on maps of 30 meter resolution gets captured at a 1 kilometer grid so maybe we should just follow your advice and go for a global 1 kilometer grid and just like forget the rest but just to say that we're probably not completely off because we are basically everywhere using that and we just read the calculation but we keep grabbing the same things. Can I add one thing to that, Afonso? So what we suggest in this Jupiter notebooks that are up is basically to use DGG grid which is the same as Fras using with standard parameters and it even has the advantage that if you figure out later that you need to intensify you are basically hierarchical so basically points that you already interpreted you don't have to like it's not a complete new sampling design you just intensify it but you make sure that your points already have been interpreted so that is standardized but there's actually a lot of random things inside the full process but always you can set a random seed so that it's always completely reproducible from start to the end so once you have to do these notebooks you just run them again and you will end up with the exact same results. Last question? Last last question. Is this working? And fantastic work on the drivers the question is not about statistics but it's for countries that want to understand better their drivers of deforestation do you see any guidance forthcoming for how countries can adapt these methods to some of the nuances of how drivers might be defined or be at a different scale in their country versus a global assessment? Thank you for the question I think this is feasible currently we have some guidance and e-learning that have been used for the FRA 2020 remote sensing survey there will be probably new methodology for the next global remote sensing survey but what we can do is also to find a way to provide the tools to derive those estimates and to customize it according to national needs and I think that's something that we have in mind to provide the tools and the way also to customize both the sampling and the questionnaire so all these driver analysis can be done also at the country level or at final level Before we close Aurelie you want to super briefly compliment this one and then I will close I was going to say we have a global methodology well global methodology but being piloted in the Congo basin in the six central African countries basically using similar methods to identify specific drivers and we have identified nine drivers that we can see in planet time series data that we can use them when we see change we can see whether there is roads or agriculture or mining implicated and we also look at the overlap of these drivers and combinations similar results 90% of disturbances are related to small scale agriculture but we also pull out degradation as well and the plan is now to apply these methods here kind of take that same approach but apply it on a systematic grid and then get those drivers out and give more of that or similar that relative contribution of drivers sorry Faw, yes I just walked up here all right we can go for a very long time on and on, thank you very much for the interesting questions big round of applause for the panelists take a break and see in the plenary if we can bring the discussion to the next stage