 global flood partnership doesn't have a hard deliverable defined it's a community coming together out of interest floods and hopefully building some activities yes but there's no mandatory hard deliverable and if you sign up to this you're probably going to sign up for some work but a lot more meetings like this to define a clear deliverable and to deliver to geo to david so let's get started to keep up this last day hopefully it would be as exciting as the days before we have free talks this morning hopefully a bit shorter each talk so you can make the coffee break so without further ado I introduce Georgian Vagamaka he's from flood tax so some of you may know that company and I'm very glad to also know that he's part of the GFP as well and he agreed to give a talk here can I say you're part of GFP you can yes exactly so thank you for asking last meeting in Delft I'm glad to see he could make it here and yes please thank you very much Guy thanks a lot um so my name is Georgian Vagamaka I work for a small company named flood tax we analyze social media and online media for flood and drought management and disaster management and in this presentation I would like to share with you the use cases that we've been working on and the technical approach that we're using in these use cases but I want to start with some slides on how we got to this idea why we think it's important and now the batteries are coming this is also working okay so I've I'm a water engineer I studied in Delft and for the past 20 years I've been working on various large amount of flood projects throughout the world and one project I wanted to share one event I wanted to share is New Orleans 2005 Katrina obviously and I was sent there for an insurance firm I worked as a consultant at that point was sent there for a insurance firm to really get to the bottom of what were the events the facts that led to the eventual damage and casualties in New Orleans we did that together with Delft University of Technology and we were using all kinds of sources that we had of course some remote sensing data some gauge stations a lot of designs from the US Army Corps were shared with us so that we can complete the picture of of the events in New Orleans but it wasn't enough we couldn't really capture the sequence of events so then you know you start walking around New Orleans we were there one month after the hurricane and these pictures can be taken at that time an embankment with scour on the other side for the engineer really nice to see what how important embankments really are and what happens if they overflow and but it wasn't enough for to complete the picture so we started talking and doing a lot of interviews in New Orleans and collecting images like the one on the bottom low that we got from an operator of an electricity firm on the other side someone had taken the picture at the right time and so we knew from this picture what time there was an overflow and what happened in the in our navigation channel and it was for this content for this kind of content that we complete in the whole picture of the events happening that we later on the differ to the insurance firm and this was really my not on my first encounter with a good example of an encounter of the ground observations that we were collecting from people only it costs us a lot of time because we went we went there and we went chatting with the people and collecting photos like that but what I found it really interesting later in my career I worked in in Jakarta so I lived in Jakarta for five years and really a city that is tremendous amount of flooding coming to them and 2013 there were big big floods and I was hired by a Petey Astra Petey Astra is a big car reseller in Indonesia the biggest in Southeast Asia actually selling Toyota's and the Suntor polder this is specific polder within the Jakarta was flooded again and they asked why is it flooded again and why is our offices why our offices wet again so then again I've got the gauge stations and the remote sensing and etc but then you go walking around and you see these kinds of these kinds of pictures beautiful flood walls here along the Kalisantium but obviously there's a coupure here that's not supposed to be there and they made it because otherwise you can't go to the other side with a motorbike but obviously there's a problem there and we started doing interviews getting content again and collecting these kinds of photos where you see how the water is coming in from the Kalisantium but also how the rest of the wall is has has holes in it it was again this kinds of content that made us make such an overview of the floods that have happened really by getting also the ground content because without the ground content you can't really really comprehend the situation there last example in the same year I worked on a flood management information system together with Deltaris in the model Fuse flood early warning system and it was an unluckily chosen title flood management information system because actually it was a flood early warning system and early warning worked so actually it was floods coming we sent out the alert or the alert to the Met Office they sent out the alert and to be precise and then we got calls and they said well you built this wonderful flood information system so where are the floods oh I have no idea because we can only forecast the floods when the floods are happening completely lost we don't know anymore so it was silent on the other side of the line because it had been an expensive flood early warning system and but then at the same time we saw a huge number of tweets coming in with with all sorts of information there about rescue activities about the water depth etc and there were not just eight that I'm representing here we started counting them later on and here you see here you see daily totals of tweets up to well this is 300 000 so at a certain event in 2014 we got 300 000 tweets not without retweets in in a single day up to 10 tweets per second that were coming in and I thought well this is something nice this is something nice so we started looking more carefully into those tweets what is it going to say us and well it says quite a lot here here when you look at this tweet it says well it says a time which is a time of the flood happening there's a banjir we have about 50 to 60 centimeters very precise did upon before chem chicks come on raya come on chem chicks is a shop on come on raya so it was really really very detailed and we get these kinds of detailed messages and one day we got 15 000 individual observations that we could track with the geo reference so huge amount of information as compared to the flood early warning system where we were using 14 gauge stations so okay that's interesting um obviously breaches of of defenses are always on pictures and mentioned repair activities here and important in jacada is that there's a lot of NGOs working very small NGOs big NGOs a lot of them and the whole coordination is really complex and at one point the but but every self-respecting NGO is making a photo of its aid and putting it on twitter so we got the question of one NGO so well can we somehow find out where there's severe flooding but nobody's helping yet well actually we can because we can actually see all the aid being delivered we can also see where where where nobody's at yet so okay all this information so um i was i was thinking about starting a company there i thought i gotta do something with this and this is this is this is the this is this is the chance but before i went on to do that i had to confirm myself of of course that my whole business model would be on jacada only which is nice but nothing perhaps not sufficient so i started looking around and we did a quick inventory of of all the floods in in a certain particular year and we saw here in syria a big picture of 30 000 south korea so 35 000 philippines almost 50 000 detroit 8 000 and also smaller events like uh in inundations in the netherlands one was not all that severe up to the knee we find it really annoying um uh 15 000 15 000 tweets thunderstorms and sx 2000 tweets just thunderstorms no inundations or thunderstorms happening and why i don't know but we also found that the 28 big floods in 2013 um i saw there were 20 28 big big floods and 72 75 percent of the top 20 disaster prone countries are also in the top 20 twitter countries don't know why but i thought this that's an opportunity so um i quit my job and i started flood tax what we do is we collect data public online uh media data uh from and uh usually general catered really content mostly from twitter but also from blogs also from news sites and we're experimenting now with messengers like telegram i'll come back to later um obviously everything has to be public available information so many people also ask us about facebook um facebook is not um so people on twitter want to share everything with everybody or facebook you want to share it with your friends only and keep it there so we can't use that information however there's also public facebook phasius and these public facebook pages actually we can use posts on there we can use and this is in some countries uh a really a great way forward also with citizen observations next we analyze the content which basically is the event detection further information extraction for paus and real-time monitoring and we're doing that with a number of partners and we're all about the collaboration we're we're collaborating with a lot with universities like robot university freya universities like Amsterdam and with deltaris and being sure that the psds that are working from these universities are really um helped with our data so that they can do the research we also deliver how our software is working so that the scripts they're producing can feed back to our software and that's how we grow the product and lastly we share the results via dashboard and via an api we'll come back to that here's my uh uh movie that doesn't work okay try not to look at the at the table oh okay that's a shame okay this is the the the daily floods that are happening in the world every day so every day we've got well what is it 20 30 floods happening somewhere in the world in the big floods small inundations we've we're doing this on the basis of 12 languages in twitter monitoring them that's tweet anomalies statistics anomalies and setting off the the detection and of course then the the big question is how how accurate is it so we we've got here this table it depends on how sensitive we we make it but the number of events detected between july 2014 when we started monitoring up to january 2018 when we did study is between 18 and 13 and a half thousand individual flood events okay how many are correct if we're looking at the the the let's look at this one just we we got an additional threshold if we're using an additional threshold of 20 non-double duplicate tweets we have precision of about 83 percent up to 87 percent is where we are now we're increasing that precision as we speak we have to relate that to some there there's no big database of flood events in the world of course of the large floods but not of also then the middle floods and the smaller floods just at least to do get some feel of it we compared it to the immune agree database and not cut the service where so we see a correct events manipulation of a 56 percent 54 percent are not captured in the not cut service actually the not cut service only has two percent of this total number of of events so and obviously that is because our database has all the floods also small invitations so this is this our starting point the only thing is that this is a global flood monitor and a global flood monitor doesn't have customers so how can we bring that to use cases oh no first first this one also the geo reference is really important as well of course um let's look at geo reference quickly um the precision on the first order administrative which is like counties regions provinces is 97 percent and lower on the on sorry on county level town level is 92 percent this is our ongoing ongoing work of us there's a publication there for those who like to see this on our website you can read it this is the fifth floor of the Red Cross in Manila it's the operation center they have their task is to collect all the information of the entire Philippines and spread it out to the to the other departments where the response is happening and the slogan of the Red Cross I believe global but at least from them is always there always first and the always first is they really take that very seriously obviously so they said can we find a way to inform us better about any new floods coming in there was number one of the question that they had the number two is when floods are happening we are scrolling twitter and trying to you know get all the information can we structure that somehow and so we did so um here's our flood dashboard that we thought for the for the for the Philippine Red Cross where you've got all the all the statistics all the photos and all the locations within the Philippines where you see dark blue as as thresholds where something severe is is going on each time we have a new event like like this is also an event very small peaks are also events we send out an alert to the Philippine Red Cross and they can check exactly what's happening we've been running this for two years we've got 650,000 tweets of them 240,000 with location there's a large large number of small events hundreds and hundreds of small events and few really really large events like here typhoon yaki just now um on 11 August um interestingly the Red Cross is much a verse of false negatives but false positives are quite okay false positive okay sure and because they're they're gonna call to the chapters call to the volunteers not true okay no go on their business they've got six people there it's of course they're busy but they this is their work false negatives they don't know anything about and it's and it's gone and later on maybe a day maybe two days three days they hear something has been going on not always first so this is and that's where twitter is really valuable because we can get to very low amount of false negatives um some tests that we did there interesting to tell is that if here's again the all the tweets but then positioned and we didn't have a good dem or anything of the of the Philippines but together with the titers we made 10 uh flood scenarios so these these these are flood scenarios and um in the interface you can click on them or select them or unselect them and then see what do you think can be is a likely flood extent and this is helpful because then obviously you can see that there will be no floods here there will be no floods here or here there can be severe floods what is the severity that can be well then i can go to my maximum scenario etc etc and match them with the tweets that i've got different example um this is Tanzania Dar es Salaam um they were interested in the same thing to support their immediate flood response so to know where to go upon a flood to striking and um as basis for new emergency fund request via the the draft so if a national society can't cope with a flood they can ask for more funds from Geneva via a draft but they need to build their case in this draft so can can we get that information of course um this was a real challenge for us because Twitter in in Tanzania is really not that big so what's going to happen and um but we we went out monitoring it anyway and we got in a 10 month period eight and a half thousand tags and and besides Twitter we also found this blog jami forum it's called where people are sharing also information about floods we took that as well making a total of eight and a half thousand i i found it small because i was used to Indonesia Philippines and Renato Makara of the director of the Red Cross said well this is wonderful because it's so much more than we had before only he had this question because he said Tanzania is a real whatsapp country everybody's what's happening constantly so well but how do you get one some information across then well i hear something in this group and then i forward it to this group and then someone forwards it to that group and and and everybody's connected in that way okay so is that a problem well that is a problem because we it's not structured and um later on i i talked to the municipal disaster manager of kinodoni and she said i have to get up at nine o'clock in the morning to read all my app messages to write it down so that i can report at nine o'clock to my manager what's happening in the city so if you can do something with that that'd be great okay so now whatsapp is uh of limits but uh what we're going to do now is um see is trial with uh with telegram uh whether we can uh move the volunteers to telegram groups where it actually is allowed to monitor in full transparency to the participants of course get that information and treat them the same way as the tweets so that we can have an overview of what's happening in the city or even the country okay that was to use cases back to um a little bit to the to the technical approach um so i already solved several of these of these sources again twitter is really valuable for real-time events and very popular in large parts of asia europe and north america um uh facebook is uh only for the facebook public really relevant there's online news and there's the messengers actually i've already touched upon all these the analysis involves um mostly um information extraction on the base of natural language processing geo referencing really important the event detection on the basis of anomalies and there's various enrichments and combinations that can be uh that can be done let me go through them one by one just to to sketch these a little bit the work that we do so we've got linguists uh in in our company what they do is uh we look at all these kinds of sentences then we break the sentences up on the basis of a number of teachers and then we're going to ask a user our users to annotate them and say well um this typhoon hagu pit is that indeed a typhoon yes it's a typhoon 500 000 is what is that oh that's damage um or that's people etc etc we we train the we train the system and from it we can say no this sentence is actually um build up of this identified this location and that time in a in a nutshell when we're going to geo referencing um uh there's the approach we're taking there is i get a lot of questions about twitter are we using geo reference that's inside the the tweet and i found very early on that people in bandung were tweeting about their brother in jakarta being flooded so then i've got the the geo reference of bandung while the actual flood is happening in jakarta so we said let's abandon the whole geo referencing part of geo reference of the mobile i don't want to know where the device is i want to know where the flood is so we want on to look into the body text but in the body text it says for instance oh flooding houses hashtag boston flood boston is actually in uh usa in philippines and also in england so i don't know where it's gonna be at and um so then we are looking for more of those uh uh tweets so perhaps one says boston uk perhaps someone says some church in in in boston and so what we finally have is a lot of dots around boston uk if that's the location and some scattered dots of boston uh australia boston the philippines and some churches in in england now we do a vote and uh you're looking very uh i do a vote and we know it is boston boston u k in a night show this is geo geo referencing approach that we're taking um we've got a nice paper on it it's called tags it's on our website as well by jens de brandt um then we wanted to do something with flood inundation maps um already showed that we've got like 15 000 water level observations in jikarda we put that into fuse uh jikarda fuse and here you see all the kalurahan the kalurahan are the subdivisions of the jikardish colors consists about about 105 kalurahan and um here's the karat barrage the karat barrage is a barrage over here that shows the the number of water in channels and we thought can't we make this into a real-time flood map and so we did we're taking indeed all these 15 000 observations we're putting them on top of a dem and we're realizing this um flood map on the basis of a hand height above nearest drainage uh approach um you can read about it in this article by dirk eilander of uh deltadas really interesting and really inspiring because from this you could get to real-time actual real-time flood maps for those areas that have a dem so we tested it also for another area in york and york there was a flood in 2013 and um a lot less tweets we only had 8 000 tweets in total so we're also like how many tweets are we going to need to produce this 8 000 tweets only and um we had the dem and did the research um also using the hand approach coming to this output this is york again and in green you see the flood extent maps that we got from it um that are actually correctly flooded the blue the light blue and the uh uh orange is where we're we're off so the similarity is really uh is really striking um in a paper by tom brauer my colleague um available on the website as well but this is uh for for areas with with dems this is really uh yeah really interesting another analysis that we that we can do with it is really dive into the data and we had this use case in pakistan so uh pakistan big floods in um 2014 and um we were talking also there with the red cross and they said um could we have known um that this flood was going to be there before that we were that we were actually informed about it and then we did something we said well this is this um the river that the the the jail in chenna and ravi rivers they have a lot of dams and reservoirs and we figured let's look at all the tweets about these dams and reservoirs and so we found that for kaishmir we have already one uh peak here for punjab was a peak here in jail there's a peak here and so we could propagate the coming of a flood to finally where it happened uh this is the flood happening in the upper regions of pakistan and here somewhat later the disaster managers being informed about them so could they have known probably they could have known if if the people here had informed them they hadn't they put it on twitter and looking from the statistics you see that something is um is coming interesting on a different note this is a burst of a dyke um it's also in in in pakistan um it's a long story but there were a lot of tweets about this about a possible dyke breach finally the dyke was breached only the people that were behind the dyke were not informed so interesting of course whoever was breaching the dyke knew they were going to breach the dyke and the other people could have known that something really bad was about of all the signs of people that had been saying something about them expressing their worries etc um also paper about it by brennan jongman of u uh amsterdam okay so to wrap up the analysis we produce on the basis of this information finally our output is like identifiers is the flood is a heavy rain typhoon the location at various admin levels the time which is start and end time number effective evacuated number of people deceased and damaged uh two houses embankments crops and all kinds of other yeah circumstantial information that you could get from from the media we can place that on a map and if we're lucky we can even produce flood extent maps how do we share that data um so um obviously via a front end via table but most of all via our api and there's our collaboration model is that how we see the world of disaster suppliers collaborate with each other and how we do it with our partners also i suppose this is uh this is flood tags we have some end users straight where we deliver our content and we've got an api going to another organization he's got end users and api going to another organization etc etc i think this is a really useful and effective way of collaborating with each other and so you see the data supplies collaborate in diverse setups to deliver content to end users and i think a good collaboration is where it doesn't matter which party has the contract to the end user so i'm working a lot with otaris and it shouldn't matter if i've got the contract with end users because i'm hiring otaris or vice versa the data flow is in the api and if you've got a good business content there then you should be able to really flexibly meet all your end users um okay back to the use cases um um twitter sorry the nature conservancy came to us they do evidence-based advocacy for mangrove restoration in samarang they wanted to compare flood management with nature development time series they said what are flood management time series can you help with that we're getting that from twitter in a in a similar similar way with fairly with precision up to 80 percent we're delivering this data via our api to them because they've got their own websites and dashboards to look at um then we're coming to at Tanzania again because we've been working on twitter and of course we were asked also at that time could we also do the same for news media and the use case here was that tma tma is the met office in uh in in dar eslam um they want to move to impact forecasting and we had the the red cross who wants to know more about the possible impact at certain levels of the certain forecast levels so how can we match the hydrometeological conditions to what's actually happening so what we did is look up news articles and so we collected 145 000 news articles from various sources um in within Tanzania and we um with uh using our uh our our approach we were able to detect 175 flood related um a flood events over a 10 year period with columns like the the location which province the start time the identifier the damage the evacuated and the amount of people killed um so we have this history of floods and then uh deltas came and they said well let's look at the history of hydrometeological conditions during that exact the same uh same time and so they did so they analyzed and then we analyzed the relation between those two those hydrometeological time series and our data and then we uh deltas created simple web app where you can fill in the uh hydrometeological forecast and then what you get is um the the situation in history that looks looks most similar to it of the past 10 years and so now the red cross can see oh this looked like a lot like 2012 and then this and this and this happened i don't know if it's gonna happen again but at least i've got some some some grounds to to build my response on um yesterday it was a lot of talk about parametric insurance um world bank is doing a feasibility study on whether parametric insurance could also be introduced for floods in Myanmar lau pdr and um they asked us whether online media could play a role there and Myanmar is not i wasn't really enthusiastic straight away because this is really the country there's there's no uh twitter the internet penetration is relatively low still so um but it was a challenge to see how much value we could get from that so we did the same and um we found we did it for an event one event July august 2018 we found 228 articles about floods and here you can see them placed on a map with all the green where we see oh i've got to tell we we we wanted to uh compare that to the actual flood extent and actual flood extent also for Myanmar is really difficult to get we got some database from this avatar this infant that we started to use and here we see in the green um everything that this infant detected and we also detected um dark blue what we detected but this infant did not detect and light blue what we missed although we do have a lot of observations also in the light blue so this is our part of the analysis how we can make these areas also green on the basis of these of these dots in the end where we are now is that um 71 of 109 events locations that were flooded um were indeed captured by the online media and we're working on the geo parsing of the Myanmar names which was really setting us back a lot because that's where the uncertainty is um um how much time do i have left my okay cool all right um so we've been working on floods um for for quite some time it's getting mature and we were asked also to work start working on droughts and problem in droughts and this is this is about Mali Mali Matthew is our counterpart they want to do drought forecasting or they're doing it they want to improve it but they don't have a sound history of droughts and droughts drivers in in Mali so they asked can we do something with online media to get with remote sensing and get a a large database so we're working with intelligence on on remote sensing imagery and then the question is of course how are droughts perceived on the ground because if it's dry it doesn't need to be a drought and then what's happening exactly I mean does it mean crops are lost does it mean that water wells are dry etc etc so that's content that we can capture from the news media creating a full database and helping the the the Mali Matthew increase their improve their forecast um this was my last use case for now I've got one sheet about this about another project that we've done I don't know if everybody anybody recognize this recognize this perhaps Simon or Albert no no no okay okay so a bit of a history of uh of uh Holland we had a big flood in 1953 big flood means 1600 people killed would be big and um and during the flood we had caissons out to to close the the embankments Rijksmaaterstad went on Pollock Works went on to sink caissons to the bottom to make sure the flood didn't proceed in these caissons is a museum if you ever in Holland do visit the museum it's a museum about the history of floods and how the Dutch have been working on on the floods and as you're approaching the very last room in caisson number four you will find this transparent sheets with tweets on them of the daily floods of what's happening somewhere in the world these tweets are coming from us and I'm really proud of it that we're in a museum and uh and uh and it's just really nice because it really shows that after going through all this museum that's about history and and and and how terrible it is that that you can see oh every day there's about 20 30 40 floods happening and this is really important that we do something about summarizing um there's a lot of public online media available it features uh the features of this information is that it's really timely like twitter is it's it's out there immediately so you can you can use it for for detection within minutes it's from the ground so it's facts it's also how it's perceived on on the ground it's very effective in urban areas especially urban areas rural areas also but the uh spatial resolution will be a little bit lower there's historic information available at least for the news media really differs but on average 10 years back and the availability of the media and the accuracy that we can get is really variable per region and you can see that here for jakarta and york where we have an amazing uh accuracy and uh my and mar where we are still uh um improving that the accuracy to a better detail and finally yeah it can be used for real-time alerting um for instance detecting false negatives of an existing system real-time confirmation taking out false positives of an existing system if you haven't historic analysis like threshold setting for impact forecasting which we do a forecast based finance in which we did in Uganda model validation baseline studies trends analysis problem analysis and um yeah this is what we're what we're doing and if you have any more about that i would love to hear them and perhaps we can chat about that in in the next session the breakout session that was blood tanks thank you so much this this is truly amazing i mean to me amazing i say that because i i'm aware of that technology but i had no idea you got this far in this little time so great thank you are there any questions maybe one or two quick questions yes um yeah we got to build up the use case then we we got to meet and and we'll see what the use case is exactly and then we need to see how much we already have so for uk um we've already got something for philippines in the easy it would be easy for america um there is a lot of data but we need to set it up and then we can south america there's um now it depends okay it really depends on the use case if you want real-time information there's like for instance columbia is is wonderful belivia is is not so wonderful um if you want historic information then anything anything goes because um the the the news history on all those countries is pretty good so it really depends on the use case that you that you would have but then we can have a look and we can prepare the the software and then you know vn api serve out the data yeah right you had a quick question that never happens now that happens a lot of course the fake news um so we've got three traps of verification first is the natural language processing first so for a single tweet we're gonna look is this a is this a um a tweet that sounds logical and um so if if someone purposely wants to bring in fake news it won't help you but if someone says oh someone is flooding my timeline ha ha ha okay then we can discard that tweet um so we're looking at the natural language processing for a single tweet then we have a probability of being true then we're looking at other tweets of course because um we don't we never have only one tweet about the event there's always tens hundreds and usually thousands or tens of thousands so we're gonna look if it's the information is confirmed by other independent sources that's really important step and the third that we do is we combine it with external information so we also have i i i didn't get around to to to tell it but we're also um um correlating the the location with the precipitation that that we have on that location or further upstream and um obviously also by combining with with remote sensing we can also give a final uh validation there so these are three steps the fourth step is that someone actually is going to yeah manually look at it and as the red cross does in philippines a call to the chapters and say well i've got this really strange news only one tweet only two tweets uh what what's this what is this about and they need to validate it like that quick question okay um so our business model is is mainly so there's two sides there's the development apart so we are being paid to develop software and configure the software for new use cases and we've got hosting and maintenance aside where we uh we do the hosting maintenance for for a fee and um the software if you're a user you can get the software for for free so you can also do the hosting maintenance yourself or you can take a contract with us to do the to do the hosting maintenance great thank you okay and if there are more questions i think you could take that yeah coffee break today okay cool and juergen is also leading one of the breakout so but don't just all go to his breakout just with the breakout you sign up for right uh so the next talk uh to move on quickly is uh jim michael luciano department of transportation i'm very glad