 Terima kasih. Hai, semua. Saya Raymond. Saya dari DataKind. Saya akan berbicara tentang pekerjaan kita di DataKind dengan pembinaan yang lain yang akan mempunyai kita. Ini bukan kerja hari ini. Kita sebenarnya mempunyai. Saya dari Lazada dan Wei Young dari NUS. Kita buat ini di luar pekerjaan biasa. Kita tidak bercakap di bahagian kedua pembinaan kita. Okey. Jadi hari ini saya akan berbicara menggunakan perlukan data pro bono untuk membuat dunia lebih baik. Saya akan mulakan dengan perlukan kejadian tentang apa yang kita nak bercakap tentang hari ini. Saya akan mulakan dengan perlukan kejadian tentang apa yang DataKind beritahu. Kita akan bercakap sedikit tentang pertahanan data dan keperluan dan keperluan. Kita akan bercakap dengan dua atau tiga pembinaan. Pertama adalah OJOY. Bagi SG kalau kita mempunyai masa yang cukup. Pertama yang terakhir, yang akan dibuat oleh Wei Young adalah pembinaan Raffles Bandar Langer. Okey. Mari kita mulakan tentang apa yang DataKind beritahu. Jadi basically DataKind adalah menggunakan perlukan kejadian dan perlukan data di perlukan kejadian. Sebenarnya, apa yang kita lakukan adalah kita adalah sebuah pelajar data yang hanya mempunyai masa untuk membantu perlukan kejadian dengan perlukan kejadian. Bagaimana dengan pembinaan data. Bagaimana dengan pembinaan data. Dan apa yang kita cuba buat adalah pastikan kita dapat menggunakan semua data yang mereka ada untuk meningkatkan perlukan kejadian sosial. Ia akan menjadi lebih jelas apabila kita sebenarnya menggunakan perlukan kejadian. Okey. Okey. Jadi DataKind sebenarnya bukan hanya perlukan kejadian di Singapura. Sebenarnya kita hanya adalah sebuah pelajar perlukan kejadian di New York City. Ia telah berada selama 7 tahun. Kemudian kita sebenarnya mempunyai 2 tempat yang lain di UK. Dan seperti di Asia, kita mempunyai Banglau dan Jalan Baru. Di US, kita mempunyai area San Francisco Bay dan Washington. Yang juga membuat kerja di sana. Seluruh orang di luar New York adalah seorang pembina yang mempunyai perlukan kejadian. Saya adalah salah satu. Kita membuat servisinya untuk bersenjata. HQ juga membuat kerja di sebuah perlukan kejadian. Mereka sebenarnya mempunyai perlukan kejadian di Singapura dengan perkara lain. Mereka mempunyai perlukan kejadian kejadian kejadian di luar US. Tetapi di luar US, kita membuat kerja lebih kurang-kurang untuk bersenjata. Jadi, cara kita membuat kerja kita mempunyai perlukan kejadian kejadian kejadian di Singapura dan Jalan Baru watak dan Calm yang membuat pembina saya membuat kerja di Singapura dengan perkara lain tidak se Jams data yang berlaku tentang membuat data, membuat kerja bersedia Kerana itu adalah sebuah kebiasaan data Biasanya semua peluang yang perlu dilakukan Kita melakukan ini sebab 2-3 jam semasa Tapi kita membuat data, membuat anonomis data Dan melakukan semua perkara yang perlu dilakukan Untuk membuat data bersedia untuk analisis Kemudian, setelah itu, kita mempunyai data yang berlaku 2.5 hari Di mana ada hackathon Tapi tanpa perkara kompetitif Kita datang bersama sebagai komuniti Dengan data bersedia dan mencuba Berlaku sepanjang kebiasaan data Kemudian, kita membuat kebiasaan data Dan meletakkan kembali kepada pekerjaan Dan semoga mereka dapat menggunakan data Seperti perjalanan hari-hari untuk mengubah impact sosial Akhirnya, kita mempunyai kerja data Yang kita sekarang hanya membuat satu Ini adalah proses yang lebih lama Jadi, dalam kes kita membuat kerja data Yang kita sebenarnya akan bercakap tentang Kerja data ini berdasarkan Dari perjalanan 3 bulan Di mana kita telah menghubungkan Dengan perjalanan yang lebih kecil Dengan 5 jam per minggu Untuk mengalami perjalanan data Dan keluar dengan solusi yang sangat dalam Dan dia akan bercakap tentang perkara yang sangat dalam Yang kita lakukan Okey, jadi Saya ingin menghubungi Dalam perjalanan perjalanan Untuk menghubungi sedikit tentang Macam mana kita sebenarnya Membuat beberapa analisis data Kerana, seperti anda tahu Kerana perjalanan dan perjalanan seperti itu Sangat sensitif Jadi, selama kita membuat Analisis data kita ambil Perjalanan perjalanan dan Perjalanan perjalanan yang sangat serius Jadi, ini adalah perjalanan standard kita Apabila kita membuat kerja data kita Kita selalu menghubungi Jadi, setiap orang perjalanan Memahirkan perjalanan yang dapat Berserang apabila kita menghubungi data Yang paling menghubungi adalah Perjalanan perjalanan data Perjalanan perjalanan, kita mengambil Perjalanan perjalanan yang sangat serius Kita tidak mahu data untuk menghubungi Dalam keadaan informasi Sebenarnya, memastikan bahawa data Memang menggunakan untuk perjalanan Dengan perjalanan Jadi, bagaimanapun Apabila anda melakukan sesuatu seperti Perjalanan perjalanan untuk Perjalanan perjalanan Anda tidak mahu menghubungi dan menggunakan Datuk data dari e-mel yang Mereka dapat menggunakan Dan melakukan untuk Mereka membuat perjalanan perjalanan Dalam perjalanan perjalanan Bagaimana? Sebenarnya, itu sangat menghubungi Kita mahu semua orang Memang menggunakan untuk perjalanan Bagaimanapun, kita dapat Memang menggunakan untuk Perjalanan perjalanan yang sangat serius Di pejabat semalam Ia adalah membuat perjalanan Jadi, idea ini Kita mahu Memang menggunakan untuk perjalanan data Untuk lebih kurang Membuat perjalanan programatik Jadi, ia dapat membuat perjalanan Kita dapat melakukan perkara yang kita dapat Memang membuat perjalanan Dan mengubah Dan saya akan bercakap lagi dan lagi Ini sangat mengguna Dari perjalanan perjalanan Apabila perjalanan berkawasan kita Dengan perjalanan data Kita membuat perjalanan perjalanan data Mereka dapat lebih banyak informasi Dan mereka dapat lebih-banyak Bagaimanapun, membuat perjalanan lagi Dengan lebih banyak data Kita mahu itu berlaku Sama seperti kita membuat perjalanan perjalanan Dan lagi dan lagi dan lagi Jadi, untuk kami Perjalanan perjalanan kita Ia adalah perkara yang sangat penting Dan kami cuba membuat Perjalanan perjalanan Untuk membuat perjalanan Jadi Bersama-sama Semua perjalanan kita Kali-kali Membuat perjalanan perjalanan Untuk membuat perjalanan perjalanan Kita cuba membuat perjalanan perjalanan DanAPP Jadi, kita sebagai organisasi non-profit dan bantuan, kita mahu menghubungi semua ini, kita mahu selesaikan. Jadi, esasinya, aspirasi adalah ini. Kita mahu semua analisis untuk dibuat. Jadi, anda mempunyai perjalanan setiap perjalanan bagaimana anda sebenarnya membawa analisis, semua transformasi, semua cleaning data yang telah dibuat, semua model yang anda menggunakan. Kita mahu keputusan untuk dibuat, supaya anda tidak keluar dan melakukan analisis yang tidak mempunyai perjalanan organisasi. Kita perlu melindungi perjalanan. Jadi, selalu kita melakukan analisis perjalanan data, kita sebenarnya menghubungi perjalanan analisis dengan perjalanan organisasi untuk memastikan bahawa tidak ada perjalanan perjalanan mereka. Dan akhirnya, dua perkara yang terakhir adalah data isolasi dari model dan perjalanan indipendeng. Jadi, perjalanan antara ini adalah bahawa kita dapat mengambil model dan pastikan dengan jelas apa yang berlaku dalam model. Kita juga dapat melihat model untuk perkara seperti perjalanan, dan semua perjalanan yang berlainan kepada model yang tidak mempunyai konsekuensi. Dan juga melihat perjalanan data untuk perjalanan jika kita perlu. Atau jika data terlalu terlambat, kita dapat menghubungi perjalanan data dan memastikan bahawa perjalanan kita akan terlambat. Jadi, kita dapat melihat dua perkara ini indipendeng untuk memastikan bahawa semua perjalanan adalah baik dan okey. Dan tentu saja, jika sesiapa-siapa sebenarnya mempunyai perjalanan yang lebih kuat untuk menghadapi perjalanan perjalanan dalam perjalanan data, tolong beritahu saya juga. Tetapi, kita selalu melakukan ini dan kita hanya menghubungi perjalanan kita untuk kita dapat berjumpa dengan mereka secara separat. Jadi, kita akan menghubungi perjalanan yang cepat untuk sebuah projek yang kita lakukan. Jadi, ini adalah OJOY. Jadi, apa yang OJOY buat, They are actually a Community Health Intervention Unit Institute which basically takes patients from the community and provides counseling for mental health. So, as you can tell, basically the data which comes in is going to be fairly private. You're talking about the mental health issue surrounding patients. So, what we did for them is firstly we had to we got the data pre-anonymized. But because many organisations are not super data savvy, we actually went through all the data again to make sure that there was no PDPA which means personally identifiable information. In any of the fields that were provided, we did this basically programmatically to make sure that names were not there, IC numbers were not there. Then, we removed the fields that were actually had this information, additional fields. So, you think for example in this case, we had things like remark fields which potentially contain some dubiasly not so much dubias, but personal information. We removed things like that to make sure that none of the information got out. And then, we basically opened this up to the community under fairly lenient NDA so that it could analyse the data. So, this is kind of our personal anonymisation slash personal privacy pipelines that we use. So, the end result is basically using that data, we could create a dashboard to basically facilitate the running of the organisation. The idea being that we could do things like reduce the number of poor referrals and with the final goal of basically improving the wellbeing and mental health of both the patients in the institution as well as the helpers and caregivers that were with these people. So, this is basically the final result. Unfortunately, as I mentioned previously, this was actually done in Tableau which is unfortunately one of the best tools for reproducibility because the data is stuck with the analysis. We really dislike doing that. But because of time constraints and the fact that I think we showed the we did the prototype and the nonprofit was like, wow, this is so wonderful. And they kind of like, you know, glossed over all the data protection issues. But, so what happens is that we did the analysis for them showing all the and they just wanted to look at this and not do anything reproducible. But nonetheless, you can get quite a bit of insight. So, for example, the rapid dropouts of one of the main things that they liked a lot which was basically which of the sources of data sorry, those sources of referrals were actually causing issues. So, for example, rapid dropout meant that the patient didn't turn up or the patient basically turn up for a single event and that basically fell out of the system because you expected to come to do basically maybe about six sessions with them. And if you just drop out, it means that, you know, you're just not really matching the organisation and you want to kind of reduce issues like that because it does take a bit of time. Okay. I will think I probably going to skip over the next section and pass it over to Wei Yang who will Oh, and a quick thing just before we leave. These are the people who were basically involved in that particular project. What happened? Okay. So, Kevin and Jeremy who is here they helped a lot basically organising this entire the datasets and stuff. Yeah. And then Paul is around basically doing all his docker stuff and making sure try to get things reproducible in this case the basically the sponsor with the non-profit guy this is Jin Cat he basically was very, very keen on just getting the dashboards out. Okay. So, I'm going to skip over my next one and pass it over to Wei Yang. Okay. So, so, I'm going to talk about the Raffles Bender Langer project that Datakind did with what we call Raffles Bender Langer Working Group RBL WG. So, this is a Langer. So, you may not be very familiar if this is not one of the most common species that you see in Singapore the very common once we we don't really care about them too much. But this particular subspecies of monkey called the Raffles Bender Langer they are estimated to be around 40 to 40 to 60 left in Singapore critically endangered here. So, Raffles Bender Langer So, in order to tackle the conservation issues with these Langers the RBL WG was set up to try to understand the biology of the Langers and then also to tackle the protection of the habitat. So, what what happened was that RBL WG contacted Datakind in order to understand few things about what they can do with their data. So, the first thing is that how can data and technology be used to derive insights and tools for conservation. So, the kind of data that they provided to us was for example, sighting times, locations, activities. So, what what these monkeys did and then also like three species that they were sighted with. So, they were trying to understand to want to they want to have tools or either other kind of analysis in order to understand how the Langers behave or how are there perhaps any relationship between the monkeys and the any particular three species that they like and location that they are in. So, the other question that they wanted to answer was using how can Langer photos be used to estimate Langer population using machine vision. So, that was kind of a harder much much harder problem that was being faced at that and then it's probably one of the harder issues that data kind SG did which is why we we decided that we would escalate it to to be a data corp which is so we have an intensive 2-3 months period. Ya, because of the short time frame so we just had 2-3 months period and then we had a intensive session where the volunteers came down for about 5 hours per week to to tackle this problem. So, he said what we had was essentially 1000 plus photos taken by volunteers and so these photos we had to try to understand whether we can use it to to extract any useful information. So, so in order to estimate individual like the Langer population using photos what happened was that you you have to have identify individual Langers out of this 1000 plus photos, right? So, that will be very difficult problem to to tackle. So, I will talk about this HVO issues separately. So, so the way we approach this was to first use design geo-visualisation dashboard so using a dashboard of course then we can identify how it's just a simple tool where we can derive some insights and statistics where the RBLWG can then use to understand more about Langers and then we also develop Langer zoning method that that we use to try to estimate the Langer areas and where they are moving around and then in terms of photos how we decided to go ahead was eventually get do do up a web app which we affectionately called Tinder for monkeys. So, this web app we are trying essentially trying to use it to crowdsource the labelling for monkey photos. So, this is the dashboard that we build for RBLWG so there we have a map here and then the map with various Langer locations and then they are coloured by the various activities that they are doing in this different areas so some of them maybe they are feeding, mating or defecating in certain areas or like certain tree species perhaps. So, so this simple feature using this few simple features we can already kind of identify whether know what are their favourite locations and what they are they doing there. So, the tool that we use was Power BI so because of the constraints by the organisation all the tools has to be free. So, Power BI is one of the tool that allow us to do that quite effectively under free using a free software. Of course the locations are all anonymised here. So, what kind of insights can we draw from this? So, things like for crossing roads so some of the Langers they are in areas where there are the roads passing through the forest and so there can be road kills sometimes so the organisation really wanted to prevent such issues and so because of that once we visualise this like for example if you see that there are any Langers crossing road of course this is not a real location they are not in Limchukang but essentially now when we when they see that the Langers are frequently spotted around areas where there are roads and very fast cars moving then you can start to think about how we can tackle the issue of road kills there are only about 50 40 or 60 left in Singapore so this is very very essential to inform policy policy design then so in the future we intend to add more data points more kinds of information into the dashboard like the tree types and other behaviours then the next thing that we did was to estimate Langer zones so based on the data point that we collected we want to estimate what are the hotspots and essentially the areas the zone that we really want to conserve so based on the hotspots we use the R to do a kernel density estimation so using the KDA method we estimated a probability of the Langers roaming around this area so the yellow patches that you see here are the high probability of observing Langers and then you have the blue ones which is a lower probability of observing Langers and so some of the insight that we can get out of this is that we see we realise that there are different hotspots so we have actually different groups of Langers not just one group of Langers that's roaming around the forest patches and then we can also further break down the zones into different activities so like for example they like to feed in certain areas or when they are travelling then you actually see that you site them in various other locations so maybe they are doing one activity in one place but they they need to travel to do other things and then the playing areas are also different so they actually have a preference for certain areas now the last problem that we face is the creating the web app which was actually done quite intensively by our volunteers so we call it Dinner for Monkeys maybe so it's not exactly like dinner we're not matchmaking people with monkeys but trying to identify whether we can match matchmake two different photos so if two two different photos are of the same monkey then we can you can click on it and decide whether you are very sure that this are actually two photos of the same monkey so essentially by doing that then we crowdsource the labeling of the photos and then hopefully so after a round of data collection we are able to then make use of a machine learning to to add to build a model to then classify the photos of these monkeys so this is the link here rbl-classify.herocoolapp.com so if you're interested you can check out this this web app that we build go over there and label the monkey photos right and then help us to do monkey classification so some of the design consideration that went through was whether should we include two monkey photos to collect the labels or maybe up to 10 so that the volunteers can classify multiple monkeys at the same time so volunteers that help us to do this classification essentially they are also tied on time this is volunteer based we want to try to extract as much information as possible within a short time frame so two we have if you recall we have about 1000 plus photos and to classify 1000 plus photos we that means we will have half a million pairs that we actually want to match make with half a million pairs then we and depending on the volunteer base and their interest we want to try to optimize this process if we actually include more photos then it's better for data collection but if we have two photos which we eventually decided to put two photos we realize that if you show 10 pictures of flangers they start to all look the same so then that becomes not so you know good for our data collection and then the other point was to whether we should include a confident scale so on a scale of 0 to 10 do you think that this how confident are you that this pair is actually same or different so so we eventually decided that we won't include a confident scale so bias some a five might mean something for you but mean something else for a different person so then we didn't want to mess up that and then of course technical challenges include having having to use all free platform so we use right power bi for our dashboard are for the zoning problem and for this we decided that we would use hero cool for hosting the web app so hero cool actually provides a free first web app and then for the database is a post postgres SQL database so we are free 10 million rose and if we exceed that then we just paid $7 per year which is okay and then we have a view JS web framework and jango for microservices so essentially it's a very free or very low budget way to develop a whole web app for for pro bono services then then we have other technical challenges like a user experience because photos are heavy on a communication cost so then we had to you know scale down the photo size so resize everything so that we get a better user experience most of our volunteers are you know their first time building a web app too and they also have time constraints we build all this thing within a two month period so finally this is the group that help us to achieve everything so thanks to the very enthusiastic crowd here we have the dashboard team and then we have the group that did the the zoning including Jeremy and and Raymond of course and then we have the other team here that did did up the web app within the very short timeframe of two to three months so thanks to them we are able to do all this amazing analysis so finally this are our sponsors and for any of you here who are interested in doing volunteering for your your data analysis skills for any pro bono or MPOs we are looking for product managers data scientist and data engineers who are enthusiastic and able to help so check out our website the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the the kerana kita melihat gambar itu, mereka hanya menunjukkan gambar itu dan saya rasa itu bukan benar-benar untuk gambar itu Jadi anda menggunakan gambar itu? Jadi... Tidak, kita... Kita menemukan... Banyak gambar kita dibuat dengan gambar sebagai objek Jadi mereka sudah dihubungi Jadi semua yang kita perlu lakukan adalah membesar mereka membuat mereka lebih kecil dan kemudian... Pada web kita boleh membuat gambar gambar dan membesar gambar itu yang kita menemukan adalah 3 gambar Jadi itu sangat menarik untuk kita Jadi kita hanya membesar gambar 1 gambar dan kemudian kita membesar gambar dan mereka hanya membesarkan gambar di tengah-tengah Jadi itu sangat menarik Jadi anda membesarkan gambar untuk gambar? Jadi gambar itu akan menjadi... Jadi saya tidak fikir gambar itu sebenarnya tidak menerima gambar tetapi itu hanya membesarkan gambar ke tengah-tengah Jadi gambar itu sudah menjadi objek yang penting untuk gambar Jadi gambar itu akan semakin berbeza? Banyak gambar yang mereka melihat... Well, beberapa gambar yang saya telah melihat Mereka seperti gambar ini atau... atau gambar ini Jadi... dan mereka berlihat sangat menarik Jadi anda hanya membesarkan gambar 20% di setiap gambar? Ya, anda hanya membesarkan gambar dari setiap gambar Anda baru beritahu bahawa orang yang membuat gambar untuk gambar adalah seorang seorang seorang seorang seorang seorang seorang seorang seorang seorang yang membuat gambar untuk gambar Jadi ia dipercayai secara menarik Mereka akan membuat gambar untuk gambar Mereka mungkin juga membuat gambar dan kita harus beritahu juga bahawa ini akan berkumpul menjadi gambar AI tetapi anda akan melihat jika kita dapat membuat gambar Mungkin Saya tahu Jadi lebih banyak gambar juga akan datang untuk mengambil gambar Bagaimana banyak gambar? Bagaimana banyak gambar untuk mengambil gambar? Sebelumnya, saya tidak tahu Kita ada gambar untuk mengambil gambar Ya, kita tidak mempunyai gambar kerana kita hanya mengambil gambar untuk beberapa masa dan ia tidak mempunyai gambar Jadi kita harus tahu Terima kasih