 My name is Clarissa David, then I am presenting on behalf of a relatively large team of researchers based in both UP Dileman and UP Los Baños. This research is being supported by the EIDR grant of OVPAA and a portion of it was supported also by the outright grant of the OVCRD of Dileman. So the title of our research is Science and Society Applications in Public Education. What we were really interested in looking at is looking at the solution of data and how we can make it more relevant to the constituencies of public issues as well as promoting the practice of data-driven analysis and promoting the practice of science research because it really needs public support. What we need to do is to create a larger constituency for data-driven problem-solving as well as data-driven policy-making. I guess in the beginning I had assumed that everything that government does is data-driven and I was immediately, yes, there are laughs around the room, I was immediately awoken to the problem of data availability that is one of the main issues working with government policy problems is the availability of data. So that's what drives a lot of our research is mostly what is available. Scientists are in the best position to change the world when their work is accessible to the public and policymakers. So part of the challenge for scientists is how do we translate the work that we do to make it more relevant because we know it's relevant but other people look at it and don't really see how it's relevant. Whose job is it to translate all the technicals into stuff that people on the field can really apply? In the academic research community also, we're used to waiting for all data to be analyzed, papers sent out for review, we wait several months, it's published and before it goes public so to speak, this can take anywhere between one to five years depending on your field of study. If you're in the hard sciences, it could be as short as one year from the time you start an experiment to the time you see it on print. If you're in the social sciences, it can take you up to five years because the process of review for journal academic publications is really, really long and the issue for me when you're engaged with people who are actually out there on the field trying to fix problems, the world turns and what we're doing is we're waiting for journal publications. The field will move forward and the longer we wait, the harder it is for us to make our research relevant because sometimes, by the time our research is published, it's not relevant anymore, the world turns. So, we need to adjust to the pace of the world in order to keep ourselves and our work relevant as well and whose responsibility is it to bridge the gap between the technical studies and the interested publics. For the people in communication field, which is where I am, it's really the job of journalists and science communicators. Not all scientists should be spending their time communicating to the public because what they do is science. We shouldn't distract them too much from doing the actual science. We have a layer of communicators whose job it is to understand what the technical studies do and then communicate it to important stakeholders. So, in this case, I will be describing how we do it in the evolving program of work that we're working on in the EIDR grant. We started by looking at where we can find the most data and we had a special interest in working on issues in education. Who has the best and most organized data sets? In public issues and working with government data, the vast majority of data by government is not ready for analysis. You can't just ask people for data and then you will expect you're not going to do any processing, cleaning, et cetera. Sometimes it comes to you in photocopied papers and you have to encode. So, in order to move ourselves forward faster, we looked across and found that the Department of Education has the best most organized data sets. The administrators currently in depth are interested in collecting and using data for planning and they have a really progressive open data policy. You just ask them and they give it to you and that's very rare to find. So, we created a relationship of cooperation with the agency to explore mutually beneficial working conditions where we are able to do analysis and publish research and we provide them with outputs that are actually useful for their work. So, the Department of Education by way of background is a massive scale of operations and a massive scale of data as a result. There are over 46,000 schools. These are schools and over 21 million students at any given point in time, there's over half a million staff including teachers, administrators, managers, school nurses, et cetera. And unlike other social sector agencies, the Department of Education is still centralized. The Department of Health is devolved, Department of Social Work is devolved. So, the data that are available for those areas in the social sector are devolved but in depth they're centralized. So, we just go to the national office, the central head office and they have all the data. They collect it electronically by web and it's automatically encoded and goes into a database and it's a matter of querying the data. So, it allows us to move more quickly into data analysis and what we were interested in is looking at analytics, mapping and how we can connect it to policy. The research program of the EIDR grant includes three groups in a very multi-disciplinary collaborative environment. We have one project that looks at external pressures on the education system. Our colleagues in the College of Science, led by Dr. Rene Batak of NIP, is an analysis of depth and data looking at disasters and its impact on school performance. And what we're trying to do there is try to figure out the best predictors of school performance. In the College of Mass Communication, we have a data journalism aspect to it, which is using data-driven journalism in developing students' skills in analytics and reporting but also doing data analysis in the service of public interest reporting because journalists are used to looking at data and seeing where are the stories, where are the issues in particular areas in this country that we should be focusing on because they have particular governance problems, et cetera, and it becomes a journalistic story. So, it's a completely different lens. In UP Los Baños, we're working with the College of Development Communication and they're doing science communication with the same dataset. Evidence-based science-based policy and program decision-making enabled local offices such as division offices of the Department of Education, sometimes as local as principal, working with principals in schools. And what they're trying to do is find the best ways to apply scientifically-determined solutions through the system in more local settings. We decided to focus. Although the data are massive, we have data for all the inputs that goes into the school's number of teachers, number of students, books, toilets, electricity, water, et cetera. We have all of that data there, but we knew that we had to focus because otherwise it would take us too long to get from swimming around in the data to something that's actually useful. We decided to focus on understanding DRR in public schools. In particular, what are our vulnerabilities? Where are they? How can this information be useful for actual planning purposes? The country, which is not new to all of us, the country is vulnerable to many disasters and different kinds of hazards. And we have a large public school system that's historically strained for resources. We are also geographically unique. Because we are an archipelago, we have portions of the country that are more vulnerable to floods and typhoons. We also have portions of the country that are more vulnerable to volcanic eruptions. And then we have issues of supply chain management, for example, of bringing inputs to certain areas of the country that are mountainous. So the challenges are varied because of how geographically dispersed we are. We wanted to understand how vulnerable schools are to disasters and whether these have impacts on performance. And we looked throughout the literature, the published literature and found out there really is not a lot written on it. Because the data have not been available. So a lot of the things that have been written in the literature are based on surveys of students. They focus on issues such as impacts of say, Hurricane Katrina in the US on the mental health of students. It's more those things or the Fukushima disaster on the mental health of parents in schools. So this is a unique opportunity to look at administrative data at the school level for a large data set and see if we can connect exposure to hazards to school performance. Okay. So this is our initial question. And we started by looking at three sources of data from the Department of Education. The basic education information system which is a questionnaire that takes two days for all principals to fill out. Because this is where they tell the depth how many students, in which grades, how many are girls, how many are boys, how many books, how many chairs, how many rooms, etc. It's all reported here. But at the same time, in the same system in BEIS, they started asking the experience of disasters by schools. So the data for hazards that we have are reported by principals. They're asked for the last in the last school year, how many times were you flooded? How many times did you, were you hit by typhoons, armed conflict, etc. So it's a battery of questions about different kinds of hazards including man-made ones. And we also have data on national achievement test scores by school. The national achievement test is a nationwide test that's conducted for students. And they do it at grade three, grade six, and fourth year high school. Maybe some of you still remember taking the national achievement test. But this is also school level data. So we have 47,000 test score results at the school level. And at the same time, this administration of the Department of Education started doing location data of public schools. And at the time that we were asking them for data, they had 80% mapped. Now they have more than 90% mapped. We haven't updated our data but that's where they are. So imagine just in the last five years did they think about getting coordinates for schools? In other words, before this last five years, people had no map of where the schools are in the country. Where, where the schools are where they were looking at enrollment figures and shortages in inputs but haven't really looked at what it looks like in a map. So they have started doing that to the credit of the Department of Education Administration. So the first things that we did were to map the schools one. If you map all the schools that entire map of the Philippines would be covered in dots. So we're only showing you maps of schools that reported whether they experienced particular kinds of hazards. And there are some things that are very clear from the beginning which are that the geographic picture first will tell us where the most vulnerable areas are located. Where the most vulnerable schools are located. So we have, in areas there are a lot of dots. There means a lot of schools that reported being used as an evacuation center at least once. In terms of flooding, you see Pangasinan, Metro Manila, Laguna, Madelas Binabaha, Orsons of Albae, Leyte. So you can see, so just by mapping it by different kinds of hazards, you can see that if we map our conflict you'll see the dots are in the south and North Luzon. If we map floods you see the dots are in schools that are in highly populated areas are more flood-brown. And then if you see the map between flooding and evacuation centers you'll see that the schools are often flooded are often not used as evacuation centers because they are the ones that need to be safe from the hazards. So that was the first step. The next step is to look at whether experiences of hazards is related at least on the surface level to school performance. The school performance output that we have are not scores. If we look at the national level we don't see any correlations because there are 47,000 schools. So just by scale, even if 10% of them experience a hazard, when you run a correlation it will get drowned in the data. So we look at it sub-nationally and also for hazards there are areas of the country that are just not prone to any hazards. Different kinds of hazards are important to different areas of the country. So it's important to look at it at the sub-national in our case the provincial level. So for example, for Cavite we found that schools that are more exposed to more of each hazard have lower national achievement scores generally than the rest of the country. And then in the data we can actually see for example that in Cavite alone there are these six schools that had six floods in one year. Buhay Natubig Elementary School Buhay Natubig is the name of the school. They always flooded. The data elementary school. We have data that's this specific. And then we have five, we have two schools in Cavite alone that were used as evacuation centers five times in one year. And that was for us a policy issue because to be selected as an evacuation center actually DEPED has no power to do that. It's the mayor. It's the LGU. And DEPED really can't say no. So it is a problem for the school division to try to lobby with the LGU that if you have to use our school so often as evacuation centers maybe we should start building an evacuation center for this particular purpose because it's true that after you have a group of people evacuated into a school for maybe three or four days they really use the Upoana Panggatong and then the principals have to put the school back in working order and it really does disrupt student performance at least that's our hypothesis. So at the sub-national level we can also map and look more closely at specific area. So for Cavite this is what it looks like for floods and we generated these maps at the provincial level and distributed and gave it to DEPED they used it for their DRR officers workshop across the country. So we, while we don't have papers to publish yet in the interim while we're doing the research if there are opportunities to provide DEPED with outputs that will help them do their work that's what we do. We had the team generate provincial level maps that are high resolution they can zoom in they can see exactly which schools and the names of the schools reported that they are likely to die. So that if you are in the division then you can really look at which schools need attention for DRR. We do the same for Laguna we do the maps and for Laguna it's interesting because you can see Laguna Lake, around Laguna Lake those are the schools that were destroyed. We don't show the dots of schools that are there but are not flooded. These are just schools that experience at least one flood. So if you can see across the country you'll find their geographic features that really show us that really indicate hazard vulnerabilities like in this case if you're next to Laguna Lake it's easier to go down. So the general observations confirm vulnerability to disaster experience are associated with coastal locations used as evacuation centers is focused around a handful of schools repeatedly in a year and road network and accessibility is relevant to use as evacuation centers. Evacuation centers tend to be in areas where there's thick road density. Consequences on schools might be minimized by density of the school network itself and this is a paper that we're still currently working on. We are still working on making the causal link. We have associations between exposure to hazards and school performance but it is not clear yet if we can say that X causes Y. So that is the long-term project that we're trying to pursue here. One of the first steps that we're doing is to look at gain loss score because we have panel data for schools which means in this particular school we can see if in this year if there are a lot of them their score goes up in this particular school. So it's a more powerful test because it controls for factors such as self-selection of students into schools, community features, etc. It allows us to focus our analysis so that we don't have to keep adding controls in the models. The nice thing about DEP ed data is it's a census meaning all the schools report so we really don't have to worry about statistical significance because all the schools are in that data set. It's very powerful because it also has a lot of data points and a lot of variables there. The science communication team so that analysis goes on mostly in UP delimand but each of the teams are doing their own particular analysis. The science communication team in UP los Baños has been looking at trends in data related more to development issues and they've been looking at looking at the data in a way that will allow them to identify local offices that they can target for science communication knowledge products. They were able to find, for example, that there are schools that were often used as evacuation centers that don't have water or electricity. So what happens to those schools when people are evacuated if they don't have facilities for power? And then what they're trying to do is to identify these areas where you have these issues and feed in package and write and feed information back to the division offices and the LGUs to help them make decisions. And part of the discipline of science communication and the way that they publish their papers is to really look at how is information best packaged into knowledge products that will change practice on the ground. So they are looking at the data with a completely different lens. They're not looking at publishing information out of the raw data analysis but publishing information that will be most practical to practice on the ground. Also in the data journalism team, and these are all projects that are going on at the same time. The data journalism team is comprised of Iwonshua and Evelin Katigbak in College of Mass Communication. And they are we have shared the data with an undergraduate class. The undergraduate class are grouped and they pursue investigative projects based on what they find in the data. Here are some of the projects that they're pursuing now. The impact of school use is an evacuation center on operations. They're focusing on schools near the Tuljahan River that are identified through the data. So for data journalism means using the data as either a starting point or supporting information to paint a more full picture of the context that includes historical context, humanistic context, political context. You have to really go to the field. You have to have these students and journalists go to the field to understand what does it look like to be an evacuation center for 10 days. Interview the principal, interview the LGU, interview the students and the parents because we can't understand the conditions on the ground unless we actually talk to the people who are on the ground. So that's what they're trying to do. And then write stories that are journalistic stories that have some political impact later. Other projects that are ongoing are disaster preparedness of NCR schools for strong earthquakes using our mapping data of schools and hazard features whether public schools have proper facilities for DRR, focusing on data of earthquakes and typhoons and public schools and NCR used as evacuation centers, estimating magnitude of impact based on population density. So the students are also as part of their class exercise will bring in new data. So DEPED doesn't have data on population density but the students know where to get population density data and how to add it onto the existing dataset of DEPED. So moving forward, the datasets grow because various people in the project are adding external data to our database. And then we have a team looking at an investigation into Kesson City frequency of occurrence of disaster related disruptions and how these affect performance of students and looking at it from the point of view of teachers in particular. We also have parallel investigations for academic journal publications. Some of these I'll just go through it really quickly include modeling of nearest neighbor networks as well as looking at aspects of distribution that are demand driven. Meaning the schools are clustered together around more populated areas and how can we help in the modeling to identify areas where more schools are needed and what kinds of schools are needed. There are aspects, for example to the distribution of DEPED schools that are not random because the establishment of schools shouldn't be random anyway. You should go where there are many people, there should be more schools, right? But there are also aspects to the distribution that look random where it doesn't look like there's any population there but there's a school and that is largely a function of the policy of DEPED of one school per barangay. It doesn't really matter if there are no students there, we need to put a school because the policy is all barangays should have at least one elementary school. A different team is looking at classification, doing classification exercises using artificial neural networks and we have a few things pipelined including modeling resiliency tests for impacts of disasters on schools. What we're hoping to do is by doing more analysis on DRR and hazard vulnerabilities of schools we'll be able to find schools that no matter how many disasters they encounter their test scores are still high and from there we can use that information to figure out what are the factors that determine whether a school is resilient in terms of, resilient is defined for us as outcome performance of the school which is test scores. The predictors of school performance using NAT scores including external data sources indicating features of surrounding communities. So for example can we really we wouldn't be surprised if you have NAT scores that are higher because the municipality is more wealthy and the LGU gives more money to the school and the school works with more budget than other schools of depth ed. These are the kinds of things that really I thought was being done but the more we were looking at the data we figured out that not a lot of people are connecting the testing data with the input data. So this is really the first time that people are looking at it systematically. We also have a special interest in looking at gender disparities in retention rates and test scores for this we're awaiting gender data particularly for test scores. This is continuing work with Dr. Monterola sure Monterola in the College of Education where we were looking at across the country girls students are outscoring boys students on every single subject of every single test and every single measure of outcome. Dropout rates enrollment rates cohort survival rates graduation rates test scores. Test scores in science in math, in English, in Filipino and social studies. Girls outscore boys in every single indicator. The disparity gets worse the older they get so it's much worse in high school than in elementary school. This is not the problem that depth ed doesn't know about. It's not a problem that teachers don't know about. The problem with this is that many people don't think it's a problem and that has been our main advocacy is how do we convince people that gender disparities that are skewed toward male rather than females is a problem because it really is. That's also a multi-factor issue including boys are put to work earlier in their lives because they can generate income earlier in their lives compared to girls. But there are also other things that we need to understand. For example is there a possibility we think there is that parents have started prioritizing giving more attention to schooling of girls because there's a cultural belief that girls always do better in school. So they support it. And then there's an economic imperative to support girls through schooling because throughout the years the overseas labor markets demand has been skewed toward women. We now are not exporting more welders in engineers we're not that we don't have engineers, we have fantastic engineers but the market has now skewed towards teachers, nurses etc. So if you want to send an OFW out in your family to gain the benefits of OFW labor then it's better to make greater investments in the education of girls. So again it's a very complex area we still need to triangulate our data sources and there's also a special interest and maybe UP would be interested in this in looking at science achievement test course over time. We have a slew of communication activities that goes on as we do the work. We don't wait for journal publications before we publicize our findings. We are currently building two websites for knowledge products, one of them based in UPLB that will be focusing on knowledge products directed at stakeholders that are doing policy work and field work, meaning the DEPED itself and hopefully other agencies later. And then we have another website that's really more for data journalism which is the public and politicians and the broader aim is for us to communicate with stakeholders to promote public understanding of science and to communicate science and data driven policy making for public engagement. We have one so far one thing that was published in Rappler so we started drafting some pieces and one by one we hope that as we move forward we can release more papers. We're going to be exploring a possibility of having a regular space within either Rappler or one of the online news agencies to see if we can just post things so we can get a broader audience.