 I was hoping that I can, I think the first one is mainly similar to the first part can be happening in time. This presentation is just five years job of practicals today. So most of the things we can do. There's a little change that happens in the morning. We are going to fix this schedule in afternoon. We have a fixed schedule for this room. But there's the other two rooms are quite sweet to host. These are the three main halls. This is the main hall. This is the book hall behind me. There's a meeting hall next. And all the morning sessions will happen here. And the platform of room sessions will happen here. And the first book is behind me where the lunch and tea is served. And it's also open for the morning sessions in the morning. If you don't want to be here, you don't want to be in the morning sessions. You can come to your own room in the food court and have your own session. And the meeting hall is just behind. You go around and you're there. And it again is free in the morning. You can host your own sessions if you want. Just ask one of the volunteers. And in the afternoon, we have a fixed session for the three rooms. And actually, this is about the fixed sessions. Just put it up for your information. It's on the side too. And it's on the board on the right. So I just put it up here. So these are the split in the afternoon. The fixed sessions are in the afternoon. But the mornings are free in the meeting room in the food court for you to use for any of your own sessions. And there's again one free slot if somebody wants it. You can block and scale it. And there's a free back bank between four and five minutes. So it's not just free back. It's just for time to get together and meet other people and what you want to have. And then there's the standard part campaign if you want to have your own conversations and other problems in the morning. So you can just block. And limit it to 20 plus five minutes. So everybody gets a chance. So again in the afternoon, as it's there, try to use the tags so that you can find those on Twitter or Facebook or whatever. And you can read up some time there. That's it. If you want any help, you can talk to them or you can email them. Numbers are probably on the sign. So you can find them. That's it. And that's the Google group which is organizing this camp and they can as well communicate. And with this, I think we'll go to the first session which is a panel discussion and we shall have a look at it. Welcome everyone. This has been quite an overwhelmingly positive experience organizing this. When we first thought we have this, we thought, oh, good people will come and everyone will be talking about technical things. And then we got such overwhelming and great response that we knew that this was going to be a good event and that even we're going to have a lot of ability to be diverse people here. People really wanted to talk about this. Also just a quick note. If you want to present an online or computer-based data tech tool instead of since we have limited space, we can do a quick fire tech tool around the cafeteria in the afternoon which is basically, we'll set up a kind of around like stations and you can present your tech tool for 10 minutes to groups who want to like see your tool online and things like that. Just so you can have a chance to show the work you've been working on and the kind of tools that you guys have been creating and we don't have to fill up the panel space. So if you want to do that just come talk to me. We'll set up some space in the cafeteria. We wanted to open the camp with kind of, let's talk about what's happening in India and what different people are working on in terms of open data and what does it mean in the context. The negatives and positives and also like how people measure impact by working with data. So I've asked on a desk, who does data visualizations who works on report B just launched a company called Grammar that does online visualization tools talking about working with data and making use of it. He also is going to do a talk later. Zana, who has worked with CIS and done a lot of research I believe you just finished your presentation. Yeah, we're not talking about that. On data and using data open this up in private teaching also and if you were on this discussion and they have been working on how to make, how to create impact on the ground using data and how to get data to people so that they can use it to improve their lives. So I'll ask them to come up and we'll have them present for about 5-10 minutes and then do a Q&A and that should be our way of doing it. Flip, what I was just giving you a few quick thoughts on data availability and data use section I will see in a second. Those are two parts of the same equation. You've got data sitting there that you need to find for people to get out in the open and then once you've got it out in the open there's a question, what do you want to do with it? My observation on data in India is that you can have all the data that the government data wants to do but it's not necessary to do that. I said it exists. On a number of occasions when I tried to find data I was trying to put it all together and we were trying to see whether we could get the data from that visitator so this guy was interested in whether the data I was trying to put it all together and we were a bunch of weeks by government data and there's this great insight into India and I used to promise that we want to go ahead with the information piece it was surprising but that data are you giving to the mic? Yes So that data existed then at some point we started looking for energy data can we find energy supply data what's the capacity of the government data and in every single case after about 2 or 3 days we find some site that has some information every 15 minutes the current production of every single reactor and every single state that I've searched for that exists online crime data the national crime the NCRB data is available online for the last 50 years for every single state every single IPC code you have a breakup of what are the number of cases registered what are the number of cases that have been processed what state they are how many conditions so I haven't taken for the case where I needed data in a few occasions where that's happened maybe I just don't know where it is so findability may be a problem findability may be a problem but availability may not be and findability is a fairly big issue but findability is also extractability where it's available it's often available as CVS it may as well have been printouts in fact in some cases they are printouts on a scan which is a problem particularly partly because for instance with the NCRB let's just take the effort and get all of this data on to NCRB and we got one way to help it out to the critical needs to type out whether we can choose it on the site but still that's a fair bit of effort and in most cases even with the data available it's not easy to use it's even shy away from it that's part of the problem and that's for data that that's for the worst case scenario sometimes it's available on a web page as a table and you're trying to scrape it the last time we had we had a problem and we were trying to scrape the data over the related data it wasn't easy some of these sites on SharePoint are just not data that we would use to scrape and even for some of those so pulling the data out even though it's available is a pain now the other pain point that we have is the licensing of this data for instance we have pushed NCRB and said let's get the textbooks out of the way they're quite, they're completely over there the data is available now what worried us though was there's a clause that says that all content, all the textbooks of NCRB are called NCRB and if you ever do anything to it then you'll come after doing this quite strongly but let's look at the head of NCRB it's completely over there the intent is to prevent those people who are trying to make investments out of money who are trying to deceive it so if you for instance know they're not going to come after you in fact could we do that with our NCRB NCRB effort itself and we'll get to see it out of the way so the reaction for everyone is the same for you simply for the maps the licensing today on any India related maps is at least true but again, nobody believes that the government would come after you simply because the NCRB is going to come after this you put it up with an over source license and he's got this thing on the side saying the reason I'm doing it is anyone who takes the data from my side if you know that the source is me and you won't be blamed for it because I put up the over source license if the government has come after everybody they will come after you but still it does prevent people from embedding some of that data into their projects and the subsequent licensing of that become an issue so on data there will be and look at the problem data is available it's not that easy to find or to use and you're not entirely sure of the license and if those problems are resolved we'll be in a better shape but the flip side of it is data usage who uses the data, how do they where do they get used, how do they use it I believe this data can tell lots of interesting stories you can reach out to people in a way that tells them something that they do not want to go but otherwise there are two kinds of people that are broadly interested in doing some things with this data one is the community domain first which could be either NGOs, the government activists or journalists or academicians who have an interest in the community and then there are the geeks who have the ability to make this data the overlap between these two communities is somewhat different hopefully one of the things that a forum like this will do is to get together both these communities and have the ability to have the knowledge to use these techniques to build the data but not necessarily an interest in for instance that's it if you take quarter heads I'm not someone who has a strong interest in data I wouldn't be involved but if there's data you just want to take put it in a school do something like this and that's happening on a number of occasions like in the US where there are a number of hackathons there are a number of contracts where there are a number of agencies where schools are important here and there are a few smaller projects that keep coming up where you put together the data you put together people that know how to read the story and do the analysis and come up with something interesting my hope is that while we've got the data sitting out there somewhere we should have this community be able to try to expose it better so we would like to sit back and discuss it speak in the part my name is Nithya Raman and I run an organization a project housed at Eiffelon called Transparent Chennai we advocate, process and actually create maps, data and research about neglected issues in the city of Chennai and we also work with residents to help them create data that can support them we are basically a group that's focused on empowering residents to try and hold government accountable for service quality but this has really been an upvote past especially because I think the quality of city level data is much more problematic than much of the national level data that we have I want to just highlight three lessons or maybe three points that might be useful for people here who are really focused on the question of data for data sake and I want to talk about our experiences collecting public toilets data in the city public toilets is something that repeatedly came up in our discussions with groups of low income workers it's an under addressed issue in the city so we wanted to pull together some data about this and actually our work documenting this issue has had some impact on the reports that we've published and because of data suing publicity that this issue has received in the press the state planning commission has actually included reports specifically on public toilets in their final plan that they're putting together right now which I think is going to be a big change in terms of the way that toilets are planned in the cities I want to tell you just a little bit about the process of data collection we started to just get a total count of toilets in the city and so for that we called a corporation and tried to find out which department managed public toilets and this actually was very difficult to find out so we were passed around the home quite a bit hung up on and then finally visited and we were told that the building's department was the agency that was in charge of the toilets now when we called the building's department they told us that they maintained a city at all so there's no count of toilets at the Chennai Corporation headquarters and they told us that we would then have to go to all the zones for this information at that point there were 10 zones in the city of Chennai so I sent we had an intern a researcher I sent her out to all the zones to collect this information and at each zone she faced a slightly different reception and a slightly different set of problems in terms of getting access to this list of toilets and their addresses so she would go there with a letter of introduction from us most of the office addresses for the zone offices themselves were not given very clearly so she spent a lot of time searching you know in certain zones things were very easy so she would go in give them the letter of introduction and ask for the list of toilets they would usually photocopy her request letter the executive engineer would approve this and then put let's say a couple of engineers to writing a list of toilets and often she would come back with a hand written list of toilets and their addresses but in a couple of other zones she faced a lot of problems for example in one zone she met with the assistant commissioner after quite a long wait there he told her to come back after a week because the executive engineer was out of town then she came back after a week again waiting for the executive to speak with the executive engineer she gave him the letter and despite it having been approved by the assistant commissioner he was very suspicious and didn't want to give this information out to the public seems like a fairly non-US piece of information we were a little bit surprised so Meryl our researcher pushed him she said we've gotten this information from all the other zones he said no I can't give it out to you without explicit permission from the corporation commissioner that's the highest level of the bureaucracy of the city that means he wanted a written letter from the corporation commissioner to give out a list of toilets in one zone so Meryl pushed him again and said look every other zone has given this to us please give it to us and she said yes he called the executive engineer from zone 9 asked him whether he had actually given the list and then satisfied that this was safe to give out to a member of the public he finally gave the list so when we collated this list of toilets from all the zones what we found was that there were 572 toilets in the city which we thought was an exceptionally small number given that only 600,000 odd houses 800,000 total census houses and the 2001 census actually had toilets with new houses 572 public toilets being like a vast under provision so then we said okay releasing this data will get us in trouble regardless of what we do even though we don't necessarily want this to be a fight with the government let's just file an RTI so we have a paper trail so that you know if they ask us where we got this data we can share we can say that we actually did file an RTI so then we filed an RTI again following the RTI actually took much longer than a month which it often does we file RTIs all the time and this is a public page frequently and then when we finally did get the list of toilets from the RTI from each of the zones we found that every single zone had listed a different number of toilets than they had originally given us voluntarily every single zone and what we found in some zones was that they had actually listed less toilets and some were more so in some zones they had listed imaginary toilets in their original content we had no idea what was happening so then we decided to map toilets in one zone just to get an idea of what was actually happening on the ground and what we found was a further problem with the data which was that many of the toilets that were listed were either completely dysfunctional and hadn't been functional for years or that they were impossible to find or located in different locations whatever anyway we put together that report we released the findings and had quite a big impact in the city and we had good outcomes from it and we continued to be lobbying on the issue of public toilets and how they should be planned now why did I think that this story was important for this audience to hear I just want to draw from this three points which are important one is that just passing a law on open data is not enough we had the RTI and despite the RTI every piece of data that we have to get from the government is an uphill task it's a circadian task every piece of data and we spend most of our time trying to get this data from the government the other thing is that there's a culture in Chennai of fear among bureaucrats there's lower level bureaucrats fear that they'll be penalized for giving out data this is despite the existence of the RTI right so what we need is if we really want a future in which data from the city is actually publicly available for people to use we need to ensure that we can support lower level bureaucrats to participate in this culture of openness without being penalized from their superiors so we need more than just a law we need advocacy and support for citizens for good bureaucrats who are doing who are actually supporting advocates in their work the second thing is that I think we really have to be careful not to fetishize government data because of the number of holes that we've seen in almost every data set that we've put it has holes for many different reasons in the public toilets data set what we suspect is that many of the unlisted toilets or non-existent toilets that were listed were actually used as money generators for corporators who gave out contracts for maintenance of those toilets in other places the lack of data actually helps the government to be more responsive to people in situations where laws or the kind of bureaucracy to make changes to be responsive are very very difficult so for example we found that there was huge holes in bus routes data in the city and one of the reasons why we think that this was the case is because something that a government official told us there which is that changing a bus route officially takes quite a long time it's a request to change bus routes from residents, from politicians, from powerful people and just from people who want bus routes to be changed to accommodate their needs and so not writing it down and not officially committing to a particular set of stops actually enables them to be more flexible and more responsive to people so there are different reasons why this data is full of holes but the point is that it's often full of holes and so we have to be careful not to just take a data set and take it as the gospel truth and then the final thing that I wanted to point out is that just having open data or a law for open data or data sets that are publicized does not necessarily mean accountability the two are not the same so for example in Chennai if we had just relied on a list of toilets from the government what we would have had would have been a list of toilets and yes this list showed that there was a vast under provision toilets in the city but when we did the mapping what we found was something even worse which is that almost none of the toilets were located in the places of great need they were not located in the informal sector market areas there were obviously more toilets for customers to use they were not located in big bus stops they were not located near most of the informal settlements which had not been recognized by the city so that means they were not on any government map they were not on any government register so in order for actually the government to improve its performance of providing toilets to people we needed much more data data that the government would never have collected data on informal settlements that are unrecognized is never going to be collected by the government it's just not going to happen same with informal market areas maybe we know about recently just started doing some stuff with informal market areas maybe this is something that will happen in the future but the point is that just government data is not going to guarantee a means for us to push the government towards at least at the city level in terms of providing local services so I think that's all I had to say from my experience I'm going to take off from some of the issues that Mitya raised over here let me start with my experience is I actually talk to a lot of frontline bureaucrats the clerks the accountants, the engineers and I'm actually very interested to understand the way in which they decide to make decisions and then what gets recorded as written and what is not recorded in the first place so let me start with toilets data for instance so let me start with toilets data for instance in Bombay sometime around the early 2000s or the late 1990s the World Bank decided to implement this long sanitation project where they decided that they would fund the construction of public toilets and then NGO decided to implement these toilets but in order to be able to implement the toilets in certain places you had to first find out what is the existing toilet infrastructure and where more toilets were needed and what settlements were needed and how they had to be positioned so all these decisions with planning infrastructure allocation for that a survey had to be conducted now there was an organization named UWA which decided to conduct the survey in the first place and it was a very tedious process because like we've pointed out the data itself that the government gives to you is not accurate it's not necessarily complete it's got loopholes and so when UWA decided to do this survey right now with this whole idea of open data and it's really to do with with the fact that what is really the ground level situation we do not know and therefore what is the starting point that one needs to start with so UWA did this whole survey for a year and found out about where toilets are there etc etc and that was supposed to serve as a baseline for Spark to be able to go and construct the toilets now constructing a toilet is not just about using the data it's also about being able to generate consensus in different localities in different segments and in being able to say okay the toilet should be positioned here there should be so many toilets for women there should be so many toilets for children how should the toilet be constructed etc now those decisions on the ground that whole consensus process does not necessarily be aligned with the data that is connected because the needs on the ground are completely different from what the data necessarily shows it is at various oftentimes particularly in this kind of infrastructure and there are also different interests so one would say what are the kind of interests that somebody would have in having a public toilet it's the local leaders in the some segments who use the toilets to be able to generate revenue out of it and so if there is already an existing toilet situation I mean if there is already an existing toilet and they are already capitalizing through it then implementing our toilet from a developer perspective completely cuts into their own interests Shankar was sitting here and I am hoping that he will be able to talk more about what was interesting for us to understand what are the different kinds of interests at stake over here even in the construction of public toilets so when it comes to infrastructure there is clearly a lot of interest at stake and it's not necessarily only the corrupt operator or it can't necessarily be the corrupt clerk or the accountant or the engineer the interest span across different levels I mean in my own research on rehabilitation and resettlement allocation for the infrastructure projects one recognizes the corruption pervades at the level of the chief minister pervades at the level of the development authorities but often times that is talked about what is exposed is really this petty corruption with frontline officials so the issue that I want to point out over here taking from Nithya's presentation is that we do not necessarily know completely and clearly how is it that frontline officials make certain kinds of decisions when they have to allocate a water connection or when they have to approve a plan and I am not saying over here that one has to treat the frontline officials as whole year than but one has to understand what are the kind of pressures that they face not just from the applicant but also from the whole structure of the bureaucracy so it's completely true as if they pointed out that frontline officials, art clubs and bureaucrats and engineers are extremely skeptical of giving you anything in writing until the permission is not sought because the question is that if they give it to you in writing where will the accountability be will it be them that will be penalized often times it is them that is penalized when they are not necessarily the ones who have been responsible for those decisions in the first place and therefore what they have recorded and given to you as data taking from here I think the second point that's been of interest to me in my own research is understanding this relationship between data publishing something as data and this whole act of writing and documenting and therefore this whole relationship between open data and law where I believe that what is very critical is the moment if something is published online or is given to you in writing acquire a certain kind of validity a certain kind of truth value and it has a certain kind of evidence value at some point so somebody can say I got this written and one sort of traces also how the RTI validates the information that you get it has to be stamped, sealed and signed and given to you so that tomorrow in a court of law or any dispute you can use and say well it was officially sealed, stamped and signed and given to me now this relationship between open data and law is something that's been of interest to me now what happens when you decide to cull out certain information represented as data and publish it in a certain way what kind of legal architecture now develops around it how does the court recognize it how do you recognize how do you use that data in the process of dispute I do not have clear answers to this this is also my discomfort with this whole idea of blanket open data is that if you decide to go and publish information online or in writing or in any other form document it for how does it tend to get used how is it represented what kind of precedence is really setting in terms of law so I think this is a question that we always be aware of because it's not just a question that affects an informal settlement it's also a question that affects me as a property holder as a recipient of government services as somebody who has a particular register of identity as a Muslim and things like that and following from this sort of complex relationship between open data and law is also a question of how through this process of representing data or calling data in a certain way and putting it in writing how are we really also reconfiguring the identities of citizens in certain ways and I'm not necessarily saying over here in the sense of I'm not saying this from an ocean of paranoia but I'm trying to also trace back from the British period itself how different acts by the British is also categorizing certain kinds of land types or categorizing certain population groups of particular caste and this is not a particular caste and this is a certain socioeconomic identity how to historical precedence now play out in the present so in a certain way if one has to sort of make a very strong argument the current move even towards open government data can be looked upon as a continuation of the British project to govern subjects and to be able to create certain categories which is my entire let's take the example of crime data for instance I've done some research in Johannesburg in South Africa and this if one has been in Johannesburg is probably the most dangerous case to be in the world I don't know what this means but everybody wants to warn you and say how you will get raped and how you will get murdered and how you will get mugged at every point in time now what's interesting to me in a city like Johannesburg is this there's been a whole history of racial discrimination in the 1990s with the coming of the African National Congress this discrimination does not necessarily go away because it has a whole history it's ingrained in your everyday stereotypes even for me when I'm walking in a street my first fear is if I see a black person coming in and say who is this guy going to mug me who knows so this kind of also the kind of consciousness we carry into ourselves in terms of marking who is criminal and who is not criminal and then if one has talked about open government data or the context of crime data what are the areas that they have marked as having crime sports are these the areas that have been historically discriminated as black as poor I mean my own research in Soweto for instance showed how the whole mapping process really ended up reproducing certain parts of Soweto as being criminal and so my concern really also is that when we are talking about representation when we are talking about putting information in a certain way are you then continuing this historical practice of marginalizing various communities I think it's a lot of interesting historical literature which talks about how communities have been marginalized historically by the British by different governments and it continues in the present as a result of open data I think that's another question that one has to stay with that then leads me to the other issue of the data itself I mean I haven't pointed out this thing about that data can tell stories that meanings can be made of data the question also is who is making those meanings about data and I think that's also a question that we have to stay aware with and following from here therefore the issue of whether one can talk about open data only in the context of equality I think it's not about open data and equality but recognizing what kind of story is told and therefore what kind of leverage or position does it give to a certain group in society as a result of it so if one looks at politics really as this kind of shifting dynamics where I am able to generate a certain meaning out of a story and therefore use it in a certain way at a certain moment in time then I think it's interesting to sort of then understand and even assess or whatever we want to call as impact of open data so I think this notion of trying to see is something that we really now need to kind of move away from and look at what really how different groups are really using this data how are they accessing this data and how are they then using this data to represent themselves to give a very quick example before just the last couple of points I was working one of the research that we did was to understand how ICTs are positioned now in helping people to access information and we ended up researching this tribal group which had a whole history of being a commodity so they would keep trying to and for them communication was a very critical thing how do you communicate and pass information to each other when you are trying to have different points in time and interesting how they've taken to mobile phones as a result of this they use mobile phones as a historical practice now of being able to transmit information they're currently embroidering a sort of land related conflict and for them using a map to be able to represent and tell their story to the world is also a way in which to push it with the government and say that this is our situation and to be able to build that kind of a support I'm not necessarily again talking about here in terms of a certain positive impact but just to sort of make a point that there are certain ways in which different groups decide to use data in a certain way to get certain kinds of disabilities in certain moments of conflict in certain moments of negotiation with the state so following from here just a few last points I think that one has to be aware of the fact that data is not necessarily neutral data is political even putting up information about land related values has certain kinds of economics at stake has certain kinds of politics at stake and I often know that a lot of people talk about how we need to make this information about land values clear what happens when you make the information clear in the first place is you reduce the perception of risk therefore and therefore that completely even escalates the value of the land on the ground even if there's not necessarily that kind of infrastructure at face over there so information is political in the sense of who's using the information, how they're using it information is political I don't mean to say okay now we need to stop all this and end the camp over here it's not the case but it's a recognition of the fact that at no point in time data is neutral the second point that I want to make over here is whether it's really a question of accountability which Nithya also pointed out in terms of how data is represented who gives you this data how do you use this data I think the question is not necessarily one of accountability because the accountability question is skewed over here in the sense that if somebody gives you something in writing is it them who's responsible for what has been given to you when the decisions are not necessarily made by them I believe the question is really one of responsiveness in terms of how can you now use this information to petition different avenues to be able to get what you think is your claim or people to get something which is your entitlement so I believe there's a question of responsiveness a question also of trust in the sense of will more open data really enable the climate of trust I'm not sure at this point I think it's a question that we need to ask you and I think the final point really has to do with this whole life of data I believe that there's a short life of data there's a long life of data that is that what reveals to you in the present immediately may not necessarily hold true in the long run let me give you a very quick example I worked with an organization in Bombay called Praja and we developed this online complaint management system where we would collect a lot of data about complaints that were registered around civic issues in the short run every month we would download our MIS reports and send it to NGOs and to advocates and we would send it to citizens and all that it showed is municipality is not responsive they are corrupt, they are not responding to complaints they are saying complaints are a hazard and it's closed and this and that so in the short run it became a conflict it sort of became perpetuating the kind of conflictual relationship between the citizens and municipality in the long run that is about 3 years later when we started to look at the whole trends in the data one thing that we found was interestingly a few departments in municipality actually learnt through this data to arrest a problem before it became chronic so for instance there was a pipe that was repeatedly sort of giving a leak now they over a period of time realised that this pipe is sort of repeatedly giving a leak there is a problem and this problem can become chronic in the sense that there could be a pipe burst etc we found that the municipality some of the departments in official were actually using that data to arrest a problem before it became chronic in the first place and that we could only determine about 3 years later I'm also hoping that Anand can sort of respond to this in your own experience in the sense of this long life and short life of data so I think with this thank you guys so much that was great, not even knowing that it was fun but I kind of want to hear from the group if you guys have any questions or anything catch all these issues that you described in the beginning about data what kind of impact it has in terms of the argument to make with the data and if it becomes public can somebody create a story around it that may or may not be true but doesn't all of this already apply to journalists or pseudo journalists doing blogging I mean all you need is a small piece of evidence so data just becomes another argument that must be dealt with almost exactly the same way so I I will give the point I'm going to answer by thinking a lot let me sort of try and answer together I have largely worked with open government data in very specific experiences and very specific sectors I believe that there is the other lives to be done for instance one of the things that fascinated me was also how certain kinds of location data is now used for instance what JustBooks is now doing with their data in terms of nothing not just in terms of what are the best but also what are the kind of books that people want to really show and necessarily feature the best set of books or even in terms of like finding out different areas what kind of languages are used more in the neighborhood books to a library have to stop so I believe that there are different different lives of data not just in terms of short life and long life but different kinds of data that are circulating around at this moment at least my own familiarity with open government data which is why what I want to do now is to understand data in different spheres in different aspects because there are also business that's seen as a kind of opportunity in data and what they categorize as data so I want to sort of flag this issue over here that what we are talking about over here is the sphere of data but there are other spheres in which data is being generated and that can have different kinds of implications in terms of what I have just said over here but I'll talk to what applies only to open government data the second issue really is about what you pointed out in terms of who's making meaning and streaming as evidence and presenting it out there that remains an issue and clearly at this point I don't have a response in terms of how is it that we should deal with it but one of the issues will be the same way as we deal with say a blogger taking a small piece of of this information and creating this program so I think really at that point it's also a matter of like who's kind of using that information also in particular kinds of ways so for instance somebody sent me this presentation about the MAP Kibera project where she talked about how this information is actually being cut out from Kibera which is in Las Lamentinas and is represented online to be able to create this data in a certain kind of way but that information are not necessarily generated by the community out there which is what she said it's not that the people in Kibera are participating in creating that information but the information is out there and there are ways in which you can read it and I can read it and think about Kibera in certain ways so that remains and I don't necessarily have a response to your question I think what we need to stay with and understand whether or what is projected is really evidence in the first place and therefore I'm raising these questions about how we really don't know how decisions are made in the ground and therefore how do you now read that data against the grain of the ground over here so I think that's really where I come come from this is one of the differences between data stories between data stories and generalistic stories which is that the data can be made open which allows other people to use the same data to come up with their own stories and sometimes for instance, I've seen one story of migration data which I identify as knowledge because of migration but then further analysis then shows that that has long been inclined now the latter is something that came out of subsequent analysis of the data which was not carried to the original story and then somebody else would then have the opportunity to say oh if they got up by age then that really is continuing correlation versus possessions and then there are problems over data analysis and a lot of people misinterpret the whole this because these two lines look the same they're a cause but if the data is provided that's the opportunity to take a little bit of pressure regarding this regarding this thing about open data and issues of profiling and labeling I can also think of a concrete example where the entire discourse on dataization Europe and how the public data which was available allowed scholars to question this concept of dataization to the level of multiculturalism that exists in ghettos and the outside that's more ghettoized than the inside of ghettos so that's one point and the second thing is the entire discussion we have had I mean we do mark out the state in terms of the state being the repository of data and the struggle that we have with the state but just a month back private cooperation like lcvr came under severe censure from the entire scholar the academia but they boycotted lcvr because of the way they were blocking access to journals if you try and access a journal paper on lcvr nothing less than $35 of paper and that's a hell of a lot of money we're talking about for anybody who wants to access it and some of this data is much more reliable much better than the government data we are all dying for so this entire discourse of the private data that's there we have to get immediately this is related to the making how do we let's say predict some of this time so there are three aspects I guess one is collecting the data second is analyzing and then third is visualizing I think you guys are very good at it so I think apparently we are working on a POC with a water leakage data now using this data I think it is possible by using some of the techniques analysis to predict some of these stuff like where could there be a leakage and then some associated data which comes along with this just the pure leakage data type of the pipe how old, age of the pipe and so on and so forth so I think all this can be put together to basically do some of the critical analysis just to have a point of response I was recently taking classes on the economic history of India which kind of understands the whole pattern of industrialization one thing that I find that could be very useful in open government data advocacy is also to be able to trace the whole history of of data if you are sort of and I don't mean like all data but even if you are working with water data land data particularly is extremely historical how has it really been from the past also how are the different regulations and all changes as a result of which certain kinds of officials and bureaucrats make certain kinds of positions and therefore what has really come to be in the present I find that that historical perspective is strongly missing at this point in time and the reason why I feel that it is important to go back to it is to be able to understand how through the past the present has come to be the way it has to be because there are certainly continuities from the past so I find that I was sort of raising this question by an economic history one of the ways in which you sort of present your research by doing very thick works in archives where you look at a lot of past material and then talk about how are different points and different groups and individuals and institutions and made certain kinds of decisions now if one has to then compare this with the methodology in economics how would economics then use the history of the past to represent the present so I think there is a methodology issue also with here and it didn't flag it because I mean I didn't want to sort of like keep going on in different directions but this is how they are also been thinking about in terms of what are the methods in which you sort of like read the contemporary, read the present and also how do you assess what you think is impact because clearly it's not necessarily just one form of impact there are different groups that are sort of using the data in different kinds of ways for instance in the research that we did on ICT kiosks on the Outskulls of Bangalow there was no one way in which people had benefited through the digitalized land records different people depend on where there are also economic locations where the data was correct because he has to then use this information to make certain kind of changes in the record if the digital information has record his name incorrectly or his measurement incorrectly for him he would say it's injustice the new digital kind of system so I think it's also useful maybe in the person there also to have a certain kind of discussion around what do you see as as being impact of this open data on the ground also what are the criteria therefore you use to assess what you see as impact I think this is going to become a critical issue as time goes on because then there will be questions of sustainability of open data projects why should you find a certain kind of open data project if it's not producing a certain kind of impact I think these are sort of related issues yeah the point that you mentioned on the predictive analysis that I'm trying to see if they can predict what the energy usage will be and to look around some of this of course requires a bit of sophistication that will be on most in general beyond most people so for instance we found that after having all of the basic models on predictor it's a simple test predictor for tomorrow's energy usage at 2 o'clock it's simply to see what is today's usage at 2 o'clock that's how to have a predictive accuracy of your error rate to be less than 2% which is something that they don't know for a long time but then what we are looking at is trying to see if we can take a subset of that data open it up as a computation see if anyone can come up with a better predictive model and that's what we are trying to do on websites for instance like that allow people to publish data sets there are enough education institutions that want their students to work on such data sets and people in the utilities in the government are willing to open up some of these data sets to make this sort of thing possible I suspect that it's partly about sharing within what can be done and how it can be done okay we are on time so I would like to continue that trend but I just wanted to say thank you for they raised a lot of really good questions that we should think about for a day as we talk about the work we are doing but also what is behind the work and what is the point of the work and I really want to emphasize the idea of the informal versus the formal in this country which is the immoral world here is very strong and data is a formal thing so there is this discrepancy that exists here and how can data help and how can data not help and I think that's a really good topic to kind of think about as we go through our sessions also that data is not a silver bullet for a lot of things and I think that has been a good thing to kind of note but it's one of those things that's a good point to start at and I think what Anand was saying and Nitya and Zeno said you have to not take it as the golden truth but use it as a point to say okay what's actually happening and how do we figure out what we need in addition to this data to find out how to make impact into improved lives that we are living so I'm going to close this panel if there are any questions please save them for the end and then for stage and we will go to the next I'll take a little short break we'll go to the next panel