 And I'm really glad to be here. I'm sure that all the human beings of the senses of India are the largest senses in the world. That's not just because we count 1.21 billion meters, but also because we count them. It's 2.7 billion functionally, which is probably larger than the population of a lot of countries in the world. Plus, we have two full rounds. The first round is the houseless thing. The second round is the population enumeration. Two full rounds in which enumeration go house to house, all over the country. And the entire data collected in these two rounds is fully processed. 100% of the questions, 100% of the questions, are processed to deal with the senses' results. It's not just the plain sight. It's also the complexity of the work that we do. We have 35 questions in the houseless thing. We have 29 questions of the population enumeration. And because the country is so large, the shawoos have to be printed, filled, and processed in as many as 16 languages. And the training manuals are made in as many as 18 languages. So how do we go around doing this? How come we are able to give you so much population very many weeks after completing the census? Of course, we have a long history. We've been learning over time. It's the first time we did this census in 1872. But there's something more to how we do it. It's the basic principles on which we do the entire work. That makes sure that we give you food data, clean and reliable data. The most important thing that we have is, of course, the fact that we have the census act. The census act 1948 and the census rules give us the legal backing we need to make sure that the census is really complete. Because by law, every person living in the country has to answer all the questions with the census. But it's just not that we're forcing you to answer the questions. It's also that because it's a meditative thing, we have to ask you the questions. You have to give us the answers. So what happens in that is that the data that comes in is new, it's not colored by any imagined benefit. Because if you do a survey asking people if they have X item in their house and people believe that if they say, no, we don't have it, the government is going to give it to them to make an image that the data is going to be very clean. But the census is just something that is, it's easier. You just have to do it. So no one is going to give you wrong answers if they don't have a motivation to do so. That is the most important advantage of having a legal pattern. So while we force you to so-called, we manageably ask you questions and ask you to give us the answers. At the same time, we give you a complete guarantee of protection of your privacy. It doesn't act all the answers even if the census are confidential. What does this actually mean? It means that no individual's data is ever given out by us. It's not given out to the public. It's not given to anybody who asks us. And it's not even given up a quote of law asks us. So that data is completely safe with us. And any data we do give out is anonymized. We remove all the names. We make sure that the granularity of the data that we give out is of such a scale that through geographical details, also in the other personal details, there is no way you can make out that individual person's data. In the census, it's understood everyone has to be counted. What we say is that everyone has to be counted without omission and without application. But it isn't as easy as it looks because there are always some people, some groups, which are more difficult to count than others. Very remote settlements, remote islands in the undermars, remote villages in Arunachal Pradesh. And may not be very remote. There may be people living right in front of you. There may be a servant living in a house. The narrator goes to the house. The servant gives the narrator water. The narrator forgets to ask whether this person lives in the house or not. So there are many reasons why people are vulnerable and why they are difficult to count. So for us, the first priority is to ensure that all the vulnerable groups, we are aware of who they are and we know how to bring them to the center of our consciousness when we are doing the entire planning and preparation process. And it's greatly helped when we do this because we have a very open way of preparing for the census. There's nothing secretive about the way to do it. About three or four years before the census starts, we start asking data users, we start asking demographers, planners, what are the questions they want us to ask? What is the kind of data they want us to give them? Without that, we don't finalize the questions. And secondly, in the entire training process of the enumerators, the publicity we do to reach out to the respondents and the field monitoring we do to make sure that the work is going on properly. We always involve the sectoral NGOs, NGOs dealing with the houseless people, NGOs dealing with gender, NGOs dealing with persons with disability. They're all part of our census process. And there are the UN agencies because the UN agencies prescribe the standards of the census for the entire world and they always work very closely with them. We have to just go and ask the question and get the data because there are always biases that you and I cannot even be aware of and we hold them. These are the biases that society just carries and we have to be very careful that the enumerators shouldn't have these biases and the respondents, if they have the biases, the enumerators should know how to overcome them. They need specifically to ask each and every question for each and every person living in that house. So suppose the head of the household is a Hindu. It is not to be assumed that all the members of the family are Hindu. You can't just write Hindu and Hindu, Hindu, Hindu, Hindu. We have to ask for each person. That's part of the training. And similarly, for the respondent, you have to ask in such a way to overcome any bias that person sees, to be shown. The time for which the census work goes on is very limited. It's three weeks for the field work and then five days for the revision round after the census time. So if we find out that we've messed up a month later, there's no way we can go back and set right what's gone wrong. So we must have perfect and continuous feedback loops. For the collaborators, we have a whole system of supervisors and their superior officers who are always in touch. So that if the elevator has a problem in the field, the problem is solved immediately. And if the elevator has a problem with skills, if he or she is confused about some questions asked, she doesn't know how to fill a form. Then she has access to the person who trained her, the master trainer. She can talk to the person right there on the spot on the phone and know how to fill the form. Similarly, we need feedback from the public because any area is left out, any person is left out. If there are rumors floating around or if real mischief is happening somewhere, we must know about it as soon as possible. And for that, we have a very broad-based system of feedback. We had the toll-free number this time, which was highly publicized. We got a lot of response on it. We have health tests, the testing offices, the rural areas, and this year, especially email, Facebook, and Twitter were a fantastic source of information and feedback because of which we could get better data than others. Now, I'd like to talk a bit about the processes we follow. The first step when we have to count everybody is to make sure we know where they live. And that means we have to know where every house is, not only physically where it is, but also which administrative unit it falls in because that's how we need to count the number. And for this, we first update all our maps. So we have with us shapefiles of all the villages in the country, and all the administrative maps built, are built in aggregates of the shapefiles. Also this time, in 33 capital cities, we had a special project of building-level satellite-based GIS mapping. The maps were taken, the surveys of all the roads was done, all the building numbers were applied, and on that basis, the innovation blocks were cut in these urban areas. Apart from that, we have this very old-fashioned but extremely useful method of mapping. This is an A3 sheet given to each and every enumerator, and the enumerator while doing the field work makes a hand-wrong map of the area, showing the landmarks and showing each and every structure. And this turned out to be extraordinarily useful in Uttarakhand because in the last two, three weeks, the disaster management authorities have been asking us for the maps of the villages in respect of it. You see what are the structures that were actually used in earlier, and this is very helpful today. On the basis of these geographical administrative units, the latest maps, the innovation blocks we cut, we create this code, and this code is the key for the entire system of maintaining the data. This is an hierarchical code which needs to write down to the households. That's from the states, the districts, sub-districts, south of the rich, ward in the case of the town, the innovation block, and the household. So each and every household in the entire country has a unique code. This is how we ensure that data doesn't go or write from one place to another. Regarding the questionnaire, there are two aspects to the questionnaire. One is the design of the questionnaire itself. How are you going to phrase the question? What is the answer you, what are the kind of data you want from the answer? And the other side is, how do you design the shoe itself which the enumerator is going to fill? Regarding the questions, the most important thing is that because we have a whole string of censuses behind us and the data has to be comparable to the old censuses, even if we ask a new question, as far as possible, if the same data is being taken both times, the question should at least be compatible. Even if you're asking something extra this time, you shouldn't be losing out on the data which was already been collected earlier. The questions should be in a sequence which is logical, the least personal questions come first, and the most personal questions come at the same time. That is why in the population innovation questionnaire, the personal questions come right at the end. And the phrasing of the question is very important because it has to be gender and class neutral. For example, if you go into rural North India and ask, which is a good chance the man who's responding is going to say, the person may just forget that that's the way the language is in those parts. And he'll say, you can only think only of the voice. So the enumerator has to ask the question, how many children do you have, how many daughters do you have? That's how we ensure that you get the right answer. This is an example of how we design the question. In the last census, we got several thousand different responses to the question on religion. There are all kinds of denominations, all kinds of strange names, the names of their gurus, sometimes their own surnames, all sorts of responses came to religion. Nevertheless, 98% of the data fell into one of the six major religions. So our data processing guys are happiest when they only have to look at four. You have male, female, other, gender, one, two, three, so just add it up and there's the result. So the more codes you have, the happier they are. And the more codes you have, the more unhappy our social studies guys are. So then you're losing out in all the complexity of that particular subject. So this is the compromise we came at. The individual is asked to write the name of the religion in full in the long hand. And if the religion is one of these six major religions, also provides a code. If it is not one of these six, then the code column is left blank and only the name of the religion is given. And how the two, the codes and the long hand thinks how their process has explained a bit. The shriyum itself, the design of the questionnaire is very important. The most important thing is that it should be clear and simple even though they're asking 29 questions, the enumerators after all might need to be on the level of a friend's school teacher. The enumerator has to fill it up. And the respondent, as far as possible, should be able to read it and understand it without much difficulty. So it has to be designed to be easy to read and it's the right sentence and clear. It has to be a very good quality because it has to go to every part of the country. Fill it for three weeks and go around with the heat and dust and it will come back all the way then it will go into the scanning. So it has to be a good quality. After that it's even stored for 10 years before it's destroyed, we won't make sense of it. And then the design has to be right for scanning an independent character recognition which is the way we do our processing. So this is what I should even look like because a lot of designs we accept are very interesting. The first thing is that we use a color dropout because in 2001 we had done the scanning and it was a usual black and white form with the black boxes. And the lines in the boxes interfered with the marks made by the enumerator. So this time we had a color dropout and that we will go out and only the marks made by the enumerator will be visible in the finals. The unique location for the right there at the top where it's very easy to scan the forms properly according to the case. Side A and side B is written as clearly as possible and there's a little cut at the upper corner just like in a SIM card so that when you start the forms and you just look at them you can see if you've stacked any of them upside down. So that was very useful in packing and also later on the scanning stage. Incidentally this design has evolved to help the National Institute of Design and frankly I think that's a great job of the form. The intelligent character recognition that our software does is only of the numerical characters and this is how the numerical character comes to be written by the enumerator. This is in fact put right from top of the form so that the enumerator remembers it. For example the one should not have a pink otherwise the software will not recognize it correctly. So the first step in the processing after all the forms have come back is making sure that they're stacked correctly in the right areas and not upside down for the scan. The scanning is done with high speed duplex A3 scanners. The paper was exactly A3 inside 19PSN. And there's a bar code, say a list control system for making sure that all the scan happens correctly. This is what the color dropout form looks like after the scanning is done. And before the actual character recognition takes place each of those images, the vector images have to be matched to template so that the right data is moving to the right column in the ICR. And this is how the ICR is done. If it's a long number, it is first separated into individual images. After each digit goes into the software, the name of the software is e-flow. Since it's very software-use, use it in the last instance also. Each and every character goes into the recognition. And after that all the characters in that batch when it's time is a one. They're put into this kind of time screens. And our data entry operators were highly trained through this work. Just look at these screens one after the other. So this entire thing has already been processed by the software. They again look that by the operator. So even if the normal is like this, which better play even to one, this is picked up by the operator and sent off to the exception. All the ones, all the twos, in fact the entire data goes in front of the operators also. It's not that completely vital to the time. After all the exceptions are clear, the error rate is less than one percent. And now I come to the other kind of processing we do. Because everything, like I said, cannot be coded. This is the way we do when we have a long-hand answer. In the last senses, like I said, we have thousands of religions. All those religions we have found last time, they were given particular quotes. So that library is already available of the previous quotes. And when you have operators looking at a long-hand reply, the drop-down menu is available. A few digits are typed in. And from the menu you can select, many of them are just between what is there in the screen. If it fits, then the particular quote already available is applied to the response. Sometimes it's possible that it's totally new to which nobody's ever heard of and it's not even reliable. In that case, the operator gives it a new quote and all such new quotes go back to the field offices to be verified if, in fact, the operator may not have full knowledge, maybe the senior people know better or the team people know better. So all of it goes back to them to find out if really there is something like that where we were field or it is some kind of mistake. And after that kind of data cleaning, this computer-assisted coding is completed. So when I say that religion data, the other term data is going to take a lot of time, so please don't think that I am trying to make excuses. This takes a lot of time. And the most difficult of all of music, long-hand answers is the classification of the workers. Because the NICNC classification is like something like 60 categories and it's very difficult to know. For example, you have a journalist. Somebody likes journalists, somebody likes media, somebody likes press, somebody likes reporter. And so the operator has to work out that all of them fall into the same category. So this coding is really hard. That's why I would like to give the division to the audience like to avoid this. After that is done comes the time of cleaning the process data. The software used is IMPS, which is your space, or CSPro, which is your space. This is software from the US Census Bureau and it's free to use but it's not open source. And these are the checks we have to make. The first is covering checks to make sure that the wrong liquor is not going to do the wrong immigration block. No area is left over or no area has been counted twice. It's a very tedious job. It has to be done through the code. It's also to be done manually. Then comes the consistency checks, which are again a very lengthy exercise because there are all kinds of edit rules we have to apply. And these edit rules have to be evolved out of experience and logic. For example, if somebody is voting in as illiterate and definitely in the education level, you cannot have an SC. So all these things have to be written down. Before this particular age, there is no likelihood of having to choose a person. If this person is real, this person cannot have something written in the facility model. Similarly, all the edit rules have been developed. There are hundreds of them and new ones have to be developed all this time. On that, this is the consistency checks are done. Then comes the issue of imputation. The one thing we don't do is idle imputation. If a person has not been returned in the form, if an area has not been covered by the simulator, we just leave it alone. And we declare it that that area has not been covered or that illiterate has not been covered or there are no more persons in there. For example, in 1981, we could not do Assam, Assam is disturbed. In 1991, we could not do Kashmir. In 2001, some sub-districts of Manipur gave such bad results that we just had to abandon them. And I'm very happy to report that it is in this sense that after all these years that we have covered every single inch of the country, there are only a very few villages in some maximum districts that have been left out. So we do not do units imputation. On the other hand, we do have to resort to idle imputation often because suppose gender is missing a particular quality. Then there are three or four ways in which we can decide what gender to put in there. One is that maybe there are some and there are other relevant columns, for example, gender is repeated in the literacy column. So maybe if you spilled it up in the literacy column, you can pick it up from there or we can make a guess from the fertility column. If it's not available at all, then there are two ways of doing idle imputation. One is called hot deck, one is called cold deck. In cold deck, what you do is that you look at the previous data. You can look at the data of the previous census or you could look at the data of the batch which has just been completed. And on that basis, you do the imputation. So if an area clearly has 70% literacy, then all the missing entries can be included at the rate of 70 to 30. Or you could do hot deck imputation which is picking up a value from within the matrix of the same batch. After that, all the damages are the longest and most serious work, I can tell you. There are anything part. Then comes the tabulation. The tabulation plans are always published in advance for the census, which are the tables that we are going to release because it is on the basis of the tabulation plan that the questions are in line. So in the Indian census, every single question has its own univariate table. And those tables have always TML, that's total male female. And wherever applicable, there are also TRU, total rural urban. Apart from all the univariate tables, we also have some multivariate tables, age literacy or education level literacy and so on. Apart from that, we're always ready to develop customized labels on requested data users. And also we recast the data of the previous censuses on the basis of the administrative boundaries of the new census so that all the data are comparable. These are the final tables that we publish. The house listing table, there are two series left for our house, a church for household. The population enumeration, these are the series. A is population, the numbers. B is workers. C is social and cultural tables. B is about migration. F about fertility and the NC and SD series. Apart from that, we have some beautiful addresses. They might take as well as administrative. I think a lot of you like to know exactly what is our data dissemination policy. Yeah, about the copyrights. You are free to use our data as long as you give us acknowledgement. On the other hand, please don't try to resell or redistribute our data unless you're doing some value relationship to it. That's our data use policy. I'm sure most of you have seen our website that we have not. Please, please do see it. There's lots and lots of things on it. And there's another product that I like to talk about called Censors and Forks. It's a data by Therian Fork, an organization which works within itself. What they've done is they've created a software which can freely download and which contains also the data sets themselves. All the data that has been released till now plus all the shapefights. And on this basis, you can develop your own tables, your own graphs, and your own maps. You can even move your maps and superimpose your own data. It's just fantastic. Please do look at it. You have no kind of the data that is available on the website if you contract copyrighted disobedience operation. There's one in every state capital. Plus, we're doing something different this time. In around 15 universities across the country, they were able to set up workstations this year in which the entire data sets are available. Plus, microdata as an anonymized level is also available where you can make your own tables and so on. There are two which are already set up, one is the GNU and one is the Punjabi University of Jala. And I'm sure that you are set up here. These are our print publications. The new ones are already all available online, so maybe they're not that interesting. But I think the very old ones are very interesting because we have these books from 1872 onwards. And now we're printing the digital archive of these books on microfilm, and they should be available online too within a few months. These are the three data releases that have already happened in the provision of relation totals that I was listing as the primary census abstract. And soon we'll be coming out one by one with age-grouping, visibility, religion, migration, mother-in-law, virtues, and many more things too. I had a question on the funnel. I had a comment asking me, talk about the utility of census data. And well, everybody knows this. The elections are held on the basis of the constituency, and the constituencies are on the basis of the census data. The elimination takes place on the basis of the census innovation laws. And the reservation of the sheets also depends on the census, so this is really important. That all other surveys, almost all other surveys, face this sampling frames on the census innovation blocks. And that's why I think we're expecting new sort of sampling frames very soon. Planning commission, of course, it depends entirely on our data for protections, the resource requirement, to check whether the votes have been achieved or not. And there are three small examples, specific examples of impact to census data that I'd like to talk about. There is this map here. This is showing the child sex ratio in Punjab in 1991 and 2001. As you can see, in 1991, it was middle, it wasn't very bad, it was sort of to the late eight seventies. And it was bad, but not horrible. In 2001, it was less than 800 in all the streets. So it was really bad, and I think the government also had no idea that it had gone so bad. If you hadn't done the census, they wouldn't have come to know. But on the basis of that, they did a lot of micro-planning, they worked very hard, they took pains to work against the audio sound and so on, and they managed to improve the child sex ratio from seven-hundred to eight-forty-six in ten years. Mandel's committee is something that most government agencies would rather say that, no, it doesn't make this to another area. But when in the census data, small but definite numbers of Mandel's committee did come out, the government has made a special policy to ensure that Mandel's committee used to be looked at out. And one more example I'd like to talk about is the electrification of villages. In the 1991 census, a lot of villages were shown as electrified in the... When the data was seen, only one house in those villages was electrified. Then it was decided that if only one house was electrified, there's electricity, how can the village be considered electrified? And the definition of electrification was changed to ensure that at least 10 houses are going to have electricity, only then the village is considered. So this is how census data helps at a very large scale and it also can help in a very small scale to make decisions. One of the big problems is like handling. Yes, I have a small question about storming. But in one of the initial slides, she showed us that you actually did a new machine draw a map and you did not take any effect. So do you think those numbers and templates that we are selling them is a policy for that, or are they completely wrong? You have to count each and every house wherever you can stay. Of course all the slums are counted in great detail. In fact, this time we had a special policy for having, even if the normal elevation block is around 125 houses. But this time we were given a special instruction if there was slum, even if it has something like 40-50 houses, it has to be counted in a special elevation block, done separately. In fact, we are using separate tables on slum data in all these different areas. You mentioned that there are some 27 billion regulators who do the census. And you also mentioned that no data outside of census scores are on the mind. Are there any policies or checks and balances to ensure that the... Ensure that the data doesn't go out, is that it? True. No, I want to know that how do you ensure that the data inside the census or the management is also sort of analysed because what have you done in a huge number, as you mentioned? Yes, the 2.7% of the functionaries are not... I wish they're not our own employees, obviously. Our own employees are our own employees. Are there other government functionaries? Yes, we have very high security protocols within our office, all in our office. All of them are under discipline. All of the forms are always moved purely in the country. Hi, man. When you have worse registration and you get the registration, why do we need census? The process for the registration in the middle of India and census commission, that's because it's also supposed to be registered once it gets to the good country. But he doesn't, of course, badly control the work. The work is done by the municipal bodies and the district authorities in the states. And unfortunately, worse registration is very far from 100%. There are 3 or 4 states where it is nearly 100%, maybe the southern states. But in a lot of the major northern states, it's like 30%, 40%. So it will take time. It will take time to make sure that the data is very good. Yes, the national population registered is also something that we are doing. We are ultimately hoping to have more than their registration also covered within that and maybe over time, that data will be perfect enough for us not to have such a detailed census. On the other hand, the census is not just about the counting. It's also about a lot of solidary things which you will not come to know unless you actually do it. So overall, I think doing it once in 10 years is not that bad an idea. I will go to your number. Hi, back here. So my question is, so are there also plans for doing sort of, I don't know about it, mini-census which you can do not by sending out person, but maybe by calling or sending SMS or something. I mean, I understand that utility of the data, I mean, there would be a lot of noise, but between census and all, I mean, what you do, those kind of things, I mean, is that a good idea? On the slides, us and the AHS are things that we do continuously. SRS is called the sample registration system. Like the other gentlemen asked about what is registration not being sold right in the country, but we need the mortality figures and the birth rates, we need it all the time. So the SRS is a survey that we do which is based on some enableration blocks all over the country, carefully the sample is selected and the data is collected twice a year. Similarly, the annual health survey is something that we've been doing for the Ministry of Health for the last couple of years, in which we do a lot of detailed nutrition and health-based surveys So yes, there is a method of, you know, sampling the important data on a continuous basis. And yeah, regarding picking up things through the telephone and so on, it will definitely be moving towards that sometimes in future, at least for some kind of service. Yeah. I think in the light of all this, all our data seems to be like small. But I just had a question around, so you mentioned, in this interesting case around, you know, from the policy perspective, so you could add specifically about the kind of challenges or the protocols that we have seen to work, when you are actually using this data to push policy makers to take some decisions because that is the value of planning for more data if you would, right? So if you could add something on that. I just bring the data and keep it out there to look. Here is what he wants to do. So the pushing has to come from the linear effect always had remarkably good response when we come out of the page. I'll give you another example, which was a Delhi government. In my Delhi data, I found out that the number of our own cities' kerosene is very, very small compared to the cooperation between Delhi's scheme, which people would get subsidized in the GD directions. So I'm sure if you do it one year later, you'll find that Delhi has no kerosene at all. So the question makers do take our data very seriously and they can see a serious problem happening in the detection. That's been our experience to say. I mean it doesn't go on top of this. Hi. Two questions actually. One around, are there any is there any connection with our heart in this sort of... Can you require the other slots? Two, are there any corporate users for this? I mean I can see more people would want to that sort of feedbacks. Or is the policy for that? I mean we don't have a separate policy for corporate users. I mean there's a certain vulnerability in which we are willing to give you data in any form you wanted. Anybody who asks for it is going to get it. That's not an issue at all. And regarding our connection with Aalha, I'll try to say very briefly we have this project called the National Population Register. That's not under the Census Act, everything is under personal confidentiality. That's under the Citizenship Act. But we base it on the same unification code. And what you saw that code is sent up to the household. In the case of the National Population Register that code will come to every member of the household. So everybody is that and we know that a particular area has been fully covered and we have the data of all the people living there. Of course none of the data is really confidential in nature. It's the kind of thing you'll find in the little walls or anywhere else. So the piracy issue doesn't come there. The piracy issue does come when we talk with the biometrics we collect with it. So what the National Population Register mandates is that we first collect the database which is already done with the first round of the Census and then we go around collecting the biometrics for all those people. The purpose of the biometrics is simply to re-duplicate our own database. And because the UID was already there this task of de-duplication goes to be done by then. There was no point in them setting up a win for it separately, setting it up separately. So this task was given to them. Unfortunately what happened is that they started doing this field work of biometrics simultaneously and that has caused a lot of confusion in the field. And it seems that we are already doing it and we say it did not look really done, it didn't seem to work. So we have to do it but be that as it may, right now what we are doing is wherever the our third has already gone and done a lot of biometrics what we are doing is we are going back in the field behind them and then we are picking up the Aukhaar numbers from the people so that the biometrics doesn't have to be collected twice. The National Population Register is a legal entity and it has to be made and it is of a degree of perfection which the Aukhaar database can never reach. So we will... Yeah, this is a small solution on the phone. I believe that the senses I believe that the senses connection stuff can be made more technology savvy the volunteers who basically go from home to home to collect the data probably those people who provided some way of technological equipment like a laptop or a tablet so that the collection of data and the the quality of data can be done more seamlessly and this can be done also in conduction where we use the UID part and the Aukhaar and to make the part more seamless in the office. So socio-economic and caste senses even though the word senses is there it wasn't done by us really and it's not really a senses it's a complete survey based on the senses spring and the tablets they use because we who recommended the tablets we use use analysis and we have been in fact going on for the last two years but the senses has to be done in a fixed time period so we have 2.7 million functionaries we need 2.7 million tablets that's not cheap secondly 2.7 billion functionaries have to be taught how to use them they haven't reached that literature yet but maybe next time next time it's very much possible that it would be done on the tablets Hello Ram it was a very useful talk in the island we talked about the senses so I have a couple of questions I understand as how much is the data type that we are storing here the senses have the data as about a million people but it has to be used data so how is it stored by data and is there any plan to use latest technologies to understand the data to understand co-ordination to understand data technology what we do is that we mainly use the software for CSPRO or IFPS and then when it comes to of course the archives are huge in volume or the scan form that's being taken out of space and the actual data it's all consistent as it is because it was done throughout all that years and it's a new hundred to be it's a big data in that sense and the first thing that's one reason why I was very keen to attend here very much to look into the data and of course more important I would love to disseminate the data as well so that people like you actually use it and you know come out with co-ordination whatever you say you said you don't make all the data available you keep the Instagram Instagram so that nobody can point out the person so is there any set of data that you would like that should be made open to people and researchers and so on they can find out good discoveries but you cannot do that because of some reasons that it can be misused so is there any set of data that anything any kind of data is prone to misuse with everything in the hope you never know what people want to do with it with one regress certainly I do have a measurement that the shape files are all protected by the survey of India so we use them we can't leave them out of the a lot of restrictions and coming out to India's data in the public so if you have that I think you can do a lot of things with it but we got you to do it more questions please take them offline thank you