 who is here today with us. He is a professor and department of computer science at IIT Robert and he is one of the most, I would say at least enthusiastic faculty we have on there. How many of you have actually seen or probably one of the course called social networks that we have? Anybody here? Ok. So, social network has been a course that he has been offering the last two semesters and the coming semester we are getting two more problems. So, one is going to be a 12 week long course on Python programming. So, it is going to start from level 0 or minus 1 probably and then get you guys to actually code in Python. So, that is something that we are starting and another course on discrete mathematics that is coming up which will be followed up by a detailed course I do not know levels 0, 1, 2, 3, 4 we are probably going to go in data science. So, that will be coming in January 2019. So, he is one of the really enthusiastic why I say is because he changed our notation of how we have to create videos to. So, last year when he came if you seen our NVIDIA videos he used to do a certain method and yes that was what we are offering. So, he came in and said let us do this in the background let us write this I want to do 6 minute videos I want to. He brought in a lot of changes amongst our thing also taught us quite a few things and that is what we are having. He is also awakening the faculty at the newer IITs IIT he gave us a contract. So, faculty from there are also offering courses with us now on the NVIDIA platform. So, we are very happy to have him here with us today and he is going to take the next session. So, hello everybody. I am Suresh Chauhan, IIT Dr. My interaction with IIT Madras will be there like I come here more often than what I thought I will be this year the question that was in 2017. It started with a phone call in the middle of the hospital office. It indicated with the fact that the moment I called they said come here make sure you can offer your course. I was just contemplating on offering a course they said come and offer your course and I will say to take the happiest few moments as a teacher has been my offering the course. So, now we are you have seen as a better teacher outside your compound of your premises where you teach that because generally the saying in current I don't know there should be a school and saying in your respective language as well. It says it means the dream in fact of your house we think doesn't have medicinal values but it is there in the cookies of Himalayas we think oh this will this is going to change you for the way it can awaken the dead right. So, somehow I realized that the sort of thing is happening in the corner of the hospital where outside I sort of connect well with you. With that I sort of even experimented taking some interns and some of it we connect really well when we are the more the different distance of the intern better is the connection. So, I am more part of people from this region probably because you are far away from the place where you can plan visit there we have a very dynamic internship program over there where it is a it is a complete package rather it is a very holistic approach towards training interns which it goes more like a proper VTech program for student of module programming courses you will be taking part in a many research project and there will be a whole lot of fact curricular and co-curricular activities. So, let me jump into the decision that we will be having today I have been experimenting on giving talks on different types of topic and my observation has been the following two things that I have observed is a talk that does not involve a lot of technicalities is safe because I also tried crunching data online bringing my laptop taking data something goes wrong as you saw many things going wrong in the previous talk that is first thing and second thing is without power point slides more I am very comfortable because I remain one of my professors who would say can you use the power point the point of power is the power point when you do not use one the point of power is used but then people tend to mistake me that I am under pressure so what I do is I flash my slides first all the slides I say I am not going to use slides and then I am going to switch to the world so I will be using world completely today for the whole of today's talk I am going to keep it very very low on the technical side so you are free to ask me questions in particular in fact most of the things that I will be discussing will sort of be question problem. Let me start with a nice question try giving me an instance of let us say some data which was seemingly useless but can go to be very useless it could be anything like let's say from your life or something that you have observed so far data that you thought was useless but can go to be very useless there is no profile I think it is very useful if you are a teenager the profile is of that of the opposite gender I am sure it is very very useful something let us look at it as a gradient Facebook profile is probably somewhere here alright let us go looking it up something that you think is completely useless I think that is very useful no so ATM and all this no use in getting the data right useless for me very useful for them let us say give me an example of the data which looks like it is completely useless I am going to give you I am going to come to that level where I am going to give you an example of that data which we think is completely useless but that resulted in a 450 billion dollar industry today completely useless whatever you say is actually this part of the spectrum right any data that you think is completely useless the time of us waking up because the company is made up bought out of a team meeting point so whenever we wake up we feed the time over there and they wear it tea for us half an hour after that it is a board ok so assume you have a board time it knows you are waking up time looks like locally useful between you and the board give me some information that let us say the number of whatever you think I will still say this is somewhere in the same spectrum as Facebook other linking there is a popular group number of words in a web page sounds like I may not be able to make sense of it still I would say I would say the statistics of the kind of words used in a web page might denote the quality of the content or the grammatical prowess of the writer I would say yeah it is indeed towards the completely useless side that is probably a good example sir sometimes the number of steps taken in a day number of steps taken in a day I have to go straight so the probably denotes your fitness levels or your activity levels maybe that is correlated to your BMI and things like that maybe people who wouldn't like all the fashion stuff that came and went fit with and sub-leger might probably want to make use of that I think you want completely useless the size of my shoes that is what is that useless let's say average of that will be very useful for a shoe manufacturer I am just trying to hook up I am just practicing devil's advocacy trying to tell you that it is actually useful call it in a record call it in a record call it in a phone call call it in a group call I have a number of heads in my head there is a nice study which talks about immunity levels that are correlated to the strands in the head of a particular age I am in my 30s it may not be very relevant for me but someone in their 20s is very relevant to them more or less from the length of my job from the ground from the length of my job from the length of the job I think you are the answer I think you are a sportsman you are talking a lot about activity based things I think this is all very important maybe couple of more points waiting time in a traffic okay so let me try giving you a couple of instances one instance and a second instance it doesn't look like data but was a very powerful implication point people observed that there was no crime in New York in the late 90s and they did not know what just happened at the place there was something positive happen many people come and they take a rate negative happen people sort of run away from there or points in there there was completely zero crime in New York people were wondering why exactly was it and people had hypothesis of hypothesis some people said it's because points cops are patrolling again and again and cops said that all prisoners are behind the bars and the church goal said that it's a gold zone country and stuff like that but nobody sort of knew the exact reason and this time there was an economist who said the place and understood something is in this place when he studied criminals he observed that criminals have two properties property number one is their generally teenagers late teenagers property number two is their mostly orphans parents and he goes around searching for orphan teenagers or naturally criminals he sees that they are missing in the late 90s and he realizes it's not criminals who are missing in fact a super set of criminals which happens to be orphans who are late teenagers they are missing very important data so it's not just criminals but the two characteristics of criminals which is missing and then he observes that in the early 80s exactly in 1989 abortion was made in New York legal legal connection abortion was made legal because of which people could go and get their abortion done because any questions are otherwise it was really difficult and then assume someone had wanted pregnancy and they had to give birth to the kid and they did not take care of the kid so this data point of how we backtrack we look at two criminals criminals are orphans and teenagers do you think all of them are criminals? not really majority of them are probably and then look at what just happened to them and you arrive at a conclusion look like seemingly irrelevant data but then you reverse engineer it found the answer now coming to this how to do the second point which is in this case it was just point 1, point 2, point 3 we got an inference looking at the following fact there is so many web pages in the world a whole lot of web pages it's like finding a web page would be finding a needle in the haystack correct? how do you find what you want one beautiful method that kind of requires a quick analogy let me just help you understand the analogy well where are you all from? connection? I mean you are studying internet ok so most of you know each other? may not be let's say let's do the following experiment I will come to your college which college are you from? SLM, assume I come to your college and take a random person's phone and check his address book right? and then I take another random person and check his or her address book do you think they both will have each other's phone number? very rarely they might have yes I am in the big college probably some students right? but then I want to ask the following question do you think you might have someone's phone number on your phone book whose phone book might have this person's number? little more probably than the previous question I will put it on the next level you know what I mean next friend I will check all your entries and go to all their phones and check all their entries do you see eventually I will find this person somewhere unless this person is an introvert and doesn't have anybody's number or nobody has his number correct ok a big question that people are trying to ask is can we crawl the web? can you guess what is calling the web? calling the web is find out every single web page ever present on the web right? this looks like not possible right? what if I create a blog somewhere and nobody listens to that does not listen to the blog so the question is this assume you have a whole lot of pages I am again talking about seemingly useless data a whole lot of pages and let's say there is a page here there is a page here some page on let's say diabetes and another page on let's say bible these two are unrelated correct? there could be yet another page somewhere here on some self help ok some self help alright how do I find all possible pages so that so that the question is this if someone asks me to give them all possible information on cancer that's available online I would like to give them imagine let's say we are in the 90s early 90s there is nothing called google background right? so how would we solve this problem? firstly we should collect all the information keep it somewhere and search through it for principles correct? how do I collect all the information? like getting everybody's phone number I don't have access to it so what I do is I'll go to him take his address book look at everybody's address there go to their phone book take their addresses so on and so on and so on that's how your true color works how many true colors? so true color is basically it's an application on your phone which will show you who is calling you even if you don't have their address that's because if you have installed true color you give it the right to take all the markings from number to the names everything is there with them similarly I can probably access the entire world's phone numbers through nexus but the question is I might leave out a few people I don't know but it looks like I have a feeling that I probably can access everybody's phone numbers similarly when I in this case in case of pages if I want to get all the pages ever present on the web I go to one page and then look at the links from this page and go to those pages and look at links from those pages so on and so forth with this I will be collecting I will be collecting all the pages seemingly this is exactly analogous to your phone book links I write a block and I write a link to your block and in your block you have a link to let's say Prime Minister's page Prime Minister's page is a link to some search barrage initiative so on and so forth so we get all the possible pages I collect them, keep them now, now is a big question as to what I collected all the pages correction is done I have all the pages let's say a million in 90s if I type the word Aishwarya there what does Aishwarya mean? prosperity or fortune Aishwarya it should probably give me the meaning of the word but what exactly do people expect when they type Aishwarya obviously so they expect to see let's say a pages Aishwarya, Aishwarya, Aishwarya, Aishwarya have high grade and stuff like that back in 90s it was also popular as it was if you do deadies so how would you give me the best possible page there that you are going to look out for that's a tough problem there are such issues like Yahoo, Alta, Vesta and all what they would do is they would create huge databases of pages this is the way I told you by calling, collect them or map them on which one is better for the word cancer you will get some thousand pages which were the top 10 pages that we must be displaying this book was not really a great one what if there is a new page that comes up it's been better than these pages the question of ranking these pages came up how do you rank these pages everyone by name Ramesh you will get 200 people by name Ramesh then I want to know which Ramesh I am going to look out for Ramesh was an engineer who is a computer science who I can probably take as a student is he really good or a student the question of ranking comes here and this is again where I would say the spectrum the data of which page refers to which page is completely unique what use that could be it helps me what use is it for me to see there is a key something we are looking at your phone book I can say how many related you have how many friends you have are you an outgoing person you can say something we are looking at the reference pages across you may not be on any of the basic questions on that but this little reference to what resulted in a 4 to 3 billion dollar industry and that is what I am going to explain by now so Google the live pages have been written they were sampled then it is asked a very simple question which became a old wine in the new bottle the idea was known back in the 70s in the 50s itself then it is asked the following question beyond how these pages how do we find when someone types let's say word bank which page should display someone type aishwarya which page should they said let us try formulating the definition for popular what is popular okay so let's do my first principles what do you think is popular what do you mean by popular in English thing that is influential okay that is not an ingredient then okay so I would say someone is popular this is very true someone is popular if someone says he is popular you may not agree but you will soon agree I tell you how this very small recursive definition is used everywhere you know the definition for GNU GNU is not your mix and then there is GNU there what does that stand for GNU is not your mix it is a very recursive definition there are many definitions such abbreviations so who is popular that person who is being called popular by popular people let me write this down so a definition of popular would be what do you mean by popular let me use okay who is popular of a person I am going to define as number of people who turn this person person popular and they must be popular what do you mean by this popular again they are popular if popular people call them popular sounds like a creative definition absolutely useless the definition is what created today's Google page writing can someone try figuring out what one means by this seemingly useless definition make no sense definition no no I am okay fine so I am just talking about pages here a popular page in that page which is being pointed to by popular pages there is a very very comical cinema dialogue I am sure many of you have understood so it may get complicated there is a popular movie we have seen where Vivek the comedian someone talks to him and he says then an FDIG I don't know I don't know how the moment he says FDIG I don't know that is because if the FDG knows him he knows FDIG means what is really mutual I know you means you also know him but then later on you do the black page by saying he doesn't know me too and then he gets a slap on the rock so if FDIG knows you then you are popular you know him and you are popular means his popularity is high if let's say a VIP comes and says you are great a big way like a prime minister comes and says oh here is a student he has done so many things immediately you gain popularity because a popular figure is calling you popular but then what makes that person popular X popular because YZ have told him to be popular how can we capture this how can we use this in showing that look this page is probably more sought after than this page doesn't look like this definition will be of any help trust me this definition doesn't just help it has Google Big Cat and also if you go to Spotify or even YouTube YouTube is not a good example 45 is a good example so it has really a music recommendation guess how it recommends music so if you look at what you have what all you have heard based on that it will suggest your new song looks at me and you can write a piece of code which does exactly this but how do we ensure it is state of the art right it simply copies this this definition I will get there after 10-15 minutes and I will get to how exactly is 45 but then getting back to this let us see how this can be used to let's say show which are the pages that are I am just writing down a nice example for us to look at let me say here is an example of few pages let's say a b c d and e what are these web pages page a page b c d it is easy on your mind if you can imagine these as people then pages and a pointing to b stands for person a recommends person b and person a watches for person b so a has a link to b which means a is a web page on the web page there is a referencing to b you put a hyperlink there here is a cool page please click on the link and from b you have 1 to c from c you have 1 to let's say d and from d to c I am going to ask you a very cool question keep observing the links properly from c to a and from d to a e to a d to e and b to ok my question for you all if this is the kind of recommendation that you have who is the leader in this team are you sure how different is someone is saying b how about c and d who is more popular equally popular because c is saying b is d is good d is saying c is good they are equally popular I don't know are they differently popular can you can you rank all these notes which comes first for sure someone said b also ok let's see let me write this down someone said most of you say a I will go by majority alright next is equally if there is a tie how will you break it c and then e c and then e and then b and finally of course d ok the actual ordering that I am going to tell you very soon is a are exactly the same and then c and e are exactly the same finally it is of course d you got the first and the last right but a and b are exactly equally popular now how do we come to the metric for this how do we come to the metric sounds like it is not unique it doesn't work here right let's try to quantify a number of popular people put down this person popular I will say look at a many people are watching for him but a says b is good and b says c is good my intuition says when I say when I am popular I say he is good and he says he is good he gets good amount of your popularity and some amount of my popularity note the word good and some I say you are good you say he is good you get good amount of my popularity he gets some amount of my popularity like he grew him right he sort of is transient my popularity to him goes but it is slightly lesser than what it was for him rich dad's son and he is my friend it is very advantageous to have a rich dad's son as my friend but then the advantage he has from his dad is slightly more than the advantage I have from him as my friend so you see the the factor is trans ok the algorithm is the best part about a quick story before we go further very relevant to what we are talking in 1922 there was a very eye-opening research article written on what is called the simultaneous inventions they observed that almost 150 of the top inventions big ideas were all the simultaneous applied technology simultaneous calculus simultaneous darkness theory of evolution simultaneous roughly one year this time the television four people simultaneously invented oxygen two people simultaneously discovered right now in this case this whatever we are talking right now was available in let's say match stroke physics stroke economics long long back it's called the Markov chains how many of you heard of it it's called the Markov chains but then in the 90s a different community took a look at this problem and said how is all this problem very very easily it's very easy they give a very straightforward solution between a 5 year old can understand solution is this so whenever I come to a solution imagine not just 5 nodes but 500 nodes it will be easy on your mind to understand the doctor the algorithm is very simple it's the following put a fellow on a let's say somewhere wherever you want your choice do you observe that the figure is such that you can go from anywhere to anywhere right you can put anyone on a person on a get him drunk yes I said get him drunk he would be sloshed and you must think about he's called a drunkard's walk he probably just called a drunkard's walk also called a random walk okay so what he must do is he must start from A and go to some other node that is allowed for him to go it is a longer traffic he can go to B and then from B he can either go to E or to C from C he can go to A or to D let's say he goes to D and then from D he comes to A or E let's say he goes to A once again and then once you come to A you have to go to B always whatever comes to A goes to B and then from B he comes to E goes to A comes to B C, B, A B, E, A so on and so forth imagine 500 nodes and I am running this drunkard's walk can you guess where I am making it where am I going so imagine I am a professor I go to the lab often to see my PhD students I can give you the distribution of how many times I randomly live with the lab and how many people are there based on that probably I have an indication of who comes to lab often and who does it randomly I go and I make a note of I give one score for a person who is there the next time I come two people are there I give two points to them this is the way I mind work and I call a person and say you are always in the lab excellent things are going well for you other person who is never in the lab I say you probably are regularly in the mess but not in the lab you wake up so similarly do you see what I am getting at him another person speaking this drunkard who walks most visited node not just that whenever he goes to these nodes he makes up he increments the value of B++ every time he goes he adds plus 1 to the B he keeps doing it and finally when you observe what is the final answer the answer will be something you can do this you can run a small simulation a straight forward program will tell you that A will have 0.285 accumulation which means if A runs a million times 85,000 times it will be on A B is again another 285,000 times and then C will be so much and E will be exactly the same as C and D will be the least 85,000 if let us say he walks for a million times I am just giving you the distribution of B what is the story so far you see who is pointing to whom and then take a random walk on it and now you are able to say who is popular and who is not I will give you a nice intuition for what is happening here when you take a car and then take a random roundabout of Chennai city if you keep visiting a road very often more often than other places that road requires modern make sense right although there are several other parameters to consider but with all the parameters subtracted there is a cool way of saying how this road comes more often it should be commercially exploited rather should put a petrol pump there a hotel there and stuff like that facility location right now I have a serious question how can we use this to rank the same pages simply take what is called the web graph you see who is pointing to whom this page in my page if I have a link to your page it means I am recommending your page if the web page on white has a link to my home page and it is like I am very very very popular obscenely popular correct so you look at who is pointing to whom and keep running a random walk on it you know the net is very dynamic right some pages come some pages go but you keep walking maintain a huge list and whenever someone ties Aishwarya or let's say India you go to that page search for the word India in all these pages fit those pages which has India and then look at the run-cured walk points on these pages and sort them in depending order of the points and takes place this is precisely what Google did and the search was really really fast extremely fast it had simply whatever it Google maintained all the pages that it called which is calling even now right and then the run-cured walk index and based on that it shows the output you do that even now yes but I know this is actually the idea they have gone beyond that as well that you have heard in the previous part right they also look at people who searched for a keyword which link did they click on etc etc we can only guess we don't properly know but the fundamental idea is basically this now I want to ask more experiment before I conclude the next 10 minutes can you think of some other places where you can use this as well firstly although the data was seemingly useless which page quantity which page definitely more useless than whatever we spoke about correct and from that we garnered which page could be the most important page by looking at the seemingly useless definition of popularity for person is the number of popular people who call him popular so it matters 100 people calling me popular just 2 people calling me popular who are popular correct you now tell me if you can imagine a situation where you have a problem like this and you want to operate as in the old what is the situation so what do I how can I use this there twitter good point so you are right I will come there but twitter who follows whom is a network on that network I take random walks alright and then I show who is popular here how will that tell me whom to fans okay more guesses good I will come there more guesses if you jump up I say you have more options to choose okay so you can use this if you go for more popular let's see who is popular which one is popular exactly if you have time she is unsupervised in our college actually in our college we have to choose our teachers for every subject but we don't know the teachers by their name but our seniors know which teacher are good so they suggest us some names and we take 2 to 3 seniors which are the best teachers and this is how the entire teacher selection process you can answer there is a dark algorithm in place there there is a strange combination there and then there is a dead end who would be the next teacher for which subject and there are some teachers like they totally don't remember like if you take this your life will be good good so it may not recommend the system but then how would I run an algorithm like this so I will give you the question I have to answer give me a network a graph like this where it can run run words walk and make it what is that it is still right it is sort of similar to what he said I will come there this house 45 I will get back to you but then looking at the network and run words walk can you pick up an application I will give you I will start giving you application one by one okay so page rank is one algorithm on the web but the algorithm opened up a brand new gateway of for more thought up to the if you look at a cell in your body okay look at it what is it what is it contained cell actually has a bunch of proteins interacting with each other and the interactions if you see is actually a network which protein interacts with which protein is actually a friendship network like this inside a small cell right now if you come out with a drug which can go and attack a particular protein there is a cell attack right there dies completely if you come out with a way in which you can kill this cell other cells might get killed but how about targeting a protein there it is like a bunch of people a crowd a mob right they are creating a lot of ruckers maybe you want to go and attack a couple of people over there and console them convince them or rather get them out of the crowd and the crowd might want to be here how do you sort of this is more like a detonator and how do you switch that off correct now it is observed but in your cell a high page rank node understand what I am saying in your cell let us say you have a cancerous cell this is your cell and these are the proteins interacting with each other there will be roughly a few hundreds alright you run this page rank algorithm there you will realize that this is the most important protein now create a drug which will attack create a drug let us say which will only target at this protein correct then it is like you kill the entire cell alright so page rank of a node is known to say a lot more about the entire structure than just that node it is observed this is a plain example a huge experiment that if you look at the neurons that are firing in your brain neurons are not connected correct that neuron that fires a lot generally has a high page rank roughly the price right I cannot make a lot of work lot of work if you go and kill that it gets slightly far second example third example probably a lot of the many examples I am going to give you I have given you so far is how your spotify works spotify works that is actually the problem you have heard a bunch of music songs you have heard a few songs so on the left side I have all the people on the right side I have all the songs that you have heard so of course it is a billion dollar question of recommending which music to watch that is what Netflix does right recommending a new movie for a person more movies you watch more money they are correct which movie should I recommend you on Netflix which song should I recommend you on spotify or other music whatever let us give it a thought now again a very recursive argument you will find it whether it is a previous argument for persons popularity so let me look at this here is a bunch of people and here is a bunch of songs okay a bunch of people and a bunch of songs pretty straight forward absolutely no technicalities quite commonsensical and here is a bunch of songs and this person has heard of this song let us say he has heard of this song this song look at this and this fellow has heard of this song this song this song and some other song right commonsensically speaking here is a very trivial situation do not you think that this should be the song which should be recommended for this person these two people have heard of the same song these two people have heard of the same song bunch of songs but this fellow has heard one more song maybe that should be recommended to him now let me capture this idea in a straight forward english like statement if I listen to a song like that song and the song is being liked by me the song is like Sanskrit that is all about practice and senses the song is being liked by me this is a huge difference this is actually a huge difference you give song a few points or you can give me a few points for listening to this song I heard this song I listened to this song this song is what I heard if this is a song or on you if you are listening to a song and if you are like him then you should recommend this song for them if I have heard this song and he is like me how do you say he is like me look at the kind of songs we have heard songs will tell you who is alike that is what if it works songs will tell you who are alike have you both heard and listened to the same kind of songs then you both are alike if you both are alike if you listen to a new song that should be recommended a person is popular a popular person says you are popular pretty straight forward idea and it became a super hit algorithm it is called the hits algorithm you can go take a look at it it is called hubs and authorities where you should look at pointers from this side to this side again today a billion dollar industry called the recommender systems question on who could be your next friend that is how facebook has formed its huge infrastructure it is called link farming it should more people means what more stickers messages more your view to the monitor and more you are there means you are also adding a lot of data there and if you are making other person blind also interested everything boils down to you having a lot of friends back in the if I remember it was 18 there was a nice piece of research it said number of friends in a person can have is 130 that is called a dunburst number you can take a look d-u-n-b-a-r and that is proportionate to the co-campus size of your brain bigger the brain, more the friend smaller the brain, smaller the number of friends something called ocean people they observed it but it is completely 130 not even the minimum number of friends we have a whole lot facebook has encased on this idea of friend prediction and it shows who could be your next friends and they do it really in a way that you cannot deny flipping on it and then adding that question at least one in three you will definitely add and what the friends means more data means more distraction for distraction means you are due to the facebook as a part so the very idea of page rank let's just realize whatever we did the very idea of page rank originated from looking at what is called the so called dustbin data I call it dustbin data because how on earth did we create this data on purpose, no it just happened who links to whom I did not link on purpose I thought it would be good to put this Wikipedia page and that gave me access to all possible things all possible things, number one number two I now got all the pages I asked the question which page is important it was not for me to answer this question and then I formulated a definition we said a page is popular popular page says this page is popular and I tried formulating a mathematical problem out of it I said drunkards work helps me give points to each node and I can see which node has higher points which node has lower points I can rank all of them properly something that we all couldn't do by just taking an example every cd e we got it wrong we got it sort of messed up although we got a and e properly in between you had shuffled the order correct but the formula gave us exact answers and then we got page rank and beyond before examples where page rank would be taken to the next level all right so if you see today I would say I would easily say there are more than 100 applications of page rank even beyond whatever I said putting interaction networks or quantifying as a mathematical system or the neurons firing which has high page rank including something like text summarization give you a quick insight on this and then we will stop the talk text summarization is I want to know everything about isreagandhi what I do is I go to the news repository take everything that was ever written about isreagandhi what do I get I will take a sentence I will take a sentence I will look at all the what is called the stock words or in the small preposition etc etc I will remove all of them they don't say anything correct non-stop words non-stop words they say remove unwanted words I keep written only significant words and then I will take a sentence and another sentence and put a link if they have a few words in common I do it what did I do let me start from the beginning take all news articles of isreagandhi see them as sentence 1, sentence 2, sentence 3 or sentence 10 call them all unwanted words and then see commonalities across sentences if the commonality is more than they say 3 words I put a line between them I create a network and there I run page rank I take the top 10 rank sentences and I say this is the summary of all the news articles about our prime minister of the 17s and only 80s in the rank 17s I can summarize in just half a page when you actually had 10,000 lines of information how exactly what we learnt from page rank just run through a drunkard talk of sentences you have a very ingenious idea of ranking some top 10 sentence correct so there is also some words to learn from page rank algorithm says your popular is popular people's point to you we probably have to increase our contact by knowing who is popular and who is not and be pointed by them it also matters for people to tell you that you are popular so it generally happens if you are there is a there is a nice sacred which says with a flower even the thread enters the garland so the garland is made up of flowers flowers is what is important there but with that comes the thread because thread is associated with the garland so this association is very important if you associate with the right people even Q in number probably your page rank will be very very high okay so with this I end my talk the summary is the data that was considered seemingly useless became very very popular we do not know which data has what gold mine we just have to have a second level thought about right anyone who is interested to know the max part of this Google page rank algorithm please talk to me after the talk I can give you some thank you very much any questions? I heard a few asking questions like there is something to be clarified which we can probably ask the definition of popular he replaced the word popular any other alternative okay right people have done this work there are only crush network in the classroom they see the crushes that people have had among teenagers and there the popularity think it does not even make sense right but still you take a random walk and go as crush on who and you will see the person who has who is most important there you need not necessarily be that person with a lot of crushes people a couple of girls 40 boys have crush on these couple of girls and these two girls are crush on you so you are very popular there you see the popularity random walk will come here and then go this side correct that is a good question so popularity can in fact be replaced by either word like hatred and it is the most hated person crush network most likable person not necessarily with the incoming links but in a very different way good question you said that you know the links are like 30 million correct so why this algorithm is to get more complicated so when we get like under the same possibility how will it line up ok details very cute mathematical so it even back up there so straight forward if you have n objects taking random walk you can explore all the objects n log n time n log n is the complexity of my thought pretty straight forward quick thought n log n in n log n time you can go and find all the books firstly you find all the books secondly the idea about computer algorithm especially the applied ones is the tolerance that it has for the errors of thought I don't care if a couple of notes are removed or removed or whatever eventually I will reach all the notes but eventually it is actually straight forward n log n and when I am making a new database I am ok not being perfect there that's the idea of what is called approximation algorithm you know approximation algorithm is all about if you want to be perfect it will cost me 1 crore rupees if you can afford some 1% of imperfection it will cost you 10 rupees approximate algorithm is like that it is more like an approximation when you keep taking random walk you are not perfect there but you get to do a whole lot of things it is very less still a very good question if we take the inverse of the concept that you say about popularity is it possible to find a way for that or maybe good question that's what I was referring to as hatred network if you replace popularity by hatred you will get the most hated person hate but then there is this organization dark web the pages which are not being referenced by any pages Google will not show that up because we don't even know of its existence so there is this network probably comes under that umbrella if I am right they are isolated they are not pointing towards there is nobody points to them that is the inspiration for reservation I don't mean our system in media in general reservation if on the random walk you don't need a few nodes you don't want to take some jumps here and there you will teleport yourself vanish from a node and then appear in some other node that could be used with algorithm every now and then it vanishes and appears in some other node and then moves ahead and as it moves it also uses the typical income taxing mechanism once it has populated all the nodes with the drunkards walk points it randomly penalizes everyone by one one unit one percent of what they have accumulated takes that and distributes those points equally to all the nodes probably why we need income taxing in general in developing or slope development countries there is a lot of reservation and income taxing is very intuitively seen as an analogy in motivation one one one two three as you showed A and C are the same popularity which would be useful for how you decide that it's very unlikely even 5 nodes there was a tie if you have 500 nodes this is mostly not possible but how can you break that tie even at that level if you have some 3 nodes you will have a tie 5 nodes you will have a tie you have 5 million nodes tie is very improbable if there are only a random example then you go to western countries the verb and the noun will have similar things the only thing the link will go here while you are there for example a ball can also be a name of a person as well as a ball can also be a thing because disambiguation is a very popular concept look at the sentence I like you because you are like me the word line is used in a very different context I like you because you are like me so word disambiguation is a huge research topic it's a very good job it's disambigating what is what it's a proper noun I don't think we will face any problems but still in the common sentence there will be a like here and a like here which will be used in the same sense we all know the word love is used in many many senses so there will be some 10 foundations of word love we cannot simply say it's common and then go ahead but still this works because the mind kind of cancels out to the data expressions that's the interesting part very good question can you find the 10 in least the least popular case absolutely that I will again use that thinking whatever you said you already told me the algorithm we are taking a top view take the bottom view see in fact I want to do charity I want to give some money for some organizations I want to do the bottom ones and then help them out I am able to look at the top three in the ascending order not the descending order they are the ones which don't have visibility how will that work I take 500 notes let's do one thing I will take the railway network of India I will take the top three railway stations in India you will observe this is the place where connectivity is kind of poor you cannot reach there very well and you observe what could be the reason the population is it's a very popular city but still there are very less page lines the stations here maybe I should put a new connection here this is called link for me if you can actually do that they are doing it so a popular person says you are popular you are popular come on you call me popular touch your back it's got a link for me and who will now avoid that two blocks cannot cross link each other it can only be one way not both the ways and we can detect if someone is doing that I will help you you help me out in general the last two questions one is lower page line second one is link for me where we may have to put a new edge or remove few edges so I am available offline maybe we can take more questions