 I am Venkat Ramana from Achille, Hyderabad. In the next half an hour, we will be discussing about the various machine learning techniques that can apply on real-world data. And here the importance is on social networking sites because many of us work on it. We spend a lot of time on that. This demo is a demo and it is going to be about how you are going to make your own learning algorithm for how you think about setting your biases on your friends in social networks. So real-world data, there are lots of places that you can say the data is real. Blocks, foreign streets, all these countries are data. So what we are concerned about our own data at various places is privacy, right? And how are we going to share data and how vulnerable are we in persons and communities on the road and etc. So with respect to social networking sites, social networking is already available. We have something called social networking sites that we are going to use, especially for networking and tech-guest group or group in my space. And we are concerned in a lot of ways about the information that we are going to share on it. The aim of this talk is to create an environment where you form your own rules, right? Because different people have different views on social media, right? Some want to share everything on the web, some like to hide everything. It depends on how you set your privacy on social networks. And all this happens because of some incubation, that you have your brain, right? You can't explicitly specify that these are my rules. So according to this rule, I share some information with x, y, z friends. And according to some other rule, I don't share some other rules. So these are incubated and how we are going to realize them yourself. So we are going to use a technique called activilining in this. And the idea and the base in which this is useful is that if you have a few ready data sets, like a few ready instances, like it means a few examples of how you classify the data. And there are lots of information very part labeled the data set. For example, you might know a few of your friends on your social media sites. And you know about them very well, right? So you want to set your privacy settings for them in a way you share everything or you don't. You know about them very well, but how about the rest of them? Suppose an average Facebook user has 150 friends and maybe you might know 10 or 20 friends. Very, too well. So you might set some settings for these friends, but we don't know anything about the other guys, right? So this is one scenario in which you can use activilining. Here actually, as the states, we have a few labeled friends instances, a few of the friends, and most of them are unknown. You have to train the privacy settings for the rest of the guys. So activiliner, here you are going to use a person, like what we call a human avatar in making this work. So this is one scenario and there are different ways that you can start learning using activilining. First one is membership instances. This starts with almost no examples at all, right? Right from the first training example, it's kind of a question there, right? You put the question to your friends and ask them for some input. How do you want to share your data with so-and-so friends? So this is kind of a question there. So in the first approach, you have no consent. I mean, no information at all. But we are going to concentrate on the third part, which is pool-based activilining here for our demo. So I need to just answer it in a sampling. So these are some of the questions that we can have here. So what do you have in common with your friends? I mean, the answer can be, here's my classmates, or this is my family friend, or anything less, that is. And maybe you can take some time and discover the question altogether, or you can specify something as an answer, explain it. And we are going to use this labeling information as an attribute for the training data, which is called the feature vector. So this is pseudo algorithm. So L, we call as the known data, training data. And we have a lot of unknown data, I mean, unknown, unlabeled instances. It's called U. So we are using a classifier. This classifier can be anything. So you can use a decision-free and everybody's approach or neural networks. That's your choice. So I'm going to use a classifier to train the initial set of it. Once the classifier learns something, it asks you questions by choosing the unlabeled examples. So once it poses a question to you, and you will be answering that question. So it happens in rounds. After the first round, maybe you have provided answers for some more friends. And these will be included in the L set. So I think the classifier gets trained on all these instances, so it becomes more obvious now, because it chooses more training data. And there is another round in which you can apply. Maybe you can stop. You can set after two rounds. You can stop the algorithm. And the learning that it has made, you can use the learning to set the biosec preferences for that sort of thing. So you can form friend groups. You don't know, like, you have the answers that you gave for the groups. And this is somewhat similar. For example, you will be extracting the features which will be given by your ego itself. For example, you set a level friend as a family member. So that comes to go. That is one of the features, and they give this feature extract. And after some questions, you want to stop this. And you are going to use that algorithm to train the initial set. And some concepts we need to know. For strategy, this is gene ability. Gene ability is, I will be taking this example to explain that. So for example, this data set, it learns, if the user comes to the site, it can use this already historical data, or this historical data, like Slash Talk. The reference to that slide is Slash Talk on mobile already. These are all the information that you can have collected over time. And where it has come from, and what it has read, whether it has read it, it can be asked at some level. So, and, ultimately, the service has chosen other sites. It's a premium basic. So, for example, if you see the premium tables, there are only three among 15. So if a user comes, and if you say, I mean, gene ability, so if you want to classify the new user among these subscriptions, so if you put in a premium, that is one fifth chance that he will be put in a database. So 80% of the time, you will be locked. So this is what gene ability explains. So entropy is more or less the same, but it uses logs. So the difference between entropy and gene ability is that entropy penalizes, I mean, it is more robust. The entropy score that it gives increases very clearly in magnitude. And we will be using classification and regression trees. So we will be using decision tree algorithm for our training. So our application data for our purpose might look like this. So these ideas are nothing but your friends. Okay, plan number 123 is assigned to sponsor group and is not in group 2. And you have to share your data with the partners. So this is some of this sample data that we are going to go. Okay, so these are some of my friends that are creating open session API from awkward actually. And I have already said some values for these friends. So you can see these are all the personal profile items that you want to share or not to share. So I want to share my data for religion and email to this friend. So also I have a feature called rating here. I am setting the rating of 75, 70. And I am assigning the topic blue card. And this report tells here that this is used for trading. So this is a trading data that I am going to use for class. And there is another friend. So he's also said these are settings for sharing this class. And these are the things that I want to share. And he's allocated the SRM group. And I am going to take one more friend. I am going to assign marking for training. So things like when we share this thing or something like that. So we have three friends and we are going to train an algorithm with this as a training data. And see if other friends who have similar attributes have their pricey preferences set. I mean, for example, I am going to take this friend and add me to project group for training because this should be automatically trained and displayed preferences should be set. So I am going to say I am going to take another course of training data, assign some groups for algorithm. I am going to train up this algorithm. Let's see if it can classify these friends who are not for training as well. I am going to classify this friend. Suppose this friend is a new friend because they are going to get on the special training site. And your assignment has a group of friends. And then if you have the trained algorithm, it's going to classify. Share your phone. The algorithm has taken the same preferences that I have set for another friend. And they can separate this. And it is just one to one here. It looks like I have used the same group for this. So it's a direct match. It looks like that. But it can be more complex also. You can add one or more groups for each person. And I am not using the parameter rating or the ranking as a feature in my learning algorithm. So you use that as well. So your algorithm might be more robust. And I am going to classify another friend. Check. Please, sir. Yes. Because these are the same things that I have shared with another friend. These are the settings that we have got. And let's see the results. So these are the groups that I have taken. Probably SRM and SSHG. These are the features of the feature record that we are talking about. And the decisions have to be made on each of your profile settings. Like data work, zip, region, and forth. So the records that I have generated for each friend, you see. And when you see the training that is actually going on. So you see this. And so these are the records that I have generated from each of the friends. The first record is from this. It's from fossil. The second record is from CJ. The third one is from migration. So we are going to use these three records. And we have formed this tree. So the tree makes a decision on one. If the label of one is yes, what is one? One is the number of features. So this is zero. And this is one. So SRM group. If it is assigned to a SRM group, then you follow the program. And you are not going to share it. So this is the count of number of hands you have generated. And you are going to follow the first branch. And you are going to share it once. So if you have a new friend, you will just check. And if he is of the SRM group, and he finishes, and it's not going to share your data. The same way for Z4 as well. So this is the tree. For Z4, we are not sharing any time. So this is the only record that we have. Not share three times. So we have another tree for the region. And we are going to share it if zero is yes. So zero is a project. So if any person is in a project group, we are not going to share our relation with us. So these are the results. So we are talking about the feature record here. This is the feature record. And this is understand, but instead of zero, one, and two, by the hardware that they are going to use, the goal is to share or not to share on each of these items. So there will be a tree, different tree for each of these profile settings. So for data work, this is the training data area here for each of these trees. So this is the tree that we have got for data work. And for Z4 as well, this is the same tree. This is the end result. So we can use a classifier other than decision trees, like neural networks, ordinary wires, any other algorithm for this purpose. Most of the examples of this removal of the in-fight are from the collective intelligence group. And some of the ideas are provided by the paper based on the word-by-word performance. I am not giving any questions to Microsoft, right? So I am not giving any questions to Microsoft, right? So I am not giving any questions to Microsoft, right? So I am not giving any questions to Microsoft, right? So I am explaining the second stage of the process. So the actual learning is how you are going to train your algorithm. That is what I am explaining. The classification for natural learning is what I am explaining. So you can create an UI and post some questions. No, no, no. That is why I am taking this example specifically. It is about your own algorithm. You are going to train on your own data, right? So I give it to you to get some different rules. For example, it is not often that in a real-world situation that you get a friend every minute. Maybe once in two days you get a new friend request. But you already have an algorithm that is trained on the past data, the past friends. So it only makes sense when you get a friend request, you are just going to set in a group and some, set in a group. That is the only thing that you are going to say and accept this friend request. The algorithm automatically sets the privacy settings for you.