 So welcome to the third day of this workshop, somewhere near midway point. So I hope you have been learning useful things and even if not all the time, at least half the time. So let us continue that effort, the journey. So today I have planned one and a half hour lecture on security assurance, a system administrator's perspective. So I can do without a network connection, I can do without live demos because that's always a risk. So everything is in the slides but luckily since everything is working, I will also switch once in a while and show you the live demonstrations. So the three parts in which I have divided the talk, the first two parts are just so that we get the perspective. Hopefully we can be done with the first two parts, the good and the bad in about ten minutes and that is just to set the stage and the third part is what we call the ugly. So everything looks good from far. So as you go closer and closer, which is how do you defend, how do you build your services on the network so that security is not an afterthought. Security is while you build the design and also the most important message that building a good, secure architecture for your network and services is not enough. What happens after that? You have to watch, vigilance, audit, react, not later, not after the horse is gone. So that is the focus of the lecture. Assurance means that, that I want a guarantee, I don't want just best effort, IP is best effort at IP datagram. So security assurance can be given. So you have to assume along with me that today we are all system administrators whose job it is to do this, although we are all faculty and teachers and all that, put yourself as a system administrator and management, the director or the dean or somebody keeps asking you all these questions. You have to answer. So that is the role. How will you give that answer? How will you give that confidence that what is going on today in IIT Bombay's network is secure, is safe, is stable? So we will go into that. Okay, so those are the third part I will, I am sorry I didn't put those two words on this slide. The two hopefully new things which you may not have seen before, which I would like you to take away from this lecture in terms of learning and teaching in a course. So one of them is, the broader name is firewalls. But the method I am going to use is what is called IP tables. So just, it is not a quiz, just a quick show of hands. How many of you have used or worked with IP tables in some form before? Okay, so a few that is good. So please bear with me if it is a little slow or introductory, but that is an important part, where we say how to configure the network in a secure way to restrict the access to control who can do what, IP tables is like a Swiss army knife. You can do so many things and we will give a flavor of that. That will come two-thirds or about 60% into the talk. And the next thing, and again I would like a show of hands, so the broader framework is centralized logging and log management. But the software that I will introduce to you is something called R syslog. Anybody has used that? Syslog, you all know syslog? So this is like instead of a network-based intrusion detection, a system or a host-based intrusion detection. And the setup for that means that you have to know what is happening on the network, what is happening on your servers, what is happening with your users, who is doing what. So that would be the two softwares that I would like you to take with you the power and the scope. And when we teach a course, then exploring this in more detail will be good. So one last remark before I proceed, that tomorrow's lab is not on these two softwares because that's very hard to set up in a single PC and give you access to all the environment demos I will give a little bit today. So tomorrow in the lab you will be doing two other softwares which are subpieces of this, which will help to build a bigger and more secure system. That is tomorrow morning's lab. And one of them is called AWSTATS, which is for log analysis in a real-time manner as events happen to generate logs. And the other one is something called OSSEC, which is a host intrusion detection system, which will give you that ability to give the guarantee that if something bad happens, then with 99% probability I will know within two minutes. You can't give 100%. But you can at least say that I have set it up such that if, what is the meaning of something bad? So let us start with that. In IIT Bombay campus network, what is the meaning of something bad happening? From the security perspective, network services and security perspective. So many things, right? Your web service stopped working, people cannot access your web page, your mail stopped, or somebody broke into your main computer, your academic office grades, or somebody stole your cash. So many things. All we talk about has security problems or misuse, students bringing down some server or hostels doing something bad. So you understand the scope is vast. So when we say something bad, this is where we have to start. This does not need any computer knowledge. This is where the direct. Then you prior the cap of director and dean. Something bad is something that the director will say should not happen. He will not tell you use IP table, say W stats, OS second, all that. He will say if something like this happens, if a student in the hostel in the night logs into this server or tries to break that server, I want to know. I want to know next morning on my desk. So can you do it? So I won't build up too much because if I do, then you'll ask me, are you doing it? So we are trying is all I can say. But we are an academic institution. So we do not take this so seriously. But if you are a bank or you are a government like atomic energy or so many other railways, then you better have that confidence that your network and servers are safe and you better put that demand on your system administrators. So is this clear? This is the perspective which we will spend the next one and a half hours. So I would also like to make, I don't want to make too many disclaimers in the beginning. It is very difficult to pace such a lecture and finish exactly and so on so forth. So I request you to bear with me in case sometimes it is slow, fast and so on so forth or we rush at some points because I'm not trying to teach you everything in 90 minutes. I'm trying to give you the flavor and hopefully spark that interest in you to go and do these things along with the next main course that we offer with the other participants and so on and your students and so many other things. So let us start. So this is just 10 seconds. We all know this. I won't read it out. So this entire topic of security is like this. Everybody has their own perspective. What is security? Today I'm giving us a sad perspective. There's a network perspective, there is a, like I said, the management perspective and so on. So this is called the blind man and the elephant. So this is a little more serious and this you can, I would not claim is 100%. I would say 90% of all topics which have to be covered related to security are somewhere there in this mind map. And in this course, I don't know which parts you're touching, which parts you're not touching and so on. That is for you to figure out. So basically there are threats, there are vulnerabilities, there are requirements and there are mechanisms in the left corner, a lot of mechanisms you're learning in this course, authentication protocols, SSL, they are all mechanisms for what? For doing this security. And today's focus is this green one on the right side called assurance, that we want to monitor and we want to do emergency response and we want to recover after a disaster in case it happens. And we have to be ready for that. We have to be prepared for that. The rest of it, of course, is all important and without all that you cannot do this, but this is sort of giving you the landscape. And today's focus is somewhere here. So I won't spend too much time on the other parts, which the rest of the course in reduction would have done. This again just a few minutes, like I said, perspective only. The olden days is on the left side, when we had books and libraries and paper and microfilm and pieces stored in dusty rooms, you know, where nobody goes and so on and so forth. Today, everything is digital, everything is online, information explosion, all those sort of phrases that you have heard. What I would like you to focus on is that keeping up, there is this again this standard thing they say, you know, information, knowledge, wisdom, moksha. So I think that we should position ourselves in wisdom, right? It's not that you need to know everything, raw data, information. You don't even need to know knowledge. You don't even know, you don't need a Cisco CCNA or Microsoft IIS expert. That I would call knowledge. Information is everything on the internet. That is how to do a specific tool or a specific application. Wisdom is what? Hopefully that has teachers, that is the role we should play as teachers and researchers. What to learn? How to learn? What to check? So that as technology changes and as data and information changes, we are able to still make some impact. Of course, the last one is just a joke. In our philosophy, we don't even need wisdom. What do we need? Moksha means what? Zero, mind blank. So anybody has heard that joke? So the Americans were doing some excavation at 100 feet. They found a lot of copper wires. So they said what? Look at our ancestors. They have used networking and then new lands and vans. So the Russians went a little further. They found optical fiber. We were using fiber. We were using only copper. What about India? We went one mile, two mile, nothing came. So we said we were using wireless. So that is Moksha. So you don't need, without anything, you can still be happy. So it's a state of mind. Again, just 10 seconds on this, Apple falling on Newton we all know, Ramanujam writing everything in one notebook or two notebooks, have you heard of this Ramanujam notebooks? So it's all, you know, even 700, 800 people do all this now, one plus two, plus three, then it's done. So these guys are getting so much credit for being brainy and scientists and all that. So we have everything in one notebook or one page. Can we do like that today? So we are very, very unlucky, you know. We need collaboration. We need gigabits of data. We need computers. We need Hadoo, Pihu, clusters, all simple things, F equal to MA, K series sum, all that these guys have done and got awards. So today we cannot live without any researcher in any field, that's all I want to say, which we all know. Again, I don't want to waste too much time on this. You know all this. That just like mathematics was the supporting science for all discoveries, physics or electronics or this or mechanical or civil engineering, mathematics is the core. Similarly, computers are becoming a core now. Without computers, advancing the state of the knowledge in whether it's biology or diseases or malaria, you name, or even today sociology, who will vote for whom? Who voted for whom? Why they voted? Previously, they would just with one page, one guy will act like a boss and tell everything. Today, there also you need data, big data, analytics, you've heard all these words. So the good part is just that without computers nothing can happen. Of course, slowly I want to introduce the bad stuff also, real-time internet security, real-time fraud monitoring, real-time screening of who wanted to do that, screen Facebook and Google posts, social media. Remember, there was a rumor that in Assam, Bangalore people started fleeing in trains to Assam, North East people because they were going to be beaten up and all that. So they blamed social media for that. Anyway, we won't go into politics today, not the right time. So the last slide on the good part that we all have to, we cannot deny, we cannot live in denial. We need computers, we need network, we need internet, okay, we need, I don't know whether we need Facebook or WhatsApp, but let's say we need Facebook, we need WhatsApp, okay. So because mathematicians have found that, they can't do, live without that, bankers have found that, okay, nobody is using banks anymore, everything is online, payments, this, that, everything more or less. It's an exaggeration, especially in the Indian context, digital divide and all that. But you understand what I'm saying, that this is where we are heading, okay, all transactions, PayPal, Bitcoin, you've heard of all those things? Security is important there also, but we are moving away without that, and last one is us. So please don't take all this seriously, especially the one about Akash, the tablet that will cure all the education ills of our country. Yes, no, maybe, maybe, let me at least get a maybe. That's what we're doing here. So that is unfair criticism. We should only say that it will not cure all ills, but it can help. It's one of the factors and there are many other factors which is without the teachers and the training thousand teachers, ten thousand teachers, one of the initiatives we cannot do without the human in the loop. But there's no denying that our students do not think of us as the repository of knowledge and wisdom, okay. They can access any lecture they want on their own. Do they need teachers? Do they need us? See now it's harder, it's getting closer to the bone. I can tell you that on security, take any topic, authentication protocols. The world's best lecture, five, let me not say the best, top five lectures are available. And I come and do it, it'll be this, I may be better than the top five, very slight probability. But it doesn't matter much, I can't be so much better. There are very five, very good lectures, yes or no? Take any topic you want, best professors, okay, I'm better than them, all that we can concede. I'm not saying you're not better, but they're not bad also, right? They're also okay. So why do they need us? So answer that later at your convenience, okay. So now, let me do a live demo. Next five slides was canned demo in case the live doesn't work. There's a site called atlas.arber.net and I'm going to go to that site live so that I can skip the slides later. So let me start with the home page. So you can see it later in detail. All I'm showing is that this is a site, anybody can access, no special privileges. And you go to this, that summary, I hope you can see attacks, botnets, dot attacks, pass, flux, bots, global activity maps, phishing, scans, all these words make sense to you now? Phishing is? Yeah. Yeah, what's a botnet? So shall we scan what is botnets, what they're saying about botnets? So I go here, and it will show you, we just have to wait a bit. That here are the number of servers, percentage, that botnets is not showing a map. Which countries are having servers, botnet servers are servers which control these bots, which are spread over and infected many computers all over the world. And when they want, they can activate them to do what? Bad things. So there's some Trojan hiding in your computer, not yours, but whoever is insecure, those who have done this course obviously will not have any such problem. So let's go to something else, activities, attacks. So here this is, this is Sunday because this is US time. It's actually Monday, and for them it's 10 o'clock in the night. And it is, I will not read out everything. It is telling you which type of attacks are happening. Okay, VNC is virtual network that terminal, SSH brute force login attempts, Microsoft window trans header attempt, some PHP access, DNS versions. These are the type of attacks that are happening. Attacks per subnet changed from yesterday as it increased or decreased, they are plotting, okay? And you can see this, you can see this live, anybody can see this live. And if you go further, you can actually see by country, which country, and India is also figuring. So let's go to India, attacks from India are happening. So our students are doing this, not faculty, okay? They're scanning, this is India. And you can see that they're trying to scan telnet, TCP ports, this port, scanning for open ports on other servers all over the world, okay? So attacks like that, recent activity, that rank. So which is the biggest threat and all that, we will not worry. You can see that later, whether it is China or whether it is US or whether it's Iran, every country now has hackers who are serving their country. Here it is, US is first, Chile, Canada, so on, which are the threats? So you can read this at leisure. And you can go further and find out. So let me go back to the talk. So this is showing which are the sites in India and Bombay. There are some, this one is the phishing websites, which is trying to act like state bank or city bank. And trying to steal passwords, they've set up servers like that. Who has found out all this? How did they find out? Should they find out? If they found out, why is he not arrested? So all these questions you have to answer yourself, top threat sources. I am sorry in case it's not fully visible. But this website is mentioned on the title at plus.arber.net. In the lab, if you are bored and you know everything that we are asking you to do in the lab, go to the site and browse. Or later go and see, okay? So it's saying, in India, this is a little old dated data. It is saying again the same thing, and next slide may have more information. Who is scanning? This even telling you, and I hope you can read this from there, that Tata.com subnet has generated 20% of the attacks. And BSNL and Tulip from subnets owned by them, attacks are emerging. So let's just take 10 seconds. What did, how did they find out this or what, who found out this? So let me tell one attack, scan. What is scanning, you know that, right? So suppose I'm at home and I have an internet connection. Can I run a software that goes to, which site do you want to scan? Let us take a site in US, some bank, citybank.com or whatever. So suppose I start writing a script or a program or using some of these tools, which tools does all this? Nmap is an example tool. There are many others, so on so forth. And I start scanning, what do I scan? What is the meaning of scan? So if you want to break into VMCC and steal this projector and you're a thief coming from outside. Will you walk in through the main door first? What will you do? One day you will walk around, right? And you will see, are there doors open, windows open? In the side is there a way to climb? Is there a tree branch nearby? Without doing all this, if you do, it's okay. I mean, that is your amateur thief. So amateur thieves are what is this is catching? But they're simply using tools without masquerading or doing something to hide their identity or source. And they're sending out packets. Is this packet open? Is that packet open? And that packet has to flow through routers, various routers. If you want to reach US from India, your packet can't jump in escape velocity of Venus or whatever. Some of our politicians have said no. It has to go through some wires and some routers. And many of the routers are controlled by whom? Different countries. And they're able to tap that traffic. Wire shark, yes? Much more sophisticated tools are needed when you're a core router on internet. But you can analyze a pattern and you can say that from this IP. And then you can find out who owns that IP. Can you find out or no? Internet is still not chaos, right? IPs are registered. There's an app, Nick, I'll show you some of that. So then you can show this data like this. So now we're in the bad. This is happening today. It's happening from all over the world. So you set up your college server. Will anybody attack it? So that's why we're living in comfort. Nobody cares. IT Bombay server. There are very few people who want to attack it. But the State Bank of India server, so very different, right? So we have to lower threat perception, risk perception. And as teachers, we should teach our students to defend against the best criminals who are for hire from various countries who know all this. So let's move on. Again, it is saying in India, there are people hosting phishing sites. So some, they are created their own website, which looks like State Bank site or looks like Citibank site or looks like some other site where there's a username password. They're hoping people will come there. How they will come there? They will get an email from Reserve Bank, saying you have $1400,000. Go log into this site and give your account information. We'll transfer it to you. Reserve Bank Governor, who is the governor? Raghuram Rajan. So the guy will have, who doesn't want money? So they'll go there and click it. And what will come on their screen? State Bank of India, username password, everything will come. Unless they look with a microscope or they look at the URL, just a phishing site. So he has sent an email, trapped them. Now, how many people do you think will do this? Percentage. So it's less than 0.1%. So he sends a million mails, 0.1% a million is still a reasonable number. If 100 people fall for it, that's enough for him. But he has to send a million mails for that. And what is that million mails called? Pam or phishing mails. So he's a new sense to society. But he's preying on the weakness of a small segment of the population and causing trouble to the whole community. So these people are trying to find out that. Now that they have found out, they don't have the legal authority to take it down. This is where certain comes, that I am seeing what is public information. Please do not assume it's 100% correct. It is possible to doctor this information. It is possible to make it look like attacks are coming from some place. Why they want to do that? Have you seen our news opinion polls, exit polls? It is possible to give any number, make it look like this fellow or that fellow. But so with a pinch of salt, computer is certain, has the authority, they have special status, they get more access, they can actually track it proactively or our cyber crime cells can track it proactively. Provided there is some legal understanding, cooperation, willpower and all that. So they have to tackle it along the priorities. For us, this may be the world's most important threat in India, but for them it is not. They have so many other crimes to worry about. So anyway, so this is still too much time on the bat. Let's come to the ugly. Then there are malicious servers and so on and so forth. Malicious servers means what? In their definition it is that in their webpage there are embedded either cross-site scripting attacks or worms or some other data is downloaded. If you go to that website, you have risking infection of your computer with a bot. So they found out that which servers are hosting such bad binaries and this and that and so on and so forth. So we'll conclude the bad by just talking about what you must have seen in various ways. Why all these bad things are able to happen? Because it can be application level security, overwrote the application, made mistakes, SQL injection, this that and so on and so forth. It could be host security, services are not configured properly, ports are open, passwords are weak, or it can be transmission security, network security, somebody is able to modify, block, change, man in the middle attack, that's why SSL and all that. So we won't be solving all of this. All I want to say is that you have been asked to spend five days learning all this, but the people who are doing bad things need only five minutes. Why? Do they need to learn all this? Nmap, this, that, ports, SSL, handshake, this protocol, that, what encryption, elliptic curve, cryptography. They don't learn any of that, what do they do? Double click, triple click, type the IP, attack that IP. So this is called cyber crime tool kits, okay? So you can go search for that, you can reduce it, but you can search for that. And this slide is borrowed from training material at CERTIN, CERTIN leaves a lot of useful material, and it has a lot of values, so I left it. Please don't try to read everything there. It is giving in a nutshell the attack timeline starting from the 1980s. The first one I remember was a poor student who Morris Worm, it is called. Anybody knows about that? Yeah? In Cornell, one student, yeah, what was that? Then you think some terminal in a type two, no, all the little parts, by using the terminal. Slightly different, this is, he found Sendmail, have you heard of Sendmail? MTA, the one that actually routes mail across internet. Not the client, not the front end. You have to make the mail reach the other side. So he was able to find a hole in that, and he wrote a mail that would multiply itself and send 20 copies, 20 copies, 30 copies to other people. He just wanted to check, test it, and he wanted to do it once, but unfortunately an error in his script caused it to exponentially blow up, and it never stopped, and it took 10 days for the sysads of internet to clean this down. And he was a bright student, so the codes let him off. So that is the type of security problems then. Today it's very, very different, okay? So today we have information warfare, I told you, countries fighting with each other, military, Stuxnet, I think Professor Fatak was mentioning the wrong example about Russia attacking Estonia or something. Have you heard of Stuxnet? Anybody has, S-T-U-X-N-E-T? What was that? US implanted that stuckness via worm into the facilities of, I hope, Iran. Iran, yeah. So the nuclear centrifuge of Iran, this is manufactured in Germany, Siemens, okay? Yet their firmware, what is firmware? Washing machines have software today, you know that, right? Are your washing machines on internet? Should they be on internet? Why should they be on internet? I want to hang up the clothes, this is not true, okay? So when I come back from home driving in the traffic, I press a few buttons, my washing machine starts, when I enter it ends, I take the clothes and put it in the dryer. I don't want it to run in the daytime, okay? I want it to run just before, finish just before I come. That's one reason, another reason. There's firstly logic 2.0, it makes my clothes 90% white. Now there's firstly logic 2.3.1, it makes my clothes even better. So the firmware has to be upgraded. So should video cons send as engineer to every person and change that software on that machine? No, if it's on internet, you can press upgrade, ta-da. Okay, your clothes will be brighter. Anyway, so telling all that, then centrifuges which were controlling the nuclear equipment in Iran, that is for making the heavy water or extracting the uranium, they managed to tamper with that software to make sure that it didn't work properly. And it took a long time for Iran to figure out, okay? So anyway, don't worry, all countries do this. And the next slide says that attacks are no longer by a single person, like a student. Attacks are orchestrated. So denial of service attack, you've heard? I don't want from one machine, because then I'll be caught. How do I want a denial of service? I want machines in different parts of the world to simultaneously bombard, that's botnet, we've already seen. So these slides are giving you this perspective, which is the bad part, that first I said it's good, without internet teachers can't be there, mathematicians can't be there, banks can't be there. Now I'm telling you that internet, many, many people are able to attack easily. And therefore all these requirements that my director will want, that there should be confidentiality, nobody should see anybody else's mail, there should be integrity, nobody can send mail on my behalf, authentication, what is non-repudiation? If I send my students a mail saying that their quiz on quiz three is canceled, then I cannot later say I didn't send it, I'm giving failed grade to everybody, they should be able to prove that I sent it. Other way also, okay? So then there is availability, all this. And the last line is the punch line. We should assure all this on internet, using a protocol called IP, which is like a postcard, everything that you write there, the to, the from, the data, without SSL, without encryption, without all that, it's visible to everyone, okay? So we know that this is a very hard job and other parts of the course are telling you how cryptography is the key, sign, go on, without that nothing. Post-setting up all these protocols, mechanisms, concealing the information, making sure integrity is there, hashing functions, so on, so forth. Today we are going to take the other job, like I said, as you said, we are coming to the ugly part now. So I have about an hour to do the ugly part, the good and bad are over. So this is our critical, is it a critical national infrastructure? What is this? Does anybody care if we just slip into this lake or that lake, we are mistake between two lakes? So suppose something comes and we fall into this lake or we fall into it, will anybody care? Hey, yes, I mean, go on, no. So this is a joke because critical national infrastructure usually means banks and atomic energy and space and ISRO and all that, okay? So this is how the US homeland security and all have classified it. And so I'm just, for a joke saying that you are asked to protect IIT bomb-based networks, saying that this is the world's most important resource for India and if our server gets broken in that main building, then India will face a big prestige loss, therefore protect it, this is your job. How will you do it? So our solution is very easy. In Pohai Lake, we have crocodiles. In Vihar Lake, we have leopards. I let me tell you, it's more effective way than what we are doing for the cyber security, okay? So because fiber cuts. If anybody wants to cripple IIT Bombay, all they have to do is go into that main road. I won't tell you where. There are two, three yellow fibers coming out like this, just cut it. Nobody's guarding it, okay? Cutting it is a two-minute job. You don't need even power hacksaws, even a normal hacksaw may do. All the, why we are serving is because nobody knows where that wire, where that wire is coming. So we are surviving. So that's why all this, okay? And they don't need to come through main gate and show their whatever badge or cards and all that. They can just swim across the lake, right? That's why we put crocodiles in Pohai Lake, okay? So that's not it. So let's spend two minutes. What will an attacker do? Let's say the attacker is not allowed to come near IIT Bombay. We have done the magic. He has to sit at least in Pune or in China or in Pakistan. He can't come within 100 kilometer radius. Can he still do damage? That's the type of attacker we are worried about. Not somebody who comes here with a hacksaw blade and all that, okay? So what will he do? So I won't do the live demo. There's an example, dnsstuff.com. You go to a site like that, type IIT Bombay.in, AC.in, and it will tell you so much information about IIT Bombay. What is their name, server? What is their IP address? And luckily for us, I did this yesterday. We have not failed, zero. We have set it up properly. There are five warnings, but 30 of the tests we have passed and five more information, extra information items about us. So all this information is public. And you can do this for any domain. You can check out a lot of information. So the second information that it says is that we have passed the mail's MX records. This is important. Do you know what is MX record? That if somebody wants to send mail to us, what IP they should connect? Which means what? That mail relay and mail relay to 10321, 125, 126 is what? Public internet IP. And that should run all this command. So they are actually trying reverse, there should be reverse lookup, I will not explain. It should also have from the address to the name mapping. Otherwise, it's untrusted IP. But if you have both, then you know that your server, which is your mail server. And if it is public IP internet receiving mail, it should allow anybody to connect to port 25. If you don't allow them to connect to port 25, they can't send you mail. Yes or no? So we should be running some service. And once you run some service, you are using some software, with software. So this is the send mail or the queue mail or the post fix. So all this software written by you, no. Microsoft Exchange, MS. Also written by you, can you see the source? So again, please, I'm not anti-Microsoft, although I don't like them too much. But suppose, is it conceivable, that if you're using MS Exchange mail server software, that there is a subroutine there, which says that on May 23rd, you please do this. Send a copy of all some mail going to the director at ITB to postmaster at microsoft.com. Can there be some hidden, what is it called? Root, kit, Trojan, whole. Yes or no? Can it be there in send mail or post fix? You know the difference between send mail, post fix, queue mail and this other one, MS Exchange. These are called FOSS or FOSS, which means what? Free, free means what? Free has two meanings, zero cost. Free also means unrestricted, you can use it without licensing, 10 users, five users, three MB, 20 mates like that. Free, open source. Open source means what? Code is not only available, you need not install binaries. So if you have the time and energy and you have the knowledge, you can read every line of code, you can compile the code yourself, get the binary and run it. And if you find code buggy code like that, saying send mail to post for Microsoft, you can remove it. Not only you, anybody in the world, all of them are seeing the code. If anybody finds any such bug, they will tell the others. Microsoft binary only. So we can, okay, use trusted binaries because it's faster and better. But in principle, he can now probe, see which software is running, whether you're using vulnerable software, even good software, even open SSL had a hot bleed. Did you know about that? So it is not that open source is not having errors or faults. Just that it is open, more people are fixing it and so on and so forth. So he can find out from these IPs what services are running and so on and so forth and he can try and get the attack. And he can find out now that server tells him that there is a web www.itb, it's a different IP address and it has all this, it is enabled, it is SSL enabled. This is the certificate that IT Bombay is using for its web server. I didn't put the whole thing, it tells all that. Now if you're using a compromised certificate or there is some problem, he can again attack. There are many, all I'm saying is wealth of information is coming to him by just using a public website to check. He can trace the route to try to find out what are the van links, which service provider they are going through and trace route ends up at a Bombay VSNL site and beyond that we are not responding to pings. We are trying to mask some of the data so that we don't want to unnecessarily give extra information to others and very sophisticated tools are available to attackers. That is not part of today's lecture but I'm sure in the rest of the course you will see it. So let us get back to defense. So this is the problem statement that the big bad wolf is not only on internet. They are there in hostel five, hostel six. They are there in the main building. They are there in the residential network. We have to allow access to our network and resources to vendors, to alumni, to outsiders. Those who want to apply to get. So that's the top line. Saying that there are so many users, there are so many applications. There are research, Moodle, our web pages, our other CDP lectures. We have to interact with the world. That's our mandate. And there are administrative software. People want to pay fees. This, that online parents want to see grades, secure way, so many other requirements. Then there is so much system software that we need to install. Then there is the OS level firewalls, LDAPD and some of which we'll see today, databases. Then of course the part that we won't see today. The pre-seize, the printers, the modem. Suddenly power goes, something burns. More important, the fiber also. The UPS, AC, cables. So I have heard in my 20 years here how many times the rat has eaten that important optical fiber. I don't know, they like all those places where we do all these important connections, the rat goes, eats it. Biggest security threat. Cannot be solved. So but anyway, we have to do all this. And here are the requirements that we have a huge academic area. We have hostels, we have residential network. Okay, we need performance, we need at least mail and web browsing. We need so many other things, Skype and video conference and NKN and all that. And then we have users and management where the nightmare begins. Misuse, then we have so on, so forth. Okay, so without further buildup, little bit. This is, now I'm going to the next 10, 20 slides are not live demos. I hope I'll have time to show you today's network, today's setup, which is let's say 25% different from what I'm going to say in the next 10, 20 slides. These next 10, 20 slides are what it was five, 10 years back, five years back. The reason being, I don't want you to attack our network. So you work with old data, it is more or less similar, but we have made changes, we have improved, we have added newer software and so on, so forth. But this would be a good case study. So that is the thing that we have. And here is an example of a open source router. Today, everything that I am saying and using a software, there are equivalent commercial products. There are boxes, routers which are very compact and sleek, but we are using a Linux box with five network cards and connecting different parts of the campus. Why the residential network is coming through a separate router, more control. That's one more place where we can monitor. And why residential network is separated from the rest of the network, so that different policies can be set. About which time, what they can access, how much for speed they can get in the bandwidth. So the IPs they use have to come via this router. So IP tables can be configured and I'll tell you how. To allow better performance when you're working in the department and worse perform, I'm not saying that's a good policy. I'm saying that if you want to have different policies for different parts of your network, then you have to architecture should support that. So in some sense, this is the ResNet architecture. And this picture, I'll spend a few minutes. Again, I apologize in case it's not big enough, but you will all get the PDF slides later and you can study it in little more detail. This again is not today's setup, but basically this is what is called the demilitarized zone. We are going to focus not on the end, what is in the desktops and workstations and department level servers and so on. We are going to look at the central facilities, which we will call the demilitarized zone. So what is below this is the IIT Bombay land. So there is one internal firewall. Have you heard of the word demilitarized zone? Pakistan Army is on that side, Indian Army is on that side, in between is. So usually they should fight each other, but sometimes if spies are coming from here, also they will block. So internal firewall is there so that our internal users also cannot do bad things to our servers. An external firewall is there, it's in the top, which prevents the big bad internet from doing bad things. And these two firewalls have to work in cooperation with each other and in between, and again I'm sorry the picture may not be as clear to you, there are all these servers. So I will try to just say a few things which are interesting. So we have called here LUM1, LUM2. Load balancer is the official name of software like this. What is load balancer? That when I want to browse the web, there are now 10,000 students, okay? Total 20,000, 25,000 people have access inside IIT. Should they all try to go via the same proxy server? No, so we have several cluster of proxy servers. And this now redirects the load and balances the load in either round robin passion or load basis. So we need that. Similarly, on the other end we need load balancers for people coming in to send mail to us. We can't have only one mail, one machine receiving mail. Just a quick guess, the answer is coming later for five years back. How many mails does outside internally leave? From outside IIT Bombay, IIT Bombay, how many mails comes every day? So think about it for your institute then think about it for IIT Bombay. Okay, so five years back it was 60,000 mails per day. Now I'm sure it is at least 10 times that. It's not a guess work. You have to know this, right? You can't just guess. Why should you, should you know this if you're interested in security? So why should you know this? Why should you know how many mails come? So my slides are there, I'll explain. But let's just try this one interaction. Is the number of mails coming into IIT Bombay every day something that a system administrator should track? When there is an attack, he would be able to detect that. Excellent, give him chocolate, eclair, whatever, okay? So this is the first principle of security. If you don't know what is business as usual, if you don't know what is normal, what is abnormal, you cannot be secure. So if you are getting every day 50,000 mails or one day you get three lakhs, are you just going to say, okay, today three lakhs, again 50,000. Like that every Thursday you get three lakh mails, other days you get 50,000, are you going to keep quiet? Is there a security implication? It may be harmless, it may be on Thursday, somebody is very interested in praying, which God is on, famous on Thursday, they don't know. And he's sending all these mails saying Jai Hanuman or Jai. It may be harmless, but probability of it being harmless is less than probability of it being harmful. So this is what is called knowing the status, knowing anomalies, anomaly detection. Okay, so we are going to see that. That load balancer makes our hard job harder, that all the mails are not coming to one place. They're coming across many machines. So when mails are coming, and they're coming across many van links. So where is this information? How to consolidate? How to analyze? That's the goal, okay. That's where the next two things are going to help, that RCS log. Okay, I won't explain everything. We have machines for sending mail, we have machines for proxying, when internet access quit, it is called. Then we have machines for the rivers, we host. I'll explain that again, and maybe the slides will be easier to follow if you just do this thought experiment with me. You went to the lab, and you tried to find out the IP address of www.csc.itb.ac.in. CSE department's web server. What do you think the answer will be? No, but what should it start with? Guess. 10 point something, right? 10.105 is CSE department subnet. Okay, so 10.105. That is the CSE server. E will be 10.107.u.v. Civil engineering will be 10.101. Or something like that. We have decided to split our subnets like that. Each department is getting a 10. something something. Yes? Now you are outside, you are in Japan. What should be the IP address when you ask for www.csc.itb.ac.in? It cannot be 10. Why? Only one IP address. Firewall IP address. Network IP address. So there are many ways to do this, but it has to be a public IP address. It has to be an address which is reachable by rest of the world. So that can be the firewall if you use what is called natting. And don't, but it can be a, like in our design, we have what is called a virtual host. It's a reverse proxy. It's a real machine. It's running Apache software, or something similar. NGINX is the current version we are running. Earlier we used to run Apache itself. It will accept requests, but it will see which domain you're asking for. The outside person in Japan. If he's asking for www.csc, it will from the inside firewall, go through that and get the data from sir and give it out. So he's not having any data of any department. He's acting as a reverse proxy. Proxy we all know, right? We have to send the request to the proxy. Proxy gets data for us. This is the reverse. Outside us is sending request to this. And that is the machine shown in the left as a cluster of machines. That cluster is also important to analyze. Why is it important to analyze? Who is visiting? What they are visiting? What they are trying? Which type of request they are sending? Are they sending payloads which are? What's a payload? In the request you can, all these crossfade scripting or many other things. So if you don't watch that activity and that activity is not visible in your server, right? You're www.csc. Sometimes it's better to do the control there rather than let every department control everything and analyze. So centralization, the power of centralization. First of course, IP addresses forces you to centralize. Second is application level means more logging, more control. Firewall level if you do the natting and translation less control. So these are design choices. So this is where I wanted to give you a flavor.