 Welcome to IIT Bombay, it's a pleasure to see so many of you here. The last time I conducted this workshop was three years ago and we had a much smaller number of people there. We had about 40 people in a small room and this time the scale is much bigger. And going out from here the scale again is much bigger than the last time I taught it. And I am pleased to see that there are a lot of new faces because to some extent I am reusing the material from the last year. Last time I offered this but updated. So before I get into the technical parts of this week's course, let me again emphasize the philosophy behind what we are doing this week and what we will be doing in the main workshop another week down the line. So the first thing as Professor Farter mentioned is that this week's workshop is heavily practical oriented. We have very little time really spent on lecturing and a lot of time spent in the labs. My assumption is that most of you over here would have already taught a database course. So let's first take a quick show of hands. How many people here have already taught a database course? Good, pretty much all of you. How many of you here have not taught a database course? So those of you who have not taught a database course, I hope you have gone through a database course so you are familiar with the material. So what I am going to do in this week is focus on two parts. In the afternoons we have labs which are all hands on labs. Now let me get again a show of hands. How many of you have conducted hands-on labs as part of the database courses which you have taught? I see a good number, I am happy, but it is not quite the same as the previous number. How many of you have conducted essentially weekly labs where the students come in and do exercises all through the semester? Good, that is again a good show of hands. So some of the material which we cover here, maybe you know already. So what is our goal here? Our goal is to have a kind of standardized offering which can be conveyed to another group of teachers and which can form a basis for a reasonably uniform syllabus for the practical side of what we are doing. Now one of the things we note, I am glad to say it has been changing for the better, but some years ago most of the students who came here to IIT, who are amongst the top in their path, extremely competitive exam to get into the mTech program of IIT through GATE and you have taught probably many of those students. One of the things we noticed some years ago is that many of these students had a very theoretical knowledge. You ask them something about theory, they will be able to answer those questions. When I say theory, I do not mean deep mathematical theory, that is different. When I say theory, I mean they have read textbooks, they understand what is there, they can tell you about it. But if you ask them to go build a program, they ask them what is the largest program you have built in your undergraduate years and the answer more often than not is 200, 300 line program, really small program as part of small course projects. That is not enough. Now my department here is called the Department of Computer Science and Engineering. Now there may be historical reasons for why it is named that way, but the way I interpret this name is that we need to do the science, we need to understand the theory, but we also need to know the engineering. By engineering, a lot of people think that computer engineering is only the electronic part of it. My view is very different. The engineering part of computer science is how to go build systems. This is where there is value. This is why our students get hired. This is why there is a booming job market for our students and for colleges where you teach, a lot of them might have come up because there is a corresponding market for the students. So it is really important to do this practical side of things. So coming back, in this week's workshop, the second half is entirely lab based. The first half is a mixture of lectures which are going to be very brief. They are just recaps to essentially refresher. And I want to use them as a way of getting feedback from you. You can ask questions. You can ask about parts which you always wondered why I think like this. So I will try to set aside some time for that. I won't use up all the time just using slides and putting them up. In fact, I will try to avoid using slides as far as possible. I do have slides because there is a lot of material, but I will try to minimize it. And the goal is to have good discussions about topics that we look at. And also we should have a little bit of time, not much unfortunately. A little bit of time to discuss how to teach these things. Professor Fartak said that you are teaching associates and I am the teacher. In fact, I would submit it the other way around. Many of you, I mean all of us actually use textbooks to learn a particular subject. Would you say the textbook is a teacher? No. Would you say the author of the textbook is a teacher? No. The person who taught you is the teacher. Now if somebody is coming on a video and saying something, are they the teacher? Probably a little bit more than if they wrote the textbook which you write. But ultimately who are the students interacting with? All of you. So you are the teachers and I would submit that I am the teaching assistant for this course. Not this week's course. The main course which you go back to, you will be conducting the tutorials. You will be conducting the labs. And you are the teachers. I am just going to be remote over here and assisting through video. And of course you are again teaching teachers who are going to be the real teachers when they go back. So our goal is to bring about a change in how subjects are taught. To bring in the emphasis on the engineering. To have students who can learn the skills and then go use them. Now databases are at an interesting juncture. When I learned databases long time ago, databases were more of a theory subject. We hardly had access to database systems. They were expensive, ran on mainframes. We couldn't really use them. But today everything has changed. Every one of your laptops today has a database. Every one of your laptops can be the infrastructure on which you can build a major database, database backed application. And every one of us today uses these applications on a daily basis. How many of you use Facebook? Quite a few. How many of you use a lecture management, a learning management system? Like Moodle. We are all going to be using Moodle here. How many of you use Moodle back in your college? So all of these are database applications. They are very large database applications and they have been built by large groups of people. So one of the important things when you build a large application like this is to document what you are doing. So an important part of a course like this is to have a project where students learn how to build a large system, go through the steps of documenting what they want to build, how they designed it, what they built. So part of this course which we will start off with probably from tomorrow is to have a project component. How many of you in your database courses have a project which students have to do? I'm sure quite a few have it but let's have a show of hands. Not a bad number but I want it to be 100%. Every person who learns databases, if you do not do a project and do not understand how to go through the steps of planning a project and building it, you have not learnt what are probably the most important skills in doing it. And the other nice thing about having a project is that it motivates a lot of the stuff which we do. So I ran the course in a traditional way where I said learn all these things then start the project around half way down the course. I had a colleague who flipped it and he taught the same course and he started the project much earlier. There are other reasons but it turns out that students like this course a lot better than they like mine. So that's an important lesson to take home. That students should be motivated from the beginning to go through some of the intricacies of what we do. It was different sometime back. Today's students know that information is available at their fingertips. They can go to YouTube or something. They can, if they have a textbook, they can go to the website, get the slides and read them. They don't even have to read the textbook in many cases. So what is it that we add here as faculty? This is a problem that students realize that all this information is there and they don't necessarily pay attention to the lectures that we offer. And many of us have gone through a situation where sometimes we make attendance compulsory. Some of us do that. Others don't. You say, come if you want to. Come to learn. Don't come because I'm taking attendance and you must be present. You students are no longer small kids. You should know what you're doing. And guess what happens? A lot of students don't come. And guess what happens in IIT? Are we bad teachers? Maybe I won't claim that I'm the greatest teacher. There are amazing teachers out there. But this is a phenomenon which has happened in many places across the world. All of you have heard of massive open online courses, MOOC. How many of you have heard of them? Many of you haven't. For those of you who haven't heard of them, these are courses which are offered to hundreds, tens to hundreds of thousands of students across the world completely online. How can you have a course like this? In this course, you cannot have human TAS sit and correct 100,000 papers. Possible. You know, the coaching classes for entrance exams did manage it at that scale. But that's a lot of people. And it's not scalable to many different courses. So the way it's done is everything is automated. So it is possible to do these kinds of things. So the question which everyone is asking is why, what is the reason to conduct labs today? In fact, it happened the other way. These massive online courses came partly as a reaction to the fact that students were no longer attending classes. So there were these faculty in Stanford, I was talking to one of them some years ago, and she said that the reason that she taught the course, she turned her course into online course, was that half the students didn't attend her class. This is a professor in Stanford. And it's the same phenomenon in IIT and other places. So she flipped it. In fact, there was a colleague of hers who had done it earlier. So she had already done this for the AI course and this other professor did it for the database course. And then others followed. They flipped it and recorded lectures and turned their classrooms into discussion groups where students view a lecture and they come prepared to the class. And then there are discussions and tutorials and so forth. Now this works very well in a place where students actually have access to mechanisms where they can view it beforehand. They are committed. They actually view it and come as supposed to come unprepared. But even if we don't have all that, as Professor Fartek said, it's possible to have lectures, it's possible to have online lectures. But what is crucial in these classes is that you have an interactive element, which makes students motivated. So the little time we have here, I'm going to have some interactive element. Unfortunately, when we teach this course with 10,000 people, the scope for interaction is a lot less. So it is going to be more or less slide-based. But the scope for interaction comes in in the sessions which you people will be teaching. So some of the labs and the tutorials are designed such that you can interact with the other teachers who come there and motivate them to see what is happening and contribute to the lecture. So I think I have given reasonable overview of the motivation and the philosophy. So let me do two things. The first is, let me show you the structure for this week's lectures and labs. And then I'll go into the other part. So here we go. You see the structure of the lab. Afternoons, we have the labs, which are in the old computer science building. There are two different buildings where we have labs. So you'll be split up. You'll be told who has to go where and so forth. So the afternoons, the labs are all over there. In the morning, we have lectures and tutorials. So today's program is the first half before the tea break. We are basically just having the introduction and then a little bit of introductory material. And then two hours from 11 to 1, we'll have a very quick tutorial on SQL. And after that, we have a lab on SQL. Tomorrow, again, we will have the same kind of thing, but with different topics. We will have a tutorial on ER design over here in the morning and then a lab continuing on SQL in the afternoon. Day after, we will continue on to database design and normalization and again, a tutorial on normalization here, followed by lab, which is JDBC. Now, how many of you have covered application development using Java in your courses? So my assumption is the rest of you have covered SQL. Students have learned SQL, but not built applications. Now, I know in some universities, there's a separate course which covers application development. So maybe somebody else taught the course, but here it is a single course which covers. In fact, our course is called Database and Information Systems. So the information systems part of it is building the whole application with the database at the core. So an important part of this lab is to familiarize those of you who have not really used Java JDBC with that technology and then move on in the following lab, which is on Thursday, will be building small web applications using serverless. So again, a quick show of hands. How many of you here have done some programming at least in Java? Good numbers. Some of you haven't. Now, many people these days are building web applications in a variety of languages which are not Java. People are building it in Python, they're building it in Perl, they're building it on Ruby on Rails. There are any number of languages for .NET. There are any number of platforms. Some of you may have been using some of the other platforms. In this course, we are standardizing on Java, partly because it's a more mature technology. It's a little bit more work to build small applications, but it actually has a lot of benefits for building large applications. In some of the scripting languages, you can build small things easily. After that, you have to be very careful. Java protects you from a lot of mistakes. So we are going to stick to Java in this, and then an introduction to building web applications. Now, the project which I told you about, the goal is to build an entire application. It has to have a back-end, a front-end. So the front-end we will be using is Java serverless and best, which is what pretty much everybody uses these days. However, there is one other front-end which many of us use these days. I'm sure many people here already use smartphones. How many of you have an Android or iPhone or any equivalent phone, a Blackberry or Nokia smartphone? Quite a few. So we interact with a lot of information systems today, not just using web front-end, but using these apps which are running on these phones. So when I teach the course here, I leave it open to my students to say that you can build a web front-end if you want, or if you want to learn something new, go build an Android app. That should be database-backed because this is a database information system course. I'm not asking you to go build a game or something. It could be a game as long as there is some data management aspect of the back-end of it. So that is another good option which students get motivated by. So although we cover some of the technology here, I want to suggest to you when you teach the course and similarly you should tell the others that they can use other technologies. You have to keep up with the world. The world changes. Java is now at least 15, Java web technologies are more than 15 years maybe, close to 20 years old now, and new things evolve, you have to keep up with it. So in the project, I even give scope to students to pick any language which they want. If they want to do it in some other language, by all means, go ahead. I'm not going to teach you that language, but you can pick it up and learn it. And I find a lot of good students are motivated exactly. In fact, this helps me indirectly, or in fact even directly, because they go learn about new technology. There are a lot of new technologies out there. I don't have time to look at each of them. So these students go build something and then they come back and show it to me. And they teach me. So I teach servlets in the course, but I first learned about servlets for my students. When servlets were really new, when they first come out, I was teaching a course, and one of my students found out about servlets, one group built a web app and came and showed it to me. And that's when I learned about it. So any new technology which comes up, you're likely to learn about it from your students. So anyway, the lab on that day is on servlets. And the last day's lab are internals. Now again, a while ago, students in the country did not cover database internals. Today, quite a few do. So again, I'll show off hands. How many of you over here teach some amount of database internals, meaning concurrency control, query processing, recovery, and so on in your course? Please raise your hands if you do. That's a good number. But again, maybe 50%. So sometimes these have been taught in two separate courses. And each the externals and one in the internals. Science grows, there are new topics, data mining didn't exist as a course. Some years ago, it is a course now, and you have to make space for all of these. Today, we fold both of these into one course. And most many universities have been doing this. So the goal is that when you come out, you may take another elective or internals. In fact, I do offer a separate elective which goes deep into the internals of databases. But the first course does cover some amount of internals. Now, why does this matter? Because if you unleash programmers out there who don't have any idea what goes on inside of a database, they tend to do very bad things without knowing that they are doing bad things. They build applications that are unsafe, security problems. They build applications that run very slowly and they have no idea why that happens. They build applications where there is no failure, something goes terribly wrong and then the database, when it starts up, things go wrong. People are not able to do stuff correctly because they don't know how to do these things. And you need to understand the internals to appreciate this and to use it when you build your application. So we make it a point to cover internals these days. And I have two labs on internals. The first lab which is on Friday morning, says internal slab, and query plans to understand what is going on behind the scenes. This is very important for efficiency. Most people build small applications then they launch it and then things go haywire because they did not understand what happens when you have thousands of users and millions of records or even larger. And the second part which is Friday afternoon is on transactions and concurrency. So again, these have been treated more as a theory where you just read up something on it. But there are actually small lab exercises which you can do to understand what is going on. So that is the last lab in this course. So that's a quick summary of this week's plan. When I do the main course it's going to be a little bit different. There, the mornings are all going to be theory with some tutorials which you will conduct, but mostly lecture. The afternoons are primarily labs and which you will be conducting. So our role here at IITB during those two weeks in the afternoons will be to have people here who can answer questions. And maybe some of the days we will have an extra lecture at the end which maybe covers common questions and helps answer common things which maybe not everybody has the resources to answer. But the bulk of it will be done by you. This is one other thing which so far all of these are traditional things. They are focusing on things which you will go teach your students for sure. They are all part of standard syllabus today. But things evolve. New topics come in. So I made some space in this workshop to have new topics. A lot of people who come here in the past workshop said, hey I came to IIT to learn something new. You are teaching me how to teach and I am getting bored. Unfortunately we do not have a lot of time to do a lot of new stuff. But we have set aside some time. All of you have heard the term big data I am sure. You cannot read a newspaper or maybe even watch TV and miss that term. But how many of you here are familiar with some of the technologies? How many of you know about Hadoop? So for the rest of you and maybe some of you who raised your hands also, on two of the evenings after 5 5.30 to 7.30 evenings slot. On two of the evenings I have one is a lecture and the other is a lab on Hadoop and a few other associated technology which I think should become part of any database course going forward from here. It is not there in detail. There is a little bit of it in my textbook for example now in the current edition but things have evolved and this is something which we need to cover a little bit more of because more and more people are using this. Not yet as important as the other material for the run of the mills today but there are many people who are using these technologies today. So we will be covering that in the evening. The other evenings if you see it says lab practice and the idea is that there are a lot of assignments and stuff here but that time is short so you get extra time to sit down and try things out with people around that you can discuss it with. People around meaning others who are sitting here but also there are going to be about 20 teaching assistants from IIT. They are all mostly masters and maybe a couple of PhD students from IIT who will be helping you out with whatever practical problems you come up with here or even if you are trying to solve a problem, you are stuck somewhere, something is going wrong, you don't know how to do it, you can take help. So they will be around both in the afternoon labs and in the evening practice hours. So that's a facility which you should make use of if you need it anytime. Good. Any questions? My question is so today we are going to, this week we are going to talk about PG SQR. I belong to a university of Chennai. So most of our syllabus is based on oracle and such kind of type of versions. I don't think after getting some input from this particular sessions, whatever lab sessions, if we are utilizing it in this lab in the proper manner, I hope so I can introduce that open source type of technologies. So that is the main objective of myself to come over here. I am very happy to be in front of you because since last few years we are following your books. For example, database management. My question is if you are giving some input in the form of lab sessions, it would be very useful to us, even though we are familiar or we are not familiar with the restual technologies, such kind of things, but we just see it to be as a bigness to this particular new type of technology. We are expecting that things from you. So that is a good point. In this workshop, we are using open source technologies exclusively and for many applications, this is the right thing to do. However, universities often have oracle as part of their syllabus. Should it be like that? I don't think so. I think the syllabus should be open. Why should we favor oracle over, let's say, Microsoft over IBM over something else. However, the job market does demand people who know specific technologies. So there may be a motivation for some of you to continue using oracle because that's what students are expected to know when they graduate. So that's not a decision that we make. Our goal is to focus on the basic technical aspects and expose you to tools which are members of a particular class of tools. And if you learn using PostgreSQL today, going to oracle tomorrow should not be a big deal. But if your university insists you should do it in oracle, you can use oracle. It's not going to be impossible. If you don't know oracle, you can pick it up. There are differences and we don't want to recommend any one particular thing. But the market might demand it and then you follow the market up to you. Any other questions? Okay. So before the break I am just commenting more on this course. I just mentioned that you indicated oracle or such name products are there in a university syllabus. First of all university, any university in the world has no business specifically talking about a particular commercial technology to be part of the database. It's grossly wrong and before somebody goes to high court with a repetition, get your university to change that. You may use oracle. You may use oracle in actual labs. Absolutely no problem. You can use any database. That is my first point. The second point is the level at which the first course in databases is taught. Okay. You can use the same Java. You can use the same JDBC calls and you can use oracle with at most one percent or two percent of changes in the code to make that application. So get this out of your mind first of all that my course is oracle based and this course is this based. Nothing. This is the biggest nonsense that we perpetuate in our mind and it does not matter. I am guaranteeing you it does not matter. So whatever technology SQL is still same. That is the reason why there is an international standard on SQL and even the oracle is required to follow that. It does not. That is a different story. It does not. But as far as our first course is concerned, it is more than that. That is the point. If you actually want to build million lines of code based systems with a specific technology, of course your students will have to be completely familiar with those technologies. But that is a later part. We are still talking about the core course. The core course is concerned. Rest assured that whatever you learn here will work. But we appreciate your point that our goal partly is to propagate the open source technology because you can use them legally without having fees associated with it. And also that in some cases where the university says oracle or DB2 or Microsoft SQL server, there is no harm. The same knowledge can still be used. Would you like them to, I am just thinking loud, since the teachers who will come, they will be building a course project after the workshop is completed. I don't know if you are familiar with that or not. When the actual teachers come, 10,000 teachers, they will make themselves into small teams. And they will be then building an application for two weeks after the closure of that workshop and submitting that, which is part of the certification requirement. As an incentive, Professor Sudarshan suggested this that those teams which then submit those team assignments to you, suppose we judge those team assignments. So there are 50 students at, 50 teachers at each college, each center. There will be approximately 12 teams like that. There will be 12 team submissions after two weeks of the main workshop. Would it be worthwhile to consider finding out the best submission at each remote center and recognizing that. So we are of the opinion that we would like to provide some kind of a cash incentive award for the team which puts in the best submission of the course after the two weeks. But this will have to be coordinated by you, because you will be there. As he said, in fact, thank you very much for giving me the right perspective. You are the teachers and he will be the TA at the main course. We are fully appreciated. That means your responsibility is far greater. You don't have to look alike like Sudarshan. You have to be Sudarshan and he has to look like you. Interesting. Yeah, I think we can have a short break.