 I would like to thank Professor Fartuck deeply for his starting words. I should mention that I am here today because of Professor Fartuck in more than one sense. I am here today because Professor Fartuck has started this wonderful program of reaching out to teachers across India through the national mission for education through information and communication technology. And Professor Fartuck is always one step ahead of everybody. When we were thinking of what to do with technology for our courses here in IIT, Professor Fartuck was thinking about how we can use technology to reach out to teachers. And when I am here reaching out to teachers thanks to Professor Fartuck, he is thinking of the next step which is the online open courses. And what is the next step after that? Well, we will find out Professor Fartuck perhaps. Professor Fartuck for being here in IIT Bombay because he is the person who recruited me here and the kind of enthusiasm which he has shown throughout his career including as head. I was recruited when he was head of the department here is a source of inspiration to all of us here. And I hope some of it has rubbed off on me and I hope some of it will rub off on you through your interaction with Professor Fartuck today and I am sure later in the course. So, before we begin the technical part of the course, I want to spend a little bit of time with the administrative details starting with the very basics. Why are we all here? What is this course about? That is the very first question which one should answer. Professor Fartuck did let me use the white board. So, what is this course about? Many people have different views on what a course like this should be about. There are teachers out there who have taught database systems for many years. They are experts in this field and they are probably looking for a course here which talks about the latest advance in database systems who have taught courses and are probably doing a Ph.D. who are looking for the latest advance in data them with their Ph.D. But then there are many, many more who have probably taught this course not at all or once or twice who would like to get more help on how to teach the course. So, this overall program the national mission I believe is focused on helping teachers to teach students. There is a different component which is helping teachers to advance themselves which is not really part of this program. I am sure many of you are looking forward to it and I am sure there will be other programs coming forth in the future which will cover these aspects. In fact those things should be done not necessarily by a 10 day program like this but by weekend programs and periodic updates may be including beaming of lectures given by leading researchers at conferences in India and so forth. So those are things we will work on to make the latest advances accessible to you near where you are if not exactly where you are. But this particular course as I said is focusing on how to teach database systems to students. Now this is a very tough task as most of you who have taught this course know there are different aspects to database systems there is theory there is fairly intricate theory which students struggle over in some cases and then there is practice which is how to build system. Now how do we cover all of this in one course? Now there is no one universal solution but based on our experience here at IIT Bombay one thing I can say for sure is that students many students really like the hands on aspect of a course like this. They like courses where they can do something new not just read something from a book and we produce it. And the database systems course is interesting in this aspect because it gives the foundation for building application. In fact one of the parts of this workshop itself is to do a project which builds a small application. The goal here is to build a fairly small application because time is limited. But when you teach this course to students as when we do this course over here students have a good amount of time. They have a whole semester with maybe four or five courses in parallel with this one. And one of the things our students do is develop really excellent projects which are based on database system underlying the whole thing. But then there is a lot of stuff on top designing what should be done in the project, what is the goal, who are the users, how should it be designed, what should the interfaces be. There are a lot of very interesting design decisions here and this is something which really motivates students. I have had students who have done wonderful projects. The weak students may do a basic project and learn the basics of the course but the advanced students go and come back and usually end up teaching the teacher something new which we did not know. And I think a course is really successful when a student shows me something which I do not know. Now one of the participants in the workshop coordinator's course held a week ago over here was asking me, I have some really good students who come and ask me questions that I do not know the answer, what do I do to improve myself so that I know the answers to all the questions. And that is fantastic I mean to hear from a teacher that they are always trying to improve themselves to be able to answer questions is really what teaching is about. But that does not mean that you should know everything, we are not omniscient. In say 15 to a librarian and done anything like this, I would suspect no one. What is it that all of you use these days? You go to a search engine, you go to Google or Bing or Yahoo or whatever else it is type a few keywords and get back your answer. In those days teachers would provide this kind of a service for the specific course that we teach. Today students have a different avenue, they can go search and more recently not only can they search, they can go take a whole course online. What is the goal of a teacher here? It is clearly impossible for the teacher to know everything and it is absolutely fine for the teacher to tell the student, look I do not know. I am very happy that you are asking me questions about things I do not know because this is an opportunity for both of us to learn you and me. And in fact, every time I have taught this course, some students have gone out and found something new and come back often as part of their projects and come and show it to me and I realize hey that's a good idea. Maybe we should take a closer look. Some of the things which we cover in this course, for example servlets. When servlets were absolutely brand new, I had not yet heard about them. Some students found out about it and did a project using servlets. And that's when we realized that servlets are a powerful technology which maybe we should take a closer look at. So this is one part of what a teacher should be doing. Helping a student to find out more and being happy when a student asks questions which they don't know the answer to. But then going out and finding out the answer. So the next thing is that database courses in many places in India are taught as theory subject with a little bit of practice writing a few SQL queries. It's very important that a course like this be done differently with a lot of hands on and half the time in this workshop is going to be spent on hands on activity. Every single afternoon is going to be spent on hands on activity. Most of it involving writing SQL queries or writing programs. And part of it doing tutorials which are on paper, not on the computer, but still doing it in your own. That is a very important part of this course. And as Professor Patek said, there are assignments and there is a project which will help you get familiar with all this material. And some of you are already familiar with it. For you, it could be a repression and a chance to ask questions which you are always curious about. I had a few workshop coordinators come and ask me questions. There's something which has been bothering me for a while. Can you answer it? I could answer it in some cases, in some others maybe I couldn't. But do ask such questions. Now coming to asking questions. Today's cream has been entirely one way. I have been talking and before this Professor Patek has been talking. And that's not how we teach courses at IIT Bombay. And I'm sure that's not how many of you teach courses. It has to be two ways. You should be able to ask questions. Now the AVU software which is used for this course has a very nice system for interaction where we can hand the mic to other centers. People can ask questions and everyone can see the question and then we can answer it. However, practically speaking with 250 centers and problems with audio in various places. We have found that there's a lot of overhead to asking questions. And the number of questions that can get asked, especially during the lecture is limited. So we have evolved another mechanism which we are using in this workshop. Which is that whenever there are questions at each remote center, you can write down the question and hand it over to the coordinators or the support staff over there. Who can type it into the chat window in the AVU system? We have a person here who is monitoring these chat sessions. And whenever there is a question about an ongoing topic, this person will highlight that question and show it on a screen, which I can see. And when I see it, I will stop periodically to answer these questions which people have been asking. So I really want it to be two ways. In addition, we will have time at the end of each morning, just before lunch, where we can have more interactive sessions with maybe video from other centers and so forth. But during the lectures, please do ask questions through the chat material. So now let's look at the schedule for the course. So if you can read the small point here, it's good. If you're not able to read it, don't bother. But let me explain the overall schedule of the course. So we have two sessions in the morning, one and a half in two hours. One and a half and one and a half plus half an hour, which is based on the lecture. And from 2 to 5, we have labs or tutorials every day. And from 5.15 to 6, we have scheduled sessions on several of the days. The other days are currently left blank. Today is definitely left blank. But we do have an opportunity to have other things going on in the evening if people are interested and willing to stay back and listen. So as an example of the evening session, tomorrow evening, we will have a session on Linux system administration, which is not technically part of this course, but is useful for all of you who are setting up labs for students. So that is something which we want to emphasize in this course also. All of you are teachers, many of whom are in charge of labs back at your university. So you are in charge of setting up software and setting up systems, managing them and so on. So the purpose of this particular evening session tomorrow evening is to assist you in setting up Linux software in particular. And I'm sure some of you use Windows as a mechanism. We don't have any workshop for Windows specifically. But if there is a demand, it could be taken up as part of some other workshop. And the next afternoon evening session, which is scheduled, is on Friday, which is on setting up Moodle. I'll come back to Moodle in a moment. And you can see the schedule of the topics which we cover with the first three days being SQL. And along with these, we have lab sessions. Now some of you may be using other textbooks. Some of you may have used earlier editions of the textbook, which we are going to use for this course, which is database system concepts, which is co-authored by Professor Silvershad, Professor Koth and myself. And in the last edition of this book, we made a significant change, which was to move SQL up front. We do SQL really early, so that there is a parallel hands-on laboratory where people can, in this case, you, but when you go teach the course, students can write SQL queries. And if you continue down in the workshop schedule, you will see that ER design comes after SQL, which is the opposite of how it was done earlier. What we found is that students don't really appreciate what is ER modeling, what is database design at the word go. But once they have seen relations, once they have written SQL queries, they get an understanding of the power of relational database technology. And then they can understand the more subtle aspects of database design through the ER modeling process and the issues of normalization, which is why we do it after we cover the basics. And if you see the afternoon sessions in the last two days of this week, Friday and Saturday, are on tutorials on ER design and normalization. Now these two sessions will be in the same room as we have the lecture session, so that there can be interaction. In particular, we will have an interactive session at the end of the tutorials led from here at IIT Bombay on both these days. On the other days, the lab sessions are entirely local. There will be no interaction during the lab session. Now moving on to next week's schedule. The first day, we are going to focus on building web applications for the first session. And then we will go into internals of databases after that. So as you can see, a significant part of this course is on database internal. Now there are courses which might split the database external, such as SQL, relational model, ER modeling, normalization, and so forth, including application development as one course and database internal as a second course. In fact, we did it that way in IIT Bombay. But given the ever increasing topics in computer science, back when we did this, data mining was unheard of. Or it was a new subject. It was not an undergraduate subject. Today, many people need to cover data mining if not a core as an important electric. So the database part of it has been compressed into one course which covers both the external and the internal. And I believe most universities have also been covering internals as part of their course. And this is a good trend because when people go out there and use databases, there are a whole lot of things they can do without knowing internals. But sooner or later, they hit a wall. And unless they know what is going on underneath, they cannot appreciate why something is not functioning the way it is supposed to function, why things are not working out. And it's very important that people have a basic understanding of what is going on behind the scenes in databases. So the next three days over here are focused on internals, including storage, indexing, query processing, and query optimization, transaction, concurrency control, and recovery. But in between, we have one new topic which is on next Wednesday, which is big data. Now, this is a term which many of you would be seeing all over the place in newspapers, in websites, wherever. It's become a very, very big buzzword. It's overhyped to a logic in today. But there is a core to this which is very important. Parallel processing is now extremely affordable and many, many companies are using this. So one of the new things in this course is that we are going to cover a little bit of parallel processing in particular big data aspects of parallel processing. And we will cover MapReduce and other big data systems, which hopefully will be novel to some of you. Now, even if some of the other material is something which you are familiar with, we hope there will be some new things here. As part of the big data topic, we also have a lab on next Thursday, which is titled Big Data and Hadoop, which will expose you to these topics, which are currently very hot in the market. And finally, on the last day, we will have a quick coverage of data mining and information retrieval. Now, some of these are covered as separate courses, but I thought it was useful to have a quick introduction towards the end of this course, maybe covering some of the highlights of these topics. And we also have labs on query processing and transactions and concurrency, which was not something which we, even we at IIT Bombay had done traditionally earlier. We covered these more as theory subjects in the basic course, and as something which people could learn and then go and modify the internals of a database system. In our case, PostgreSQL as part of an advanced course. But we started having labs on query processing and concurrency control as part of a basic course, so that students can see what is going on inside. And luckily, for both of these, you don't have to go look inside the database systems code, but all database systems have made it possible to look at query plans, and we can understand what is going on under the sea. And similarly, for concurrency control, we can set up a situation where things are happening concurrently and see what happens when you do these kinds of things. So, this kind of hands-on is pretty useful to students. So, that's a quick view of the schedule for the course. Now, coming back, let's see the Moodle site for this course. Now, all of you hopefully have logged into Moodle. If you haven't, make sure you do so later today. And you will see all the course materials available through Moodle. We have the slides which I cover. We have extra information. So, as I said, we have all the course material on Moodle. It's accessible to you once you log in. And that includes this first topic, which is on software setup and lab administration. As I said, many of you will have to go back and set up the labs. If not the whole lab, at least the database software required for lab at your college or university. And we have provided a lot of material to help you in this, some of which you will actually be doing as part of the workshop, some of which you may not be able to do as part of the workshop, but the information is there so that you can go back and try it out. Now, when you do these things, you will run into issues. Now, what do you do in such situation? One of the nice things about a workshop like this is there are many people around you. There are workshop coordinators who have come here and done the same things, and they have an idea of what goes wrong and how to solve it. And then there are your colleagues who are doing this course, some of who may have done these kinds of things before and can help you out. So it's really important that you go through all these steps, try it out, and I will pretty much guarantee that all of you will run into problems somewhere along the way. Go ahead and take help of others to resolve it, but understand what went wrong so that when you go back, you will be able to help others. So that is an important part of today afternoon's lab, which is setting up various systems. And today afternoon we have the labs which involve, as I said, setting up Postgres SQL and PG admin and then getting familiar with these tools. And then writing some simple SQL queries on this. Now some of you may wonder, why are we using Postgres here? Many people have Oracle as part of their university syllabus and the university doesn't really give you the flexibility to go run a lab in Postgres here. Now I don't think that's a good idea for a university to tie down the syllabus to a specific product. Yesterday Oracle was dominant, tomorrow something else may be dominant. We don't want to train people just for one product. However, when we have to do labs, we have to use the product. You can use Oracle, you can use Postgres SQL. We chose Postgres SQL here because it's open source. It's very easy to set up, lot easier than Oracle. And most important, if you want to go beyond external, if you want to look at internal, it's all open source, it's available. You can download the code, you can make changes and indeed our advanced course, Covering Internal, is based entirely on mucking around in the internals of Postgres SQL. And Postgres SQL is a very nicely written piece of software which is relatively easy for students to get in and muck around. I'm not saying it's easy, it's very complex. It's hundreds of thousands of lines of code and it's not an easy job going around and doing things inside it. So it's certainly not something for the introductory course. But it is something which students might want to do as projects or if you have an advanced course on internals, it is something that you may want to do. So we find Postgres SQL to be a very nice product, very well written. It has very rich coverage of its dual features. It's very good to learn from it. That's why we use Postgres SQL. Now you will also notice that there are assignments which require a submission. Now if you can see the Moodle page, assignment zero over here does not really involve the submission, it's just something you need to do. But assignment one has a submission where you write SQL queries and submit them. So Moodle is very nice for allowing people, you in this case, or students later on to submit assignments. And many of you would want to use Moodle back in your college. Which is why we have a session on how to set up Moodle. It's pretty easy to set up actually. And we will help you to do this. We would strongly encourage you to go back and make use of Moodle or some other equivalent tool to manage assignments. It's been a big improvement for us back in the old days when we had students submitting printed assignments which had to be graded manually. This is a lot better. And one of the nice things about submissions which are online is that it is actually possible to check for plagiarism. This is a big problem with students copying things. Now how do you detect it? If it were printouts which somebody corrects and returns, it's hard to do this. Once things are online, there's a lot of software out there which can help you detect plagiarism. There are subscription based services which you pay for and then there are open source tools too. And later on we will discuss some of these things. But once things are online, it is possible to run these tools and check for plagiarism. This serves as a strong deterrent to students and forces them to learn how to do things. And this is something which incidentally we will do maybe not during the course, but at the end of the course, we have all your submissions online. So it is eminently feasible for us to run plagiarism detection software on this. So as Professor Fartek said, I would urge all of you to please try to do these things on your own. Now there are times when you may be stuck during the lab. What do you do? You don't know how to proceed. It's perfectly fine to ask somebody for help. You can ask your colleagues who are attending the course. You can ask the resource center, the coordinator at your remote center. You can ask the teaching assistants that they have at the remote center for help. And take their help, but you have to write the queries or write the program on your own. It's not that you just take the code from them and then submit it on your own. You have to write it on your own. That's very important. Doesn't mean you struggle and struggle with no idea what to do. You can take help, but please do the final thing on your own and don't take help immediately. Do the first steps yourself. Try to make some progress. If you are stuck, by all means take help. We want you to learn. And similarly, when you conduct the labs for your students, the same thing applies. You want your students to learn. Every lab is not a graded assignment. But it is important to have grading to encourage people to do some work. So what I do is we have some ungraded assignments, but even for assignments which are graded, I do allow students to talk to their friends and take help. But all that I ask is at the end, report that you took help. If you took help from someone, say that this is the person you took help from and what help you took. But the important thing is to learn. Grading is secondary. It's important, but it's secondary. Unfortunately, many of our university systems emphasize exams at all costs. Learning has become a secondary aspect. And I know many teachers are very distraught, very upset by the fact that students seem to focus on the certification at the end of it rather than on learning. Sometimes teaching can be a thankless job when you have a lot of students who don't really care about learning. But inevitably, when you have hands-on, students are quite enthusiastic about it. Not all students, but the good students are. And it's often a pleasure to see students learning hands-on. And I personally find labs to be very enriching, even more so than the theory aspects of the course. Okay, now coming back to the tools. Moodle is one of the important tools we will use. Moodle has news forums which could be used for discussion. But in fact, there is another very nice tool called Piazza. It's an online service. Let me open up Piazza in a new window. So let this load and then I'll tell you what Piazza is about. Takes a minute. So what is Piazza? It's basically, it was started, incidentally, by a person from Mumbai who while she was a student doing an MBA at Stanford, she wanted to be able to interact with other students, although she was not physically on the campus, in a hostel or whatever. And she couldn't meet them physically to discuss things about the course. So she went out and built a system and then built a company around it. And her name is Pooja Shankar. She did her BA in Mumbai and then her master's and her MBA later on in the US. And this is a really nice tool. It's enormously successful. It's being used at all the major universities. It's being used at Stanford, at MIT, and we have been using it at IIT for a couple of years now. And I'm not sure why it's not loading right now, but you can try it out later on. What the answer allows is for you to post questions. And when you post the question, other people can answer. I can answer it as instructor. Other people here, TAs for the course can answer it. But most importantly, other students can answer these questions. In this case, other teachers who know the answers. And that's really beautiful because you get really fast turn around if everyone is enthusiastic about this. It's a feedback system where if everyone is enthusiastic, it becomes a really, really useful tool where you post a question and within minutes, you get answers. It's not that you have to wait for the next day for the instructor to find time to answer questions. So I encourage all of you to both participate in terms of asking interesting questions and also answering other people's questions. And one of the nice things about peers is you can edit other people's answers. You will always find someone who answers something which is where the answer is actually wrong. Now, how do you correct it? You can, if you are sure about it, if you are sure the other person's answer is wrong, you can go edit them. No problem with that. I'm fine with that. Now, sometimes your edit may be wrong. Someone else may come and edit it. It's okay. Eventually, things will settle down, hopefully to the right answer. Wikipedia has been very successful with this model where anyone can go in and edit anything. Well, with some restrictions on controversial topics. So Piedra provides these kinds of features which will let you ask questions and get answers. And this is something which you can continue using beyond just the timeframe of this course going forward into the next semester or more while you're teaching courses. You can come back and ask questions. Students ask you a tough question which you don't know the answer. Come back and ask it on Piedra and somebody will answer. So that's the second tool. The third tool which we will be using is a clicker. Now, I hope that most of you would have been issued an Akash tablet with the clicker software on it. Now, it is likely that not all centers have it, but many of you have it. And one of the nice things about clickers is that there is instant feedback. In a class where I can see all of you, I can ask for a show of hands and say how many of you know the answer to this question. Unfortunately, I can't see all of you. If I have 10,000 of you, each of you would be maybe 20 pixels on my screen. So I can't really see it, but we can have a clicker and ask questions which you can answer with multiple choice in this case. Course outline, I think I have covered all the basic points. Let me see if I missed anything here. So let me just wrap up this administrative session by saying in the labs, please use the time to do all the exercises sincerely. Then if you do not submit 100% of the answers, it's okay. You can actually go back in the evening and do it offline. If you have access to the web, you can log in to Moodle and submit your answers later in the evening. And all the assignments have a deadline, but we have not prevented late submissions on any of them. So I urge you to take extra time until the solutions are put up. Now, once the solutions are put up, we don't want you to submit anymore. You would have seen the solution. But until that point, if you're late, it's okay. You can still try it out, but do make the most effort to complete it. But if for whatever reason, some of you travel long distances, if you have other commitment, you're not able to do 100% of the assignment, it's okay. As long as you give it 100% commitment during the three hours of lab that you have every afternoon. So please stay for the entire period and give your 100% in that time. And as long as you do that, it's okay if you don't complete 100% of the assignment.