 Good morning everyone. My name is Udayan Jain. I am a part of EDX development team. Firstly, before starting, I would like to give my sincere thanks to Professor D.B. Fartak, our project mentor, and Mr. Nagesh Karmali, our project in charge. Now, let me introduce you with my team, Gagan, Yush, Rupal, Sonam, Sohan, Surbhi, and Zil. Now, the topics which I would be covering in this presentation, I described here, we will be covering a bit introduction about the platform and some of the demonstration, what we have developed in last two months. What is EDX? EDX is a non-profit organization which is a collaboration of MIT and Harvard. The EDX was developed last year. Now, they have released the code in Open Source, and the Open Source platform is known as OpenEDX. In the last two months, we have built a MOOC prototype named as IIT Bombay X, built on OpenEDX platform. This is the overall overview of the EDX. There are various parts. One is the LMS part, which is Learning Management System. Other one is the CMS, Course Management System. This is the discussion forum on which the student and the teacher can discuss. The other part is the grading part, in which the external grading will be done regarding open response assessment or any other essay type questions. Now, what is CMS? Course Management System. It is also known as Studio. Now, this will be the instructor's side, where the instructor can create courses, can create different events and upload videos related to the course. Now, the features of CMS is one of the major features is adding videos. On EDX platform, there is a condition that the instructor has to go to YouTube, upload a video and then fetch an ID and put it here. But what we have done here, we have directly added a button that is uploaded directly to YouTube. The instructor can directly click on that button. A new page will open. He can fill his credentials and upload that video and he will get that ID. There is no need to go to the YouTube, then upload it, get an ID and put it here. This is the editing part of the video, if you want to add some of the attributes of it. Now, Course Team. The instructor can also add other team members in his course, as his course team. Like if he is authoring a course and he wants a team of 5 or 10, there is no upper limitation on this. So, he can add in it. Next is the grading policy. The grading policy by default is pass or fail. It is a scale. Like above 50, it is pass. Below 50, it is fail. Now, if a teacher wants to divide the grades into four parts, A, B, C, D. So, you just have to click on this plus button sign and the grade will be divided. You can drag and drop it. Means you can extend the limit, like 30, 40, so you can extend it. The instructor can also define the grace period for the student. If the instructor has created any assignment and he wants the students to submit it in 2 hours or in 3 hours. So, you can define the grace period here. Whenever the assignment will be online, so from that time the grace period will be started or the deadline. Assignments, the major and the most important part of any course. The instructor can define various kind of problems. The various problems are these. There are some of the basic problems are also included like multiple choice questions, text import, single word answer. And the advanced problems like you can add a circuit defining your questions. We can also add some of the python evaluated input questions also. We can add various kind of image input questions also. So, these are the advanced questions which the instructor can add in his assignments. Now, what is LMS? Learning management system. It is the student side of the system where the student has to register himself. After registering, he can view the courses, search the courses based on his field and can enroll himself. After the successful completion of a particular course, he will be awarded a grade or a certificate based on whether he's passed or failed as the instructor will decide. Now, the basic features of LMS are self-paced learning. The student can watch the videos n number of times he want to watch. The most important feature of the watching videos is that there is you can see there is a subtitle going in the side. So, if I want to watch a particular part of the video, not the whole video, you can directly click on that subtitle and the video will start from there only. This is the major important feature of it. This is the student dashboard. We have added one feature here like you have seen an area of interest. It is mentioned computer science. First, what it is there? In the edX platform, when you register, they never ask you your recommended field or area of interest. What we have done? We will ask the user at the time of registration what is the area of interest. And on that area of interest, there will be courses shown to him like recommended courses. Only those courses in which he is not enrolled. Already there are two courses of computer science and on both he is enrolled. So, there is no courses showing now. This is the discussion part where all the discussion will be done between the student teacher and the student can add various kind of posts, can search based on the date, words, comments. Now, Django Keshe system will be explained with my Swag. Good morning one and all. As you then spoke about edX platform. Basically edX platform is built on Django and Django is a web framework which follows model view controller pattern. And it has a wide number of features and major feature is that it has a good cache system. So, I am going to speak about Django cache system. The main features of Django cache system, it is very robust cache system and it saves dynamic web pages. And the major feature is that it supports levels of cache granularity. What I meant by cache granularity is that you can cache a part of a web page, a view, a part of a template, etc. These are the types of caching which are provided by Django platform. We will go one by one. First comes the memcache. Memcache basically what happens, a part of RAM is allocated just for caching purpose. And the main advantage is why they use memcache is that it can run on multiple servers. And many popular websites like Facebook, YouTube, Wikipedia uses memcache. Next comes the database caching. Database caching where there is a table created at the back end and then the table is used for caching. The advantages of database caching, there can be multiple databases used. If there are suppose read request, cache read request, cache write request, there can be a cached router which manages read request separately and write request separately. And disadvantages, managing multiple databases is difficult and there is a problem of SQL parsing. I mean you have to check the syntax of SQL. Next comes the file system caching where the cache content is stored in particular folder on the file system. By default it is stored in flash var flash tmp folder and user should give read and write permissions to the directory. Next comes the local memory caching where you don't use an external daemon service like memcache and cache only on your local memory. So we have done experiment to test various cache systems available in django and testing tool use was jmeter. Wherein you have to create a thread group and mention the parameters. So here I mentioned the example parameters, number of users has 10. Ramper period is the time in which all threads should be started. That's has 20 and loop count is the iteration in which user should repeat the request and it's 5. And these are the results obtained using jmeter. From this you can see the preference order that memcache takes the least time and local memory caching takes the most time. EDX database will be explained by Govind. Good morning everyone. Now I will be explaining the EDX database. EDX is using actually two database systems currently. One is the SQLite database that is the relational one. It is used to store the user data, user information and other is the MongoDB database which is a non-relational database. Just to use this course content. But actually when you use it for the production environment, SQLite is being replaced by MySQL. In the SQLite database, as it is a relational database, it is using user profile, storing user profile, registration data and course enrollment data and test and grading data altogether. Actually in IIT BombayX, there are 86 tables, 85 being of the EDX platform itself, one we have added while the development. This is the categorization of tables in the EDX platform. As you can see on the left part, user and group information which is including most of the Django legacy database tables. And on the right part, the course enrollment tables are there in which the data about course enrollment is stored with the student ID. MongoDB database is used for storing the courseware content. It is a no SQL database which is document based. So it has no predefined schema like in relational database, there is no schema for relations. You can modify according to your need. So it is used to store the course policies, courseware content and actually it has a unique feature MongoDB. It stores data as key value pairs, which helps in optimizing the retrieval and appending of data in the MongoDB. In EDX platform, there are two MongoDB databases, X-content and X-module. X-content stores the file system content, X-module stores the metadata. The use of two database systems comes into role because previously MongoDB was not being used. XML file system was being used and transition has been done to MongoDB. But the relational database pre-existed, so they have not transited that to MongoDB. Now EDX Aura will be explained by Surbhi. I'll be covering next three sections that's EDX Aura, XQ and Ease. Now coming to talk about any MOOC, one of the important thing that is to be covered is assignment. So in our IITPX, apart from the general multiple type questions and objective questions, we have a provision for open response questions. Aura is used for correcting those, grading those open response questions. It supports three types of grading, peer, staff and machine. This is the basic flow which will explain how the interaction is run. When a teacher creates an assignment, he can set which assessment is to be used or a combination of these is to be used. So when he creates an assignment, he also creates a set of rubric on which the student's answer should be graded. So a self-assessment, the student grades himself based on the rubric. A peer assessment is done through a calibration round where he grades himself fast and then he goes to the calibration round wherein there are few professor graded questions kept and he grades them and then the result is compared and if it's found proper, then only he's allowed to grade others. Then coming to AI, it's still work in progress. They have enhanced a little bit. They've made an ease module that's enhanced AI scoring engine that I'll be covering in later chapters. AI algorithm is developed as said. Peer, first you have to grade your essays then go to the calibration round where you already graded essays and compare your results and then only you can grade your peers. Coming to XQ, we've discussed about LMS and we've discussed about EDX Aura. LMS is where you can submit your assignment and Aura will grade them but how will these two interact? For interaction, there's a component called XQ wherein from LMS when you submit an assignment an HTTP post request is created with two URLs, one from where the request is coming and another is where the final graded assignment is to be sent and these assignments as the name suggests XQ, it's handled in a queue. So the coming results are enqueued and they are dequeued when the graded ones are come. So this is the basic flow. Ease module, this is used for AI grading. This can be used for both numeric classifications, numeric values and free text. I tried testing ease with a set of essays. So I created on a prompt which said that extending the two year duration of four year duration to six year of our college and something like that. So I collected a set of six, around six essays and there's a function in this which provides like you submit essays and then based on synonyms, it creates additional essays. So with those essays and additional essays, I tried training them. What I found was the feedback was not getting generated properly whereas the grading part is concerned the scores. Scores were pretty proper like I tried, they had mentioned it like if you give the prompt as the given question as your essay, it will grade it to zero. So I tried giving prompt as the essay and it gave score zero. Then I tried a very rough essay which was completely out of topic and it gave a less score based on grammar. Grammar was proper. But still since I did not have a very huge collection of essays for training and testing, so this still has to be done in more depth. This is basically how it does. After creating the model, it dumps the data into JSON file which can be used later on for other purposes. Then it selects the algorithm based on, there are two basic two types of algorithms. It selects one of them. Then it creates an essay set object and adds it to them, extracts features, get classifier and everything is done. There is a Python library called SCKIT learn which already has machine learning implemented. So it is using that for doing. We will go for demo which will be given by Zeele. Very good morning to all. Today I am going to show the features that we have created in the EDX platform, LMS and CMS. So these all four are the features. One is the YouTube that Ujjain told. And the second one is popularity, of course, is that previously in EDX platform, it was not mentioned how many students suggested for a particular course. So we try to develop this and the other is which is keyword search. You can search on various keywords. Another is recommended course is that, depending on the students area of interest. So I will show the demo. This is the LMS thing after the students are logged in which will show the courses that he has enrolled and here area of interest are there. So if I edit the area of interest, right now it is computer science. So two courses are shown. If I edit the area of interest, the humanity, then it will change the courses depending on it. Server is down, means we have to start the server. Did you have a server class machine to experiment or did you load it on your laptops and PCs only? When we started with, we started with our laptops and then we later got a machine. When you tested the performance using JMeter, did you use the desktop machine or the server machine? Server machine. Server machine. If you use the normal machine, it was not showing the preferred results. So then I used the server machine, then results were coming. The issue is of determining the scalability of the software architecture. So strictly speaking what you should have done is you should have used first some partial resources of that server and like logically setting down some process or some memory and then enhance that and return the same test to know whether you get a better performance if you had hardware or not. That test is not yet performed. But the JMeter scripts that you have written are all there as part of your submissions? Yes sir, we submitted those reports I wrote all about in the room. So if somebody has to repeat that test, it will be possible to do that. Yeah, I wrote the procedure steps and all. This testing of essays by the way is an extremely open-ended issue. Final answer is not given. For example, I asked Dr. Anand Tagarwal when he said that essays can be evaluated. I said we have collected about 2,400 essays written by teams of about 10,000 teachers. Can they be graded? He says of course yes, without even asking me what the essays were about. Each essay was supposed to come up with some idea on how to use Akash effectively for education. And the education was to span everything from school education to higher education. Now in such a situation, practically every essay if not every essay, majority of them each is expected to dwell upon a completely different idea. How do you evaluate that? It is using for evaluation. It's like given a prompt, how many related topics, words you have used, length of the essay, grammar mistakes and spelling mistakes. Like it's using wordnet dictionary. So that's the whole point. In general, automatic evaluation of qualitative parameters is a very hard problem as compared to automatic evaluation of quantitative parameters. Grammar mistakes, spelling mistakes are all quantitative parameters. Length of essay is a quantitative parameter. The creative idea, the sensibility of that idea, whether the idea is implementable or not, nothing of that sort can be yet automated. So that's okay, that's the restriction that we live in. Talking in the context of courses, why do you mention YouTube is that the only place where the video resources are kept? Sir, actually what they have done when they have released the OpenEDX platform, so there is only one feature that at present what they have given that first the instructor has to go to YouTube and then fetch an ID there and you have to submit an ID in short. So there is a provision that you can use your own servers or you can deploy your own servers where you have stored the data. But for that you have to change the complete code. Why? Because there is no provision like here that the instructor can, if I am using any server, like amazing server and I want to upload a video on amazing server, so there is no option here. They are not providing any option there. It's not a question of option, but why do you say that the entire code will have to be changed? Sir, actually there is a particular block for video component like X modules. Agreed, but in the entire EDX let's say they are standardized on YouTube at how many places in the code specifically any reference to YouTube will ever come? So there you are. Why should it be? No, we have to, but we have to change that X block which is related to the video component in which you don't have to change that X block. You have to modify that. Now how many lines of code do you have to change is the question I am asking. And it is my humble suggestion that there should not be more than 10 lines of code to be changed because if it is more, the code is obviously not written in a modular fashion. Here we have just appended the button code. Nothing else. How long did it take for you to do that? How many lines of code? 50 lines, not more than 30 lines. What is the total lines of code of the EDX? 38,600,000 before two days. 38,600,000 something. About two days. Two days ago. There is about 3.8 million lines of code. There is a lot of code. And you are only 5, 6, 7, 8 of you. So how many lines of code you have actually been able to look at, if any? There is written in Python. Have you all become experts in Python now? How many of you knew Python before coming here? Oh, not bad. Lots of them. Is that how you picked up this project that just incidentally happened? Python and Django was not prerequisite. We have to learn. Python and Django is prerequisite. So those who knew about it. How many of you have studied Django earlier? Normally. But in general, you would have studied the notion of caching. The caching that you described, how much of the caching is on server side and how much is on the client side, if any. There is a server side caching and there is a client side caching. So is most of the caching that you mentioned is on the server side? Yes, sir. We tested on that. All of that is server side. And that helps. Is there any provision to arbitrarily increase the memory allocated to cache? Sir, maybe, usually you can use memcache, but there is no requirement to increase memory. Memcache, basically, you can use multiple servers with the same cache. There is a block location called in the settings file. The question I am asking is slightly different. Yeah. We expect on this move about one lakh students to register. Yeah. Suppose one million students register. That means the number of people who would go online simultaneously, particularly during the deadlines of quizzes will be very large. That is when your performance requirements will come up. So during that time, if I wish to allocate larger memory to cache for memcache, let's say. Can I or can I not do it? The configuration. So basically memcache uses a Python memcache binding. It comes as a package, Python memcache. Maybe you have to change in the files of memcache, then it might work. So it is basically company-based. Good job. DEMO is ready. DEMO is ready. Let me go through it. Getting delayed, but that's okay. We will cut down on the t-time. Okay. This is the dashboard of LMS. Here are the courses that are registered in the CMS part. So if I search for database, then it will show the results. It will show the database course. Another feature is that students handle which shows the popularity of course. We have just created. This is the course page where all the details of course is visible. And the other that YouTube feature. This is the CMS part. So here 8,000 port is for LMS and 8,000 one is for CMS. I will quickly show the YouTube upload feature. So if I want to upload a video, I have to specify the details. Here I am using YouTube Python API for uploading videos. So I have to write the title and the other information. Here the main security thing is that all videos should be in the same folder. Because it requires that file API of browser does not give the whole file path as a security case. Because if it will give, then you can see all the files of the computer system. So it just gives the file length. So to use this YouTube thing, we have to put it in some folder. So this will give the YouTube idea of video. It will just take one minute. Now we will continue with the features. The ADX has tried to complete the MOOC itself, but there are still some features which are still left to implement. Like the search could be refined on the base of category search or school search. There is no provision for rating the course like in terms of out of 5 stars or something. The AI grader needs to improve as he suggested the creative or qualitative analysis. And also they access control to staff. All the members in the course team have equal rights. So there is no provision to add something like TAS to manage a huge course. And also user interface to create programming assignments so that inputs and outputs could be uploaded as files could be implemented. Like for programming assignments, they have provided a sandbox in which you are able to run the program. But to make it more easy we need to provide an interface such that you can upload an input file and an output file so that it can be compared. These are the references in our team. Thank you.