 the open source assessment module. So what is basically edxora? So edxora is developed for grading open assessment. So by open assessment means the essay type questions where there is no automatic checking can be done. So the strategies that aura implements is grading using machine learning algorithms in certain cost of grading, peer grading, self-grading. None means no grading. And basic checks, which includes basic or normal spelling and check grammar. So we tried to modify the system. And we made a piece, which stands for peer expert auto-gradient self, evolution system. So the components present in P's are peer, self, or already told. And so next slide. So basically, this is a flow diagram. How will the peer grading will look on? So we've studied a lot of methods and tried to come up with all the problems that a grading system has. So basically, in a peer grading system, the problem is that every peer tries to follow a particular strategy when the grade others. And to come up and to cope up with all the strategies, you need to waste to mitigate all those actions. So the problem that, if suppose I'm a student, I'm trying to give my other peers a very high marks in a hope that, in turn, I will get also good marks. Or similarly, when it's a related grading concept, you expect that I will give other less marks. So in turn, it will affect my grade higher. So to come up with all these things, we tried to make an algorithm. And initially, when we start the system, there's a calibration. So as you know, what is calibration? Go down there. Next. So basically, what is calibration? Is calibration is done so that when there is no user profile available, when we don't have any statistics at what, how is the student's ability as a grader? So we tried to calibrate your skills accordingly. And calibration is done so that we can know that how well you can grade. So type of calibration we support models are peer, expert, and none. So basic difference between what is the basic expert calibration is that instructor gives you an assignment. He gives you rubrics, how to answer it. And also he answers it along with the model grade. Now, when every student is asked to calibrate the same thing. And as soon as the scores between your comparison and the instructor's score is matched, and based on that, you get an incentive score. This incentive score is internal capability of your as a grader. Similarly, suppose if the instructor doesn't want to grade answer, you just want to supply the question. So what we provide, we just provide the peer grading. Where this comparison is done between your score and the mean average score of your peers. The calibration ensures that for the first time, when we don't have any history, when we don't have any user profile, as well as the student enrolled in MOO, we can give him a user profile and tell him that how can we use that data so that the evaluation strategy and the evaluation system can be performed very well. So how do we use calibration is that we use, based on the calibration, every assignment has number of peer grader. Suppose for an instance, an assignment has three peer graders. So we will divide the whole group or number of students. So for 100 students or maybe 300 students. So we will divide the students in each group of 100, 100. Based on order by their incentive scores. So this ordering means that for the first set of 100 are the best evaluators. Second set is the medium, third are the worst evaluators. So each assignment, when the peer, it's peer graded. So each assignment is went to three respective different peers. And the three different peers are chosen from each one of one group. So this ensures that the uniform grading can be done. And also this is very helpful when, suppose you have to get the weightage to the peers, peer grading. So instead of uniform weighting, you say that I don't trust that peer so much. The system just says that it's simply a peer. So it can appropriately weight that factor. And so apart from this, this calibration is done. This calibration ordering is done for the fair evaluation. But still in spite of all this, still the students manage to give wrong grades or with work with a strategy. So basically these are cases which are present in the system. So normal cases when you give grades honestly. But what is the over-generous case? As I told it, when a particular student rate other peers over-generously in hope that he will get a good grade, or just when he performs with a particular strategy. So this over-generous case needs to be handled because a just system that when the peer doesn't perform well and you give even a good grades. So second is creative accounting case, just a positive over-generous. When the person, despite being that he has performed well, he gives very less grades. And a penalized case is when there are three, four peers decide mutually that we will make this person as a target and give him a less grades. So this is generally not present in MOOC. But this is a case when, suppose if MOOC provides group project and all, then this case is come up in the accounting. So how do we do normalization as process? So basically normalization process starts with we try. So the table, next table. So this is a sample grading table. So that's how peers have given. So there are like four students. So Peter has given marks to other three and John has given marks to other three. So peer. So now to calculate, we calculate IER. So IER basically the score that is given, the marks the sum of the score, the scores that you have given. And similarly, AER is the summation of all the scores that are given divided by average. So it's an average effort rating. And then we calculate the individual weighting factor. The individual weighting factor is your score you're given individual effort rating divided by average effort rating. And from this, this came to bias factor. So bias factor is how biased were you when we're giving grade? So when you give rating to others, divided by AER, so effort rating. And similarly comes from the normalization factor, which is inverse of bias factor. So I can show you the table, table, yeah. So example. So for example, like in the first case, the for the Peter, IER was 49 and the average AER was 51. So Peter gave rating to others 60. So this is case of over genus K. So this bias factor came to 1.17. And this, when we transfer to normalization factor, the normalization factor became 0.85. And this 0.85 reflected to the score. So normalized score now became 4.27. So this from the score that when he is given intentionally five to everyone, the score get reduced to 4.27. Similarly, if you could see the normal cases, the rise became, from the last, if you can see like the genet A, that column, the four give rise to 4.8, five to 5.23. So this is just based on normalization. And so this process is a recursive process and this continues until your bias factor comes in between the range of 0.98 to 1.02. When this range came, the scores get normalized and this is actually the scores what you expect to get. So apart from that also, normalization is one technique that we try to use to get a fair grading system. Along with this, we also use an incentive mechanism. So incentive mechanism is the comparison of your score that you give to other between the average peer normalized score. So suppose I give my peer five while the class gave three. So I get, so I came to know that I gave intentionally more marks. While the average class was given to that particular peer only three. So this gave me a normalization incentive and this incentive in terms reflects both in my grades as well as it reflects in my capability as being evaluator. So again, instead of like first time when we use the calibration score to do the grouping in the final process, we do the use this ordering by incentive to group the people. So can, okay. So now I want to show that what is different between edX aura, the recently launched module that was released on first June only this aura and comparison between the P system. So both edX aura and P's provides calibration mechanism but the difference between calibration is that aura has a compulsory calibration while P's doesn't have any compulsory calibration. So it depends, we provide flexibility to instructor to which calibration model he wants to choose. So, but an edX aura also has some notation that it provides calibration only for the purpose that it teach how to calibrate. So we also do the system, our provide, we provide also do the same thing. But in addition to, we also use that calibration score to have an optimal distribution of evaluators. Also there's a bug in, I think in aura that if suppose the students fails the competition six times calibration test, irrespective of what the score he has he will be promoted to do peer grading. So that is, I think doesn't help much if students just click six times that will be a bug that won't help. So also the aura doesn't have any normalization technique that I explained just now while P's follow perfect normalization technique so that the scores can be generated and the student actually get what score he deserve. Similarly, no incentive mechanism is there in aura. No optimal distribution that is grouping based on your incentive score and calibration score. Both provides a good peer review system. So both has a mechanism. So now I will invite someone to come. Good afternoon everyone. Now proceeding with the other modules that our P system has. The second module is the self-assessment module. As we all know in the traditional grading system the main focus of the instructor is providing grades. But and not much emphasis is given on learning of students. Now through this self-assessment module the instructor can use this self-grading module of P's to enhance learning and creativity among students. If an instructor thinks that in a course the enhancement of learning and creativity is the main purpose he can employ this method. In the self-assessment method the students are allowed to grade their own assignments based on a evaluation form or rubrics. Now when the student grades himself on the basis of this rubrics he tends to reflect on his or her weaknesses and in the process the learning is enhanced. Now when these students grade their own assignments then they mark themselves and these marks are only considered as their final score. Now the next module that we have is the expert evaluation. This is in accordance with the traditional process that we have. Here the instructor has the option to choose for expert evaluation. In this he can use this expert evaluation when there are less number of students enrolled in a course. He can opt for this method. And here when the students answer a particular assignment the solutions are forwarded to the instructor and then he grades and he provides these scores too. Now this edX, ORA also has this expert evaluation module but there are slight differences. Both are not, expert evaluation is not scalable for a MOOC as it's not possible for a instructor to evaluate tens or thousands of students' copies. And this expert evaluation is used for grading in both the edX and fees. Now it is used as a training data set for the machine learning auto graders. And the expert evaluation in our case is also used for solving discrepancy like if an assignment is given for peer grading but if a student feels that he has not obtained perfect score he can report it to the instructor and the instructor can then opt for evaluating that himself and resolving the conflict. Now the last module that we have is the auto grader. In the auto grader we will be using the discern API of the edX and which has the ease module and we will also be integrating various compilers for code checking. Now apart from using all these, all the four modules We are three minutes exactly so start wrapping up. Apart from using all the individual modules the instructor also has the option for choosing any combination of the four modules. Like he can choose peer plus self or auto grade plus self grading and he can assign a particular weightage to what modules he has chosen. Now the main module that is the self plus peer evaluation that is the most important module of hybrid. Here if the instructor wants that in some course the grades are also important as well as he wants to enhance the learning. He can opt for the self plus peer evaluation. He can assign a particular weightage to the self and the peer part. In this process when the student answers the assignment the assignments are forwarded to both the peer module as well as the student themselves rate their assignments. Now we have two set of scores one from the what student has given himself and other from the peer grading module. Now these scores are compared and if the difference lies beyond a certain range the normalized and then the final scores are calculated. If the difference is not much like it's in a particular range then what student has given himself that scores are only accounted for his final grades. Now I will call Nikit. Okay there are some differences between the self rating module in PEs and EDXora. EDXora does not use this self evaluation for grading purpose while in PEs we can use this as a grading technique also like I explained we can use self plus peer so it can be used for grading as well as learning. And there is no incentive normalization in EDXora while we have incorporated in PEs. So these are the differences between ORA and PEs now I will call on Nikit to continue. These are the various stages that we encountered in the process of completing our project. So initially we started with the study of various research papers and we came up with our method PEs which they have just explained and then we built a prototype of this method in Joget which is a workflow engine. However we faced a problem that this prototype could not be integrated with an existing MOOC that is EDX and that was our long term goal. So during that time also the EDX platform code had released an open source so we shifted to that and started studying about the EDXora and we tried to make enhancements on that but EDXora has not been successfully installed till date so it was not a practical option and so during the last three days of our internship we built a standalone web application in a Django framework of our entire system and the future scope next of our project is to integrate our system which we built in Django with the EDX platform and we can also build an API for that purpose and next. These are a few educational application slides that we have included for two of our group members who have left right now. So they are on shapes and matrices, educational mathematical applications and now we would like to show these screenshots of our evaluation. This is the screenshot of our feedback system. So basically sorry for the UI, the UI cannot be now just a two days. So the system that's a, this is seeing it's a, it's a instructor side. Just a minute, just a minute. Why are you showing a system which is built in two days? Don't show. So because it's functional. I agree. I'm not saying that you're not made it work. Better not show a system it doesn't work to it. We shut up with whatever you have learned and whatever you have done the challenges. The future work could be built a system. Yeah. What we have done, basically the question I want to ask you is, okay, you have implemented a lot of mathematical algorithms. So as soon as I see it. Right? How do you check a mathematical algorithm? Sir, some of the mathematical background that we took was initially from the research papers that we referred. I agree. And we tried to do some simulations and also try to have some practical knowledge. So basically the challenge, we also face that if we suppose we propose some strategy, we don't know how to get it confirmed. So I said unless that- Exactly, that was my question. So that we try to- They got nothing to do with your software. Yeah. I said, I'm gonna stop. That is not necessary. So we always try to do some simulations and also like try to get in some low scale, but actually the problem was that even if we try some experiment, the experimental setup would be just for 100 to 100 students. While we are talking on the scale for MOOC. So all the strategies that are facing this because any papers or something which will work will work for 100 students because you just can't see any particular case. But the problem was that it was not scalable anyhow to get an experiment. We run some experiments on 100 students and also like when we have some feedback forms, we tried to get the scores and normalize when we had internal meetings. So that was done. But anything was not able to done for the scale for MOOC. So we try to specifically stick on the basis of the research papers which we get and try to combine them. Yes. So again, that was the next question. Yeah. Okay, how do you scale up? How do you use your, this is was peer evaluation system. Yeah, sir. I got one lakh students who got Dr. Patak said. Yes, sir. First course. Yeah. And he wants 10 lakhs. Okay. Next course. Okay. Only 10 lakhs. No, no. We are in India. We talk lakhs. Not me. Okay. So 10 lakhs. So how are you going to implement a peer evaluation for 10 lakhs? Sir. What is the strategy? The strategy, sir, basically is that because only in the last scale, only two things, when you go for MOOC, only two-scale type of evaluation system can survive. Either you opt for machine learning or you opt for peer evaluation. So when you opt for peer evaluation, you just need to work and focus on two, three things. First is when, that how you distribute the paper. Like evaluators. So like, if you, I attempt an assignment. The main strategy is that how, which three peers or which four peers should judge my papers? That is the main component. Do you know that in EDX, if one lakh people start, do you know how many people finish? So that is. How many? How many? Sir, very few. The only thousand, 10,000, something like that. Yes. Correct. So now, coming back to this question. Have you ever thought about how will you monitor peer evaluation? Yes, sir. I have assigned you to these three gentlemen over there and they say, I am no longer interested in a course. Why this? Why will they evaluate? Pardon? So we have, suppose we have an end date and when the debt ends and all the debt evolutions are left. So first of all, we evaluate only to those who have attempted the assignment. So that probability goes less that of false people being reviewed. But still, even the people will get away from the course. So what we do, we go for a second round evaluation and in this second round, we go only for the expert people. So the first category when we divide in the case, so three, like suppose there's three categories. Is that algorithm built into a system? Yes, sir. Which you have done? So no, that one is not there. Okay. So that is in your head only? No sir, in reports. In reports. Okay. It is in the report. So that we have done and basically. That is the major peer evaluation as I see. What you have done is all mathematical things, et cetera, which work for a small group. The extension to one million with this peculiarity of people dropping out at the drop of a hat. We are talking about free courses. Okay. I mean if I am supposed to, my son enrolled me for one Coursera course. Okay. Because I am in online education and I didn't attend a single thing. Okay. So I plan to have that experience sometime when I have got time. Okay. But the danger here is only one. If you don't run the course properly. Okay. Sir, even if you run properly, then students will die. I am not sure. Ideally, like Dr. Fartank mentioned, the type of student I am, ideally I would like to take a course and only attend the final quiz. Okay. So that you can do anything. Without doing any peer evaluation and let them evaluate. No, sir, that is the point. There have been professors in IIT who wanted to not give me an A in the subjects I like. They did not succeed. Okay. So for that, sir, because we're too proud of it. So that is the peer evaluation, the monitoring of peer evaluation is a major challenge. Yes, sir. Actual algorithms. And actual algorithm, as you admit, testing is a major thing, any mathematical algorithm. Sir, even... Proof of concept is very difficult. Sir, tell them also like no one course, like EDX just started in this June. Only implement in two courses, peer evaluation. But wait, wait, wait. The EDX is much better placed because they have got a lot of history. Yes, sir. Behind them. Okay. So, but history... No algorithm with no history is very difficult to prove. Okay.