 We, the team members of IT Bombay X platform upgrade and test automation, i.e. Hershoni, Prachi Pandey and myself Shreya, are here to present the work done by us during the course of this intern. So let me just give a brief introduction of our project. IT Bombay X is an online platform to offer massive open online courses for individuals and OpenEDX is an open source learning software platform and IT Bombay X uses OpenEDX platform. Right now IT Bombay X uses Ficus version and our task was to migrate the test data from Ginkgo to Hotron to Ironwood and the motivation behind doing this migration is to utilize the new functionalities provided by OpenEDX. So now Hersh will explain what is data migration. So for moving from one platform to another we need to migrate the data from old release to the new release. So the data migration process is basically done in four steps. As mentioned in the slide, the first process is selecting. So the selecting process selects the two releases from which we are migrating. We migrate from old release to the new release. So suppose we are migrating from Ginkgo to Hotron so the old release will be Ginkgo and the new release will be Hotron. So selection is done. Now we need to take the dump of the old release so that we can restore the dump into the new release. In preparing phase we take the dump of the old release and we keep it in a zip format so that we can copy that file to the new release and extract those dumps. In the extracting phase those dumps are extracted and restored into the new release. Now we have an exact copy of the old release into the new release. Now we can apply the steps of transformation so that we can transform the data according to the new release. So this way we will do from Ginkgo to Hotron and then from Hotron to Ironwood. So these versions have several features so Prachi will explain the features. The motivation of our project is to utilize the new features provided by the new versions of OpenEDX. So starting with the features of Ginkgo, course navigation that is you can view the sections and subsections within a particular course, etchlist video playback that is whenever a video other than a YouTube video is uploaded to the cloud then it is played using HTML live streaming, bulk email suppose you want to send to an email not to a particular individual but to a group of people then you can utilize the bulk email feature and drag and drop question a new variety of question has been introduced. Now coming to the features of Hawthorne release, whenever a learner has completed a particular section of a course then it is listed by a green completion check mark. To insert new file, drag and drop options are available. Suppose you are a learner enrolled in a course so every week the main highlights of the course will be sent to you via email. In your profile in the IIT BombayX platform you can also add the links to other social media accounts like LinkedIn, GitHub, etc. and whenever you post any question on the discussion forum and someone replies to it then you will get a mail which will redirect you back to that reply. The new features which have been introduced in the Ironwood version are studio login via the LMS. Earlier in the previous versions LMS and CMS both used to have separate logins but with the advent of Ironwood now there is a common login to both like in order to the login to the CMS you need to login to the LMS first. Checklist. Checklists are made from the point of view of the staff managing a particular course like suppose you have completed a section of a course and now you don't want to add any other content to it so you can mark it with a checklist. Public courses, this is a very fascinating feature like suppose you are not logged in with your ID or suppose you don't have any account on IIT BombayX. So if a course is public then you can visit its sections, subsections and all its contents. Now the next phase, the starting phase of our project, the analysis of different databases that are present in various versions will be told to you by Shreya. The analysis was done to compare the databases of various versions of OpenEDX. So what we did, we used the information schema database of MySQL which is a virtual database and it provides you with the information of all other databases, all the names of all tables present in all other databases with their schema and all the rows. So what we did, we prepared an Excel sheet of all the versions that is one of Ginkgo, one of Hotron and using the advanced Excel features we directly compared both versions like you now you have an idea that which tables are present in Ginkgo, which tables are present in Hotron, what is missing in Ginkgo and what is present in Hotron. Now let us begin with actual migration part. So the migration is divided under four sections. In section one, we prepared a schema report of the old release. Section second comprises of the Django migrations which we'll discuss in the further slides. And section third is same as section one but it contains the schema report of the new release. In section four, that is after migration what we did to confirm whether the migration is successful or not, we compared the reports generated in section one with the section three. Now I hand over the mic to Harsh. In report generation phase, we'll be getting the data entries from the version in which we are working in the old release. So we'll be creating and before folder. So we'll be using shell scripting to get the rows in the tables. So by this way you can get the entries present in the table. Now we also need to compare the schema of the table. So for that we'll be using the column definitions and the constraints in the table and we'll be making the report of that. After the report has been made, we'll be performing the actual migration steps which is mentioned here. So we'll be taking the dump of the old release by using the commands MySQLDump and MongoDump in the old release after stopping all the services of the edx platform. So we use the shell script for taking the dump of all the databases present in MySQL and Mongo. So now this whole dump will be stored in a tar file so that we can copy that tar file to the new release. After the dump has been taken, we'll be restoring all the databases of Mongo and MySQL using the commands mentioned above. After this restoration phase, now comes the transformation phase. So we'll be going to edx platform and we'll be dropping the dj cell retables. So why we are dropping this dj cell retables? Because we need to stop all the asynchronous process that are happening in the platform. So as we have already stopped the old release, it is safe to drop them as these tables don't contain any data and also to stop the asynchronous process. So we'll be dropping those tables using these commands. After the table has been dropped, now we need to perform the actual migration on the new release. So we need to make sure that our VM is connected. That we'll check through links command. So now we'll make sure that our VM is connected to internet. After that, we'll be having a native.sh file in the platform. Earlier in Ficus version, we used to have sandbox.sh which has been renamed to native.sh from Ginkgo. So since Ginkgo, we are only having native.sh and we are using that file to migrate all the changes according to the new release. In native.sh, we have several playbooks which downloads the needed processes and tasks so that the migration process can be completed. So we'll run it. So this is the end of the core part of migration. After that, if any error comes during this process, then we need to perform all the steps once again to make sure that our migration is successful. So now migration has been done. So now we need to analyze whether our migration was successful or not. So we need to compare with the expected result which comes from analyzing. So after this section three was to run the report result for after folder we'll do the same as section one, but we'll do it for the new release and change the parameter according to it. So we need now we need to check whether our migration was successful or not. This will be told to you by Prachi. After running our scripts for the old release and the new release, now we compare the differences between them and if the differences are as per our expectations, then the migration is successful. But if the differences are not as per our expectations, then we have to trace back to what are the problems that have been faced by us during the migration that it was not successful and then we have to repeat the entire process again. Now the next part that is automated testing using selenium will be explained by Shreya. This part does the front end testing. So we use selenium which is a portable framework for web application testing. It is an open source web based automation tool for testing. We wrote the python scripts and what selenium does selenium accepts the commands and passes that commands to browser and we did this automated testing in ironwood version. So here are the few examples. The checklist as they are only available to the admin and the staff managing a course, they were available only in the CMS part and not in the LMS. Public courses without getting logged in, we tried to access public courses and were successful to do so. We tried to log in directly to the CMS but the attempt failed and then first we logged into the LMS and then to the CMS and we were successful. The technologies that we learned during the course of our project are working with virtual machines and Sible. Since open edx is a Django based platform therefore we learned Django migrations, shell scripting. Since the LMS databases are in MySQL, so we worked with MySQL and the CMS databases are in MongoDB. So MongoDB as well and automated testing using selenium. These were the non-technical learnings that we learned. Collaboration, communication, information gathering, resilience, execution and adaptability. Future scope will be told to you by Sreya. We did the migration on the test data. Now the same steps can be applied to do the migration on the production data and as new versions of open edx keeps on releasing, so there's always a scope of future enhancements and shell scripts that we use for migration almost remains the same with a slight change in the parameters and the feature testing can also be fully automated. Thank you. What scripts have you written? Sir, we have written selenium script. The shell scripts are already provided to us. So shell scripts, who had written? It was there in the open edx documentation also and somewhere provided to us by our manager. So you have done from Jinko, Jinko to Houghton and Houghton to this, not from Ficus. No, no, we have tested only from Jinko. These migration scripts are given on open edx. Yes, sir. Right? Yes, sir. Given by open edx. Okay. And what were the problems did you face while doing this? Sir, when we were deleting those DJC early tables, there were some foreign key constant that was failing every time. Okay. So we couldn't trace back that error. So did they mention it in their documentation when, so in the migration? Yes, they mentioned that to drop those tables. But we were not able to trace back that error. We also searched a few groups in Google groups also we joined, but we didn't find that error. So I think that. And what is the reason for dropping those tables? Sir, because of asynchronous process, we want to stop all the asynchronous process during the migration steps so that it cannot hinder, it should not interrupt our migration. Okay. And then it will be created. It will be automatically created after running the native.sh file. Native.sh. Okay. And then you did something fine diff. Yes, sir. Okay. So what was that? Sir, so when we're, when, before migration. So that is also provided by open edx or you have written? No, no, that was provided to us by our mentor. Sarita. Mentor. Yes. So you have written that. Okay. So what does it do? It compares the, all the databases in MySQL and creates a report about it. How many entries are there in the table? What are the constraints that are used in those tables? And it gives a list of all the tables present in a database. And the same is done in after migration, after the migration. So we compare those two results. So we, we get to know how many. How do you compare? Using the diff command or we can also compare using tools like meld. Okay. So we will get the, how many tables are upgraded? How many tables are deleted? No, I think I'll be interested in seeing the output, not how the script is written. Okay. So, okay. Is there an output? So when you did that diff, what did it give you? This was the output of a fine diff. What does it tell you? So it tells about the changes in the database that have occurred, that have been seen. How many entries have been added? How many entries have been deleted? Okay. Thanks. Thank you.