 Hello, I'm Nathan. I'm Nemtek, first-year student working as an RA for Fortexer. These are my fellow RA students. We'll be mentoring some of you for your summer internship projects and we'll intend to be mentored by Nagesh and Firuza. They are at the very back. So we'll just dive into the project. But before that, you may not be familiar with some of the technologies that we'll be mentioning here. But don't be disheartened because of that. You're not expected to know a lot as of now. If you don't understand any of the technologies, it is fine. You will have time to catch up. You'll have time to learn new technologies. So the first project is IP-based hardware infrastructure management for the cloud and the mentor will be Minali. So cloud infrastructure management basically is the maintenance monitoring and making sure that the cloud stays up and running for as much time as possible. We don't want any downtime. We don't want any resources to go down or we don't want any of the drives to fail. Basically that is what cloud maintenance or cloud management means. So most of the management tools focus on migration, elasticity, load balancing, and of course system failure detection. If any of the resources fail, we want it up and running as soon as possible. System administrators already have tools that alert them or notify them whenever a particular system resource goes down. Example, a hard drive failure. So the motivation behind the project is that any downtime on a cloud is very expensive in the industry. Here also any downtime on the cloud will hamper the user experience by a lot. So we don't want any downtime. We want to make sure that the cloud is up and running as soon as possible. Even if there is any failure, we want to reboot the cloud. We want to reboot the drive as fast as possible. To achieve this, we are not simply a science system administrators and ask them to go manually check the drives because the server rooms can be quite huge. We cannot manually check each and every single drive to make sure that they are running. So we want a tool to make sure that we can reboot the disks using software basically. So there is already one such tool. It is called Najio's. It is quite old and it is very popular. So Najio's what it does is it keeps checking the resources. It keeps checking the resources and whenever there is a failure, whenever there is a change, it notifies or takes a set of recorded actions that has to be taken. Najio's the drawback with Najio's is that it performs at an operating system level, which means that if the entire system goes down, then Najio's cannot do anything about it. It can only notify. So it cannot reboot a system that is not responding. It cannot reboot a system that is already shut down. So we want, so therein comes the importance of IPMI, which is Intelligent Platform Management Interface. It is a protocol that provides several management capabilities for the cloud, irrespective of independent of CPU, BIOS and OS. So it gives a set of interfaces using which the administrators can monitor and manage cloud activities. So even the best thing about IPMI is, even if a system is down, you can reboot it using this protocol. Reboot it or manage it using this protocol. So the aim of this particular project that is IPMI's hardware infrastructure management is to create a tool using the IPMI protocol and provide it a user interface for using in a large-scale cloud setup that we have here. At the end of this project, we need a fully functioning tool with a functioning user interface. These are the references. Moving on to the second project. This is to build crowd sourcing platforms. The project mentor will be Himanshu Singh. So you may already, I think Dr. Mukta had already given a brief talk on, she has mentioned crowd sourcing. So crowd sourcing is basically a technique of obtaining information or input from large number of users all over the internet. Typically they are unpaid, like for example how Wikipedia works or how Cora maintenance works. So there is no singular authority. Generally, all the users all over the world contribute to crowd sourcing. So for this project, we'll be either using Python and Django or PHP and Rupal. So you may already be familiar with Python, it's a general purpose scripting language. And Django is a web framework that can be used for web development, rapid web development. PHP and Rupal, PHP is of course a web server-side scripting language and Rupal is something that is widely used here. It is a content management CMS, which is content management system. We'll talk about Rupal in the coming slides. And you'll also be making use of REST APIs. So REST is basically a set of standard operating procedures for communication between systems in the web. So if you create an API that follows a particular protocols, you can call those APIs RESTful APIs. So the motivation behind this is the same as the motivation behind crowd sourcing, which is that large number of users can take part in this process. Large number of users from diverse experiences can contribute to the crowd sourcing activity. And it allows people with different browsers, different machines, different systems, different platforms to test out your product, test out your website or whatever it is that you're building. And it is much better than using a code team to test out your product. And of course it is free of cost. And some of the likely applications of crowd sourcing that we can make is a clean India application with which you can take pictures of waste dumps that needs to be handled in new surroundings and upload it and others can comment on it or apport it to give it a higher priority so that the authorities can take action on it or the community itself can take action on it. Another example is data collection for supervised learning. You may be familiar with supervised learning. It's a type of machine learning in which you need huge quantities of already labeled data. So these labels have to be manually provided. This label manually can be done by crowd sourcing. You may have done Google capture in which you are given a signpost or you are given a road sign and you are asked to type in what this road sign reads. This is an example of a similar activity. So you can give a user a text, a video, an image and ask the user to tag or provide this description on this. And using this description, we can feed this data into machine learning algorithm and use it for classifying tasks. These are the references for the cloud sourcing platform. And the third project would be portal for collaborating communities and the mentor would be Nivea. This project is already under development and you will have to keep contributing to it. So this involves efforts from users in a single community to create, edit and publish OER. OER is Open Educational Resources. There is a presentation coming up on OERs. So it can be anything like audio, video, mage, text or text books. So the entire community will create, edit and manage and then publish these OERs. So we had to create a portal for collaborating communities, basically. The portal would be built using Drupal 7. And it would use other technologies like PHP, MySQL and REST API. The major components would be content publishing. To publish the basic content, then localization support for Indian languages, that is local languages. And user reputation component. User reputation component is to make sure that there are different levels of users. That is, a user can gain reputation by putting content or by commenting on content or by creating more content. User can have different levels of reputation or gain more reputation in the system. And this has to be done automatically. And of course, content versioning and even logging. Content versioning and even logging is there is a separate project on content versioning and even logging that we'll discuss. So this part of the objectives have already been achieved. The following objectives are yet to be achieved. So we need full integration with the content publishing part, content publishing workflow. We have to build an effective event logging system to keep track of all user events. Event logging means whatever user does, it has to be logged. If a user clicks on a particular link, if a user watches a video, if a user comments on something, every single target user does has to be logged. The reputation system again means when a user does something positive in the community, he has to gain some reputation and whenever he reaches a particular level, he has to get some badges and some more authority. And this has to be done automatically so that we have, by default, some administrative levels and some moderator levels and the like. And the next thing is integrating quiz module with the reputation system so that we don't have to worry about the results of the quiz and how points are earned and lost. So these are the links for portal for collaborating communities. So the next project is performance improvement of Open edX. I would be the mentor for this project. I think you have already seen a presentation on Open edX. You already know what Coursera and UDMA is. So these are MOOCs, basically massive open online courses or websites that provide massive open online courses. edX is one such site and Open edX is a platform using which you can create a website that hosts these courses. So we have created, in IT, we have created such a website, IT BombayX, you have seen a presentation. So the aim of this project is to improve the performance of Open edX by combining several technologies. So RocksDB is open source database software that is built by Facebook to improve their database storage. So it performs exceptionally fast on flash storages. MyRocks is an additional MySQL layer that is put on top of RocksDB so that we get the best of both worlds. The motivation is to improve the performance of IT BombayX so that the users get a better experience. So as for the objectives, we'll have to spend a lot of time familiarizing oneself with the technologies to use like Open edX platform. Open edX is a very complicated platform which makes use of MongoDB, MySQL and other several technologies. And you'll have to familiarize yourself with MyRocks and RocksDB also. So then we'll have to identify the strength and weaknesses of the existing system, then integrate RocksDB with MySQL and MongoDB. Then we'll have to check for the performance improvement in the new system. We'll have to design several tests for that and document the entire process. So these are the references. The next project is event logging and content versioning system. I've briefly talked about event logging already. It's a mechanism by which all events associated with the user, all clicks, all video views, all image views, all search queries, everything that a user does in a system is logged at the server side. So the point of logging all these events is that you can run data analytics. You can learn what a user wants, what large number of users are searching for. You can perform data analytics on these data. For example, if large number of users are reporting on a particular event, you can take note of that. Or large numbers of users are tagging it, tagging something or navigating or have trouble navigating something. You can take note of that. So this has already been performed on existing systems like Open edX. So we want to build a new event logging system. And content versioning is basically like a lot of you might have already used content versioning. It basically keeps track of all the changes to the content. All your content and all the changes that is made to the content will be kept track of. And you can see all this history and all the contributions by a particular user. There are various tools available like GitHub, CVS, SVN. So combining both event logging and CVS, we can have very powerful data analytics mechanisms. For example, an event cannot capture how much a single user has contributed to a project. But looking at a user's history in content versioning system, you can see how important a user has been to a particular project. So this can be used to rank a particular user based on his contributions. And as an example is, for example, an event can tell you what type of contributions that a particular user has made, which CVS cannot tell you. This is screen grab from an Open edX event log. So you can see how events are captured as JSON files. And in the second and third figure, you can see how they are passed into readable formats, into different formats that are much more readable. This is a paper that discusses event logging. Next is OER repository platform. OER, as I've already said, is Open Educational Resources. So any text, textbook, any image, any video, it could be entire courses that are shared under Creative Commons licenses or free art licenses or no public license. All these, all such courses, all such content that are openly and freely available are called OER. So the point is to make OER repository, a platform for OER repository. So we would be making use of PHP. It's a server-side scripting language that can be amended into HTML, of course. And Rust API, as I've said, Rust is a standard for communication between different systems in the web. And you'll have to create APIs that follow the Rust standard. So dSpace is an open-source software that is widely used to create open digital repositories. So you can add content and provide access to these content for any number of users. The content can be anything from text images, any type of content. And Drupal is a content management software. When you think of Drupal, think about something like WordPress. So using Drupal, you can create web pages. You can add functionalities to your web pages variously using modules. There are several modules already available because it's open-source. People have already contributed a lot to the Drupal community. There are several modules available. You can add functionalities using these modules. And you can create new modules, modify modules, and contribute back to the community. So the motivation behind OER repository platform is the motivation behind open education, which is to say that by improving the open access to information and education, you can provide high-quality education available for everyone. So you can bridge the gap between universities and general public. And you can remove formalities like admission criteria. So these are not required in open education platform. It's freely available for everyone. And the current objectives are we have to explore the existing metadata formats for open educational resources. And we have to create a platform for uploading and searching these resources. Then you'll have to add metadata standards to the D-space repository that we'll be using. Then you'll have to design a REST API communication between the D-space digital repository and the Drupal front-end. I think these are the references for OER platform. And finally, word on what we expect from you. So obviously we'll need the finished project, which follows all the proper coding standards and proper documentation. And if possible, you'll have to contribute back to the open-source community if you'll be using open-source softwares. And if possible, you'll have to write a research paper describing in detail the work that you have done. And of course, you have to learn as much as possible about the topic that you have learned. These are the six projects that we are working on.