 Welcome back to learning analytics tools course. This week we will talk about descriptive analytics. Before we jump into what is descriptive analytics, let us look at the data types. There are different data types like categorical, ordinal ratio, but let us use only the two types here, say categorical and numeric. What is categorical data? Categorical data is that it has either this or that, this category or that category. For example, if you collect gender information, it is female or male, there is no saying order or something here. Or if you collect a response to a survey questionnaire, the response is yes or no, then you can have this categorical data. So that is most type of data we will collect in qualitative analysis. But let us look at the numerical data which we collect from the students interaction with learning environment. This can be again classified into two, discrete or continuous. Discrete is integer value, simple value on that particular moment, say students performance in the max exam, 75, 76 that value. So the important thing is what to report. You know these are data types which we can collect from the students learning environments. So what to report? We have to report the statistical values just to make sense of data. What is the minimum value? What is the max? What is average, median or more? Or you can say skewness, something like that. You can report the statistical value of the data in particular form of representation. Also we can report the data distribution. Distribution in the sense of how the particular type of variable changes over a period of time or within the student, how particular user behavior is changing, how the students effective state is distributed across one session, how the students, all the students in the class, emotional states has been distributed for a particular R, something like that. Pick any feature or data and consider you would like to show it to one of the stakeholders. You remember the stakeholders that is the researchers once you are the researcher here, consider you want to report it to other stakeholders like the one who funding the students or the teacher, you want to show it to the someone who is developing educational learning environments. Consider any stakeholder and write down the answers to the below question. Why did you select this particular data? Why do you want to show this data to that particular stakeholder and whom you are considering which stakeholder picking? Because based on the stakeholder picking your data, what data you want to show will change, right? So what data you want to show and what properties of data you want to show and how do you want to represent? You want to represent in which type like a table, text format or you want to represent the data in some plots or graphs. So list down answers for these questions. After listing it down, I assume the video to continue. Descripted analytics is the first analytics in learning analytics we discussed. The descriptive analytics deals with answering these main questions. So the reference book for this particular part of this course is Learning Analytics Dashboard by Joyce Klex and Catlin from our textbook refer. Let us start with another activity. Consider tell or online learning environment, what data we can collect in the environment to represent in the dashboard? So imagine there is a dashboard you want to create for online environment or tell environment you are creating and you want to represent students interaction in the dashboard. What data you would like to collect and show it to students or teachers? So can you think for a moment and list down what are the data you would like to collect from the students interaction which should be shown in the dashboard. So in a dashboard to the students we can show this kind of data. For example, you can take the students performance in the test, you can show performance in test or performance in their final semester marks or you can show performance in subtopics, all this information can be shown. Or you can also show the time spent on each task and overall how much time they spent on each session and how their time spent on particular tasks compared to the other students that you can compare and show it. Also you can list the data that is reading resources or they spending time most on reading resources or they spending more time on watching videos. This data can be shown to students also teachers. You can show to teachers that your students are mostly spending time on watching videos and not reading. So teacher can go on motivating their class saying that hey you should spend more time on reading. Similarly for content developed by the user that is how many forums post they are doing or many assignments are submitting this can be shown to the teachers. For the time spent this will help both students also teachers. For students they will understand that I am putting more effect on particular tasks instead of other tasks. For teachers it might give indication that the particular task in the course where students are having problem that is why they spending more time on that particular task. So by creating a descriptive analytics we can infer a lot of information like which particular task is having students having trouble or students can self evaluate themselves with others in the class or they can self reporting like they are reading too much of resources not spending any time on watching videos or not spending time on creating a complex part of solution or assessing their solutions. So this is self evaluation happens the teacher can infer which task or which particular concept is students having problem with. All this information can be obtained by descriptive analytics. So this dashboard can be for each individual student. Also for teachers or teachers can look at all the students abstract values we can compare. If you consider this dashboard to the stakeholders in a higher level for example the school principal or the district head you may not need to show all these informations. You might abstract this information and show a different set of informations. Please consider that when you create a dashboard or collect data of a dashboard you have to think who you are creating this dashboard is for. So let us go with the next activity. Assuming you have collected data from a classroom for last few years. Say you are teaching a same course for last 5 years we started the same problem in a first week of this course. Consider you have collected the data for last few years and now you have freedom to select any research question you would like to answer from this data. For example, how students performance and attendance are correlating over the years something like that. So you can select any research question from the data. And if that is a question how do you like to represent this data to answer your research question. So take a moment think about it. The question is come up with a research question and list down what data are needed to answer this question and how do you represent this data that will help you to answer this question easily. So processing data like what data to collect how to represent is very, very important. If you have decided that the research question you should know from where you can collect this data. For example, I said that if I want to know how students attendance and the performance are correlated or I create a hypothesis that higher attendance leads to higher marks in the exam. In that sense I need to collect data such their attendance and performance over the years that is the data I have to collect. So based on your research question you will know that which data to collect and how to collect this data. If it is past percentage and the performance over 3 years I can collect data from the mark sheets or from the performance in the semesters or mid semesters exams. After collecting data as usual the step 2 will be processing the data like cleaning up the missed values, remove the outliers, make sure there is no any errors in the data. Then you have to prepare the data to represent that is you have the data now you have to compute average past percentage per year for each course so that you can show this data overall. For example, you collected all the data in excel sheets now you have to go and compute average past percentage per year 2015, 2016, 2017 so that you can show that in the graph. So that is important so not just data collection also this data version also it is connected to the research question you are asking. Similarly, I hope you have answers for the research question you decided and you have a similar set of data processing and what data you have to prepare and how you represent it. So the most important part is why we need data visualization. Why? For example, if I want to show the students past percentage in a class over years we can show that using a trend graph like simple bar graph or line chart to show that the students past percentage in a class from 2015 to 2020. Or attendance in the class for last 30 working days it is very important so you can plot all the working days in x axis and the attendance. You can see the attendance varies or attendance reducing or attendance increasing. This kind of a trend is showing here or more detail more fine grained is that times spent on each resource page. For example, the first one here we talked about more abstract level for example, past percentage here in this past percentage we are considering group of students not a single student. So we are considering students over years so there will be like n will be more students over years, n will be more. In this it is boils down to a particular class say your class has 50 students strength your 50 students data over 30 days. So one class so n can be say less compared to this. Now we are coming back to one particular resource page. For example here for resource page say the PDF on introduction what is average time all the students spent on this introduction PDF. This data again collect from the tell environment or a MOOC or Moodle. So this is introduction one particular page time this also can be collected for each students. So the data can be collected at different levels. And or if you want to go for academic analytics like it is students faculty ratio, it might help the institutes or the authorities to take decision. Also the enrollment to a course it will help the academic analytics to decide whether which course to offer next year or which course not to offer next year. So the data can be at different levels based on which what is research and what is research questions and whom you are going to share the data with. Also it is all depends on whether you want to collect individual students you want to recommend provide recommendation to individual students level. These data to identify why the students are not able to attend the classes this data to identify the past percentage is increased one particular year why what is the teaching strategy was used in that particular year. So these kind of questions can be answered by collecting these kind of data. So what is data dashboard? I will use this word dashboard in couple of slides ago. Data dashboard is not new to us we are interacting with dashboard we are seeing this everywhere in the newspapers or in our TV advertisements. It actually is a form of represent a complex data in a user friendly graphics so that you can understand it easily. And it is you typically they use charts graphs or that it can be static or it can be interactive graphs if you use the system online based or computer based data graphics. More recently data visualization is differentiated by from infographics. We are not worried about infographics or what type of infographics has to be used in this course. In this course our aim is to understand what is the data which plot or which shot to use so that we can infer from that data. For researcher the purpose of dashboard is to extract inferences from the data. When you have the plot the data and charts graphs the searcher's aim is to extract the information. Also it is to communicate your insights and findings to other stakeholders or to publish to the general public. Examples of dashboard is product reviews which we see in newspapers and TV magazines or websites. A lot of advertisements talk about these dashboards are comparing the two different data to promote the product. It is mainly in the domain of marketing, infographics also in the media, health care, finance also in education. So data dashboard is not new to us. What we are trying to see in this course is can we create a simple dashboard based on the data we collect in the learning environment such as tail or MOOC or classroom environment. So this particular video is just to motivate like we have to look at the data in the graphical way and we have to start considering ourselves that it is not just collecting data and finding the search question. Also it is to represent the data to other stakeholders or creating a dashboard such that it will be useful for researchers and other stakeholders to make inferences. Thank you.