 O ydyw'n brif yr oeddwn i. Now I'm not just the Av guy for those that don't know me. I do a lot of things. Lots of different things. One of those things is I like dabbling with data and analytics. In particular learning analytics. Which I hope all of you have that word buzzing in your ears and PVCs or asking you questions like what should we be doing about learning analytics? Yn y context yw Mooks, mae'n slyg yn ymwneud o Simon Buckingsham, sy'n dweud i gael y dyfodol arall, y dyfodol ar y dyfodol ar gyfer ymwneud, rwy'n ddweud i'r gwrthwyng yn ymdweud i'r ddaeth, a'r Amythyn yn ymdweud i'r ddaeth. Mae'n ddiddordeb i'r ddweud i'r Ddweud i'r Ddweud i'r ddweud i'r ddaeth, felly mae'n mynd i'r bod yn ymddir iddyn nhw, sydd yn i'ch mynd i ddechrau i'i borders i'r bêl iawn, felly mae'n ddiddordeb sy'n ddiddordeb i'r ddaeth i ddweud wedi'u gwahanol i'r ddweud. Felly oeddwn i'n gobeithio, mae ydych chi'n rhaid i'r ddatganiadau. Mae'r angen o'r eich ddweud o'r adornau yma oedd rhaid i ddechrau Gwylodraeth Adam Cooper. Felly, i wneud eich bod ni'n byw, mae'n cael ei wneud o brifthynau cyffredinol i'r angen o ffasiatus o ffasiatus wedi'i angen o'r angen o rhaid i eu cyfnodol i'r angen o'r angen o'r angen o ffasiatus. Mae'n angen o'r adornau ei ddechrau i'r angen o'r adornau, a'i adornau'r adornau eich rhaid i'r angen o'r adornau. of actual insights. So taking something and producing something that we can do as a result of it. But breaking this down into a very simplistic model, you've got some data, you do some analytic, you get some insight. But when you start looking at this one in more detail, insight, who is this insight for? There's various actors within this, and it was quite nice to see Simon Nelson this morning talking about the self, the learner. So the learner getting some insight about where they should take their own direction of study. Also you have the tutor who quite often at the tutor level might be how do I make this better, how do I make this design better? Where are the things, the points that the students are struggling? The institution, the insight for them might be who are students that we want to convert onto our courses. ac mae'n ddweud o'r llwyddoedd yn ymdillfa ar gyfer y cwmhwyl yn ymdillfa. Felly, o'r cyd-dweud, mae'n ddweud o'r anoleth. Y ddechrau'r top ffasg yma yn y cwmhwyl, mae'n ddweud o'r cyd-dweud o'r anoleth, Ryfbeith y Ffyrtson, a'r Cyngorau Cyngor, o'r 5 o'r anolethau, learning analytics for social. Network analysis, discourse content, disposition content, one of the reasons I'm interested in learning analytics is because it draws in so many different disciplines or so many different ways that you might want to analyse data and I think it's time to keep reminding ourselves that it's not just about learning analytics in this context administration, that is another aspect of analysis that's quite important. And then the data, I was thinking about what data there is and within the current climate of future learn Coursera ed X it's very a platform as a service model so there's data being collected byddai'r platformau. Ond yn y cyfle cyntaf, mae'r cyfan yn cyffredig bwydol, mae'r ddyn nhw'n ddau'r ddau, darlwynt, Facebook, yn y gwaith. Dwi'nSape'r bobl yma, oherwydd mae'n ddau'r dddangos. Yn ysgrifennu, felly hwn yn ymdyn nhw oes eich ddesgwlad yw haes ymddangos ar hyn ar y data, neu yn cael bod ychydig yn oes yn ei ddesgwlad. Cyn ar gwrthu gwneud ymateb ateb o fewn ni'n eu bod yn yr oedd ymdyn nhw oed wedi'i ddweud chi'n gwneud llesu dweud? Y cwrddai'r ddataeth wedi'u ddweud? Mae i'n ffordd, yr unrhyw o'ch ddim sy'n meddwl ac ydw i'r ddweud hefyd i'r analyse, ein ddysgu'r ddataeth a'r aelogau. Ac rydw i ddim yn fwy, rydw i'r dweud nhw'n gwybod y dyfodol y dyfodol o'r cyfnodau ymddangos o'r cyfnodau argymfaen. Felly rydyn ni'n rhaid i'r ffordd am y gweithio a'i'n mynd yn ymddangos yn y cyfnodau, a rydyn ni'n rhaid i'n mynd yn y cyfnodau. Ac rydyn ni'n rhaid i'n mynd yn y cyfnodau, roeddwn i'n ei gwybod y ffordd a'i gydag yw'r cyffredd. Mae'n edrych yn ddechrau. Yna yw'r unrhyw ysgrif yw'r ystod ar unrhyw mewn yng Nghymru. Mae'n ddysgrifio'r hyn o'r fawr ymlaen â'r eich byddoch. Mae'n ddechrau'n dda i ddweud am ddechrau'n ddweud â ddweud. Ddiweddio'r ddechrau'n ddweud am ddweud o'r ddweud o'r ddweud. Mae'r ddweud eich ddweud yn ddweud ar y ddweud. Mae'n ddweud yw'r ddweud yn ddweud o'r eich ddweud. One-off answers where students just had their own particular misunderstanding But for example that big x at the top left is where 2000 students had the exact same wrong answer Now if you have two students in a class of a hundred that have the exact same wrong answer you have never notice But when you have two thousand students, it kind of jumps at you And so Andrew and his DAs went in and looked at that assignment and understood the misconception that lay at the root of it And then they constructed a targeted error message that other students whose answer fell into that particular bucket would get that targeted error message as opposed to just the you're wrong and that gave a much more personalized and useful experience to the students because it put them on the right track in terms of what they needed to fix in order to get the right answer. And so this effectively is a much more personalized experience that you can do by utilizing this large amount of data that we have. Felly, ydych chi'n gweithio. The questions I'd like you to consider are where did that data come from? Was it in a wall log. Obviously Coursera collecting it, so that's the other thing. They have to be collecting this data so that they can query it. Then someone's got to do that, query it and come up with a graph. Do your platforms that you're using for open courses provide that for you? So data access and data shape are all quite important. Can you think quite well about how that's a multiple choice of questions? I mean, if you don't want to try to do a base job where you're not stuck, it's strange that you don't get away with that. How did you come up with that multiple choice of questions? One of the other examples of this was they were actually using it for error detection to find the questions that they had posed incorrectly. So a tutor had created a question and he himself had got the wrong answer. So they're using these techniques to identify bad tutors, bad questions. Which is another way to look at it. It shows a post. It shows that there's a disconnect there. Then you look into the reasons, but by the way it's joining them. And also thinking about, this is a very simple xy plot, but if you were wanting to query that data on demographics, was it gender biased, location, age, how would you do that with your data? This is quite a recent one. I should have probably paid more attention to this before I did the video with Jonathan. It's looking at the length of videos in open courses that are most effective, most effective being people got to the end of it so they didn't drop out. So in the 609 minute range, so these are groupings of video length. I should say this is from edX, this is edX data. And this is the median time spent watching the video. So in the 609 minute you've got a peak of, you know, they watched 60 minutes of it. So the conclusion is six minutes. Make your videos six minutes, not a minute longer, not a minute less, six minutes. Think about the data that's behind this graph. So they've recorded student activity in terms of when did they click the play button, when did they stop the play button. And then they've basically, you know, this is a summary, I think it's about 20,000 videos or 20,000 students. So how would you get to this graph? I have answers for these later if you please to hear. This one's quite interesting. It's getting more advanced. This is Coursera data. It's quite interesting. This analysis is looking at engagement and disengagement patterns within Coursera. So we have, they're using the assignments as basically an indicator of whether or not a student is still active within the course. And they came up with a categorisation, too. A, auditing, so which is, I think we would use kind of like learning. You've got legitimate professional peripheral participation. B, behind, so the art's still doing the assignments, but out of sequence. T's on track and O is out. And this was a summary of, I think, about three or six classes. What's also quite interesting about this analysis is produced by students. Stanford had the Lytex lab where some of their students have been given access to Coursera data and this is what they're coming up with. Interesting analysis, I think. So to come up with these clusterings, they're using K-means, which is a fairly standard analytical tool. And one of the reasons that they were interested in this was they wanted to make comparisons between courses. So was there characteristics of a particular course that suited another one? So by having a way, then, to pull out the data, to analyse it, allow them to start identifying differences. But again, you know, they've had to pull out the data, they've had to do some, you can find the paper online, they've had to do several iterations to come up with this point and there's still work, they say, to be done to refine this further. So I've kind of been prodding you about this whole data thing, data, data, data, which hopefully is the question that you keep asking future learners, Coursera, data, data, data, data. But it has to be the right data. So if you're wanting to do analysis of videos, you need that data. So that data needs to be recorded in your system. It's not just about data, it's getting access to it. Even within an institution, there can be terrible issues with getting access to the data that you need. So even if you're using Blackboard, getting the data out there is a nightmare. Who has permissions to give that data? Google Analytics, which I'll come to in a second, is a fantastic tool for getting data and summarising and analysing data. But usually it's just the web admin that has access to that. How is that being distributed through your institution? So you have to get the data, you have to have the data accessible. The other thing is the shape. I was fortunate to be given access to data from one of the new platforms. I can't say which, I'm under a non-disclosure agreement. It was just my SQL table dump, a database. So I'm talking to people from, is Amy still here? Is that what you get from Coursera? Just a database dump. And you're kind of left to your own devices to work on it. I think you can provide data in more useful means. I quite like CSV. Future Learner said to be my data as a big, scary spreadsheet, but also in a nice, highly pink Ribbon Custodd Plasic platform. For the end data literature, I find that very nice. But I guess when I probably get better at this, they're not asking me questions that I wouldn't want to ask in that form. That's it. I've got the data to interrogate, so that's quite helpful. Which hopefully, if I remember my slides, flows, data flows. So it's creating these recipes of you can get a dump of data out of your provider, but you're able to quickly turn that into something that's useful to you that provides the particular insight that you need. And there's a whole list of tools that you can do to do that. But I think within open courses there's a great opportunity to actually share some of the expertise around whether or not the people that have the dollar signs in their eyes will permit that. I hope they do that. So I just wanted to highlight some kind of quick wins, easy alternatives that you might not be aware of, but might kind of stimulate some of your thoughts in this area. So here we have again the longest video by the global grant. 25,000 students, several weeks work. This is YouTube. Have you done the... I don't know if you can see this. This is the audience retention graph on a YouTube clip. So this is one of my videos. This is about who we are. That's why I gave you this. Well, no one liked it. You know, we've got just about 20% of people got to the end. What was that about? So this is there on YouTube. And YouTube do provide a data export to a degree. So we can see it nicely and gradually get up. So this video is kind of a nice, fluffy visualisation video. This is a different video. This is an instructional video. Look at these things. We've got people, and this is the nice thing about this. When I play this back, I can see the video. So I can see the point where people are hanging on. So this is an interesting point in terms of cognition, misconception maybe. This video has, I think, 2,000 views. So you've got quite a decent data set there of what's going on. And there's other things within the YouTube analytics about demographics, location, gender, and which might help you. You can specify the date range. Oh, so, yeah. It's not one I've dug deeply into. Don't know, would be the honest answer. And whether or not that information is public is another question, which is another consideration if you were to go down this route. So the MCQ test, what you can do in Google Analytics is you can actually do event tracking. So if you've got a MCQ and you've got Google Analytics, you can track which response people made to that MCQ. You can target is correct or wrong. Within Google Analytics itself, you can get a summary chart. Google Analytics also allows you to do evaluation, so we could do which definitely don't. I could do age, gender. I can run experiments with Google Analytics, AB testing. So I could be tweaking the platform design, running tests and getting data out of Google Analytics. So early I mentioned the issue of quite often Google Analytics is held by your web team and they don't. Can you admin write or out admin access? You can get out is CSV, that's a manual process, but there are automated ways of doing it as well. Basically now Google has a bit code that lets you proxy the data. So you can create a segment or an analysis or a slice of data and then just make that available on a server behind authentication. So you can give people access to the raw data in a controlled way. So the next one, how am I doing on time? So this is Canvas. Have you come across Canvas, VLE made in the States? Canvas have Canvas network, so basically it's a platform to provide open courses. So you can say you have the standard kind of VLE tools, the discussion forums, announcements, assignments. Really nice thing about Canvas is it has an API. APIs? Basically an API is a way for your platform will have lots of data locked away in it or processes or functions and you have something you want to do with that data of processes functions. So you write a bit code that talks to their server and that server gives it back to you. The Canvas one is very well documented. So we can take a discussion forum within Canvas and with a bit of API magic we get into a spreadsheet. This was one I did for the learning analytics open course. Once it's in the spreadsheet you can just play around with the visualization tools in there. This is a Google spreadsheet so I can share it with anyone. So I shared it with the course so we could compare how we were doing performance-wise and tutors could see how we were doing performance-wise. Having access to an API really helps get the data out so that anyone who has access to it can slice it up, dice it, try and find something interesting or useful. There's a link up there for more information about that particular one. You can go further. So this is taking it into social network analysis. So we're looking at the individual discussions. One of the reasons to go into social network analysis is that it provides insight into how the group are doing particular characters within the group. So those were all dealing with other people's platforms. At Alt we've actually experimented with our own platform. We ran Octel, which was an open course that ran early in the year. Not huge numbers, 1,400 students registered. It was in WordPress so we had full control over the platform, which was a godsend because we could create our own data APIs, get the data out that we wanted, which we found useful. So this one, it was a connectivist course, so we wanted to identify all the students who had just created the blog and made their first post and hadn't had any comments as a way to target some insight for the tutor team, the target resource to say, hey, go and give this guy some love and affection. Go and tell him what he's doing, he's good, wrong, right. So it's not an example of just taking the data out and doing something useful. One of the issues within the open context, a truly open context when you're not platform specific is that people have different profiles online and there's a danger that they become analytically closed. But it was quite an eye-opener for me when I came across a site called Full Contact. Basically, you provide a list of email addresses and it goes off and searches those email addresses against various databases and it comes back with a hit rate of, I think, around... So I put in 250 email addresses and it's come back to 178 people. It was recently kind of, you know, it brought you who they were and it comes back all this information, which Twitter profiles they have, which blog, you know, if they're on clouds, their gender, which is quite scary. So even within an open context, you can actually find out quite a lot about someone and then use that data. For good! For good people! And we're going on, at all, to do site with the MOOC Research Initiative to look at some of this open data and analytically closed. And this was the last slide. So this was, from Simon again, this idea of sharing some of these recipes, sharing some of these tools, because the field is very broad and I think we could learn a lot together. Okay, thank you very much.