 Lucinda Bromfield who has been working with the BPP and myself at Portsmouth. And what we do is from time to time we run these tech Thursday sessions to be precise every last Thursday of a month. There are some times when we don't do it because of it's holidays or people are busy and things like that. But we are a bunch of friendly people who are looking for more friends so that we can do things in the HE sector and the FE sector and bring about some change that we are all passionate about. So like-minded. So if you would like to be a part of this group you can of course go on to our webpage which is over here. If you just search old south it will come up. It should come up. If not if you went to the old website then through the groups and section you will find us. But I guess my colleagues can put the links there in the chat. And you can contact Lucinda if you wanted to join via her email. I'm sure she'll put a note for that in the in the chat. So what do we do? As I said we meet regularly and talk about issues around good practice and challenges in our places related to learning technologies. And usually we meet last Thursday of a month from 12 to 1. But this time we have three sessions coming up because we're launching a series on assessment in particular about inclusive assessment. But also today we're going to talk about innovative techniques and technologies that are bringing about some some turbulence perhaps or some change or disruption through as technology is often related to. So about the coming two events we are as we said we're launching a working party around the area of affecting change in assessment practices within within HE and FE. And this is our scope and objectives. If you'd like to be part of this I'll give you a link you can you can join that action as well called to action. But this is a list of events that are that we have here planned. So next week then on Thursday same time we have Dr. Sarah Broadbury from NTU who's going to be talking about open and innovative assessment techniques that has been used during the pandemic. And there are some ideas that were worth we thought we were worth sharing with the rest of the community. So we're bringing Sarah who will also be helping us with the with the working party work alongside other members of the all south group. Then on the next week following from there it's been myself and a colleague from my department Jeb who will be talking about techniques to engage students during exams exam revision that kind of thing. So assessment and preparation for assessments that we focus will be there and how to make that more open and inviting. Then we have other speakers and I've listed them here on their all-on assessment. So we'd like to have more people involved if you have something to share. We'd like to consider that around assessment. Please do contact us separately to be part of this. Now that is the link I was talking about. If you want to join the the working party that we are launching from this week onwards here is the link for you to complete and join us and we'll be updating you on our activities and obviously involving you with that work. Now today what is it today? We are going to talk about assessment and I asked in the regular sort of style of the sessions with I asked chat GPD what is assessment and this is what it came up with. It's about it's evidencing knowledge skills and abilities and so on. So you all know all this anyways but I'm just checking whether chat GPT understands that yes it does and I asked it well with you around with chat GPT around what is the future of assessment going to be like you know obviously we're all worried about assessment offenses and academic integrity. So I asked this question and came up with with these suggestions that you could use open-ended questions and exams or use scenarios and project-based learning critical thinking these are the things that says are going to obviously not as badly affected as some of the other assessment formats but again you take this as with a pinch of salt because it's always developing. Then I asked it what does it think about the knowledge it holds about the world and science and technology and so it says I'm highly knowledgeable but I have limited scope, limited contextual knowledge, limited multimodal understanding and so on and so forth but I can be inaccurate as well but it's being regularly updated to include things like having the ability to have empathy, cultural awareness, security, intent, law, science etc. So it's constantly yapping its game. In terms of own skills if I asked it it gave these list of things and in the end was interesting it could mentor, coach, exemplify concepts, clarify, signposts, resources. So these are quite kind of things that you do with your students so it's claiming to be able to do those things. Then I asked it what about your own emotions, judgeability, what do you what do you feel about right now? It says I'm unable to feel lack of empathy, impersonal tone, lack of emotional intelligence. So these are the things that came up with I can't be surprised, shocked and so on and so forth. Then I asked it can you experience anything? It says well how experienced you are in doing the things that you claim you're doing sorry. So it says I'm not perfect, I'm trained not experienced, tested and evaluated against benchmarks and data sets widely used around the world. So instead of saying that you're experienced, you're tested and you're popular, you're used around the world. Then I asked what about my colleagues but about my educational sort of academic colleagues who are aware of you being there, what do you think about their knowledge and it's used in the future with you around. So then it came back with things like that's irreplaceable, it's always evolving, it is needed to do jobs. Then I asked it what about the experience of people? It says it is valuable and essential, gives rich and nuanced understanding of the world, critical in arts, literature, philosophy but also in STEM and R&D. So it's kind of pushing people into that area where it can't do research and development or come up with original things or or lived experiences which are nuanced and so on. Then I asked it what about the skills you see humans have which will be useful since your arrival and maybe in future when when you're common place you know things like generative text and other realized common place. It says creativity, intuition, trial and error, emotional intelligence and so on and so forth. It also put in the end safety and ethics, interpretation data. So I'm thinking some of these are already its forte but some of them are perhaps still frontiers which for future I guess for it to become expert in. Then I'll ask it eventually about the emotions people are having at the moment and I'm sure you will be in either one or the other or combined camps if you would perhaps want to say that in the chat about how you're feeling, maybe that'll and why. That'd be nice to see in the end when we collect the messages from you all. Excited, want to use me, say stuff would want to use me to give personalized feedback. It says some ways in which you can use it, offer them new opportunities and so on and so forth. And it might also be concerned about jobs, ethical concerns, privacy, unintended consequences, skepticism or accuracy. Are these the feelings you're having about chat GPD? GPD perhaps they are. What it did not say immediately was about the academic integrity and so on. And then I ask it eventually okay well if you're going to be around for a long time what can we do as humans to overcome our negative feelings about you and it says okay embrace me, develop me, use it as a tool, use me as a tool to improve effectiveness, efficiency, inclusion. Now that's something I don't know why I picked those words because in my thesis I talk about technology being used and useful for improving effectiveness, efficiency and inclusiveness as well. But anyways, inclusive language use and learn more about inclusive best practice, legal requirements, stay informed as I'm involving value of human connection. Don't forget that. Of course so yeah with all of that kind of backdrop I want to invite our next sort of speaker for today who's Manjid Durkant who's very kindly agreed to come on and talk to us about the silver lining perhaps in this in this cloud about use of AI in assessment for supporting us, for helping us do our tasks. So over to yourself Manjid I'll make you the presenter or you've already presented a counter. Over to you then. Thank you very much. Are you able to share your screen now? I'm sharing my screen now I'm not sure if you can see it. It's actually mine. There we go. Great. Welcome and thank you for coming here. Thank you very much for that Manish and I really liked your introduction there as well. I think it's quite a novel little thing that getting chat GPT to answer the questions that you might have of it in some sense which is quite interesting. I think it's one of those things that you know happens once in a in a generation where everybody kind of coalesces around the topic and finds it really quite interesting. We actually ran a roundtable yesterday and we had over 600 people sign up to just you know have people discuss about chat GPT AI and higher education. There are a lot of interesting concerns and kind of future looking potential that we'd all seen and were discussing. So I'm looking forward to seeing what this group also has to think. So the short kind of title of this talk is that things are changing and we really do just need to be ready. So the first question I think a lot of people have is what is artificial intelligence? Now I've got a quite a skeptical look at this statement but I personally think it's pretty much a catch all term from a commercial perspective. So I know plenty of companies out there who are using it to get a lots of funding or to try to sell their platforms more effectively and all this sort of stuff. But from a technical perspective I always think of artificial intelligence as primarily rooted in statistics. The idea being that you use techniques often quite sophisticated mathematical techniques implemented with really sophisticated kind of technology and data techniques to classify items and occasionally generate content. So you'll see here in my diagram on the right hand side that the idea being that you know if you have your input you get some sort of classification out and the training data which is often large kind of gets combined with some sort of statistical understanding and this looks like AI. Now this is what we've kind of thought about AI for a very long time. I give it a character or some handwriting or something like that it can identify what character is or if I give it a picture of a person we can identify which person it is. We've seen our phones be able to classify which groups of people are our family members and that sort of thing and it's become commonplace and classification of images and written text and that sort of stuff has been commonplace for quite a while. These algorithms have also been used in social media to show us advertisements and show us posts that are going to cause more engagement and also show us more time give us more time on the platform by allowing these companies to you know generate more advertising revenue. But transform models are quite a different beast. Now the idea is still the same you're still using large training data and statistical understanding but the technology itself is quite powerful. This existed for quite a few years now but it's used in chat GBT and other large language models and the reason transformer models are different is because they are so powerful and generating of content which is I think for a long time people have thought is a uniquely human trait and that's kind of what's challenging people recently when chat GBT has come out. Now most people weren't interested in GPT until the chat part came out. Now GPT itself stands for Generative Pre-Train Transformer. The GPT models have existed for years. GPT 3 which is the core of chat GPT well technically 3.5 that's not getting to too many nuances has existed for over a year now. People have been using it for marketing and for coding for ages. In fact grade ourselves have been using GPT to help us generate marketing content and help us generate blog posts or at least paragraphs or blog posts here and there to help kind of speed up that sort of stuff and you know GitHub co-pilot has been used to help people generate code as well and it's kind of interesting how the chat component has been added recently and has made it one of the most fast has made it the fastest growing platform ever. Now GPT itself is trained on the internet, trained on large faster text data, has billions of parameters but at the end of the day it is just from a lay person's point of view, predicts the most likely next best word. From a computer science point of view I should be saying token but I'm going to be using the word word just for simplicity's sake. The idea being it's like you know you're auto-complete on Gmail or the little auto-complete you have on your phone when you're texting or typing but just supercharged beyond belief because it takes everything that you've typed before and run it through itself to be able to generate the next best token very sophisticatedly. In some sense it's actually kind of magical that this technology even works in the first place but it does as we can see. The chat component is slightly different so what the chat component is is that it's an agent that sits in between you and GPT. The idea being that it takes what you've typed into it and it can convert it into a prompt that GPT can use to then auto-complete the answer and that actually because of the nature of the way this agent has been created does actually cause quite a few issues. Let me just open the chat as well so I can see any comments as well. Please feel free to ask questions in the chat and I'll try to answer them as we go on as well. Now I think the question most people have is really how powerful is it? In short it's incredibly powerful. It can code. I've talked a bit about GitHub co-pilot. It's existed for a while. People have been using chat GPT and co-pilot to be able to write platforms and like scripts from scratch. It can write. Marketers have been using tools like this like Jasper AI and other types of tools to write copy and blog posts and social media posts and refining those types of things. It can even pass exams. It's passed the Wharton MBA exam, the US medical license exam, the law school exam. It is genuinely really powerful. So should you be worried? Well yes and no. I think immediately as educators we're quite worried about students using it to cheat. Now students using it to cheat. So the survey that we ran yesterday found that of the cohort of people who came around 77% or so of educators in higher education are worried about students using it to cheat. But on the same side, detectors are being built and we're kind of doing the reflective look at what assessment even means and why we do particular types of assessment and gearing that so that we do different types of assessment that cannot just be just plugged and played into chat GPT. This type of technology will be an industry soon. In some sense it already is. One of the things that I can really advocate for is teaching AI literacy. Not only to our students but you know to our peers because I'm shutting our eyes to this technology being available is never going to really help. Now can it be used to reduce workload? Yes. I've seen examples of this technology being used to create lesson plans, summarize large vast quantities of content into kind of different lecture content or at least the basis of lecture content which can then be used with your own kind of finesse. People have actually been quite interesting about this and what's it called? They've used chat GPT to create example answers for the questions that they usually ask and then getting the students to critique it which is quite interesting as learning tool. But at the same time you know it has limitations. For example it can't do maths or like physics or subjects where you require sophisticated but strong subtle links between topics. It's a confident liar. Because of the nature of GPT because it's just predicting the next best word and the next best token it is going to lie to you because it's just creating things out of thin air. And this is kind of where you have the alignment problem in the training with the chat component which I'll get into soon. And the creations can also feel kind of uncanny and then this kind of gets back into the alignment problem as well with the chat component as well. We've got a couple of questions here or and comments that I like to address. So Microsoft is embedding its own AI words into a word how to put it in an outlaw tune. Yeah that's very important to think about. It's already in Bing. In fact the more powerful version of GPT is in Bing which is in beta which I've had access to. It is slightly more powerful it's quite interesting but there's no like remarkable difference in terms of like a step function. In terms of assessing how good you are at using the applied AI applied field and practice I think that's a very good point. Using this technology and incorporating into our education as a way of assessing how good we are. It sounds a little bit ridiculous when we say it now but we at school level and GCSE level we have calculator exams and non-calculator exams to assess how good we are at using these type of tools. Liz has got a good question here about opening up a divide between people who can pay for these services and those who can't. I've got a slide about that coming soon so we'll get on to that. Yeah and chat yeah it's exactly that that sorry Linda said that chat GP doesn't always get things right and that's due to the nature of the predictive text elements and students have found out that the hard way. I had a great story from Professor Alison Davenport who said that a student used it to submit some work and clearly it had used a technique that was well outmitted and wasn't correct but because it was just using this large source of data it was just kind of autocomplete this sort of things and then we've got a couple more things about reducing workload is that yes Ruth has a really good point here is that every kind of answer response that you've had from chat GPT sources provided simply don't exist the links don't exist and they're broken. I think this is again going back to the whole point which is it is just generating the next best token and it's not doing anything rather sophisticated. Now I do want to be clear here GPT-4 when it's integrated into Bing can provide sources because it can cite the web pages that's pulled information from so there is a slight difference there and as we start to get these tools become slightly more sophisticated it might be able to hold more references slightly more effectively and so let's move on to the next slide here. I think a lot of people are worried about detection like if people are cheating how do we actually detect AI generated content in some sense you we were all worried about people using Wikipedia and all these online sources but then tools like Titan and other plagiarism checkers came about and there was less worry about this so I think the knee-jerk reaction that a lot of educators are having is well let's be able to detect this so that we can then at least use this more effectively as a learning tool now I want to be very clear about this detection isn't going to be a catch hole in some sense it's an arms race between the tools are generating this content and also the tools that are detecting this content at least initially we can use things like signals within the output so there are tools like GPT-0 GPT kit and grade ourselves we're working on the tools that work on this premise the idea being that there are different types of parameters when you can analyze language using NLP to work out whether or not text was written by a human or not for example GPT-0 uses something called perplexity and burstiness of language so perplexity is effectively the the randomness of the words that are used AI generated content tends to be less random in the word usage the burstiness of languages GPT-0 defines it is kind of the variability in sentence length and a human written content is quite bursty the sentence length varies quite a bit but as these tools get more sophisticated and as we kind of ask it to be prompted in a particular way of writing we might be able to navigate these things it's not going to be a 100% catch hole because at the end of the day it's a statistical measure of the actual thing it's looking at now on the other side of things yes exactly Edna has got a good comment that GPT-0 can recognize a GPT poet for them yesterday and that really is just highlighting that you know as these tools get more sophisticated like GPT-0 I believe uses GPT-2 to be able to analyze some of the perplexity and burstiness but you know if we're already on GPT-3 GPT-4 you know GPT-N let's just say right it's an arms race and do we really want to be getting into this arms race now the next thing here is watermarking or fingerprinting now this image is from click and bow at all it's very readable paper I recommend people reading it and the idea here is on the model side rather than the user side right and so what this means is that if we effectively split the word space or the token space of particular words that can be output by the model into red words and green words such that on average normal normal human languages should have around 50% of red and green words we can actually bias the green words slightly and reduce the percentage of red words slightly such that it doesn't affect remarkably the outcome of the words but when we look at the work that's been generated by AI it will be remarkably more green than red giving a statistically significant indicator that this text has actually been written by artificial intelligence and now this obviously depends on those people providing the models open AI have said they're going to be working on a watermark but other tools may not and open source tools may not and you know we can't fully be relying on this so I guess one of the things I want to say is that detection is here but is it necessarily the thing that we should be focusing on and is it something that we should be relying on in my personal opinion no but it is a tool that we can use to just kind of iron out kind of the cases of like just absolutely like like absolutely ridiculous cheating and there's another comment here saying that from Sheila saying that it's very well noticed that GPT-Zera can provide many false positives and I think that's also very important to note here you know detecting AI generated content is one thing but if we accidentally you know say that students work is AI generated when it wasn't that's going to cause a lot of issues too so I know like this is a bit of a everybody kind of says this a lot when it comes to education but whenever I think about kind of new tools and new things incorporating within education I often come back to Lume's taxonomy as at least as a you know as a little pulse on where to kind of go here and you know one could argue that from the invention of Google and Wikipedia that you know the remember balance of the taxonomy has largely been rendered relevant within the workplace not to say that it's not used and not important for the education piece the question we kind of need to ask ourselves is where does AI affect the rest of these things you know AI can be used to explain ideas and concepts you know it can be used to it's not that good at applying the different things but you know as AI evolves we might be able to start to apply things a little bit more as well analyzing is quite difficult like I said a lot of the content is uncanny service level it struggles to draw connections that are very subtle but rigorous as is with physics maths you know to evaluate and justify and defend I think this one's quite interesting because if you ask it to defend itself it will but it will also often produce garbage as a result of it and this comes into that whole uncanny aspect of that sort of stuff and creating new and original work I know that you know when we like create poetry or create prose or ask it questions as we did at the beginning of this talk you know it might feel like is creating new and original work but then there's an aspect of which it's recycling old work and there's always a balance in how do we kind of use look at how AI enhanced these components of the taxonomy but also how we can use it to assess our students' ability to leverage their technology in the first place so here we go Clement's got a comment here saying there are also paraphrasing tools that can take outputs from chat to make it more human sounding and effectively remove the watermark if it even existed I think that's a that's an interesting component of course it can't just be a you know like a big library open source library as to what color the words are you know otherwise you could just use a synonym tool to just completely remove that the watermarking aspect there'll be there has to be a huge kind of like a seeding process that would be a whole thing and and you know even if we do have like these really good section tools they're just not going to be perfect because at the end of the day they're based on statistical measures and there are additional issues with generated content so I've talked about this alignment problem quite a bit now now what actually is it so chat gpt is trained by training an agent to produce good prompts now normally sorry I lost my train of thought there for a second it's just training by training an agent to create good prompts now normally we would ask a human hey can you create some good prompts for us so that we can actually use that as a training set but in practice it's quite difficult to get people to train to create that type of data instead what open AI did which was to be fair quite smart is they reviewed the output of what was going on and used that to train the trainer effectively like a reinforcement learning based system to um within this AI system what that means though is that you know you can either thumbs up or thumbs down the output and if you've seen chat gpt you're noticing the corner there is a thumbs up and thumbs down that's still there from that training data but the problem there is not all people know what quality output looks like in all domains I'm not a poet I know the simple completely ideas about poetry and rhyming words and and that sort of thing you know iambic potameter is a phrase that was uttered when I did juicy I say English um but um I don't understand poetry very deeply um and so when we're talking about the agent it's actually incentivized to lie because if it created an output that was hey this is uh I don't actually know the answer versus uh a relatively convincing surface level lie somebody who doesn't know that what quality output is is more likely to approve the subtly convincing lie versus um uh the saying that something is is is just uh not not known at the same time uh you have the problem of creating kind of middle of the road content surface level content and the reason for this is because it the way the reward model works it's going to get just as much reward for creating really basic examples of stuff versus creating in-depth examples of stuff because if there was a very sophisticated poem I as a lay person would not know what that actually is a good poem or not I am more likely to actually give it a thumbs down than a thumbs up and thereby it's actually incentivized to create kind of more surface level content there's a huge problem when it comes to embedded bias see these large language models are trained on the internet and the Bing AI is also connected to the internet so that um it uh it updates itself um and the bias of the content is therefore embedded within the model and wide-scale usage of these tools propagate more bias more effectively it's like shining a mirror on ourselves but with a magnifying glass you know it's absolutely can be absolutely horrific and I don't think there's been enough kind of oversight to looking at the bias that exists within this data um and also the bias that will exist in future large language models now there's the aspect of financial access as well that is well you know chat gpt plus is $20 a month um and it's quite likely that the paid models are going to be better than their free counterparts and this it immediately makes me worry about the financial inequality gap allowing people with access to means have access technology to uh supercharge the difference between those two kind of groups and that social inequality now when we discussed this yesterday we had a provost state that it's quite likely that you know if these tools are show value that institutions are quite likely to be able to procure these tools for all students at the university level now the university market from a business point of view this is my commercial background the university market the school market are quite different markets um so even if this might be true at university level it might be way too late by the time we actually have um uh this uh this done so i'm just going to go through the comments again so here we go so clements talking about uh reciting pre-existing work isn't that what a lot of creating new knowledge is like i you know there is a common saying uh you know there's nothing new under the sun or everything is um everything is uh i forget what the saying is now everything is a remix that's right um and that there is an aspect of that certainly i think um we're going to have to see really how much creation of new knowledge and new content is really recycling versus um creating new things and i think that can be very interesting time with chat gpt uh manish has made a comment regarding the arms race that i mentioned um and the idea being that the tools like this need to develop student knowledge and meta cognitive skills and it can only be used to improve higher education and make it more inclusive i think that's really true if we have equitable access to these tools there's a future out there we where we have very inclusive education uh we've talked to students and they say they're using chat gpt to explain topics and ideas to them from the point of view of professors or a student to kind of get different viewpoints and different ways of understanding things can be really quite interesting uh to live in uh this kind of time um and then there's a huge Liz Avery's made a comment regarding the human cost of workers that are used to train the models and that can also buys income as well that's i haven't read that particular article but um i'm sure it'll be quite a good read um it's not yet truly capable of creativity and tuition it's limited to working with the parameters of the data and how it's trained on the definition of the word new is important here that's actually a very good point and again i think um tony we're going to really see how much new is recycled versus never been seen before clement's got a comment regarding chess players they never could think that machine could do their job then we can actually looking back we we can actually explain why chess is easy to do for a machine but it's always easy to explain things that have already happened uh you know i i'm actually um you know a really big chess fan and looking at the history of how gary kasparov versus uh deep blue i believe it was and and the way things worked then i think point an interesting point here is there was kind of a there was a there was a there was a limit of explainability when it came to chess and algorithms and for example stockfish 12 or so stockfish 13 or something like that was using algorithms based around calculation of point values of particular pieces and doing in-depth tree propagation so looking forward into the future and finding out moves but when alpha zero came out and now those kind of neural network techniques are being embedded into future kind of tools explainability as to why those chess engines are making those moves is really hard we went from a system where actually you know moving upon two p two spaces forward is going to lose you a knight in this one particular line which is super specific to i don't know this that's what the computer said and explainability is a huge part of artificial intelligence that a lot of different components are missing um ed edna has a comment regarding uh ideas but does it create a sense of mistrust in the classroom how do you counteract this for me this is about ai literacy like i remember being a student and when wikipedia was new everybody every teacher would be telling me hey you can't just use wikipedia it's an open source set of information anybody can edit you know and in some sense that's what the power but we were also taught look use this as a starting place find the references do your own research and stuff if we have a good source of information for ai literacy and ai ethics and understand the bias that could be in there we should be able to counteract these problems i'm scrolling through here with the other comments you know one says that sooner or later every educator will have a tool like this like any other lab tool that's quite an interesting comment like when we look at like the mathematics and the physics kind of tools we have tools like mathematical uh which can solve really complex equations and systems etc uh analytically exactly and we don't necessarily oh like this is going to absolutely ruin everything we still teach students how to solve those types of equations because the point of those equations is not necessarily to get the answer the point of those equations is to teach them a way of learning that meta cognition that's been mentioned in multiple times that critical thinking that's not and yep there are there are comments about tripping up gpt which is also true currently it doesn't go past 2021 that is correct for chat gpt but it's not true for being a gpt which is integrated into microsoft that is live and up to date connected to the internet and will always be up to date unless they change anything after i after this talk okay so i'm gonna go into the next slide here so the next section of this is talk is is really about an ai assessment assistant this is kind of where i sit and grade exists so we build grade while um we were postgraduate students and we noticed a few things we noticed that the assessment design and delivery occurs in segmented systems you'd be creating content on like a pdf document and you'd be giving it on your learning management system students would be doing it on a piece of paper and uploading a pdf and then you'd be printing it off or you're marking it really weirdly and you'd be copying pasting the marks into an excel spreadsheet it was just like a nightmare and often when marking was done it was very repetitive and time consuming and thereby expensive where we were using you know postgraduate students to mark that work and when we looked at the nss scores you know feedback was often the lowest scoring assessment feedback was the lowest scoring education category and the idea of grade is to really solve all those issues the idea is to design and deliver assignments from one place use technology and artificial intelligence to give feedback faster and to improve the rubric system to improve consistency the idea being that you know you create your assignment on grade the student attempts it on grade you mark it in grade which allows you to give it allows you to mark work quicker and more consistently that allows you to reduce turnaround time now we've been talking a lot about ai so let's talk a little bit about how grades ai actually works as well so the idea in that when a student response comes in we basically ask the ai a question and the question is has this answer been seen before within the set of questions that have been previously marked for this particular question if it hasn't that's when it's given to a teacher they mark it they give it high quality feedback and we use that with AI learning such that when the next answer comes in we can look at the difference between those answers and we can look at the feedback that you gave to the first set of answers and provide partial or complete automation depending on the thresholds that are available now I appreciate that this is just a lot of words so I want to actually show you what it looks like um here so uh is it has it changed tab okay I think what happened yeah we can hear you my share okay I think my share bro okay is that better is that uh it should come up I guess yeah it's there we go okay apologies for that no problem uh technical hiccup there um so the first thing I kind of want to highlight here is this this rubric the idea being here that this rubric is shared amongst all markers with the same question to allow for consistency if I edit any one of these pieces of feedback this edit is retroactively applied to all other students for the same feedback to ensure consistency I can rearrange this rubric and stuff without bothering other markers and have things kind of set up in my own particular way the way I've given feedback here is I've kind of highlighted the relevant area and I've given it the feedback that I want to give this is pre-written feedback just to save some time and I've only marked two questions here but already the feedback the AI started to learn so in this question you can see in this answer sorry we can see that the AI is suggesting feedback based on the way that I've marked those previous questions and it's highlighted this for me there's a 50 accuracy confidence that it has here and this is based on the the previous data if I accept this and we move on to the next script we can update those percentages as we move on if we just kind of go into different questions we can see the different confidence levels that the AI has for these particular types of questions and it's not only for mathematics but it's also for let's see if this broke it no it's switched it's also for short answer texting so you know here we've looked at kind of the idea of like understanding natural selection and you know it's understanding a few key differences between these answers which affects the confidence percentage but if there's nothing to say then it won't pick up anything and it has different kind of areas that it will adapt and understand now here the difference is quite low if this for any reason is incorrect I can just delete this feedback and the platform will actually learn the differences that it can do for those particular types of questions and understand the nuances of the way your marking work we've got a hand up from Manish yes just wanted to remind you that some people may wish to leave at 55 because they may have a meeting with physical somewhere else but yeah please continue and now you've been taking questions on the chat anyways so thank you for that yeah I'll leave the floor back to you thank you sorry thank you um yeah I've only got a couple of slides left anyway so in terms of using grade you know we've estimated that depending on the way how much marking you do and how much you pay for it you can save around a quarter of a million pounds a year depending on all those types of factors we found out that you get faster more consistent feedback so we've reduced turnaround time in cases where it's been used from three weeks down to like three days sometimes and the consistency feedback has increased and we see um now we haven't fully tested this yet but the hypothesis is that improved feedback can will should result in improved learning in terms of case study the university of Birmingham even if they didn't use the artificial intelligence they still found it significantly more easy to provide feedback and still found it significantly faster and we looked at the artificial intelligence component for mathematical responses we found that you could reduce the amount of time it took to mark work by up to 89% and increase the feedback by up to seven times and you know we're always building this community of other educators using grade we've worked we've partnered with JISC for example and other universities as well and our algorithms are always improving for example we're working on an essay marking uh and report marking solution uh as well um I'll I'll I'll take some some questions now uh from the comment um so people can raise their hands or type in the chat whichever is convenient given that it's like eight minutes or so well if people want to leave at 55 do complete the survey that I've posted or I'm giving a sort of feedback about the session but also if you want to join the group for making a change in inclusive assessment practices please do that over to you. Can grade work with handwritten submissions uh yes Liam it can you just don't get the artificial intelligence component but you do get the workflow benefits so for example um if we get a pdf example here the idea is that you uh draw a region over the area that you want to give feedback to and you link the feedback and we're working on connecting the artificial intelligence component into this engine with optical character recognition which we already have in the platform for students what sort of solution do you suggest for long form written assignments so at the moment we have a solution that we it's still in beta so it's not fully released yet but the idea is we can you can grade essays and reports as well so you can see the reports you can give it inline feedback like you would you know notes in pdfs but you also get this direct feedback in a compressed rubric system where you can see all the statements that you want but if you want to you can expand it to see that the particular nuances of the points that you've defined and allows you to be give feedback faster and more clearly all in under one hood. We don't have ai here just yet but we're using this alongside potential partnership to improve the ai capabilities for long form content. So can I ask a question um as well have you seen any improvements in the nss scores in whatever your system is being used at at the moment? Yes that's a great question so we it hasn't been implemented long enough for the nss scores to be measured as a result of where it's been implemented but we do uh ask questions regarding that are similar to the nss and we find that students find the feedback more consistent they find the turnaround of feedback better they find the quality of feedback higher and the quality was defined as kind of applicable and relevant etc as well from Loomstack'sonomy. Thank you. If anyone's interested feel free to go to our website you can like book a session and chat further etc thank you. People are sending you lots of thanks and they're very appreciative of the time you spent with us thank you so much as well from the team if people would like to get involved and you know work with us together making assessments more inclusive please join the group and leave us some feedback which I can pass it on to Manjitha later on we have the chat messages anyways and we shall see you next Thursday unless you have any more questions for anyone of us including Manjitha our guest speaker thank you. Thank you very much everybody because nowadays you have so I can pause the recording as well just a second stop recording.