 So welcome to today's session on artificial intelligence and evaluation. We have a panel discussion. So thank you for joining us. It's being put on by the South Australian branch of the Australian Evaluation Society. So before we begin, I'd like to acknowledge the traditional owners of country throughout Australia and recognize a continuing connection to lands, waters and communities. I'd like to pay our respects to Aboriginal and Torres Strait Islander cultures and to elders past, present and also emerging. And I know that we're on at least several countries based on just the location of South Australian members who are here today. I'd also like to thank our Melbourne colleagues. Today's presentation was made possible by a conversation between myself and the Melbourne branch. Convener Ruth and she recommended one of our panel members, Christy, for today's presentation. So thank you to our Melbourne colleagues. If you're not a member and you're interested in learning more, then perhaps now's a good time to join the AES. You can grab the screen link there. It's pretty easy, www.aes.asn.au and you can find out all the details about becoming a member. So let's get into our discussion. I'm going to facilitate it. My name's Mark McKay. I convene the South Australian branch with many of the members who are from the committee who are here today. And AI has taken off with, I guess, a bang in the last year or so. It's been around a long time before that though. And we're seeing more and more headlines and the speed at which generative AI in particular is emerging is, I guess, interesting, challenging and leads to lots of opportunities. But we need to think about it, particularly as evaluators. I've got some of the headlines that have interested me. Algorithms are pushing AI-generated falsehoods at an alarming rate. How do we stop it? I don't know if anyone's used GPT. I'm sure you have, but if you asked it to generate references and then you check those references, often they'll be rubbish. We've got Google, we've got other platforms also launching into health and health is an area where we do sensitive data. So who's accessing that? So the systems learn. What happens to it when it does learn? How can we be sure the results are useful? We've got the crime side of things happening. So in the bottom left-hand corner of my screen, AI has been used to generate the voices of family members and then extort funds so that they're released. No family members were involved in the crime. Their voices were picked up and copied and it was made to be quite real and obviously people pay out. There's also the opportunity to send doppelgangers to university classes where if it's an online session, someone can be present and answer the lecturer and engage, but they're not there. So what happens? I guess also what happens if we do outsource our work to AI, where does our skill base go? So all sorts of things are happening. Even in the creative space, people are talking about the end of photography or how is it going to change? So lots and lots of interesting things are happening and I'm sure you've seen or heard radio talks, podcasts, seen it in the news, it's everywhere. So today we've got a panel of people to speak about it. I'd like to say thank you to them first and then let them off the hook to some extent. This is a new space. There's not gonna be a right or wrong answer. There's gonna be some opinions and I'm pretty sure I'm on side with the opinions that they have, but it should be interesting. So Ian Walton will be speaking first. Ian is a longtime colleague and friend of mine. He was a former data quality manager at Medicare Australia and has also been involved in the education of postgraduate students studying health care administration for many years. Nigel Bean is also a friend and colleague, former professor of mathematics and now working in the consulting space. Nigel has had a lot of experience using synthetic data, understands probability and also can talk about AI. So he'll have some interesting views as well. Christy is an associate director of ELT at Grovesner, sorry, performance and she has an interest, especially in ethics of AI as it pertains to evaluation. So let's get on to it. So the structure for the session, each panel member will have five minutes to talk about something on generative AI. We'll start with Ian, then move to Nigel and then to Christy. Then we're gonna put some questions to the panel, then open it up for question and answers. If you've got any comments, burning questions, if you want to write them in the chat and we'll get to them in section three, that would be great just so we don't lose time with interruptions. At about the four minute, four and a half minute mark, you'll hear my timer go off. So that gives you a warning, Ian, Nigel and Christy that you're running out of time. Okay, let's get underway. Let's have Ian talk. I'll hand it over to you, Ian. Just stop sharing the screen. Thank you, Mark, start the clock now. Welcome everyone and thank you, Mark. I guess I have a strong bias here with regards to data quality. I regard it as the foundation stone for future AI. After 40 years of it working of just about every possible aspect in the health sector and working on two data warehouses, I have a very strong appreciation of the concept garbage and garbage out. So my proposal with regards to AI is to build the AI house on rock, not sand. What do we mean by data quality? Well, in my 40 years, I always use this formula, fitness for purpose. I want to use to tell my students is, would you expect a world-class swimmer to be able to hold their own in a boxing ring with a world-class boxer? And the answer is no. So therefore, when we look at data and we look at information, we should be looking at the fitness for purpose. Now, data emanates from a business system process that quite often involves policy, information technology, and frontline staff. It's a very simple model that I use to explain complex processes and to highlight the power of silos. So let me give you an example. Let's say the organization wants to create a field called data birth so they can register their clients. Policy says, I want a date of birthfield, please. And IT says, give me the specifications and I'll put a date of birthfield together. And they'll say, it's not a problem, we do it all the time, day, day, month, month, year, year, year, everything's fine. And the frontline staff will say, just give us the policy and the business rules associated with the registration process and everything's hunky-dory will produce you an outcome of client registration. What I did in one of the organizations I worked at was I brought the policy people, the IT people and the business operation people into the same room to look at all of the fields, one of which was data birth. Policy said, I don't know why you're reviewing this, we have a policy for date of birth, it's done. IT said, we don't know why you're reviewing this because everybody in the world in IT knows the date of birth is day, day, month, month, year, year, year. And the business ops people said, we don't have a problem either. So why are we reviewing it? And I said, well, here's a chart of the date of birth for the country. And there was a massive spike on the 1st of January and the 1st of July. So I asked the policy people, how could that be? And they said, well, there must have been a lot of people born on the 1st of July and 1st of January. I asked the IT people, don't ask us, we've got the field there, it's the business operations people that have filled it in. The business operations people said, oh, we can explain that easily. There are an awful lot of people who don't know their date of birth. So we just use the 1st of July or the 1st of January. Completely destroys the integrity of the process. Now, why do I give you that particular example? Well, that's a simple thing like date of birth. There are many, many, many others and you really need to know what you're dealing with when you're dealing with data from organizations. So to solve the problem of date of birth, we need a transparency of process. We brought all the people into the one room and looked at the system process to work out what was going on. We also identified that the data source must be identified along with the system processes. Now, when we moved to AI, the very first thing I'd be asking for is where did you source your data from? And what was the integrity rating for that data? I also strongly have learned over the years that we require technical, which is the IT view of the world, and business metadata. Now, metadata is information about data. In the example I gave you just now, the technical data would say date of birth is day, day, month, month, year, year, year. And if you looked at the business data, it'll say, and when we don't know, we'll use the 1st of July or the 1st of January. That's what I mean by this business metadata. How many people collect business data? Very, very, very few in all of the major organizations I've ever been associated with. So let's go back and look at the example I just gave you. We've got transparency and accountability as being the issues in these silos, policy, IT and business operations. What sort of problems do we hit with? Well, the same problems all silos have, the lack of communication between silos. It's not my problem, go and see the IT or the frontline operations people. We have the problem of what's in it for me. IT and policy may say, well, we're not interested in date of birth. There are other bigger fish to fry. You've got 30 seconds, Ian. We have no time and no resources. So what I'm saying is we need incentives and penalties with regards to this process. We need to educate people as to the value of data integrity. Otherwise it's garbage in garbage out. And I'll just quickly finish with the Bee Gees, have a hit song once with How Deep Is Your Love, which includes the lyric, we're living in a world of fools breaking us down. Perhaps in relation to AI, we should rename the song, How Deep Is Your Trust? In closing, John Farnham got it right when he said, you're the voice I would add, use it or lose it. Thank you. Thanks, Ian. Okay, moving across to Nigel. Now Nigel, I know you wanna use a slide, so I'll just bring it up for you. Thank you, Mark. Okay. So thanks, Mark, for organizing this and welcome everyone. So I wanted to just start with a little bit of history of artificial intelligence. It's been around for a long time, but historically its main use was to support decision-making. And as Ian's just talked about, garbage in, garbage out. If you're training your decision-making algorithms on data that isn't worthwhile or isn't reliable, then you're gonna get unreliable decisions coming out of it. AI has a difficulty in that many of the methods do not provide any explainability of those decisions. And there are a number of areas where there are either regulatory or ethical reasons why you need to be able to explain decisions and that can be quite problematic for some of the AI methods. And obviously over the last, let's say 10 years, there's been a massive increase in the complexity in these decision-making algorithms. And for example, most of you probably opened your phone with a facial recognition piece of software. That's exactly AI. The recent excitement over the last year or two has been in generative AI. And as Mark said, that's what we're gonna predominantly talk about today. With generative AI, there is no explainability. And that could be a problem. It's trained on the whole internet. And I'll leave you to decide whether that's garbage in or not. I'm a person who made my first post on LinkedIn a few weeks ago at Mark's request and have no other social media. So I think you'd know my opinion on that. But even with whatever your opinion is, if it's trained on the whole internet, does it, has it been trained on, does it know about the things that matter in the application that you're using it for? So for example, health data. How successful is Dr. Google? I'm not sure that it's a great way of running your medical life. How the generative AI works from a technical perspective is that it is able to predict the most likely next phrase. And I'm deliberately being nonspecific there. It could be word, it could be a collection of words. It can be a whole lot of different things depending on the actual algorithm. But when it's writing text, it is predicting what text should follow the previous text. That's fundamentally how it works. Now in its default mode, it will put the most likely next piece of text. But some of the algorithms or some of the interfaces allow you to increase the temperature. And what that does is then allows it to choose more randomly, but still from the more likely pieces of text to follow. And that's how it's working. It has a very impressive performance in terms of language and grammar. When you look at something that is written by AI, it's usually written pretty well. Certainly a lot better than previous generations of these sort of techniques. And a number of my friends say it writes better than I do. So why not use it to improve your writing? So it's very impressive in that area. But one of the points that I think is misunderstood because of, and at least partly because of the impressive ability to write is that there is actually no, and choose your favorite word here, intelligence, understanding, rationality. It is just predicting what words should follow what previous words. And my favorite example for that, which no doubt has been corrected by now was when a version of ChatGPT was asked the question, I have two jugs, a five liter jug and a two liter jug. Measure for me three liters of water. So a fairly simple solution, fill the five liter jug, use that to fill the two liter jug and you'll be left with three liters in your five liter jug. What ChatGPT came back was an 11 step algorithm and the final sentence was, you will now have three liters of water in your two liter jug. I find that a beautiful example of, or demonstration that there is no rationality. It's written out well, the algorithm all seems sensible, but it's willing to conclude that there's three liters of water in a two liter jug. So it doesn't understand those things. You got 30 seconds Nigel. Thank you. And then of course we've all heard about hallucinations. So I won't go into that. One other quick comment, Marks said I use a lot of synthetic data. What that's for is the data we collect is literally one instantiation of life. And if you're trying to test new methods, you want to be able to test them against other reasonable representations. And so for that we create new data sets, but we're very, very careful in replicating the key features or at least the key features we understand, but we know what we've replicated. If you use AI to create synthetic data, what's it based on? How do you know what features it's decided to keep and what features it hasn't or where it's created that data from? Thank you. Thanks Nigel. Okay, Christy over to you. Brilliant, thanks Mark. Thanks Ian, thanks Nigel. I just wanted to start with a bit of context as to how I'm approaching this session because that'll inform the answers and the responses that I give. So I'm not a techie. So I am terrible at digital adoption. You know, I don't know my SaaS from my XS all that sort of thing. But if you take one thing away from my segment here, I think that AI capability is something you can build. It's something I've built and it's something that's essential for us as humans in general, but as evaluators specifically. We really need to run alongside this technology and not let the horse get too far out of the gate the way we did about 10 or 15 years ago with cybersecurity. The other thing I think is really important to pick up here sort of reflecting on Ian and Nigel's points is artificial intelligence, AI is so broad. So I just wanna be really clear that AI does not equal chat GPT. So there are many other different solutions out there. So far we've spent a lot of time talking about chat GPT but there's the large language models, the LLMs. There's also the multimodal ones as well. Barbed by do co-pilot, so many others that I've forgotten. So the deficits of one solution may not be present in others because the technology is evolving so rapidly. I am a cynic. So like the other panelists, I tend to be quite conservative in nature and I tend to look for the risks everywhere. But I've deliberately decided tonight to try and be optimistic. So let's see how I go. I was thinking about the examples that Mark provided and sort of set the scene with. And I think it's easy for all of us to be doom and gloom on AI and I know I am, I'm just waiting for our robot overlords to come along. So I always say please and thank you to my laptop when I'm using it. But I think we've had several different industrial revolutions over time and the world hasn't ended. So the printing press was giving the access to language to the commoners. Oh no, they'll uprise and kill all the landlords. The personal computer was democratizing the internet and knowledge to everyone. Oh no, the world didn't end. I think the same here. There'll be some shifts. The rate of technological advance is happening at an unprecedented rate and also the impacts it can have are unprecedented. But hopefully the world won't end just yet. With my positive hat on, I think there's opportunities for us to harness this technology to answer some of the questions we've looked at for a long time. How do we have work-life balance as a human race? How do we advance science and knowledge to have unforeseen insights? How do we focus on work that we enjoy? You know, the other reflection I had was, you know, because it is a technocratic space, sometimes we feel like we're not capable of engaging it or it's just a fact it'll pass us by. But I remember growing up with, you know, you had your cameras everywhere and you went down to the pharmacist and you got your 799 photos, you know, if you waited seven days for them. And Kodak had heard about the digital camera coming along and thought just a fad won't disrupt us. Where's Kodak now? So I just encourage us all to not be Kodak. We need to actively engage with this technology and build our capability to grapple with it, understand it and evaluate it right now before we become obsolete like the manual camera back in the day. I think also we as a species, we're developed for adaptation. So, you know, evolution has occurred to bring us a range of different advances for our species. The written word and our ability to pass knowledge on through generations being one. We saw with the advent of the personal computer that our human brains evolved at that point in time. So the human brain's working memory went from about seven plus or minus three bits of information down to about three or four plus or minus three bits of information. Because what happened was the way our brain functioned was we no longer needed to remember large tracks of poems or essays or religious stanzas, but instead we needed to know where to find that information. So our brain adapted in that space. I think our brain will adapt here. We're very resilient. I saw a video today sent around by one of our techies that showed what happens when chat GPT and large language models are implemented in an Android body. So if you want to look at something really cool, go to YouTube and look at chat GPT and figure and it'll come up with the test demos of this robot and the dexterity just displayed by the Android, its language processing, its response to the situation are unprecedented. And so there are unprecedented applications for this technology. So it will be used the horses out the gate lawsuits about how the, you know, training data was acquired unethically, not withstanding. We as a race aren't gonna turn the clock back on this. I also ponder to what extent does AI pose new risks or just extend existing risks that we've always had. So we as humans have always loved shortcuts. We've always loved rushing analysis under pressure. We've always loved taking credit for each other's work. For the person that's using chat GPT or similar to write their essays and analysis, isn't that similar to back in the day of the professors, you know, taking credit for their research assistance, original thinking, but publishing under the professor's name? About 20 seconds, Christy. Brilliant, thanks. And so from all of this, we need to really engage with this technology and that's what I urge every person on this seminar tonight to do because there's two perspectives we need to consider. How AI can help our practices evaluators and how do we evaluate AI-enabled programs and services? So my top four tips, upskill yourself now. This does apply to you. Number two, when you're planning evaluations, consider the type of technology that's being used, it's model, it's strengths and limitation because they're all different. Number three, look at the training data, the model design and how it was actually implemented. That's critical in this space. And number four, look at the transparency and the outcomes of the decisions being made. So if you follow those four cornerstones, that's how you can improve your practice as an evaluator in dealing with this brave new world. Thanks. Thanks, Christy. If you could post the name of the video in the chat, I think people would appreciate it. There's a few questions about it. I'm going to post some questions on a screen. So I'm gonna just share my screen again. So questions for the panel. I came up with some and they've had the advantage of having a little bit of time to think about them. So a number of people have touched on the source of the data. That's one question. What if the models are reinforcing existing societal biases and we use them in work? So either as evaluators or the people who are evaluating or organizations who are evaluating have used the models to determine things like who we should serve if we're evaluating some sort of social service program. What about the fact that generative AI can create documents, videos, photos, all sorts of artifacts? How do I know if that's given to me that the artifact is real? Is the evidence no longer useful? And there's a video you can check. It's using chat GPT. Yes, there's others as well with Excel. You can now pair the two together. You don't have to have any skills in Excel to generate the output. Just type in a line and it'll complete the report, give you the graphs, the tables, et cetera that you need. Yes, it's quick, it's cheap. It's easy to use and I could imagine as someone who worked in academia once that students would take to work like ducks take to water. I don't have to learn this technology and spend time learning it. I can just generate the answer to the question. Is that good? Are we lowering our skill levels and therefore our understanding of what's produced? And is there a framework we might need? Now, I'm not going to ask the... Okay, got it. The panel to answer all five questions, rather pick one. So, which panel member would like to kick off? Pick your favorite question and go for it. Christy? I will have a go. So, I will have a go at generating AI, lowering the bar for tools. So, full disclosure to the group, I was half concentrating on the chat. So, let's see how I go off the top of my head. So, something really cool about AI is it can democratize knowledge. So, provided you've got an internet connection, which is increasing across the developed world and in parts of the developing world, you can, as Mark was saying, do programming without having coding background. You can access all sorts of technology. It could help level the playing field between your cover letter and someone else's cover letter. So, it can create more equitable opportunities for all where your own capability can be demonstrated and can be assessed, rather than you being held back by your grasp of perhaps verbal or written language or cultural norms. So, I think that's one thing. The second thing is I have heard some applications that are being trialled in this space is to have customer-oriented sort of access to websites. So, that sounds really fancy. But the example that I heard was a government department is trialling, having all its information able to be searched through AI solution. So, let's talk about chatGPT as the main model people seem to be aware of. So, you can hop on and say, how do I get my phishing license? But really importantly, you can type it and get the answer back in your native language. So, Vietnamese or Afrikaans or Indonesian. So, information that may have been blocked from you for either your digital literacy or your grasp of a native language is now more accessible. So, I think there's really, really exciting application here. The flip side and here's my conservative side is it does increase the risk. So, we've now got an incredibly powerful tool freely available to millions and billions of people around the world who don't always understand concepts like hallucination and temperature and how to sort of make sure that sources are correct and the chatGPT trio that goes up to September 2021, 2021 for example. So, accessibility is at an all-time high but our educated use of the tools may not be matching that. So, that's the risk to be aware of. Thank you, Christy. Ian or Nigel? I'd like to build upon where Christy has gone. Well, I hope to build upon. I noticed if I go to question two, what if the models reinforce existing societal biases? Anybody who watched Q&A this week on the ABC heard them say, no, you'll use AI to get rid of the societal biases. In other words, you'll bias society with AI against its own biases. You'll educate it, so to speak. Good luck. I noticed today just before coming to this meeting that there's an article about the European government building a framework for AI and I'd like to come back to Christy's point. This, how many grains of sand are there on the beach? There are just so many different ways that this can be used. So, I've only got one criteria and my criteria is this. Let any AI application state its source. Provide transparency and accountability on the AI function that's performed and one of two things will happen. Everybody will get educated at the same time or AI will disappear quicker than it started. Because once you bring in accountability, the people who are making money and it's at the best of my knowledge, there's only four or five people in the world who are running this and providing the frameworks for it. They have total control at the moment and it keeps coming back to transparency and accountability for me and having a framework that provides incentives and penalties to get rid of the fake news if I can use that term. I also noticed that in the chat comments, somebody asked whether we are using AI at the moment. I would come back to Christy's point. AI is being used successfully in a number of areas but who's keeping tabs? Who's working out which is the good AI and which is the bad AI? Where's the framework? Where are the controls? Where are the penalties? If you let up the garden path and while the dollar runs this well, those people who control the structures will be more concerned about making a dollar than the example I gave earlier in the data quality about making sure they got it right. That's it. Thanks, Anne. Nigel? Yeah, I particularly like the Excel question about lowering the bar for using these tools. One of the things I think is really interesting in the chat GPT implementation of this, I've only seen a couple so I'm not universally aware of what's going on but the chat GPT version includes an explanation in its own writing of exactly what it's done. So when you ask it a question around an Excel spreadsheet or you ask it to create some table from some spreadsheet data or whatever, it explains exactly what it's done and show right down to the sort of logic that it's used. And then you can look in the Excel cells to actually see what calculations it's done. So in this case, and certainly in the hand of an expert user, it's very powerful and you have that ability to be able to check and make sure that it's doing what you want and has done it correctly. So I think that's really quite powerful. I've seen another implementation where it just produces the answers and I would be a lot more concerned about that. I think it's a really good question though about lowering the bar for access to tools. So much of, I guess, training and education is getting you to have enough of a background base in an area to eventually become an expert spurred in that area. And so lowering the bar is obviously a good thing from a societal point of view of enabling more people to contribute, more people to become involved and use these tools. The question is, well, the risk is around people using it without knowing what they're doing. And although GPT will do what you ask or the other programs will similarly do what you ask, whether you're asking the right questions, whether that's a legitimate conclusion, all those sort of things are potentially swept under the carpet through that. I agree with Ian's comments around societal biases. Given that we're training the data, training these models on generally a pretty much uniform what's out there basis. Now, they're so large, it's going to be very difficult to undo societal biases using these sort of tools. Could I quickly add to Nigel's point, Mark? We all accept that students are using these sorts of tools to finish essays and assignments, et cetera. And one of the things I used to do with my students when I was aware that this was occurring is I would call them into the office and say, well, look, can you explain the principles for me? And nine times out of 10, all I had to do from that point on was just hand them the tissue box. So at the end of the day, critical thinking is what could be lost here. And I don't even think critical thinking has actually got started properly in the universities and teaching organizations. So there could be a heavy swing to the universities in that regard to start up in the ante with regards to the critical thinking processes. And could I also finish by saying, I think AI, just like the atomic bomb, which was meant to create endless energy, can be used for a lot of wonderful things and a lot of very, very bad things. Thank you for those comments. Before I open it up, I think there's one other observation that I'd like to throw into the mix. And that's one of the things you drew my attention to Ian, and that was the post office software implementation problem in the UK where a lot of people lost money as small proprietary type business owners of the postal outlets because the software was giving incorrect answers. Yet it was all kind of swept under the mat. And it's ruined a lot of people's lives. So software has been around a long time. We use it all the time. We're using it now just to have a Zoom. However, if we don't pay attention to the problems, I think there can be pretty grave consequences. If you add to that the power of Twitter and the number of shows that have been on TV about the effects on young kids just using chat on Twitter, et cetera, and having suicides as a result of it, we really, really, really have to take this very, very seriously and get some framework and structure around it. And that's, sorry, just to add my two cents there as well. But that's where regulation and policy comes in. So the Commonwealth Department of Industry has been doing some great work in this space. Check out their rapid review and their interim guidance on the use of AI for Commonwealth government. New South Wales government also has a policy and framework out there. But the EU Act was just introduced last week, I think, which outright bans some use of AI. So there are frameworks in this space within this. Yeah, I think the rest of the world has to follow the EU and they tend to do it first. I don't think it's fully passed as yet. It's pretty close. But yeah, that will certainly have some implications for where AI goes. Can I make a quick comment, Mark? I noticed in the chat there's a bit of discussion around expert users. And this is really interesting. So my son is obviously of the generation that's most used to this. He's trained very seriously in maths and data science and so knows exactly how these things are working and understands them very well. He tells me about some of the advanced prompt engineering that is being used in using these tools. He's doing it within his business life. And there's some really creative, really interesting processes where you can use really good prompt engineering to manage much better what you get back from AI and really increase the quality of what it's doing. So expert users in that space, there's a lot of potential, I think, in that. Okay. Absolutely, and that's where I'd say, maybe even in my own view of the world, you'd scratch out the word expert and add in responsible because I don't know if any of us can call ourselves expert because the technology is just moving in such leaps and bounds. But I think a responsible user of this AI technology would understand the sort of ethical conundrums and challenges that can appear. So we've touched on selection bias or bias in the training data. We've talked about monitoring and assuring that the outcomes are actually fair and that decisions are traceable and transparent. I think the best way to do that is to have a good governance regime, evaluate the tech to make sure it's doing what it's expected to do and continue to upskill yourself and remain capable in the space. Thanks, Christy. I noticed some of the questions and in the space. So what I propose we do now is turn our attention to those, but also if someone has a particular view they would like to share beyond our panel members, please stick your hand up and I'll try and get to you. One of the questions that I think is interesting, I just went past it, I think it's by Caroline Henwood. Christy, is there a third perspective question? How are they using AI in evaluation and what does this mean for comparisons, synthesis across evaluations? And I guess, yes, one of the risks I see is people give us evidence and there was a piece in the conversation just this week about a long-term study that had been collecting qualitative data. Respondents are now using chat GPT or other tools to fill it in and they notice the quality of the responses has changed and is wrong. So that's dubious data, but what about us as evaluators using it to make the process easier for us or potentially greater returns if we're trying to maximize our income? What are your thoughts? I will start a bit tongue-in-cheek with the classic evaluator answer, which is it depends. So when you talk about use, what are you meaning by use? I would say using it to write your whole evaluation report is unethical, I wouldn't say that's ideal. I would say using it to help you get a sketch of a writing plan would be a wonderful use if people are unfamiliar with that, using it to get a summary sense of the literature on a particular topic or to understand who the prolific authors in one area could be to even sort of do summaries of swathes and tropes of stakeholder consultation information or survey information to do an initial sort of high-level analysis for you that would be an appropriate use. So I think it's about breaking down what we mean by use. So I think if we outsource our job entirely to the machines we might as well resign and go to McDonald's, right? So it's how do we work in tandem with the technology? So that's why I love the term AI enabled. So how do we enable our work through this technology? So for example, but there are perils. So for example, at Grovener we are looking at co-pilot and whether that's a solution for our business to help us speed up things like conducting reviews or synthesizing large amounts of data. But we hit a snag. So last week the person in charge of investigating it found out that this technology actually wants and takes access to your whole SharePoint. So not just specific limited databases you granted access to. So your data's out there forever. So we've actually just put a pause on that because we don't want to ever share anonymized or de-identified data or private data. So the sort of context within which you're using it depends if you're using the free version online, if you're using a private enterprise version and what you're using it for. But I do think it's an area where we need to understand more about what's happening in the field so that we can learn from it and share better practices. Thank you, Christy. Anne or Nigel, did you have any other insights that you might want to add to? Okay, no, that's okay. I noticed Bill Wallace from the society has posted a comment or a link to another learning opportunity, one that is coming up on April the 16th. There is another seminar on artificial intelligence 101 for evaluators. So if you're interested, that could be an opportunity to have a look. Are there any questions that people want to ask online? You know, using a voice, sharing their photo, just while I have a quick look. Can I just step in for a second, Marketsville Wallace, who's CEO of the AS. Thank you everybody for presenting today, really interesting and thank you for attending. I just want to also bring it to your attention that we're currently in the midst of organizing a festival that will happen during May, which will be dedicated to AI. So we'll be sitting around and say the date shortly for that. Thanks, Bill. John Pilla Mamello, I will get to you in one sec. I'll just start on John's question. So won't be a second. John asked the question, are there risks any different to a novice using Excel to undertake a statistical analysis without an understanding of the assumptions embedded in the different statistical methods? Nigel, you might have particular views on this one. Yeah, I was actually in the middle of trying to type an answer. Yes. So in many ways, the differences, oh, sorry, the risks aren't really any different. I mean, what you're talking about is a technology that has enabled people to be able to apply sophisticated methods very easily. And that's not a bad thing. I mean, if you go back to a 1960s statistics education, a significant component was how you calculated the variance because it was a really intensive calculation. And so part of the education, the training was how you could efficiently calculate the variance. You don't teach that in 2024. But what it meant was that for you to be able to use any of those tools in the 1960s or 50s, you had to have that education. Otherwise, it wasn't practical. So I think the risks are very similar in meaning. However, I think in scope, they're significantly greater because the sort of the entry barrier or entry threshold is lower. You don't even need to know basic Excel. You can just write and ask questions. And I think also the complexity and scope of the potential application is much greater as well. So I mean, I have no concept of how broadly this could apply way beyond anything I'd ever understand. So I think it's, yes, it's a very similar risk, but I think it's a more significant risk purely for those two reasons. Thanks, Nigel. And having seen students who feared numbers previously and Ian has seen the same sort of student as opposed perhaps to Nigel's who I hope would have liked numbers if they're in a maths course, I have no doubt that students will rush to this and adopt it very, very quickly. The biggest risk is understanding. So if it's easy to access, easy to do and get results, that's fine, except the people who then go out from training institutions into the workforce are getting results, but they don't understand them or how to derive them. And that I think is a problem, particularly when it just happens so much faster than what was done previously. It was easy enough to go buy an assignment, but you get potentially caught because of other techniques. Now it's going to be a lot harder, a lot, lot harder. Interestingly, just on that one mark, I was having a conversation with someone today who shared that her daughter was pulled up by her teacher for using AI in an assignment. And so we sort of talked a little bit about that example, and one of the reflections I had was, where should people use how to learn AI if not in school? So, okay, using it for an assignment isn't great, but isn't this the place we need to be teaching people how this technology works and how to operate it responsibly, as we've been talking about, so that when they do hit the workforce, they know all those implications and can navigate through it. So I think it's important to address it from the front, so that's all. Yeah, I agree, the challenge is they've already seen what's happening to student grades. If you're not adopting the use of AI, a good student will be surpassed by students who would in the past not have been as good as them, so the pressure is on for everyone to adopt AI, and I think that's disappointing, but it's a risk and we're going to have to prepare for it. I know, Mamella, you've had your hand raised for a while, so do you want to ask your question, please? Hi, am I in return? I think I am. Yeah, my question is follow up to Christie's response regarding the use of AI in evaluation, though it's directed to all the panelists. I'm interested to know if they have used AI in evaluation, how they used it, and what was the outcome? Well, I'm happy to jump in there. So surprisingly, even though I read about and talk about this topic a lot, I haven't actually used it that much, so I've got more of a theoretical sort of understanding and interest in the space. I'm also incredibly old school, so the first thing I teach new consultants that come into us is do your analysis, like with paper and pen, put the keyboard away. So I think it'll take a while for me to sort of move my Luddite-like tendencies into integrating it into my own work, despite knowing very well the opportunities and the need to do so. The thing that I'm sort of looking at for our business, though, is like that analogy with students that Mark set out, are we about to be on a race to the bottom? So if our competitors can undercut us by using AI technology to write their reports to do substantive data analysis and we continue to price for human effort, what does that mean? So that's sort of the risk we're looking at for the next five years, but now I haven't used anything yet, more because I love writing and I love creative writing and I love getting those used to the flow, but that's just me. Okay, if I could add to that, I would say that I haven't used it at all, but two things quickly come to mind. On the Q&A show on the ABC this week, the question was asked, what happens when you have rubbish information coming through AI and the arts are given by the panel? I've learned that people was, everybody will divert to trusted sources. Nobody ended up asking what the trusted sources were. And secondly, Mark came across an article just recently which I've been suspicious of for some time where because of time, money, gotta get your accreditation up, we've got a lot of university people publishing papers that have been where they've used AI to help them put the papers together. A further study has found that a lot of the research could not be replicated on review, which raises some serious questions about where research is going. So I know that is different to the question you've asked and the answer was I haven't used it, but the last part of the comment I'd like to make is Christy again mentioned how wide this field is so that when we talk AI, each one of us turns around to, oh, they mean, and we make an assumption which came out in John Pilda's question, I use the word assumption which I love. The bottom line here is what do we mean? So we could probably go back to Bill for instance, if we have enough time to say, what do you mean by what you're going to offer at your session in April, Bill, with regards to AI, so that the people turn around and say, oh, it's chat GP, I'm really interested in, or it's whatever. Sorry, Manolo, that's all I've got. Thank you, that was good. One other comment from me, which is not direct to evaluation because I don't do that, but I have used it around trying to write software and my general conclusion would be if it's something relatively standard, it's very good, but if you're trying to do something which is out of the box, then it can't have seen that before, right? And so it ends up pulling things together and it pulls things together that sometimes you actually can't go together. So it will never be a solution. It's not just you've got to tweak a few things, it will never be a solution. So I think it's another thing for people to be quite aware of is in terms of those uses. So the comment earlier about your cover letter, absolutely great for those sorts of things, but the more creative you are, the more you're on the edge of knowledge, the less and less reliable it will be. I was thinking to pick up on Ian's point as well over a few minutes and take the bill out of the hot seat. So the 16 April session, so that's me again as well as Gerard Atkinson of ARTD, Giant Gigantic Brain. So that's AI 101 for evaluators. So we'll be going through common terms and terminology and sort of theory and history. And then Gerard will actually talk through some of his applied work as well. So that's what that one is about. Thanks, Christy. I guess as a facilitator on the panel, I can say, yes, I have used ChatGBT with a client, but it wasn't for evaluation purposes. It was for something else. It was more of a innovation and research space client. And we got some very interesting results. If you weren't an expert in that space, you would have certainly been led up the garden path around the corner and probably through quite a few rabbit warrens as well before you realized it was giving you rubbish. So some of it was good, but some of it was, yeah, it was putting things together that you'd go, don't think so. And it would give us references for them and would hunt the reference down and you just wouldn't find it. So yeah, remember, it's probabilistic. So it was putting stuff together that looked really, really good, except it wasn't. So... We didn't get a titles with journals, with authors. They don't... Oh yeah, did it the whole lot. We're looking for these things and that don't exist. You guys aren't saying they made it up. Probabilistic model, it's creative, generative. So yeah, any last questions? Because I know we've got about a minute to go. Does anyone want to jump on? I wanted to just give a little bit of a plug to Christian Tremonga. So anyone that's interested, he's just what I've posted for him is a master's research project on the use of AI and evaluation. The details are in the chat if you're able to support your master's research, that would be amazing. Okay, thank you. And I did notice someone who was... Oh, it was you, Christy, who made reference to the CSIRO National AI Centre. Awesome resource. Checking out it as well. So first of all, I know we've reached the time to end. So thank you for everyone who's come along today and put questions in the chat and posted those. Especially, I'd like to thank our panel members, Ian, Nigel and Christy. We appreciate your time, your comments and your thinking before you join today's session before... So you could talk to us all. I wish you all a very good evening and look forward to seeing you at a future event.