 Hi and welcome everybody and to our afternoon seminar and I'd like to begin by acknowledging the traditional owners of the lands on which we meet the Ngunnawal people and pay my respects to the Elders past, present and emerging. My name is Ruth Nichols and I'm one of the members of the AES Committee, local Canberra Committee and today we're really pleased to be able to welcome one of our committee members as Scott Bailey to give us presentation on why programs fail. Before we get into the presentation this afternoon, just a couple of housekeeping things. So first of all, I think most of you are on mute already and if you're not, if you could please put yourself on mute. We will have opportunities to have some discussion as we go through the seminar today and that'll be a point at which you can come off mute. You can also, if you'd like to put some questions in chat as we go along. And when we get to those points of having our discussion, we'll be having a look at what's come up in the chat box and use that as an opportunity to support our learning. So the other thing I wanted to let you all know is that our session is being recorded today. So that's just something to keep in mind. The AES has a YouTube channel where we publish recordings and seminars. So I think without further adieu, I'd like to welcome Scott Bailey for his presentation. Thank you, Scott. Thank you, Ruth. Hello, everyone. I'm doing double duty here trying to admit people into the session while I'm paying attention to what I'm going to say. So anyway, we'll work it out. I want to talk today about the symptoms and causes of underperformance and government programs and what that might mean. And I'm waiting for this PowerPoint slide to move forward. There we go. So I want to have a little bit of discussion about examples of failure and I'm going to share some of mine and I'm going to ask if you'll be willing to share some of yours. I'm going to talk about the common symptoms of program underperformance, what they might mean. What the main causes of underperformance might be in government programs. And the reasons why an impact evaluation might conclude that a program is ineffective. And then if we have any time left, I'll talk a bit about key challenges for evaluators and assessing social programs. And I'm just going to pause for a second and let some people in. Thanks for doing that, Scott. Okay. This is very slow, this internet. So here's some examples from my own experience about different cases of underperformance. The exact details are through the agencies to protect the guilty, so to speak. But these are all real world examples. Myself and some others did a study on mental health services for people in crisis. And we pulled 900 patient files. And each of those files was meant to have a plan for the patient, the client. And we only found about 40% of them did have a plan that was required. Now the department would say, well, everyone had a plan, we just didn't put them on the file. That may or may not have been true. I never really knew. But one thing was for certain, their quality control certainly fell down. That's without doubt. Another piece I did at work once somewhere else was looking at services to child protection allegations. And these would get prioritized for high, medium and low. What we found was over a 12 month period, only 30% of the high priority allegations got a response within the department's own timeframes. So that's a lot of kids who didn't get investigated or notionally at high risk. Another example of mine was I looked at a national grants program. As part of this grants process. A community agency would get a grant, they'd spend the money and then they were supposed to have an audit report report back to the funder. And then the grant would be acquitted that is to say signed off what we found over a three year period, only about 50% of these grants were actually ever acquitted. And this was a program with hundreds of millions of dollars every year. Those are sort of illustrative examples from my own experience. I was wondering though if people would be willing to share some examples from their own experience of what looked like underperformance in a government program. I'd invite you not to identify who the agency or the clients were but just give a rough flavor of what was the nature of this underperformance. Would anyone like to offer an example or two. I know it can be tricky sometimes to think of examples, but I can see a hand from Harry. So go for it, Harry. Good day, Scott and Ruth, can you hear me? Yes, we can hear you. Yep. So look, just to get things kicked off, I can talk about one that we've published. So we're not too embarrassed about it. We worked on an app to support university students who might get discouraged early in their studies to try to stick it out at university. It was something we did with funding from Department of Social Service as part of their try test and learn fund. And we spent a lot of time on developing the app. Beta the behavioral economics team based in Prime Minister and Cabinet spent a lot of time developing the app thinking a lot about what would help students. We did a lot of user testing. And we did a pilot roll out into the eventual roll out and in terms of student retention, which is our sort of the main thing we're looking at. We didn't see any improvement. And we also wanted to have a look at grades and we couldn't see much improvement there either. And it looked like, although we'd spent a lot of effort in trying to make the app sticky so that people would keep using it. After that initially signed up that a lot of the problem was that there was a big drop off in usage after the first week or two of downloading the app. So there was a much prospective impacting on retention or results if they if they weren't using it much. So that's, but that hurts because there was a lot of time and effort and imagination that was poured into that for seemingly little return. That's our example. Thanks. Oh, thank you, Harry. That's a great example. Any of us be willing to share one of theirs. You're welcome to put your hand up if you want to like Harry did. I know Scott, in my experience doing independent evaluations, I've certainly seen some programs being implemented in a very short period of time and expecting to see impact results, you know, within like, you know, less than a year. And while, you know, the program still really just trying to get itself organized and yeah, that's been that's been an interesting, you know, phenomenon I've seen on a couple of occasions. Kim. Yes, hi. Not quite sure how my, my camera is going but hello. Kim Gray here. I was musing over an example that I'm not going to name and I think the big challenge with this one was that was a big difference between the programs expectations, the settings and the way it was supposed to work. What it actually was like, and what the participants felt was of value to them. So their experience of the program was not very satisfying. And therefore, data that could have been used to talk about its performance wasn't as informative as it was thought to be so I put that down to a very big difference between the way the policy designers thought the program worked and the way the participants experienced the program. Okay, so we've heard some examples about implementation about design, but strategies not working. Do you have your hand up Julie. Yeah, I can't find the little reaction buttons on using my little hand. I've got one that you might even remember it's affectionately known as the roads program, and it was about improving governance. So it was, you know, it's a real issue of how you define success, but basically, or failure, but basically, you know, you're putting a simple solution to a very complex problem. And then it's likely going to fail and this one I have in mind was a sort of an anti corruption process and I can't remember the details of the middle bit but the, there was, if we build roads, we will improve governance and reduce corruption. And that was not success, I think that would be a failure. Does that ring any bells, Scott. I'm laughing because that's one of my favorite examples. It's going to work. What a wacky idea. Hey, that's my best one. Excellent. All right, let's move on a little bit. If the IT will cooperate. Have lots of IT problems here. I'm going to talk a bit about symptoms. Some common symptoms include things like quality problems, client complaints, staff turnover, insufficient outputs, wastages, inefficiencies. Ineffective programs. Programs are not responsive to their client group. Limited external political support for the program. Adverse publicity and adequate reporting, excessive waiting times, criticize criticisms from from external watchdogs like parliamentary committees, or the auditor general's office. All series of things that can happen. And these are important because without the perception that there's a problem performance improvement has no starting point. That is to say program managers and public officials need to feel some kind of performance pain. The question is, where does it hurt and performance pain in public programs. Can I mention several interrelated sources, including inadequate production of goods and services program just doesn't produce enough. The programs good and goods and services are of insufficient quality. Many resources are being consumed by the program. The program is ineffective. It fails to fulfill its intended purposes. For example, unemployment for youth isn't being reduced. The issue of client dissatisfaction and complaints. And there may be an image of that of course is staff dissatisfaction and turnover conflicts with other related organizations coordination problems. That's quite common between federal state agencies. There's no adequate inadequate adaption or innovation in the program a failure to respond to the changing client X needs or external circumstances. Performance reporting can be a problem. I mentioned lack of political support and burst publicity criticism from watchdog agencies. But all these problems are actually symptoms of underlying causes. So figuring out where it hurts sets the scene for performance improvement, perhaps in three ways. It determines whether there is a perceived need or change to improve performance. It helps to identify the sources of performance pain. And public sector managers also benefit from understanding who cares about a performance problem, and who's willing and able to do something about it. These potential underlying causes include the program's mandate. That is to say the mandate could be unclear the program lacks authority. There was a confusion of roles across agencies. It's the issue of strategy. That is to say the program's theory of change is faulty or the assumptions are untenable. The strategy does not fit the program's environment. It's another common one. Perhaps structure that is to say the program structure does not fit its environment or its strategy this inconsistencies attention. One of my favorites is the topic of performance leadership. That is to say the leadership style is incongruent with the program, whether it's inadequate governance accountability or net a limited focus on using performance feedback to drive continuous improvement. Culture can be an issue if the culture and incentives do not support a focus on continuous improvement and achieving results. The organization systems can be an issue in terms of policies systems and processes failing to support effective program management and service delivery. And finally the topic of resources. If the level of resources financial physical people technology and operational capacity is inappropriate for the program's design and systems. So what I'm trying to do is argue that we have some very common symptoms and most of us is bumped into one time or another. But these symptoms are actually symptoms of underlying causes that are more fundamental drivers of the problem. So I'd be interested in hearing from yourselves at this point how does my list of symptoms and underlying causes fit with your own experience. Does this make sense. Is it logical. Is it something you've observed in your employment. What's your reaction to this. Scott it's Peter graves here if I can't quite put up my hand if you can hear me. Yes Peter thank you. I'd like to support you but literally on the lack of political support. Of why government programs do fail and also their reforms. And some of my comments are quite relevant to what's going on with the current government bringing back program evaluation. Because I always thought about climate change that Prime Minister thought was the greatest challenge of our time. We had a climate change department formed in 2000 and whatever seven. And it all fell in a heap when the Prime Minister changed and Prime Minister Abbott thought that climate change was crap. So having senior leadership that is actually continuous is extremely important. And in that regard. Some of my recent research about the management reform of managing for results. It never went on long enough. While it went on for 13 years. It stopped because the support the top changed. And it is now being brought back. So there is a time there is a fact of extended time in a lot of what you're also talking about. So Peter this lack of political supporters are. Perhaps that relates in some way to changing governments from one political party to another. No, I think it also involves changes of ministers. And I think of Martin Bolz when he was head of immigration trying to bring in an evaluation culture there. And he brought in Wendy Southern at a very senior deputy secretary level. He moved over to health and brought Wendy with him to bring in a similar evaluation culture. And when he had a falling out with the health minister of the time and left. So did all of those attempts at bringing in an evaluation culture in health fell over because there wasn't one to begin with. So yes unfortunately very senior people literally have too much influence on the value of APS programs. I've certainly observed an emphasis on evaluation wax and wane in the Commonwealth over the years. And in the shorter term depending upon the emphasis of the leadership team of a particular department. Without mentioning particular departments I can think of 304 that have either focused on evaluation secretary chant and then it went off the boil or the exact opposite has happened. And it sometimes disturbs me that I focus on managing for results depends upon the personalities at the top, and that it's not more systemically grounded. And our systems and practices that's just my issue a little bit. Anyone else like to offer some comments on my list of symptoms and causes and their own experience. I just want to let you know we've got a comment from Lou who doesn't have a microphone. And Lou says we've experienced a range of these with the programs we oversee a lack of consideration of evaluation in the establishment phase has been very problematic for us. I can think of a number of programs that were established run for a number of years funded. No baseline data collected no M&E frameworks put in place. Then I can think of some that were 15 years into the running them like oh maybe we should do an evaluation of this to see how we're going. It intrigues me how things can go on for that long. I'm wondering about under what circumstances is a problem amenable to a solution that's called a program so programmatic solution. If you like that, because maybe it's not it's unrealistic to think that programs can address or melee melee all sorts of social circumstances that are less than desirable. And so programs overreach what they're really capable of doing in the real world where in government, the real world is that there's an election cycle but there's ministers come and go that they each have their favorite priorities that secretaries come and go. They each have their own biases and prejudice because that's how people work. What are the limits to what a programmatic solutions. I'm attracted to your idea in a couple of different ways on the one hand, I think it's fair to say we've got some great examples of important policy success. The reduction of teenage smoking in Australia, I think that's nothing short of amazing what's been accomplished. But then I could probably try and think of another two couple of examples where, alright, the Australian aid program in Papua New Guinea with investments designed to reduce corruption. And G that's a real stretch we have small sums of money. What's our political influence what's our understanding of the country what's our linkage with key decision makers, our ability to influence incentives and pop in again officials that that whole area makes me think that's a huge stretch on our part. I'd be interested in other people's views. Scott it's Peter Graves again just supporting you on that PNG comment. He was in a place to know told me that the PNG people don't work out the results of their own programs when governments change, because the new government doesn't have an incentive to know the results of what the previous government achieved. And it's all a different course of events. I'm thinking closer to home I think the topic of youth unemployment has been a foreign in government side for. Oh Lord, at least since the 1980s that I can think that far back. What other people think Kathy. I used to work in jobs Victoria in the Victorian government and it's quite interesting how you pointed out you find employment was like a challenge as well and I remember us trying to set our goals at the initial program design as well too. And it, you can set those goals but it's also like there's other factors at my core that might be other social. It comes to it's kind of similar to like that wicked policy problem, where you might define the pro like the problem and be able to identify the goals in the initial program phase, but when you actually work and have to implement, you're dealing with other complex challenges as well. And so that's where evaluation can be quite challenging when you're working towards implementing that program but then other programs like other problems arise as well. Thank you. That's a good point. Other comments or observations. Kim here. I'm using over high Scott. Yes, I'm using over the slide you have up with symptoms and causes. And, and I like the way you've talked about needing to feel pain need to actually have a reason to explore something. So some signs that there's a problem. But they're not the cause and then you've got a smaller list of causes. So I just wondered what you'd say about the relationship between symptoms and causes. Do you see any relationship or is this a much more complex question. Yeah. I've mowed that one over and I think it's fantastically complicated and probably context dependent. For example, the particular symptom could be related to one or more different causes, pick something stuff dissatisfaction turnover. Maybe that's because they're overworked. That is to say there's they have inadequate resources so maybe that's why they're dissatisfied and they turn over. Maybe they feel like they don't have the mandate authority to do their work and so they're being held accountable for things outside of their control and that's the source of their dissatisfaction and turn over. Or maybe they're unhappy with the leadership of the agency and that's the source of their dissatisfaction turnover. So I can't imagine there's a direct necessarily one to one correlation between all the symptom is always related to these causes. Perhaps more in a diagnostic said well I've got these cluster of symptoms and now I'm going to have to try and work backwards to try and identify with the underlying causes. But I'd be interested in your thoughts but my initial reaction is not going to be a straight one to one relationship. Yes, I'm sure I'm sure that's the case sorry I was just responding briefly. I'm sure that's the case Scott and. And I would sort of also argue that you need to explore more than the symptoms that you've been able to observe. I think there's a lot more to unpacking the causal drivers than what you can necessarily observe. There may be things that you have you need to do more research into something to understand what could be going on. I think looking for symptoms is a great way of thinking about what you're doing when you're linking monitoring data and monitoring indicators through to unpacking an evaluative question. I thought that was quite a nice way to to think about what you're offering here that finding out where it hurts is a question more like monitoring. Does it hurt anywhere let's just check in getting it doesn't mean that it's necessarily going well you may not just you just may not have enough information yet. Yeah, and I had the poor judgment wants to apply some diagnostic work to symptoms and organization I was working in that I won't name, but we had some real quality problems. And so I did some five wise analysis on their quality problem and I also did a problem tree analysis, and that raised some very sensitive issues very quickly. And it wasn't received real well. In senior management sort of the SES band two level and upwards. And I guess I was pointing at things that were a incredibly problematic be not necessarily amenable to easy problem solving and see didn't help particular individuals and their reputations. And so, whilst I thought my work was reasonably insightful it didn't go anywhere because politically there was no support for it. And so that was one of my comments a moment ago it's one thing to identify symptoms but another is who's interested in the symptom who's willing to work on it. And I think those are perhaps the really important diagnostic questions rather than a more mechanical symptoms might be a reflection of that cause that's just an initial reaction by me. I just wanted to mention that Raul had posted a couple of observations in the chat box as well. I've seen those. He raises the idea about the brain drain to the public sector impairing public sector capability on various fronts. Yes, it's interesting that our current government has spoken about the need to rebuild public sector capacity. And that was something raised in the 30 review of 2018. And then the Commission of audit and I forgot what year that was 2012 maybe raised this issue about public sector capacity. I says, I agree that it's an issue but I honestly don't know at least the Commonwealth level what the government's going to do about that in such a tight budget environment. One of the things governments are conscious of is the number of public sector FTE positions and the associated salary budget there's actually incentives for common governments to employ consultants, because it treats they're treated differently on the balance sheet. And so it's been an issue for a number of governments they can say well the total number of public servants has been flat in the Commonwealth for the last number of years and that's almost a point of pride while they spend more money on consultants. But of course at the same time the capacity has been and capacities being outsourced as well as corporate knowledge is that being outsourced. So yeah rule makes a great point but I honestly don't know what the solution to that is, especially in a really tight budget environment. But having said that. Maybe I'll contradict myself this was the argument that Bob hot Paul Keating used in 1986, when the budget was a bit tight. And they said you know what, we've got to do better at spending our money where it counts and they pushed evaluation policies and made it mandatory and Paul Keating was trying to identify programs that worked and didn't work. So either had to be fixed or they'd be canceled or wound back. And that way he could generate additional money to spend on other programs that work better, which makes a great sense and personally I'm really supportive of that was only one problem though. They got the incentives a bit messed up. And so programs realized fairly quickly, if they did an impact evaluation of their own program and they criticize that next time around they got a budget cut. And there were one and a half rounds of that before every department did a self evaluation said hey we're great everything's good. So the incentives got distorted and other countries or chilies an example. Those big major impact evaluations aren't under their departments on control the equivalent of Prime Minister and cabinet leads them. So they let line agencies do implementation studies but the big strategic policy high disability impact evaluations are actually managed by Prime Minister's office usually done by consultants. Other comments or questions. Scott if I could come in on that again bringing in an international example that is ahead of Australia and it is in America. Through a piece of law actually signed by Donald Trump. The foundations for evidence based policy act black letter law in America requires every agency to have a chief evaluation officer and to have an annual evaluation plan. That is a good start. What happens to the reports afterwards is a separate issue. But in America it is now mandatory obligatory and legally required to do program evaluation. Yes, the AES input into the thought review recommended the that all departments have a chief evaluation officer chief performance officer equivalent to finance and HR trying to locate responsibility for that function. Not sure there's a much support for that although we do have the evaluator general argument that's been topical for the last couple of years here but I'm not fully sure how that's going to play out in our space just yet. Julie's there, Julie's there. Hi Scott, I'm just wondering if where you would include sort of like a lack of stakeholder involvement in in the framing of the, the situation of interest, if that would be an early symptom, if they weren't involved in defining in framing the, the situation of interest and defining, you know what success and failure looks like. Oh, I think you may be a pre symptom or do you have another word for that. Well I think that's certainly an issue whether it's a symptom or a cause I'm not sure off the top of my head but I think you raise a brilliant issue. Let me think of it's old school but I've always like Joe Holy's five steps to program management. He was the chief evaluator in the federal US Health Department some years ago, and he'd say step one. Engage with your stakeholders and trying to develop a consensus about what the problem is, and what needs to be achieved. Step to design programs that potentially might work to address this problem. Step three, you implement this program, while you evaluate it from a whole range of different value perspectives. Step four was to use your performance feedback to drive continuous improvement. And step five was to communicate back to your stakeholders your funders government in the public about what you've achieved, and that was his idea of what the continuous improvement cycle was certainly in my work in government. So examples of step one develop a consensus stakeholders about what the problem is and what a good outcome look like that always would come back to bite you if you don't get that right up front and I understand why sometimes we don't because there's pressure to act and I get that and spend money and be seen to be doing things for political reasons. But that is a problem, never goes away if you don't sort it out, eventually whether it's five 1020 years, it will come back to you again. And I see Kim holding up her hand. Yes, Kim. I was thinking maybe this idea of a lack of stakeholders involved at the start it's something that we see in working indigenous affairs where we're thinking about other right people involved. We think a lot harder about that now that the closing the partnership is requiring more work on engagement and partnership. So I would, I would argue maybe that it's a risk indicator, rather than a symptom. So maybe there's a set of risk indicators that could be correlated with signs of potential program failure. I hadn't thought about that like that before but I kind of like your idea I think that's there's something good in that. Yes, Julie. I just found my hand to raise so thanks. But I was going to say that's an interesting idea Kim risk indicators. Signs that something's wrong. And I think that might might kind of help us decide what is amenable to a program solution and what needs much more systemic change. And I think one of the issues that might might come up as program failures, we're always going to be fight failure as a program, because they're long term and as you said Scott they're always going to pop up later on we might sign this kind of squash them down. But if the systemic conditions that are actually holding them in place are still there, then nothing really is fundamentally going to change. Other comments observations, experiences you're willing to share. Yeah, I was just going to say Scott there was a there was a comment from Lou I just want to acknowledge Lou would put in there about the section being established through an MPP specifically. And that was to enhance governance program governance conduct evaluations and develop internal capability. Interesting trifecta there. It does intrigue me that not all new policy proposals will have a budget allocation for evaluation for example. It doesn't seem to be a fundamental precondition for setting yourself up to, you know for tracking how you're going as you go along. Yes Julie. Well, Scott you're assuming that that that's the intention to have success and change something. The intention might be just to be seen to be doing something. Don't pretend to be naive here Scott. Nothing because you're quite right and what's his name. Duncan Fraser wrote about a series of articles in the evaluation news and comments back in the old Lord. It was fairly 1990s I think it was, and he argued the programs were different types, and one type was a I'm here to solve a problem type. There was another one that was, I'm just here to give the illusion that we're doing something because it's basically uncontrollable. You had a couple more types and I just can't remember what they are off the top of my head but yeah somewhere functional problem solving programs and others were more political and symbolic in their intention so yeah you make a good point. Another one I just thought of it could be that I'm just biting time until I come up with the solution you know. Oh yeah, a whole kind of a mix of those two that you suggested. I'm sure that would be a really fun thing to do to write that list. Oh, and another one of his I've just remembered is to divert resources towards your political supporters. It wasn't that we're actually trying to solve very much but we're diverting grant funds towards an area of the economy where we like those people so we'll make sure that they get a bit of Jerry. Never seen that one in practice now. Comments. If not, I'll move on it willing or move on. I'd like to suggest this is a related topic but I'd like to suggest. If you employ someone to do an impact evaluation, there are four and only four explanations why impact evaluation might conclude that the program is ineffective. And I mean it to my life thinking there's only four. The first one is strategy. If you're teaching for I'll use a silly example but you'll get the point if you're teaching farmers to bin to burn incense to improve their crop yields and thereby raise their income. What doesn't really matter how much incense nor how well they burn it it's just not going to work. So sometimes the fundamental strategy is just wrong. And it doesn't matter how much you do it or how well you carry it up is just not going to work. For one reason your impact evaluation can find your program doesn't work. The second one is your strategy is fine, but you're not actually able to implement it with an integrity that is to say the intended services aren't getting to the right people. There was a really amusing evaluation I read some years ago in the US. And these people were supposed to go door to door in urban inner city urban areas in Chicago, and teach people how to maintain their homes basically repair them. The program was funded. It was evaluated. It ran for 10 years. There was only one problem. The problem, the program as designed was never implemented. They never actually did the things the program said they were going to do. The program staff decided the original plan was silly. So they did other things instead. Rather than teach home handyman skills to these inner city residents on how to fix their doors repair window. They went into an advocacy mode. And so they taught these people how to be community activists and lobby government officials. In that sense, I'll come back to you in a sec Julie. So in that sense the program is designed was never implemented, although it did get evaluated interestingly enough. The program and its design and its implementation are quite reasonable, but something in its external environment changed. So what used to work doesn't work anymore. An example I've seen an international development is where you have a partner you're partnering with someone and overseas government to do something in their country. They're highly supportive and Australia's doing certain things and the overseas government's doing complimentary things, but then their priorities change and they stop doing their part of the agreement. So what used to work doesn't work anymore. And the fourth reason on my list is that actually the program works just fine. It's the evaluation itself that it's faulting. Here about someone saying an impact evaluation found that the program is ineffective. I mean, I'm always looking for these as potential reasons the strategies want implementation doesn't work. Something in the environment has changed, or the evaluation itself is wrong. We never like to admit that but it's a possibility. Julie, you had your hand up a moment ago. Yes, Scott, just a little comment. I like you for your four ideas around this but the example that you gave. I thought around people door knocking to help people maintain their homes. I thought that was so blatantly a theory failure or a strategy failure that everybody working on it could see it. And that's why they implemented something else. Well, before I just my bias perhaps before I'd argue about strategy hadn't worked. I'd want to know the program was implemented first. It might be sauce up front but I'd still before I declared a clear strategy failure I'd want to know something about implementation, but I could think of other examples some. Immunization. We know immunization works the science of that as well established we don't have to test that all we have to figure be clear about is the implementation. I've seen examples of some overseas countries that I want to name where Australia was delivering drugs for immunizations to community health centers that were never ever implemented that were never used on children. In that sense the strategy of immunization was okay but implementation there was a delivery failure. Maybe that's a better example than my silly incense burning one. What has been your experience everyone I'd be interested in examples if you think this makes sense or maybe I'm missing something. Got we have an example from Lillian in the chat box about timing. Knowing when to conduct the evaluation too early and it may not show much in terms of outcomes that certainly my experience to Lillian and understanding the complexities of evaluation work, which sounds like it speaks a bit to methods appropriateness to. Yeah, I like that when it's a great. It's a really good one something I like that. I do a better consulting work with Commonwealth Health. And one thing they do with their program logic models they talk about short term outcomes medium term outcomes long term and impact. They make the program areas put timeframes on those. So, program area X in what time frame after the delivery of your outputs are you expecting to see the short term outcomes is it one to three years five to seven whatever the same again for the medium terms and the same again for the long terms. And that can be really helpful particularly for two reasons that forces you to think about what the trajectory of outcomes is. Okay, if you're an evaluator and you see this, it's a way of flagging what expectations are at different points of time so thank you Lillian I think that's a grand idea. Any other comments or observations. Kim here. I just back up that comment. Noting that a lot of the time you see program logical theory of change saying that short term is one year and medium term is up to three years and long term is longer. I really like the way you flag that areas would need to work out what is their expectation, which I'm not sure people can really do very accurately, frankly but I guess that's better than an arbitrary approach. Yeah, I think you make a great point. I'm not sure we often really do know what the likely trajectory is. But yeah, it's a good thing to wrestle with and facilitate a good discussion about that. Over in the chat, Hannah's made some contributions. Hannah says I work in the international environment and often see stakeholders in the countries we work in that have variable priorities. These might be determined by their government of the day or other external factors. It can make it interesting managing expectations both when designing and evaluating programs to add to that a time stakeholders may not have the high level insights to decide what is a priority and why and this can impact what they tell us. Yes, I agree with all of that. At the same time, programs are political creations. And ideally, they should be reflecting just to what extent it's possible. A bit of a consensus amongst that political process and the stakeholders I mean sometimes governments just do things because they have to urgently and they don't have time responding to COVID could be an example, but it's not always like that. And we certainly have programs that have been running for decades. And maybe the community hasn't really had much opportunity to have a say about this program and how it's being implemented or what it's trying to achieve. And maybe they have but we just have to have the same discussion again. Lots of friends who are teachers. I don't have a background in teaching but I've got friends who do and the role of public sector teachers comes up every five years or something and it has done forever and I suppose it will continue to do so. The extent to which the public education system has responsibility for the narrow instructional aspect of students versus more broader life skills. There's some little on issues like sexuality and this sort of thing. So, where their demarcation and responsibilities lie in the public education system seems to be a topic that's open for discussion on every five years. This was my impression. All right, I might move on a little bit. This is my home stretch here. I'd like to make some comments about the key challenge challenges for evaluators seeking to assess social programs. When I got a lot of this from the former boss of mine, Pam Williams, back in that around 2000 when we worked together, and also from some of my own experiences in a range of different sectors, I think there are some common challenges. One of these being, how are you going to decide on the evaluations area focus? Options include the changing nature of government policies over time, funding levels, desired outcomes, strategies in a particular sectoral area. They can change massively. So what are you going to focus on? You want to assess the volume of outputs delivered, service coverage, adequacy of access, things like waiting list, you can do that. Service quality, sure. This is Pam's impact on stakeholders where it's the immediate client, the family, the broader community, both intended and unintended effects, interactions with other programs. These are all options. Personally, I think when I include myself in this, I think we've been very, we've underdone the area of unintended effects. I really don't think we've given that great justice. Interestingly, you know, I was working with a fellow in the ACT government, one of Raul's colleagues, and I won't name the program, but this person developed a negative program theory. That is to say he had a positive theory how the program would work based on various assumptions. And then he went, what if every one of my assumptions didn't hold any exact opposite outcome happened? There was a negative program theory. I've never seen one before. I thought of a shared genius, though, myself. It was a horror story if it all happened, but it was a great thing that he was aware of this. Something else that comes up a lot is the equity of funding allocations across geographic regions or across different client groups. And then performance monitoring, evaluation and accountability system. These are all topics that an evaluation could potentially focus on. And there's no immediate right or wrong to it. But by Jib, there's a whole range of issues there more than you could possibly help to cover and run evaluation, which means you can have to make choices. That's really the essence of my message. And then the systemic challenges. But I see this a lot. The department hasn't developed a program logic model or theory of intervention about how a program is supposed to work. And any theories of action are really implicit rather than explicit. If you interview people that can explain it to you and they'll have a rationale but it's very much in one or two people's minds. And it's never really been codified, which means it hasn't ever really been tested either. A common systemic challenge I see a lot is that the target group cannot be serviced at a great standards within the current resourcing envelope. And that means the department creates waiting lists, cues. They unofficially will raise the eligibility requirements or narrow the intended target group. And that's just their way of trying to manage with not enough money to service everyone in need. And all this issue to a free service of having infinite to demand, often not having a clear target to measure against and demand management, which is sort of government. Weasel words for demand reduction becomes management's focus not meeting demand that becomes the new unofficial goal. And sometimes we don't have clear measure of successful outcomes or appropriate service division, like that is to say we have unclear criterion standards for service delivery. So you as an evaluator do you want to open that can of words even potentially suggest some or try and work with your stakeholders to try and develop criteria and then evaluate against that you could do. We also find that workforce issues are common staff shortages the need for training high turnover low morale. If you ever worked in a program area where 25% of the staff turned over every year I have that means in four years 100% of your staff have turned over. And that has huge implications for learning corporate memory. You can often find to that you can find large amounts of administrative program data collected, but the data is not very good quality limited ad hoc analysis undertaken, what the data actually means the client files can be incomplete. That's not uncommon. And decentralized service provision coupled with unclear roles and responsibilities between the central office and regional office that makes some really difficult evaluation work. More and more will see cont service delivery being contracted out with met perhaps limited contract and quality oversight. And all that wasn't enough, particularly in areas like health and education, each level of government has limited control but overlapping roles. Federal government has policy and funding state has responsibility for ensuring good outcomes. At the same time you've got alternative service providers are also operating in the sector. What you find in some of these national sectoral areas like health education is a government, government's role often becomes not efficient and effective delivery of a service, but one of plugging gaps and available services without creating perverse incentives. That's often what happens in practice. That's some interesting implications for what that means for doing an evaluation. And if all that wasn't enough I'd raise the issue of methodological problems. Limitations and available data it systems often not user friendly, you can have privacy ethical considerations trying to identify all the relevant outcomes for a client group. Evaluate is often how to meet for independent expert advice and substantive matters. For example. And that can raise issues with the value as being accused of second guessing clinical judgments I've been accused of by judges of trying to interfere in legal matters. When I was doing an assessment of their management of deceased estates. I was accused by doctors and emergency departments in one state of trying to second guess their clinical judgments in their services to suicidal youth. And I got around that by Australia back then had clinical standards for the treatment and emergency departments of suicidal youth. So I had the two doctors who wrote the standards and got them to join my evaluation team. And then I paid them to go to 11 hospitals with me, and they pulled patient files and system against the standards they themselves wrote. Gee, what a difference that made from Scott you're not one of us, you wouldn't know through to oh here's George and Fred wrote the standards come in guys you want our files would give you anything you want. So there are ways of working around this issue of you don't have the expertise or you're not one of us. There's also this issue of decentralized service provision. The practical consequence of that means very time consuming and expensive evaluations with large samples. Do you want to represent it of samples from 26 different sites around the country. I've any valuations like that in the health system that gets quite expensive fairly quickly. Do you want to pick five of the 26 sites and just study them, but then how are you going to generalize from the five back to the 26, or, or some of the sites sort of typical you could try and do a typical case qualitative samples Michael pattern would say. The other issue I would raise is that when service delivery has been contracted out and that's increasingly common evaluators need the authority to examine the contracted operations that is to say the authority, the ability to follow the money to see where all that goes. I'm conscious I've been talking a lot, and I'd like to stop there and. find a whole series of challenges under three headings where evaluators looking to assess social programs and please, how to invite you for any comments or suggestions or feel free to disagree or point out something I might have missed. Thank you. And Scott, I just noticed that we're just coming up to the end of our time so I'm not sure if people might need to go soon but does anyone want to ask any questions. I've Scott before we wrap up. We'll make any comments. It's great to say thank you Scott for such real world comments on the practice of the public service in Australia. Sorry Harry I think I talked over you. No I was just saying thanks to Scott. Enjoy the seminar. Thanks. Thanks Scott. Everyone in the Canberra committee would like to thank you very much for presenting for us today. It was such a such an interesting seminar. I really enjoyed being able to hear about the examples, especially the, the incense sticks I'll take that one. I'll take that as a note for program design. Thank you everyone as well for coming and joining us today and we look forward to seeing you again.