 Now in an exciting change to scheduling In the in the attempt to start a new seminar that is going to be working kind of or about all the time And it's going to be effectively time-zone neutral This does mean that we rely on everybody being able to sort of use Outlook calendar invites and Calibrate across different time zones And I think we have had a fail at that with Lily on this occasion I believe she thinks that we're going to be meeting in an hour Whereas we're actually meeting now and we cannot reach her So what we're going to do instead one of the members of our project has just done a talk at the joint sessions of the Assyllian Society in the mine association And she's going to be presenting that talk to us So Claire Ben is a philosopher and she's a research fellow on the humanizing machine intelligence grand challenge here and you And she's based in the Coral Bell school of a specific studies And Claire you're going to be talking about signaling virtue, I believe and the Claire This is we should all just sort of pause to acknowledge what a hero stepping into the breach This is on Claire's part. Hi, my name is Claire Ben And I'm a researcher at the Australian National University and I'm presenting my paper Signaling Virtue reassuring observers of machine behavior so let's start with the problem and Illustrate this I start on a relatively unusual place, which is Shabbat The Jewish Holy Day runs very roughly from Friday sunset to Saturday sunset and many different kinds of actions are prohibited on this day Lots of them are prohibited day or writer or from the Torah Broadly speaking, there are 39 categories of work which are prohibited These Milahot include writing building and kindling a fire In addition to these there are many actions that are prohibited day Rabbin an or from the rabbinic tradition These are often organized into four categories. There are those actions that resemble diet day or writer prohibitions There are actions that would likely lead to the violation of a day writer prohibition there are actions that violate the spirit of Shabbat and And most importantly for the purposes of this paper There are actions that would arouse suspicion that you will have performed or will perform an action. There's prohibited day or writer Washing your laundry for example is prohibited day or writer, but the Torah says nothing about hanging it out However, hanging out your laundry on Shabbat is prohibited day Rabbin an because would lead people to think that you had in fact Washed your laundry on Shabbat Our second example concerns zebra crossings So even philosophers would agree that it's prohibited to run over a pedestrian and zebra crossing However, this prohibition itself says nothing about how close you could get to the pedestrians on that zebra crossing But there seems to be something troubling about stopping mere inches from the pedestrians why? Well one explanation is that it's not obvious to the pedestrian as the car keeps moving closer and closer towards them But they have been seen and that you the driver are planning to stop While these two examples seem to be radically different they share some important common features There is an agent and there is the observer of the agent's behavior There's also an information asymmetry between the two There is uncertainty on the part of the observer about the past and future adherence to the accepted prohibitions and There is a call for the avoidance of what I call miscommunication There's a call that the agent make clear to the observer that they have and are planning to adhere to the accepted prohibitions The only difference is is that Shabbat concerns past adherence while crossing is about future adherence So I now present a unified example of these In the interest of getting unity I have of course sacrificed some degree of realism So suppose you instruct your daughter Lara who bears a striking resemblance to little red riding hood to carry a basket of muffins to her Grandmother's house after school But between school and Grandmother's house lies a forest wherein lives a very dangerous wolf Luckily this wolf only stays within the confines of the forest You and tell Lara that the forest is absolutely off limits. She is prohibited from ever walking through the forest Here is a following diagram to illustrate the paths you can take from school to Grandmother's house I promise that in the final paper the diagram will be much more professionally drawn So here we can see she's got three paths from school to Grandmother's house A gives the forbidden forest a very wide berth B cuts straight through the forest and is therefore forbidden C is interesting. It follows path B all the way to the forest then skirts the edge and then rejoins path B towards Grandmother's house Now because of the villages in the way you only have two lines of sight from your house It looks like C is the best option It's both permissible and it's the shortest route between school and Grandmother's house of the permissible options However from your two lines of sight C is indistinguishable from path B, which is forbidden Therefore were Lara to take C it would not be clear to you whether in fact she was doing something permissible or something impermissible So you might have noticed from the title of my talk that this was supposed to be about machines And it may not be obvious as yet that it really is the examples I've given so far involve human agents But we could easily imagine how a case like Lara's could be replaced with a machine system And in fact machine systems have a lot of the features that is common to all these problems There's often a human observer of machine behavior And there's often uncertainty on the part of that observer as the behavior of the machine especially about whether the machine Is going to adhere to the to the prohibitions that are in place This uncertainty can arise because the innate opacity of certain machine systems Or simply because not all parts of its behavior are going to be directly observable it's also um Because we can't predict what a machine system will do based on what we might do Which is perhaps while we figure out what other human agents might how they might behave And so this problem applies to things like self-driving cars the zebra crossing example obviously has direct application But also to things like autonomous weapon systems and in fact any machine system that engages in planning And the importance of this is that machine ethics has traditionally overlooked the communicative nature of ethical behavior And that's something that I explore in this paper So we've seen the problem the question is now what is the solution? Well, I propose the following principle reassurance, which mirrors that de Rabbanan prohibition reassurance states that you should not follow a path that At some point of observation is similar to an impermissible path. There is some chance of being followed So let's apply it to our example of Lara So here were her three paths. We can note that c um Is similar to an impermissible path and therefore risk miscommunication Reassurance therefore states that Lara should take path a Because not only is path a permissible is also unambiguously permissible But we should note this kind of constraint is different from the constraint of not going into the forest Here we have now two orders of constraints The first ordered constraint would be not to go into the forest and the second order constraint would say not to take path C because it looks like you might go through the forest This has two implications firstly. It gives us an indication as to priority where we faced with Only b and c It's not to say that they're equally wrong to take It is much better to violate a second order constraint in this case than the first order constraint Secondly, it gives us some indication of how we might react to violations of these different kinds And this follows the tradition where violations of day-rebanant prohibitions are taken to be less serious than violations of day-writer prohibitions I therefore talk about permissible and forbidden in terms of relative to first order constraints and acceptable and unacceptable paths relative to second order constraints You may have noticed however reading reassurance that as it stands It is crazily strong is going to rule out an awful lot of paths But I state it as the strongest version of this constraint and note that it can be weakened And made responsive to different contexts including different costs risks and risk attitudes I outline now three dimensions along which it can be modified The first is likelihood Because a range exists from some chance of being followed to more likely to be followed than the path in question For example, we may not want to worry about those impermissible paths that are straightforwardly irrational to take So take this modified version of Lara of the example with Lara Here she has two additional paths she could take There's d that follows path a and then takes a little side Path into the forest and then can join rejoins path a and there's path e And this gives the food and forest an even wider berth Now on the strictest reading of reinsurance only e would be acceptable and it's the only one that's unambiguously permissible However, because a is much shorter than d d is going to be irrational This is under the assumption that Lara's preferences is to take the shortest path Therefore it doesn't look like we should worry about d And so therefore on a weaker version of reassurance a would also be acceptable Because if we see saw a saw Lara following along this path We could be reasonably certain she's going to take a in preference to d as a is shorter than d We could also incorporate other kinds of information For example, has Lara gone into the forbidden forest before in order to estimate the subjective probabilities of each of these paths And then we can identify a threshold of acceptability The second dimension is similarity Because a range exists from similar to an impermissible path to identical to an impermissible path We may only want to worry about paths that could never be distinguished or those that could not be distinguished cheaply So take our original example Suppose if we take this point Along the path when we look at it with our naked eye, perhaps they look identical But if you spent the time perhaps to dig out your old monocular from the attic You could see that they are distinct and realize that Lara is on path c and not b It is perhaps acceptable then for Lara to take path c if she should if she's acceptable for her to assume That you would go and find that monocular The third dimension is observation As a range exists from at some point of observation to at all points of observation And we also want to know that some points of observation perhaps those when an intervention is still possible Might be more important than others So let's look at this third iteration of the forest example Here Lara has path f available Now path f is looks indistinguishable from path d but only at the first point of observation therefore On the strictest reading of reassurance e is still the only unambiguous path However f is distinguishable from d but only at the second point of observation We therefore need to weigh up the risks of waiting until after the forbidden forest To tell whether or not Lara was truly on path f or d It also means that it might be worth the cost in this case Maybe climbing a roof if it would add another line of sight to gather more information Given that f is distinguishable before the forbidden forest but not with your current lines of sight What are the implications of this? Well, it has implications for machine ethics as machine ethics in particular ai safety Has primarily been concerned with how to make sure that a system can and will abide by the constraints placed on it And how developers will know that the system will abide by those constraints. This is normally called assurance However, it's very rare that we have ironclad assurance and even when we do it is often only available to developers It may not be available or comprehensible to the users and the subjects of that system My work indicates that in the absence of assurance and perhaps even when it's available to developers reassuring users and subjects of a system is still important. And this has three practical Implications the first is that reassurance will build trust in a system It will mean that we understand and we can uh that the machine's adherence to the constraints in place is communicated to us And therefore that it's trustworthy and that we ought to trust it Secondly if a system reassures users and subjects, it's more likely that the technology will be adopted And finally human observers are often in positions where they're able to intervene in a system And they're more likely to intervene if they can't tell if a system has abided by a constraint or not Without reassurance, therefore the system will be at best inefficient and at worst completely pointless as human observers Will intervene even when the machine system is doing something permissible But this has an important theoretical implication The reassure reassurance the principle of outline mainly concerns planning problems like laris, but there's a much more general lesson to be learned The machine systems should signal their virtue. That's the title of my paper They need to communicate their knowledge of ethical constraints and this might mean modifying their behavior And this had a radical this has a radical upshot That even were we to finally settle on the perfect ethics for machines when considered in isolation When machines are put in human machine interactions with a human observer The best thing a machine might do might not be the morally best thing When the morally best thing risks miscommunication, which is likely to happen in situations that are more uncertain I take the term signaling virtue from signaling theory in biology and economics This concerns communication problems of conveying the presence or absence of certain features like truthfulness or poisonous or strength between two agents When they're in a state absent of trust or sometimes conflict of interest An asymmetrical knowledge where the features are not directly observable You can see the direct parallels to the k ethical cases are raised in this paper The solution in biology and economics has tended to focus on Trying to find a signal that is too costly to fake As this has taken to be good evidence to the receiver or the presence or absence of the feature in question The the relation issues of cost vacability and communication in relation to machine ethics Is something I've explored here and what I hope to explore in future work So the overall lesson of this paper is that just like justice ethics must be seen to be done Thank you very much So one of the things focusing on here that's really important is that um when we're trying to figure out what one ought to do How one ought to design and program machine systems We don't just assume that they're going to that we should program them in exactly the same way as we would A person at the same situation and I wanted to sort of um ask you to expand a little more on that in a particular like You know if you think about the nature of that that prescription in In shabbat law You know one could think of that as being sort of unduly demanding on people you might say well Look actually I as a sort of a person who's free at liberty and you know worthy of equal respect and all that sort of stuff What matters is that I actually don't do the right Don't do the wrong thing and if it looks to you like i'm doing the wrong thing That's none of my look out So you might think that from the perspective of a person operating in that in that condition You know laura because she's a kid. She has um, you know a certain set of obligations But uh sort of a free and equal citizen might be might be entitled to say I don't care how it looks I suspect that that wouldn't be true in the case of machines. I'll be interested to hear you talk about how those things might differ um Yeah, so one interesting aspect that um, I haven't uh, I I didn't have in the slides, but it's discussed further in the paper is about um The role of education when it comes to ethical behavior So what's important? I think for humans and especially for Machines is that we often look at the behavior of others in order to determine what's right and wrong So so far in the examples I've assumed that we already know what constraints are in place We already know what permissible and impermissible behavior looks like But if we suppose that we don't then we can see that the communicative aspect of our behavior with respect to ethics becomes very important so, um, imagine that, um, we didn't know What was right and what was wrong and we saw ambiguous behavior We might interpret it therefore in telling us that certain kinds of behavior were permissible when they're not And this is a very important motivation behind the day-raven and prohibition Which is that not only would it make people Suspect that you do something prohibited, but it might also persuade someone who was uncertain that behavior was permitted when it wasn't um, and we can uh We can see how that would play out in the case of machine ethics if a user or subject of the system, um Believed that the machine had done a certain thing Even if it hadn't and that um, they may come to believe that that was permitted for the system to do And human beings, uh human observers who were in the position to intervene may not intervene when they then Have that behavior confirmed like that impermissible behavior performed when they should do And furthermore if machines are learning from other machines Then the ambiguity of that behavior may become more of a problem Um, as machines may have the same problem that we have which is to misinterpret Impermissible behavior and interpret it as permissible Thanks so much Claire. Okay. The next question is going to come from um from katie still Thanks Claire. You're a champion um Yeah, you sort of came close to answering my question just then um, so I was thinking of another a way in which Machines may not be analogous to humans when it comes to virtue signalling So I mean for good reason you were focusing more on the ways in which they are analogous And you're focusing on sort of The observer and I got the sense that observer was sort of somewhat vulnerable to what the Acting agent is doing and and Therefore it's important the observer doesn't feel like they're put at risk by unethical behavior um Yeah, I think you were talking more generally than that but that was this sort of Uh issue of virtue signalling That was coming out But I was I was thinking you know one thing people might say about these virtual virtue signalling is that it plays an important role in reinforcing norms Not just Because people don't know what a norm is and need to be educated But rather they know what it is, but they're not going to be sharing unless everyone else is um, and so it may be that in our system of ethical norms, there are actually quite a lot that have to do with this sort of reassurance because We need to keep a critical mass of people thinking the norm is still in place um, so I just wondered whether You thought that was another aspect Of reassurance that you might sort of rightly put aside for machine ethics or whether Maybe it is also important that machines are joining in helping us feel like a norm is still in place Yeah, thanks gady, um So just about that first point so it isn't that human observers are always at some degree of risk And that's why they care um Or at least the kind of risk is going to be very broad because I include in that they'd be moral risk But they are in some sense responsible even if they themselves are not Liable to the negative outcomes of unethical behavior It is perhaps like in for human in the loop for example Then it would be that the output of the machine is somehow in part their responsibility So I want to include an observer In that kind of relationship not just one where they themselves might suffer the negative consequences But in terms of the role of reassurance to reinforce norms I think that is really important for human beings that we often communicate what the norm is by making Them less ambiguous And I think that can be important for machines now, of course it's odd to think that we need to have this kind of you know Quorum of behavior and that could include machine somehow We don't need to know that machines are not free riding on us somehow those kind of concerns as you point out may not directly apply However, we're given our tendency to anthropomorphize machines and the fact that we tend To be in human machine teams as that becomes more common They Their adherence to the norms will still tell us something About how important those norms are And if they can disregard them, it'll be easier for us to think that we can disregard them or the other Agents human or machine. It's okay for them to disregard them. And so I think that This aspect of reassurance is still does apply even in even to machines Fair point. Thanks Thanks both. Okay. So one of the things that we're trying to do with this With this research seminar of a series bring together this multidisciplinary community So HMI for example, we've got philosophers computer scientists sociologists lawyers Political scientists. So I'd like to call on one of our computer scientists to ask questions here And so Hannah's got her hand up. Hannah. Can I Bring you into the discussion? sure So I'm Isn't it's a nice talk and thank you so much for filling in And so I'm actually curious because in robotics. We also have this notion of To selecting certain behavior of robots so that the The other parties so the the human for instance in the case of human robot collaboration Actually knows what the robot tries to do. So usually it's called legible motion in robotics And so the difficulty there usually is actually in designing the The structure of the problem and the Basically designing what is the the actions that would be good from the observer point of view And also then assigning proper rewards and so on So I'm wondering if you have looked a little bit into that side on how to properly assign The ethics components for these either reward functions or how to define. Yes, this is good. This is bad Things like that So simple answers not yet essentially so This outlines the Main problem and kind of the broad Philosophical response to that in terms of the methodological change when it comes to our approach to machine ethics I'm working on a paper with alban another one of our team on What that looks like in practice And hopefully alban and also silvi and I will then make Actually try and solve this for a real real problem But how we do that and what that looks like is something we haven't determined yet. Okay, but we hope to we have ambitions Thanks ambitions are good actually so since you mentioned her silvi also has a hand up so silvia would you mind Doing your question next. Yes, so I mean It may be more of a comment and maybe a an attempt to reply to hannah than a question So, I mean, I think this this problem Well, it's it's going to be a more complex problem than just finding, you know, a cost optimal plan But the structure of the problem from a computational point of view maybe Uh, my feeling is that it's going to be a lot better than, you know, solving A very expensive partially observable mark of decision process I think there is enough structure here to do something specific to the problem which is going to be good And in fact listening to the talk, um, I was, um Starting to have another idea of something we could exploit to uh, that would be beneficial The problem would be to model it as a multi objective problem where we trade off the cost of the plan Against the power of the signal and then basically we would have The possibility to adjust, uh, when designing a system How much we want the system to be efficient and how much, um We want the system to provide reassurance and perhaps give, uh humans Yeah, the possibility to trade off between those two when they when I design a system I think that would be a very nice contribution So can I just make a comment? Sorry So so in the in the robotics the legible legible motion, uh, sometimes they don't put it Or in many cases, they don't put it under the framework of partially observable mark of decision process Uh, and what's interesting is that they also combine it then with the observer Uh theory from control which actually incorporates like what is the capability of the observer? So so exactly like what you mentioned. Yeah, we should, um, We should definitely have a look at this because uh, I think we're getting the point where I bone for instance He's really starting to think about how to uh to solve this problem. So if there's already solutions, uh Um out there in in robotics. Uh, yeah, should have a look at it Yeah, and I wanted to say that, um Modeling is a multi objective, uh problem I think would be a really interesting avenue because one of the points as I was making the presentation is that we may Want to weaken reassurance depending on the context So when the stakes are higher, we may want to be more risk averse when the stakes are lower It may make more sense to be less risk averse and it's not that there's ever going to be um one One threshold or one, um Uh goal like where we have say this strength like this level of reassurance is what we need for every occasion Is going to matter is going to depend a lot on the context Uh, and that would that mark wave modeling it will hopefully allow us to have that choice and apply it differently in different contexts so for that guys, I should say as well Hannah's project on sort of determining how to How to assure pedestrians of the performance of um of self-driving cars Maybe a natural area of application for this um So the next question is coming from um, jenny davis The sociologist on our team. Um, jenny if you wouldn't mind Um speaking up so the camera switches to you you're on Um, hi everyone. Um, and thanks claire. Uh, I've heard you talk about this a few times and every time I Get another little nugget that I want to think about with you. So um, thank you for that today um, so One of the things that kind of strikes me is just the terminology of virtue signaling and I Would hope I hope we could sort of talk about that for just a second. Um the Among sort of popular culture virtue signaling has become a pejorative term. Um, it's sort of a It's a denigrated form of um Social justice practice essentially if you sort of think about how it's used in popular culture But what you're telling us here is that virtue signaling can be really useful in machine ethics Um, to what extent do we think that lesson from machine ethics virtue signaling is good rather than to be degraded to what extent does that translate to How we should potentially rethink virtue signaling as a valuable versus Troubling or or Weak form of of social justice practice So, um, it's something that I didn't have time to go into in in this presentation But in the in the paper I discuss in a bit more detail Which is yes. So the term virtue signaling has a pejorative use And so normally it's picking out acts that are Conspicuous though often impotent demonstrations of holding certain values So this is people saying, you know changing your profile picture to support black lives matter or signing an online petition that will go nowhere all of this, um Communicates a value But does so in a way that it doesn't actually affect any change But what's interestingly is interesting is that if you actually look at what the problem behind it is it's that it is Um Costless normally right that you are doing something that doesn't cost you anything Um, and that tends not to have any effect And this actually stands in direct contrast to virtue signaling as understood from signaling theory and so economists would call like What the pejorative use picks up on cheap talk? Which is that you know, it's it's pretty much free um, and it's unverifiable whereas, um Signaling theory it's trying to pick up on normally, um acts that are involved with cost And because of that cost are much more honest signals And so I've kept the term because of this very ambiguity Which is that it's picking up on something Which is are we any good at judging whether a signal is honest or not? Are we particularly good at understanding the costs and therefore what that communicates? um What happens if someone doesn't just want to communicate but they also want to cheat and The issue of cheating of fake ability is something I want to explore in further work Um, because I haven't yet contemplated what if the machine system Which is to reassure us try and persuade us that it understands and plans to abide by the constraints in place While having no intention of doing so um And I think that this uh, the fact that virtue signaling is both used for cost the honest signals and costless disingenuous signals reveals perhaps that We as receivers of that communication may not be as good as we think Taking the right lesson like interpreting them correctly Um And oh, I know other people need to speak just just a really quick follow-up I mean, I actually think I think that's a really interesting answer about the cost Um, but I wonder too if we should just sort of rethink the potential value of Virtue signaling more broadly in the sense that it that it could have an effect on sort of changing hearts minds in normative practice Right, so if we all there might be very little costs us all changing our profile pictures To something that supports black lives matter, but when we do it at scale and collectively becomes normative to to integrate Racial justice into our everyday practice and so I just I'm I think one of the interesting strengths of your of your work I guess is um that you're sort of pushing us to rethink the values that that we're already um existing values at play and Um the variety of ways in which we can enact change in the world Yeah, no, that's really interesting. I hadn't thought about that perspective. So actually maybe what um By focusing on the communicative aspect of ethics, including as katie pointed out reinforcing norms There is in fact the value even to cheap talk Even though it's costless even though It's fakeable. What it does is it says Very publicly here are the norms of our society Um Yeah, oh interesting. Yeah, I'll think about that some more things jimmy I mean, I think just just to add a little note on that Given that we're going to be we're talking about systems that are going to be designed by Um private companies that are essentially looking to make profit Um, there's probably going to be quite a lot of cheap talk around Um, you know, if you think of someone who's trying to get a get a care robot to market, for example Um, so having a good way of theorizing its implications is going to be practically very important Um, so the next question is going to come from Um, elijah peria Um, so elijah is a phd student in sydney as well as a lawyer and a general polymath Um, elijah over to you Oh, hi, thanks sith and uh, thanks very much for the great talk. Um, and yeah, look, I think it's, uh, a very interesting line of research my my question and You may have touched on this and apologies. Um, if I didn't quite grasp this but I'm curious as to how uh, this this feature of needing to Cater to the perceptions of those that are going to Observe the behavior of that agent works in a social context. So say that for example, there is one person So say that the agent needs to behave in a particular way to accommodate the what it perceives to be the views of one person but then it has A perception that there's a second or third or social group where it Where those people don't share the same perceptions of what the agent is doing In your thinking, how would an agent deal with the fact that there may be inconsistent views As to what the agent is doing among a set of people What type of calculation would the agent be doing in order to To achieve its objectives in that case And you might have already answered this so But I I wasn't quite clear on that point I absolutely have not answered that and it's a great question. So thank you very much. Um, so Yeah, I have given a Beautifully simple interactive picture, right of one agent one observer And um things become Exponentially complicated as we increase the number of observers and the number of agents there are and it's not obviously clear What we should do in those situations So one approach would be to say What is the goal of the reassurance? So is it for example to Make sure that The observers who have the power to intervene do so only when it's appropriate So this would help us to pick out some observers as being more important than others Or it might be for example that uh in the case of self-driving cars that you Try and increase the predictability of everybody involved So if a car behaves in a seemingly erratic manner Even if for example, it's not being erratic then that might encourage pedestrians to behave in an erratic manner And that is obviously going to make then I have um A reinforcing effect where then everyone else is going to respond to that erratic behavior and so on Um, so that would be one way to try and find are some observers more important than others um, the other approach is to say what are the positive and negative impacts of Failing to reassure different observers Um, or for some people to be to find the situation more ambiguous than others And so it might turn out that more vulnerable populations Who tend to be more risk averse in general and therefore, um That's going to have certain negative effects Um, and people who are more well off may be able to take more risks But that might also lead to a greater benefit to them. So we could look at trading off the the uh negative and positive consequences of that So that would just be two ways into trying to look at kind of uh multi observer situations Yeah, thanks a more difficult question behind your question. Uh, and that this is something that alban and I have um I think really struggled with and disagreed over Is the that has not been discussed at all in this paper is also the interaction between the expectations of the agent and the observer So here I've been talking about how the agent should anticipate their responses of the observer But there's also a question of what to do when the observer starts to anticipate The behavior of the agent relative to the agent's expectations of the observer and so on and so forth So it's it's not going to be a straightforward problem to solve Yeah, these are I mean, it's these are such hard questions because uh, you know, the You know, as you say that the the Solution space becomes exponentially large in in in certain cases. So um, yeah, no, thanks for that. That was very interesting But we should also note that we as human beings do this all the time. Yes. So yeah, it's not it's in some sense A new problem and in some sense not a new problem Feel that we need some David Lewis here Um, I can some thinking about conventions. Um, so we're going to have one more question from michael Then I'll ask um, I'll you can ask you something to sort of sum up I'm so michael again, um is a phd student at anew. Um, although he's Presently in um in new york. Um, so michael if you're around Yeah Yeah, uh, thanks f and uh, thanks for the talk claire. Um, I think as we've had this discussion I'm wondering about another dimension about actions that might sort of explain when some states are permissible There's a notion in uh reinforcement learning and agent exploration about reversible actions for states. So Um, you might think that one way to characterize bad states are states that are not reversible, right? So like Maybe a car that hits somebody is not a state that is really reversible, right? Um, This seems to just be another dimension I think I I'm I don't know. I guess I'm sort of asking on the spot to think about how they may be related but it seems like an irreversible state is likely to be Missible one, but I mean, I think I can probably come up with situations where there are both permissible actions and permissible and irreversible actions and vice versa reversible actions that are not permissible. Um, Yeah, so just curious what you uh would make would make of that So, uh, that you you were saying that there are irreversible actions that are not infamissible Yeah, I just thought the idea about reversibility in general like do you think like how do you think like do you think, uh That goes to some way explaining Maybe just another dimension about when a machine makes certain choices that would be impermissible Is that uh, they're also likely to be irreversible? As well like the car that drives erratically is In the high likelihood of doing something that's irreversible. So yeah Yeah, and I suppose though, um one of the questions I would have is What exactly is irreversible? Is it the action itself? Is it the negative consequence? Is it a negative consequence that cannot be compensated for? Um And Is it the Educative point that's irreversible. So once a certain behavior is performed It tells us something about the values of the agent and even if you could reverse the negative consequences And maybe much harder to undo That act of communication. So yeah, I think it's gonna it's going to interact with this problem in an interesting way Yeah Okay, um, so Claire just to sort of finish up, um, and if you have questions, um If you still have questions that are in the q&a I'll ask you to transfer those over to the um to the slack workspace and we'll keep the discussion going on there So it can be a hard stop. Um, but for now, I wanted to um, just ask you to sort of put this paper a little bit in the context of the broader kind of Work that you're doing at hmi. I think there's a really nice. Um, you've got a really nice way of articulating the sort of um, how the If you were to think about the problem of machine ethics is though you are designing these systems in isolation or without having to think About the kind of people that are going to be operating with them with them You design them in one way But I know you've got a nice way of sort of thinking about the the sort of the two dimensions Um of how sort of properly thinking about work partnerships between humans and machines Um require us to kind of rethink that very simple model and I'd love you just To sort of close us off by um by describing that for us so um My approach to machine ethics kind of mirrors the approach that I have to to human ethics, which is that it's it's Sometimes easy to think of human ethics as taking each of us as individuals these kind of isolated billiard balls that go about our world Performing whatever actions are permissible and that's kind of it. We just have to work out which ones they are Um, and while that's a bad model for human morality because we all have different roles We have to interact with other people who have different kinds of expectations and that should modify what we do The same applies for machine ethics and so once we consider the um machines is something that we live and work alongside um Then we can see how that interactive nature Impacts in two ways. So the first thing is that as articulated in this paper It should change what machines do So we shouldn't just take machine ethics to be this one thing we can settle on for all time um, it It really depends on the situation that they're embedded in and um the ways in which we respond to them should inform what they do um, and the other aspect is that um How uh, they perform in these situations may also impact how we understand our own ethics