 Hello, welcome to this eight robustly beneficial podcasts. Number eight, we will discuss the paper by Lé, actually today, and title the roadmap for robust end-to-end alignment. Yeah, so this is a paper I started to write like a bit more than a year ago. And the reason for these people is that I felt like in the alignment research and AI safety research, there were a lot of ideas, but there were like local ideas and patches you can put here and there to solve this or that problem. And I found it a bit frustrating that there was not a global picture of everything that needs to be done. And I think this is very important in terms of safety because when you're trying to design a safe algorithm or a safe process, then any part of this process is potentially a vulnerability and if it was not designed carefully, the whole system can be compromised. So one of the key motivations of this work is trying to include everything, end-to-end alignment of the algorithm, like everything that needs to be done. So there was one of the main motivations. And the other main motivation is to try to cut it down and to decompose it into a huge number, if possible not a huge number but a large number of very small problems that hopefully can be tackled independently so that anyone could just see, oh yeah, this problem sounds interesting to me, I can work on this. And so that its work on this is extremely useful even though the person who is trying to solve this part has no idea of what's done for the global picture. So yeah, that was the motivation. And so the decomposition you have is in five main parts and you would say that approximately every part has between five and ten sub-problems that can be worked on? Yeah, so that's sort of how the paper is written. I don't think any of this should be taken too rigidly, too seriously. It's more like a picture to have in mind, a roadmap. You don't have to follow it exactly but at least I think it's useful to have this in mind so that you know what are the different parts and also what are the parts that are most neglected. Some of these correspond to things that research that is already ongoing but arguably there are parts of it which have been neglected so far and there's not enough research on them. So the global idea of the roadmap is something that's applied to algorithms that are maximizing something. So typically what I have in mind is mostly reinforcement learning in a very general sense. Like the last podcast was about reinforcement learning and we also saw that reinforcement learning is not that different from some supervised learning on some longer time scales and the roadmap definitely applies to supervised learning. It's not specific to reinforcement learning. More generally the problem is if you have an algorithm that does maximization and in machine learning these days we essentially only have maximizing algorithms then the problem of what this algorithm maximizes becomes critical and the whole point of the paper of alignment and especially robust end-to-end alignment is to make sure that what is optimized by our algorithms is something that we should be optimized and that's going to benefit mankind by being maximized. So what you are saying sounds very much like the usual way to discuss about the problem of the alignment. Computing the right objective function for knowing what humans truly want to optimize and actually this is just one of the components in the roadmap you proposed, the center component. I think we can use the names of the components. Yeah so the basic idea is to try to find a name to the different components that are easy to remember but for my experience they're not that easy to remember in the end but it's like Alice Bob, Charlie Dave and Erin so ABCDE Alice Bob and Charlie are like very classical names in computing so I did Dave and Erin I guess and they are all components and doing something to make sure that so Alice is the main maximization algorithm it's the thing that's doing reinforcement learning or supervised learning like it's the maximization algorithm it's what most people are working on these days so yeah and what we want to make sure is that it's maximizing the right objective function and the computation of the objective function of the rewards to be given to Alice is done by all the rest like all the rest can be thought of as the reward system like how to something that computes the reward and I think it's important to have in mind that in complex applications these rewards are going to be difficult to compute it's not like a human can come and give the rewards especially if you think of YouTube like it has to receive rewards billions of time per day so it has to be done algorithmically and so the whole idea is to imagine that this algorithm that computes rewards yeah there are different important parts of it and so the different parts are Bob, Charlie, Dave and Erin so perhaps confusingly I don't know if I would do this differently if I had to do this now but it actually starts the other way it starts from E, from Erin, then to Dave, then to Charlie, then to Bob to then feed Alice with the right rewards and so the different components so Erin is in charge of data collection and it needs to get data in, Erin I don't know if it's something to remember the name but Erin has to do data collection if you think about this, we can talk about this again but if you think about this like if you want to compute relevant rewards to be given to even to a kid today like you need to know what the world is like to know what is a good reward to be given so data is critical so that's Erin's job but then you have to realize that data is usually partial, incomplete, biased there's all sorts of problems with data even if you have very good data collection and so you need still to do some world-model inference from the data you cannot prove it from the data you need to do some world inference, world-model inference and this is Dave's job so Dave's job is to try to represent the world at least all the relevant aspects of the world to give rewards so that Dave's job Dave is doing world-model inference and then he gives the state of the world to Charlie whose job is going to be computing how desirable the current state of the world is so typically if the current state of the world is great then Charlie should say yeah this is great and should give higher rewards to Alice so that's what I was saying before that Charlie in your roadmap is the one doing the solving the problem of learning the values of humans yeah and then there's Bob which is I think the most fascinating the most neglected and the most important in the end aspect of this world map and what Bob is going to do is to tweak to modify the rewards given by Charlie to include some incentives to Alice essentially to so that Alice takes care and even upgrades the whole reward system because you have to imagine that especially early implementations but even later implementations of this reward system of this algorithm are going to be imperfect and there's a real risk of doing counterproductive overoptimization of some measure that's only an approximation of what we truly want to maximize and there's actually a name for this it's called Good Heart Slow so Good Heart Slow is something probably we'll be talking about this at some point also on this podcast but I think it's something very very important for alignment and safety it's the observation made by Charles Goodheart who is an economist that once you optimize over something even though this something seems very correlated with what you want to optimize well it can be very good early on and it can seem to be very productive but at some point overoptimization can be counterproductive if not catastrophic so the example he had in mind was something about monetary transfers you can think of for instance GDP GDP is typically a measure of how well it's usually very correlated with how well people are the welfare of people so high GDP is strongly correlated with happier people and for a long time as it was optimized people were actually getting happier life was getting better but arguably lately maybe if you keep optimizing for GDP it's going to have very negative side effects also it leads to some poetics or some strategies to sort of hack essentially the measure and this can be counterproductive another example that was given by Goodheart is the fact that school systems are now full of rankings and scoreings and this incentivizes for instance students to work for the exam and later on forget because he's just the lowest effort solution to get a good grade and if you think about what people remember from school it's a bit catastrophic if you think about this but it's also a different level like the Shanghai ranking has completely changed a lot of the ways universities are structured in some countries whose universities have a low ranking like France for instance this has led to a lot of mess when a French scholar is trying to give his affiliation sometimes it changes from one year to another every year because there's a lot of poetics involved and arguably it's not that productive for research or for teaching in these universities so somehow this is the role of Bob in your roadmap which is because the value learned by Charlie will be never fully perfect but we can expect over time that they will better and better approximate our true values then to avoid that Alice which is the maximization and algorithms over maximizes the imperfect objective function which is by Charlie then Bob would transform the reward that Charlie computes and give to Alice a different reward you mentioned that Bob is also giving other incentives to Alice essentially it's like making sure that Alice will take care of the reward system as a whole making sure that the data collected by Erin are plentiful but also reliable I think this is a very important challenge for AI safety making sure that we have reliable data especially if you have large scale algorithms so for instance if you ask yourself what is a good video recommendation these days about coronavirus let's take a very present day right now we are in the midst of the coronavirus outbreak maybe you'll have forgotten but right now it's a bit of a big deal a lot of people are talking about this and it's not easy to know what recommendations should be given in terms of the coronavirus typically you want people to be prepared for this but you don't want people to be panicking and the level at which they should be worried about this has to do with how risky it is for them and it is actually very hard to know how right now for instance at EPFL I've just had this discussion we regularly have discussions with different people it's like how seriously should we take this should we stop public events should we do this podcast just you and me I guess two people is fine what is the threshold some people are just staying at home out of safety you better know what we should do and what we should not do the stock market are plummeted because a lot of things are not happening because of prevention measures so it also has a cost to ask people to stay at home and so what should be done is really a difficult question and the key answer to what should be recommended typically by a YouTube algorithm strongly depends on what the world is really like if you could know who is actually carrying the coronavirus it would be so much easier to give the right advice to different people and to do this you need quality data and having reliable data about the coronavirus is very difficult what's quite remarkable is that it's not completely impossible the WHO is doing a lot of trying to organize a lot of the information about this the Wikipedia page on the coronavirus outbreak per country at least the number of cases per country they try to have more localized data and so on so I think it's quite remarkable that we have access to so much data but this is nowhere near the best kind of data we could have and so more generally I think to give good recommendations to have robustly beneficial algorithms data collection is really really important and we should think about how to collect the best possible data to make your algorithm beneficial so not only Erin the data collection should be in the pipeline of the hand-to-hand robust alignment but also we expect that Alice who has the possibility to influence the world because she's the one taking decisions also should make effort to improve the quality of the data collection process so one other thing I had in mind in this process is that essentially Alice would be the only one who's actually acting on the world sending messages in a unboundedly restricted manner and as opposed to this all the other components it's just like again I don't think this would stick to what I propose not at all but the idea I had in mind at least is to make sure that Erin, Dave, Charlie and Bob were just doing their jobs and they were not doing things that were not planned by this world map because if they were starting to send messages all over the way as well then they would potentially be causing harm that we did not anticipate so the way offering is also to restrict the risks of dangerous propagation to just Alice but of course this means that Erin cannot self-improve in a sense and if it wants to self-improve actually it should not want anything but if it wants to self-improve essentially it would be telling Alice or Bob should be telling Alice more precisely that Alice should improve the whole reward system doesn't typically by giving to Alice higher reward if the reward system gets improved if there are better data collected in a more reliable manner So is it that in the data collected by Erin there is also data that describes how Erin is collecting data Yeah so I think this would be the way to see it not only about this I guess but also data about how the whole reward system is functioning and yeah it's all the abstraction and collect data about the world but also about the whole reward system and maybe even about Alice too Okay it makes me think again about the discussion we had on black boxes so should it be that also even if Erin, Dave, Charlie and Bob they are collecting data about the world completely aware of the state of Erin, Dave, Charlie and Bob should Alice be aware of that meaning that if the whole pipeline is fully transparent to Alice somehow I expect that she would be much better at hacking it in a way that are undesirable So that would be Bob's job to make sure that like if there is a potential hack of the system Bob needs to give incentives to Alice to patch it So intuitively the idea is that Bob sees that there is some gap and immediately Bob gives very bad rewards to Alice because of this gap in the reward system and sort of with the promise or like the... so maybe there would be ways to do this because when I'm seeing this like it sounds hard to make it like computable with the reward Yeah I also don't really understand when we talk about this superintendent system if there is a gap then we expect the maximizing algorithms to find it and receive high reward because of the gap So how does Bob affect how can Bob realize that there is a gap and patch it I sort of understand that if Bob is able to patch the gap then there is not a gap There is not a gap So if Bob detects a gap it cannot fix it directly again because Alice to be the only one who is doing something but what it can do is to send the promise higher rewards and the way to do this activity that I have in mind is to also send a gradient of the rewards compared to with respect to different things in the world and in particular here in the reward system So typically there would be a gradient of like if you slightly patch the gap then I would give you high rewards because the gradient of it is high So that could be a way to formalize this idea of I saw this gap you should fix it Yeah This really shows also that Bob needs to be in some sense better at predicting things than Alice itself or something like this and yeah When I wrote this like the first time especially I was like I'm not even sure that there is a solution for this problem It sounds extremely difficult by no means like the world map solves the problem not at all like it's just a way to decompose the problem It still seems like extremely hard especially designing Bob but I grew slightly confident in at least the existence of the solution because the whole pipeline and especially Bob's job seems to me very similar to what Julia Galev calls intellectual honesty So there's a video on YouTube I put it in the description where Julia Galev discusses this concept and she argues it's extremely important and I'm really convinced that because typically if you educate more scientifically democrat people in the US then they tend to believe more in climate change which sounds good but if you do the same to Republicans then they actually tend to believe less in climate change and so this really shows that scientific education is maybe it's relevant but it's definitely not sufficient at all and can be even counterproductive in this case and the reason for this Julia Galev argues is that maybe in this case in particular the Republicans were lacking intellectual honesty like they got better at reaching their goal and that's what a lot of scientific education teaches in practice maybe it's not what it should be teaching but in practice it's what it's teaching and she argues that if you're more capable because of education you're also more capable of lying to yourself of self-deception and if you want to prevent that in a sense of intellectual honesty intellectual honesty is making sure that the data you created are actually unbiased and you make sure that the data will be reliable so that's Erin's job you make sure that your world model inference from the data that you've read is done in an unbiased manner in a good manner close to Bayesian and that's Dave's job you make sure that whatever you're going to try to do afterwards will be motivated by your true volition and your true preferences and not by the thing you want right now because you're a bit tired and you just want something that you will regret later on and so in the end intellectual honesty seems to be exactly the problem that Bob is trying to solve namely making sure that the whole reward system is performing well and the way to solve intellectual honesty there are essentially there's only one way which is giving yourself the right incentives and there are two kinds of incentives you can give yourself two kinds of rewards one of them is social so you can surround yourself with the people who value intellectual honesty and this I think hugely helps but the other thing is internal rewards is to make sure that you receive rewards whenever you access better data even if it's bad news whenever you make better reasoning even if it leads to conclusion you don't like so we had a bit of a problem during the recording what I meant to say is that you try for the fact that you are acting according to your two preferences your two volitions things that you really really prefer and not things that you would regret later on so I guess this is part of intellectual honesty and to get there you really need to make sure that all the components Erin, Dave and Charlie are performing as they are supposed to and Dave's job is to give the right rewards in order to get there Bob make sure that the reward system is functioning well how can he score the performance of the reward system can he score for instance the performance of data collection by Erin and I think this is a really critical question and the key for this is certainly going to be data certification making sure that the data are certified as they should be in particular that using cryptographic signature for instance you can sort of guarantee the traceability of the data and making sure that the data comes from a trusted authority the reason why we trust some text is because it's signed and I think the same thing must be one more done for videos as well they should be cryptographically signed so I think having a whole pipeline of certification and signature from trusted entities is really important and if you think about the data about coronavirus it's mostly because it's signed that I trust Wikipedia or WHO or the World Health Organization but the problem of data collection is much harder than this you have all sorts of biases and stuff like this somehow we humans are somewhat able to judge how biased things are I think we need to reflect on how we do this and to make sure that algorithms are doing this well as well and then there's the other parts of the pipelines for instance Dave who's doing world model inference like you need to be able to assess how good it is at doing this like one way we do this at least for humans is to give them tests or to track to ask them to explain how they switch their minds I think this is also a problem that is posed for humans like how do you make sure that a given individual has a good world model and especially for controversial topics this becomes very important and tricky so you mean when humans agree on the data that they have observed but still don't agree on what they infer about the world then how they will discuss with one another too and if you are an external observer and you see how they make their inferences how do you determine which one is doing the best inference I think this is a difficult problem so there's one trivial answer I guess it's like just do all the computations that they were doing or applied based on everything but that would not be desirable for Bob to do this because essentially he would have to do the whole pipeline itself and then he would itself become a liability and stuff like this so I think it's important to have algorithms that are able to assess if other algorithms are doing a good world model inference and for some cases there are some people that we maybe tend to trust because we see them changing their minds explaining themselves like displaying intellectual honesty typically so these are ways to gain trust in world model inference that are not by redoing all the computations of the world model inference So you think that the research in machine learning about interpretability fits in this? Yeah, I think in the end interpretability is going to be important I think it's even more crucial for Charlie So Charlie is trying to infer human preferences and maybe even human volition So imagine you have in front of you an algorithm so Charlie and Alice is telling you trust Charlie has your preferences well you probably shouldn't trust it blindly and probably what you're going to try to do is to interact with it so there's a really nice paper about this called Rebuild AI probably we're going to talk about this where you have this interpretability that's really critical to gain trust in the system but this works well for preferences it's already a bit harder for so-called social choice so social choice is like the algorithm has taken into account what I prefer but does it affect its actual decisions because its decisions should be affected by what everyone prefers so it has to aggregate the different preferences and here you have sort of the same problem of making sure that your vote has been taken into account in the presidential election for instance if you really think about it it's not easy to verify in the end you sort of trust that there's people who are doing their jobs on one another but you don't know any of them and in some countries it can be a big of a problem yeah I'm not naming any country but yeah and these are still relatively easy compared to the interpretability of an algorithm that's computing your volition because you don't even know your volition that well probably and maybe the algorithm is going to conclude that if you thought longer you would actually prefer this but right now you don't actually prefer this it's like talking to your parents and the parents supposedly know you better and maybe they do but they still need to convince you that they know you better and that can be very very hard so here interpretability is key and it's going to be very very difficult because you need to convince humans of things that maybe they don't want to believe in and that can be very very tricky it's not a part of the challenge yeah I have 100% agree with that yeah so that's the overall picture so maybe instead of so maybe in this case like instead of trusting the result of the computation or redoing the computation yourself you can rather try to gain trust in the algorithm itself what are the key steps of the algorithm that computed your volition and if you convince that sometimes if I explain to you that the algorithm took your inputs, your preferences but it's so that people who has your preferences if they thought longer changed their minds it actually observed people who thought longer and actually changed their minds and that's why it anticipated that you would change your mind as well maybe it starts to be a bit convincing that the algorithm is actually doing the right prediction especially if it gives examples of people you know or something like this yeah but this sounds like changes over a short period of time and also I expect a lot of mind changes are not in that category where maybe everyone would be changing their mind at the same time so maybe if today compared to three months ago I think a lot of people have changed their mind about how important it is to be prepared against pandemics and so if three months ago the algorithm was telling us you would all prefer that the world was more prepared against pandemics when people were not very worried about it at all so yeah in that case there is not really any example of people that previously changed their mind and everyone has changed at the same time yeah and sometimes if you take the example of what educated Republicans would deny climate change it's going to be very hard to change their minds even if you do the YouTube algorithm well for me it's like impossible to change Republicans mind like that at least maybe the YouTube algorithm can have repeated exposure but it still needs to be do this very very well and in the process there's always the risk that the user stops trusting the algorithm and says I don't want to I think this algorithm is just trying to manipulate me to changing me to do something or not yeah this just shows how difficult the problem is and it has to do with a lot of the in this case in particular the interface between algorithms and humans and we already talked a lot about this about psychology I think psychology is critical for designing robustly beneficial algorithms just to see the impacts of the algorithms on humans but maybe also to draw inspiration from some humans that are particularly good at intellectual honesty whatever to try to design these algorithms especially Bob I see also that psychology is very relevant for learning human preferences we had this discussion about a previous paper on preference learning where if we don't know how irrational are the agents we are observing then it's extremely hard to know what are their true preferences because irrationality would be when you have a behavior that does not show your true preferences a common example I'd like to mention is the example of Kasparov playing chess and he plays imperfectly a perfect chess engine would know that Kasparov does not want to win because he always plays the wrong moves but this is quite stupid if you know the human psychology you know that Kasparov is trying his best to find the right moves to play and anyone could know this and it's quite obvious well I would guess that the algorithm still guesses that even though he's not trying to win for sure he's still trying to not lose yeah maybe not anymore but yeah that's clearly a problem for other problems more controversial or more complicated like social justice or like climate change or like this like sometimes people advocate for ideas that are not that clearly productive for their own goals and so an algorithm that learns it like trying to do universal enforcement learning on this may not conclude well may conclude that the human has different preferences than sexual preferences so yeah that's one additional challenge I guess one last challenge which is the last section before the conclusion of the paper is about the need for distributed algorithms so like as I presented the roadmap especially for pedagogical reasons you would have only one Erin, one Dave, one Charlie, one Bob and one Alice so I think it's easier to think if you only had this but in practice it's very dangerous to have only one of each because each one is a single point of failure if suddenly Dave stops walking then the rewards cannot be computed based on a model of the world and so the rewards can be completely unaligned and the algorithm could be dangerous because of this and so you want to make sure that every component especially of the reward system is at least duplicated and that all of this works well it's even hopefully resilient to attacks because you have to imagine that there's going to be malicious people the internet is full of them and so you have to make sure that each of the components is designed as a decentralized so-called Byzantine Brazilian system which means that it's going to be a network and even if parts of the network crash or get hacked or get blocked then the whole system as a whole is still going to perform at least roughly as we want it to be performing and that's the whole problem of decentralized machine learning which is an additional problem the topic you have talked about is robustness it applies to each of the components of the system so robust statistics are critical at every step along the way robustness is going to be critical and it's not easy because we expect that there will be adversaries in the world and so the data collection could be poisoned so even though we are trying to increase and do a good quality data collection there might still be some poisoned data coming in so the job of data would be to still be able to compute correct world model even though Erin has not provided the very clean data we talked about this about the coronavirus I guess there are not hopefully too many malicious users who are trying to hack the different trustworthy systems so that misinformation spreads among the big platforms but you can imagine like for instance on the Wikipedia page about the coronavirus outbreak so I haven't looked really at the figures but I think maybe there are like 100 so what I remember there were like 10 individuals 10 contributors who were responsible for 40% of the page and I'm guessing like 100 maybe 200 people or max are responsible for 90% or 95% of the page so that's quite distributive I guess it's not like one individual and that's better but it's not that many either and you can imagine that someone who really wants to spread misinformation so maybe the coronavirus is not something that people want to are sufficiently motivated to put misinformation in but if you can imagine the page on Google or the page on Donald Trump or something that are very controversial like these pages are going to try to be incentives to hack these pages and if there are 100 people and it's like really really people are really motivated then people might like the bad malicious users might eventually use some illegal things like blackmailing or threatening to get a modification of Wikipedia especially if the profile of the Wikipedia contributors are unknown and this would allow them to hack the most reliable information that many people rely on so you're saying that if it's only 100 or 200 agents you would consider that not yet robust so I think it's quite robust so the problem if you have more people then within the Wikipedia community there can be already malicious people trying to so you don't want to go too big but you don't want to be too small so I guess the more you accept contributors the more you are vulnerable to malicious users among the contributors but the smaller you are the more single points of the more every individual is a single port of failure so there's a threat of I think Wikipedia is performing very well especially these days but it's not that much of a given that it will always perform as well as it is I would still like to predict that it's still going to be quite fine but if you live like maybe if there are increased tensions throughout the world maybe the internet can become even more adversarial but more and more people are pushing for their own ideas and in this case I hope Wikipedia will never fall of course it's arguably the most reliable source of information we have these days in the world and I hope they will quite believe that it will be a reason to this but it's something to ponder I think like especially if you think before the advent of Wikipedia yeah it wasn't the given that it would be Wikipedia it's the same thing for the World Health Organization like at some point like some people actually wanted to create the World Health Organization because they were scared of or they wanted to have better global health and also like to fight a potential pandemics but it was not a given you had to have people who were motivated to do this and like I've heard like from back channels because WHO is not that far from us like it's in Geneva but I've heard that like financially they were not doing that well last year I'm guessing they're going to be doing better this year because of the coronavirus but yeah the resilience of such structures is also important and something to think about so you think that we could get inspired on how Wikipedia get humans to collaborate together even though humans I guess the one that participate on Wikipedia they still disagree on a lot of issues yet the product that is aggregated from them is somehow very reliable, neutral and doesn't have a is not full of fake data so we could get inspired on how Wikipedia stays quite robust to all these adversaries that try to manipulate the platform but still get a good result we could do the similar things when considering each of the components of the roadmap yeah I think it's something to do inspiration from so yeah one case would be a distributed Dave that so many agents, many servers all trying to infer what model from the data collected by Erin but maybe not each individual piece get to see the whole data they see only part of the data so they would compute slightly or very different world models but then they would still need to we would still need to aggregate these world models to decide what to share with Charlie yeah so that's actually a paper I'm currently working on trying to do decentralized learning in the presence of malicious users malicious nodes like people who can participate adversarily to the system like the it's a difficult challenge the remarkable thing that is the blockchain like the blockchain has especially Bitcoin has allowed like this remarkably resilient system so if you think about Bitcoin like it's never been off for the last 12 years and it's quite unique in its you are and I think there are inspiration to be drawn from this but on the other hand like the Bitcoin has a lot of flaws the Bitcoin is like extreme computing it's not that fast and stuff like this and there's still a lot of research to be done lots of improvements for instance like there's a professor who's working here who worked on something called 82 where you could do a lot of the blockchain but I think there are ideas and I'm currently working on this about trying to do these things also for machine learning algorithms trying to design them in a robust manner but still you also still want a performance out of this because like nobody is going to implement something that's extremely costly energetically and extremely pretty and stuff like this and that's a big challenge as well cool so yeah so so it's been a bit more than a year than I first thought about these ideas I think they are I'm happy to say that I still think that they are not completely useless it's always a challenge to see how these ideas grow old but I think they are useful so has there been a similar work but taking another direction as what you've done? so I don't know about any alternatives I think the ideas are quite reasonable and sensible and natural solutions as well so yeah I hope this can help people better think about the prime of alignment and also to realize that it's very difficult I think this is kind of a theme of this podcast but ASFT is not easy it's really really not easy there are lots of potential pitfalls and risks and the technical changes are extremely hard and you need a lot of brilliant people working on it and also what I try to do in this paper was also to show that there were interesting problems problems that I think are interesting for their own sake so the work I'm doing these days on distributed Byzantine learning is also mathematically quite really nice I haven't done such beautiful mathematics for a long time I was very happy to do this and I think there are lots of problems like this in the world map if you think about this there are ways to pose them as really nice problems and I hope also we're doing this in the podcast like every episode we're trying to suggest research ideas and yeah hopefully this can inspire people to tackle some of these ideas very good so I hope you enjoy this podcast and I hope we'll see you next time bye