 Cool, so welcome to our panel discussion today about the legend morphea and boski integration Think you heard already a lot about it in gaps intro remarks and also the one from John Madsen today So I'm Bica and I'm a vice president in data engineering at Goldman Sachs And I'm a product manager for the legend stack and I have the pleasure today to moderate an awesome panel with Amazing people and technologists with mark from Microsoft a pier from Goldman and Steven from Morgan Stanley And maybe without further ado, let me actually hand it off to pier to start introducing himself. Hi, so hi I'm pier the balloon. I Managed the platform team within data engineering at Goldman Sachs I've been at Goldman for 15 years and four years as managing director During this time I moved in different You know areas of the business moving from compliance technology to operations technology to core engineering But while I was doing that I always worked on data and how to make sure that people can access data more easily Retrieve data more easily in the firm and moving around in the organization Give me give me a really good way to understand the pain that people have, you know It's different use cases different, you know, like problematic that people have whether they're on the on the operation side The front office size on the back office in compliance So that helped actually refine this platform that you know, we built we built for time And you know one of the highlight of my career was last year when we decided to open source the platform and Contribute to the code to the Finno's Foundation And yeah, so that's that's me. Awesome. Thanks, Pierre So Steven, why don't you introduce yourself next? Hi, Steven Goldbaum from Morgan Stanley? I have a similar trajectory. I have been in a lot of different positions and a lot of different business units across Morgan Stanley and even before then and so very similar experience a Little bit more on the application side in terms of dealing with the business users Giving the business users confidence that we're programming what they expect us to program and that the application is doing what? They expect the application to do so I did get a lot of experience with across traders across data and and and people to do need to manage data And that all culminated into more for in many ways not not just through myself But with the experience of the rest of the team as well And again, I very similar to Pierre a highlight of when we open sourced it and contributed to Finno's Great. Thanks, Steven. And then Mark last but not least you want to introduce yourself real quick. Yeah, my name is Mark Maron I'm from Microsoft Research Slightly differently from the first two panelists. I have sort of been in the academic track for my entire career You know in the compiler runtime space. I've done a lot of work on Optimization garbage collection debugging program analysis tools Recently I got really interested in digging into programming language design and verification Methodologies and really the crux of the problem is how to make sure that your program is actually doing what you think it's doing Verifying that that's either the case or if it's not that sometimes happens with developers coming up with that You know an input that shows you why your program actually is has a bug and allowing you to fix it quickly So that's really what what I was interested in doing and it aligned really well with the concerns You know both Morpher and legend and it was really exciting to be able to come and work in an industry where they really care about Correctness and reliability and work with some great people on some really cool technology So I was really excited to be able to come and participate in this Great so all of you already hinted a little bit at the various projects that you either work on or brought into the fitness community So let me start by asking Pierre to tell us a little bit more about the legend project Yeah Well, so it's going to be hard for me to do a presentation I thought the great presentation that you guys did before Expanding the system and wait school But I'm just going to speak a little bit about the name because you know Legend is kind of a tricky name like it sounds pretentious in some ways But and each time there is a legend of legend. I hear a lot of weird stuff, but you know why why we took legend Okay, so as you can may have seen at the present President presentation The core of the system and the purpose of the system is to define more semantic About data is to build map, you know and cartography of information that help actually Different data being produced in different part of the organization be brought together Like bring this silo bring relationship between information that we have behind and you know We found that actually having this information and this map of information that everyone can contribute to Help, you know, they find navigation of information retrieving and navigate navigating information for sure also improve data quality as we saw a little bit earlier, but And remove the duplication of information because you have one place where you can find information in a central way, so Legend, you know is meant to evoke actually this kind of description of information that you know help people understand the data They're working with and the same way a legend help people understand the symbol and notation of a map and In a chart, you know The product is actually this kind of guide that help you understand what's happening in the context of your organization Yeah, very interesting and could you tell us a little bit more about the pure language and its capabilities and more detail Yeah, so when it comes to pure like because the system is really meant to Collaborate about definition of data and the way people work on data When we came to pure we had to look for a type system that that help communication So what we did is that we took the UML language because it has actually a graphical representation that really help people collaborate and work with each other's But when it came to UML As much as you can define constraints with OCL it's not really used a lot But that's like a possibility of the language we found there was not really possible to define data derivation Data queries or data transformation. So so we built our own language on top of UML to address actually these concerns So obviously now you can refine the semantic of the model with constraints like because it's the same as OCL But you can actually define derivation and when we speak about derivation is how to build data on top of data So I cannot how can I create a little bit more information out of information that has already been modeled in the system? We can also define data transformation So when people want to actually build a model but transform to another model all that using the same language And you can do queries, you know find quickly information out of your model and Filter and retrieve information really fast So now that's kind of the why we build the language and we had our own things But the cool part it's kind of the mind-bending part and everyone makes fun of me of me when I explain that Is that we build actually our model using the model itself? So pure the language is defined using pure itself in a recursive fashion. So it's like it's like It's it's a meta programming actually that we have in the environment and why it's important Because because pure is modeled in itself Everything you can ask anything in the system what it is and the system will tell you it is built this way up to a Recursion point where you know things are built with themselves and that help actually have a full reflective language that can be used to transform actually The language that you express information into other language like let's say SQL or graph QL or portable for other things So this kind of capabilities of you know modeling the language with itself help actually Transform the language into other language and start to enable execution to different Part of the the system we work with in the you know in our organizations Sounds very interesting again. Thank you Pierre for giving us Yeah, a better overview of legend and the pure language so Steven handing it over to you to Explain a little bit more what Morpheus and how it relates to data ball notes sure so Morpher is a project centered around sharing business logic and so what I mean by business logic is calculations rules Anything that the the business thinks of when they think of an application. They're not thinking about execution and Data flow and stuff like that. They're thinking about this is what the applications should do And so that's what I mean by business logic And the the natural question is well, why would we want to share business logic? I think the probably the most obvious answer is risk every time we have to write business logic in code We're creating risk because we're creating risks that we're doing it wrong and that we've missed something or We're rewriting a bit that that you know is is is less known So every time we touch the business logic is risk So when we talk about sharing we're talking about sharing it across systems But also across future technologies so that if we do want to migrate technologies We don't have to touch the business logic and risk that we can maybe co-generate those tech to those technologies and make it Execute that way And then we're also talking about sharing it across people So one of the things that we want to do is increase the knowledge of what the application is supposed to do and what it is actually doing in production and so we're We're talking about visualization and anything we can do to make the users comfortable and understand What the application is doing and why it's coming to the conclusions that is coming to? And so the question it might be well What how are you doing that? What is the way that we do that? So you'll probably hear us talk a lot about an IR which is an intermediate representation which When we were thinking about doing this and how do we go about doing this? We were thinking well we want to put logic in a data format so that we can share it and we can create that ourselves Or we can actually look into computer science Where they've done this for decades and they've mastered it and so we should probably use that instead We should probably you go use computer science for that and so that's what when we talk about an IR That's what we're talking about a data format for logic And then finally I want to say that morpher is focused on integration It doesn't define a language itself. It it's happy to use other languages in the front end We use an existing language We've now got Boski, and we've got pure and we're happy to to have any other languages Map into that into that IR just as long as we're saving that business logic And we also want to integrate on the other side to say well We're not going to dictate what the execution platform is we want to support different execution environments and paradigms and and really the the goal of morpher here is to integrate all that and give it a single target so that We're not writing all kinds of tools across all kinds of different languages. We kind of kind of centralized on something Great. Thank you for setting the stage on further integration that we can explore So mark last but not least again. Tell us more about Boski and what it can do to For quality assurance. Yeah. Yeah, so the Boski language project sort of started out as just kind of a broad question And I I work in a group that has some really excellent people that have done amazing work in sort of theorem proving Program verification abstract interpretation model checking everything you could imagine about you know compilers Everything you can imagine about trying to reason about what a program does whether it's to make it go faster to find bugs you know also better memory usage all sorts of things and It was really interesting over the years talking with people, you know, and when they tried to write these techniques You sort of had a beautiful core They really made sense and we were like this should really work Well, and then we'd go and we'd try and apply it to JVM bytecode, you know that the intermediate representation for Java or the CLR bytecode the intermediate representation for C sharp and Things would suddenly bog down and you know, you weren't getting as greater results as you would really expect or want and We've been talking with people the same sort of things seem to come up over and over again That we're causing these problems and preventing us from really getting the kind of great developer tooling and Checking results that we wanted and it kind of stemmed from the fact that these intermediate representations were designed Kind of in the world where you were going from a source language like C sharp And you were compiling down and down and down until you got to assembly language code So they were really designed to make it easy to take these Desugaring steps from a source language down to an intermediate representation That was easy to optimize for x86 or arm or everything else And now what we were trying to do is we were trying to build tools on top of this platform that wasn't really designed for it And so there were a bunch of Things that made sense if you're writing optimizing compiler that don't make sense if you're you know trying to do a program verification and so the question we had and this was really a research question at that time is Is there something is there a better way to do this right if we went and tried to design this area in intermediate Representation rather than for compilation to design it for developer tooling could we make different choices? Could we build a different representation and could we build one that actually supported the needs of these tools much better, right? And so that was sort of the first part is like reimagining this particular component and what it should look like for that The second part was well suppose you built this intermediate representation and you could do amazing analysis on it Can you actually then connect it back to a surface programming language that developers would want to use or is it just so you know? Hideous and painful that no one would ever write code that that targeted it So we sort of had two components to this this project What is this intermediate representation look like and then how do you build a programming language that targets it and is designed to? make verification and Checking easy and so we you know we spent a couple years working on this it was great again working with the Morpher folks We bounced some ideas off them. We tried various iterations We had some thoughts that were good ideas some thoughts that were bad ideas, so we did you know as research There's a fair bit of experimentation there But in the end we've really come out with something that I'm I'm very excited about in terms of The IR that we have and how we've been able to integrate it with with Morpher And then with legend as it goes through the Morpher stack that allows us to do some amazing things in terms of program reasoning and We'll see in this demo, but really allows you to have Assurance that your software is doing what you expect it to do right and actually proving that that is the case like I really liked in Intro the statement that in this domain getting the correct answer is important and having confidence in your answer is important And that's what we really want to enable is not just confidence because you passed unit test or confidence because you haven't had a bug Report, but confidence because you have a principled logical theory about why an error Doesn't occur or why your application does what you think it does So yeah, awesome great mark for telling us more about Boston here And just real quick because I cannot wait to see the Action happening in the demo in a few minutes as Steven could you just I'm curious to learn more how this Collaboration between the three of you came about like could you tell us a little bit more about that? Yeah, it really came directly out of open source. So Morpher in legend and pure were Contributed to finna was it roughly the same time And we immediately saw the synergies and we're pretty eager to work with each other and finna was made that Very easy to do where normally it would it would not have been easy to do at all And then the fact that both of these these applications are completely open source So we can all go in and look at everybody else's code And it's the same story with poskey as soon as we saw the open source announcement from Microsoft And we saw the synergies we contacted mark in because that is purely open source And our stuff was purely open source that just made the whole thing work very easily Yeah, anything you want to add to that? Yeah, no like toff finna is definitely a Big a big player here in helping us work together It's really cool actually to be able to work across banks and Microsoft in a totally open fashion There's one thing else is that When when we started to speak to each other's We pretty much are really like minded in the way we think about how to define information how to define models How you use an intermediate representation to be able to Communicate and work about code and the the the complementarity was really interesting too because as much as we really focus on data on the Malaysian side You know Morgan and Morphe are really focusing on business logic and integration of more Business representation so it was really kind of clear that we could make them work together to have a Complete offer about something that can address cross compilation and cross execution on different targets But you know including data and including business logic together Cool, so I don't know about you, but I want to see this integration happening in action now And so that time if you don't mind playing the video Hi Welcome to the devil for the legend warfare boss key integration project Where we enable compile time through improving feedback and logic visualization for legend pure functions Powered by Morgan Sally Morpher and Microsoft boss key My name is Stanley Zhang. I'm a software engineer at Goldman Sachs in the legend team who developed integration In the first part of the demo, we would like to show you the metamodel diagram of Morpher IR's type system modeled and created and legend Here an IR stands for an intermediate representation For Morpher the IR is in JSON format, which is defined and published as a now language package This was an essential step as part of the integration project because it enables us to generate more for IR from legend pure code Which is needed to visualize functions and receive feedback on the function logic from boss key Coming back to the diagram Each node in this diagram is a legend pure model or class and represents a type in what for IR's now model Each ever represented inheritance and each line represents an attribute relationship This diagram therefore is a clear picture of the structure of any target Morpher IR to which we transform legend pure objects to Yeah, you want to comment on what we just seen? Yeah, not so so it is straight a lot the metamodeling capabilities that I was speaking about before like the first thing We do when we integrate with technology In legend is to model the technology itself So the same way, you know, we expect people to model business concepts trades and others at the technology level We have exactly the same strategy of modeling the technology what it represents and and what it is There's another part which is really interesting here is that we use the transformation Transformation language to be able to you know go from an IR to an IR. So the same way we have the Morpher Morpher IR, which we just saw just before we have our own IR and we have a transformation language that can say okay How can I go from one to the other and transform and the language is really terse actually like the code that you to perform that is fairly fairly light behind the scene and One thing I would add that you know as much as we have Morpher and you so Morpher if you know if you go in our GitHub code, you will see the same thing about Java You will see the same thing about GraphQL or SQL So eventually, you know the project would end up having a repository of technology Expressed in a uniform fashion so that you can compare them or quickly understand them because they're all expressed in the same language and in in the same fashion Great. So now let's look at the second part of the demo video where we introduced Morpher For the second part of the demo, we'd like to show you how we generate Morpher IR from a legend function and visualize its logical paths Let's start with creating the function in legend Please note that I'm using a local deployment of the integration stack build from code available publicly on GitHub We're also working on deploying this into the phenos environment For clarity, I created a new legend project and workspace ahead of the demo to save us some time In a new workspace, let's define the function representing the decision-making logic Which is helping a rental company anticipate how many items they will be able to rent out We can do this built in the user friendly form mode or in the text mode Let's toggle to text mode to write down the function Within the function, we define the local variable maximum allowed, which is decided based on the ratio between the requested amount and the availability We then decide the final rented amount based on the maximum allowed variable The logic is quite easy to follow looking at the function described in legend studio However, if users wish to visualize the function and the logic, there's now an opportunity to do so with the integration of legend and the Morpher visualizer Let's now click on the generation dropdown and select Morpher type This allows us to generate valid Morpher IR JSON code from the previously defined function and pure You can see the result displayed here on the right hand side Now we click on visualize generated IR button We're taken to the Morpher IR visualizer where we can view the type structure and the logic of the function Let's explore by first set availability to zero We see the eventual outcome will be zero regardless due to the first conditional check Now if we change availability to 60, we see the ratio between requests and availability is below 0.5 And thus we return the amount of requests as the outcome And now if we change availability to 40, maximum allowed becomes half the availability And depending on whether we allow partial rentals, we return to maximum allowed amount or zero since we can fulfill the requests As we can see Morpher enables easy visualization of not only the function But also the logical paths that can be taken with different inputs Steven, you want to comment on what we've just seen? Yeah, so this is a great example of I think what we were talking about in terms of confidence This is a custom built UI and the goal of this UI was to decrease the feedback loop between IT and the business experts And so I think you can imagine the business experts describing the logic and this was generated completely So there was no extra coding to make this happen So you can imagine a technologist sitting with the business expert and they're describing the logic And the coder is coding it down and then displaying it in real time and saying, is this what you meant? And let's try out some values and see if this is actually what you meant And this has proven to be useful So it proves to be useful in setting up and writing the code But also in auditing the code after if the system comes to a result And the users wonder how did we get there? Why did we calculate that result? We can actually go back in introspect and use these same kind of tools to find out why the application made a decision at any point in time Cool, great So I think there's one last part of the demo video where we now introduce Boski In the final part of the demo, let's see how the integration with Boski enables through improving feedbacks for legend functions Let's start by typing in a function with an issue This time we create a function using the user-friendly fork node Will Boski be able to help us identify what the issue is? Again, we generate the Morpher IR This time we click on View Boski Feedback button And we're taken to a web app called Linta that we created to display Boski through improving feedback On the right hand side, we have the generated Morpher IR While on the left upper corner, we have the corresponding pure source code from which we generated the IR And on the left bottom corner, we have the feedback from Boski, which indicates a zero-division error in our function Wow, now we know the issue We can also see that in between lines of the pure code, we highlight the error in code section To give a better glimpse of what we can do with Boski Feedback, let's redefine the function now with implicit zero-divisions First, let's create a function with a problematic expression to be evaluated Again, we see the highlighted code with potential zero-division detected by Boski Let's also try adding one other variable and place it in the divider We can see that Boski gives feedback for the potential zero-division for the variable Now, as a responsible legend modeler, knowing we have a potential issue, let's fix the zero-division Let's redefine the function with a conditional at first, checking whether the divider is greater than zero Now, if we ask Boski what he thinks, we see there's no problem found Hooray! Just to close the loop, let's see how we can use Boski with the previously defined rental example But with an issue in the function We start with the exact same rental function defined previously But remove the zero check for the divider And let's check out the Boski feedback We see that a potential zero-division is detected and is highlighted in our pure code That concludes all three parts of our demo As we can see, Legend Pure just became even more powerful with the integration with Morpher and Boski Which enables the easy visualization of the logical paths and compile time through improving feedback Thank you for your attention So I wanted to just give one more chance for Mark to comment on what we've just seen Sure, well that's dangerous because this is where I get really excited I can talk a lot about that Yeah, exactly, you have about 30 seconds The quick one, maybe at 45 So yeah, this is really cool And I think what we do just at a very high level is we take the semantics or the meaning of the Morpher IR We convert it into a formula in first-order logic And we use a state-of-the-art theorem prover developed at Microsoft Research Z3 And we actually look at this logical representation of the program And build a proof that that div by zero error can never happen Now, you might be wondering, isn't this kind of like using a sledgehammer to crack a walnut? If it's div by zero, there are other techniques you can use that are less heavy-weight than this theorem prover But I think the rental example is a great one Because not only does this technique work for standard runtime errors like div by zero Or invalid cast exceptions But if you start wanting to assert the correctness of your program somehow And add user-defined assertions like in the rental example I might want to say that if I have a rental order come in My response will only be accept if I have enough surfboards or whatever In my inventory to fulfill that rental order And as that user-defined check, we can still encode that in first-order logic We can still put that through the theorem prover And then we can still guarantee that your code implements that correctly And that assertion will never be violated So this really allows you to start thinking about what do you mean in your code And then verifying that that actually holds, or as we saw before Actually finding if there's a bug what that bug is so you can quickly fix it And have confidence that not only did you do code reviews and unit tests But you really got this deep fundamental assurance that your software does what you expect Yeah, great. So I think to wrap up the panel discussion today Maybe in one sentence from each of you How do you see this collaboration going on in the future? Steven? I see this as part of an ecosystem that will just keep building and growing Mark? For me, well, we have a lot of engineering work to do to make sure things work beautifully all the way up Like I mentioned, this started as an academic project We're really excited, it works, so we're switching to engineering mode And it's time for me to fix bugs And Pierre? Yeah, we'll fix bugs too But what's really interesting for me is that many times when we deployed Legend at Goldman In order to source data, people came many times and say What about writing more business logic like in the context of some regulation and things like that And I see that as an opportunity to start actually writing this business logic And have all the power of the fair improving and all the power of the more fair capabilities Using code paths to help users start to go beyond fetching data And actually writing business logic in production, so I'm really excited about that Yeah, me too Awesome, then thank you all of you to join us and of course the audience