 Okay, so now we have 10 o'clock and we can start with the next session. We have Elias Mistler, who is going to join us and give a talk about how to write multi-paradigm code without making a mess. Elias is a principal machine learning engineer from Previz. I hope I pronounced that correctly. He is doing a lot of machine learning and he is going to talk about using both object orientation and functional programming together and how to make those two play nicely. So let's head on to Elias. Elias, can you please start sharing your screen? Of course, of course. The second second, right, right. Yeah, yeah. Excellent, I'll mute. Okay, well thank you Mark for the introduction and thank you everyone for joining. What is now the first talk here? So, as Mark already introduced this about multi-paradigm programming in Python, I've got an introduction here. Basically, already said, Previz, the company I work with is an invoicing finance company and we use machine learning on large corporate datasets to predict whether invoices will be paid in the future and then to finance invoices and improve the cash flow of small and medium enterprises. I myself am a machine or yeah machine learning engineer principal and my main responsibility is sort of integrating our machine learning algorithms into our invoice processing platform but I also do all sorts of other data integration pieces, operational tooling around the company, just making sure everything works front to back. Basically, from this whole integration, from these integration approaches, I work with different people with very different mindsets working in different paradigms. We've got closure engineers who are very much on the functional programming side and I picked up a lot of things from them that I found really useful but obviously there's a lot of object orientation going on in Python as well and I kind of just want to share some of my learnings from bringing those together and what I found particularly useful. There are four main things that I want to go into. They all overlap a little bit because you can't distill it fully away from each other but their code structure, then data structures, how we deal with state handling and how we deal with multiple implementations for sort of the same concept, the same problem. As some introduction, we've basically said it to a degree already, Python itself is a multi-paradigm language unlike I mentioned closure on the functional programming side. Java is a good example on a sort of fully object-oriented side. Python is very pragmatic, very in between. It doesn't buy into the dogmatic sticking to one side of it but just brings it all together lets you decide. An important point that I want to make before we get started is those two paradigms are concepts. They're not a matter of syntax so just because something is written inside of a class doesn't mean it's properly object-oriented and just because it's written in a standalone functional or procedure even does definitely not mean it's functional programming. So just a quick sort of trying to catch everyone in about the principles. Object orientation is from my point of view just revolves around this idea of mutable data structures that is things that we change in place that have a status, a state that we change. It deals with that typically with a rich type system of classes that are interconnected in particular class hierarchies and the principles of object orientation which are inheritance, abstraction, encapsulation and polymorphism. As much as object orientation is about mutable data structures, functional programming is about immutable data structures and it typically relies on simple data types and uses pure functions. Pure functions means it's a function that has no side effects, uses no global variables, is sort of encapsulated in itself and is idempotent. So if you call the same function with the same inputs you can always expect to get the same outputs no matter the state of any other part in the system. That's sort of the core principle there and often these things are evaluated lazy so you can sort of nest your function calls before they're even executed. That's the two sides of the principle and because this can be a bit dry I thought I'm going to use an example going through the presentation I've chosen to pick Sudoku which is sort of this a way of a number crossword riddle if you like. It's a nine by nine grid with numbers from one to nine and each row column or or three by three block so each of these here should contain the numbers one through nine. So what you get here on the left side this is typically the the problem that you get and then you start filling in numbers till you've solved the whole thing and that is the solution to the Sudoku. Now with that as a background let's dive right in. So the first thing I want to talk about is code structure and the way I'm going to do this is sort of compare the OO and the FP side to one another and then sort of find a sensible middle ground maybe try to take the best of both. We're going to start from this string here which is a Sudoku definition as per this open Sudoku website. Just to be clear how we understand this the first nine digits are sort of the first row of the Sudoku riddle the second nine the second line and so forth so this fully describes the original state and I'm just going to run this because all the code in this thing is run live. I do have some more code files in the in the background just to keep the implementations like short and narrow here but I'm that's all shared along with the slides on its own repo so you can later on have a play around with it yourself if you like as well. Okay without further ado one OO way to implement this would be a factory function so I'm going to create the Sudoku class here and for for now I'm just going to put a grid an array on there which is the nine by nine grid we're going to change that later but then I'm going to add this from string function as a as a static method or as a class method onto this so I'm explicitly putting this function onto this class in an object oriented way and then I can use it like this with the example that we've just seen and what I'm getting is obviously an instance of a Sudoku where the grid is this nine by nine array the this is fairly explicit high context which makes it very easy to find and use like everyone's seen this kind of thing knows how to use it how to even start it because most IDEs allow you to just go Sudoku dot and you find your you find that functionality now in the functional world how would you do it you would actually isolate that function and write something like this I'm going to go into the implementation a little bit more just after this but it just stands on its own in an idempotent way and you can use it by just applying it to your input it gives you a grid but it's just the grid it doesn't know it's a Sudoku we'll go into that a bit further down the line as well so this is entirely free of any assumptions about the use case and because of that it's really easy to reuse or generalize so I could parse completely different things from a Sudoku with that exact same function because it doesn't know of its context so how do we bring that together and that is a way that I like doing this is saying we have this function here we keep the pure function with all of it's benefits in fact they even went as far here and said I'm going to generalize this to any square matrix and be able to parse any square matrix with this but we can also create our Sudoku class and essentially just use this function on the class to have this sort of high context use case as well so what we end up is we've got both makes our code very tidy and reusable because it's nicely chunked up it generalizes really well because I've demonstrated it here by just generalizing this function it works in any context it works for any user and because you still have the high context class it's very easy to use and explore as well so from my point of view this is sort of a good way of just bringing together the best of both worlds okay let's look at the implementation and you can probably guess it from the title the object oriented or it's really more from a procedural world but following the mutable data principle the approach would here would be you create an empty array at first with the outputs then you iterate over the inputs that you have and append to your outputs for every for every digit that you have in the input works fine but this one here comes to mind for me I would have written it shorter but I didn't have the time and that is because I find this very easy to write but it can be a bit tedious to read and reconstruct instruct so looking at this what was actually the high level intention what was meant to be done and then because you're fairly explicit about the the variables and and the appending this can be quite error prone and if you write this kind of thing a lot you probably know what I mean but what's the alternative and that's a piece that I really like from the functional world that is the idea of mapping because what we really wanted to do there was take our inputs and apply this integer function to each element in the input sequence now a little disclaimer int is actually a class it just acts like a function which kind of just proves the point that Python is multi-paradigm but I'm just going to leave that aside for now so what are we doing here we're mapping so we're applying the this int I'm just going to call it function we're mapping the int function over our inputs and just because like I said earlier this is a lazy operation we just force it to be a tuple which makes it non-lazy and then we've got our output values here if that's a bit unclear sort of in with the order of execution there is a trick that I like to use from the tools library which is we can just force this to be left to right with this thread last function which is essentially creating a pipeline from left to right saying take our raw examples map it over the int function and then turn it all into a tuple to to collect the outputs so this is a comparatively concise way of expressing the same thing that we had before and it's really much closer to the intention because what we wanted to do was just to apply this this function integer to all of our inputs which is exactly what we're saying here we're not telling it to loop I think that makes it really easy to read you might disagree at this point which I did as well when I saw it for the very first time but once you're used to the syntax this becomes much much easier to read than those tedious for loops it can take a little longer to write though because you have to be a bit clearer about the actual intention I have to try to abstract things a bit more but I would argue that's a good thing and that makes you a better software engineer so yeah just as a side note Python has this great syntax of list comprehensions as well which is a bit of a mix of both worlds and it works really well as well it's kind of easy to read and write but you really have to be careful never use lambda functions inside of list comprehensions also never define functions within a for loop by the way because those won't behave as you want them to I'll leave that as it stands and if you do things like that just make sure those don't get too long and too complicated because you do want it to be easy to read so pull any sort of more intricate logic out into its own function and then use the function inside the list comprehension or inside your pipeline from earlier all right just another quick example on this the opposite of pausing just or not quite the opposite but similarly when we want to take our internal object and just display it in the object oriented world we would implement this representation function which I've done here in this class that I'm loading and then once we've done that it's straightforward we just let this thing display there's no further thing that we need to do this is a built-in this is this behaves clever so that makes it really easy to get a nice representation the functional way there is a bit more explicit in that we would say well to really get this sudoku all the way I'm using the pipeline again we're taking our raw input we're pausing it to the internal format then we're formatting it for display and then we're printing it I explicitly kept printing outside of there because printing per se is a side effect so I don't want it in this function to keep this function pure that being said the output is kind of the same well it's exactly the same so the multi-paradigm is always yeah why don't we just do both or take the best of both and here similar approach just define the function as a pure function reusable and then enhance your sudoku class by using that function so keep the class a bit shallower and keep your sort of reusable chunks of logic outside of it okay that's as much as I want to say about the first part let's look a bit more at data structures and here so you see the example so going forward now we're going to use that sudoku grid up till now I just modeled it as a as an array but let's see what we can actually do in an object-oriented context you might want to do something like this you might want to model each sort of small square with a digit as this square class then you might want to create an abstract square collection which just ties a number of these together typically nine for rows columns and blocks but if you want more well a sudoku is really a collection of 81 squares as well 9 by 9 but to make it all nice and explicit you would say well a sudoku is also really knows its other square collections it knows its rows its columns it blocks so you've you've got a very explicit structure there that you can work with so that gives us this nice instance here that where we can say well just give me your eighth row or give me a particular square give me all the information about it in this case we know where the square is what the digit is that is currently filled in and whether it's locked so whether this was whether this was an input from our original thing or maybe something we filled in later on right so that assumes certain usage patterns because you're putting a lot of context into it and that in turn makes it very intuitive to explore but it's also fairly rigid and it requires a lot of boilerplate proof is here the implementation of this was 120 lines just for sort of defining the classes setups etc so what's the alternative functional programming is all about simplicity so here we just say you know what this thing is a 9 by 9 grid it's a grid of shape 9 by 9 and of data type int done so three lines instead of 120 very simplistic we can use it in the same way as before so after we've passed our input we can validate it with this thing again this is a schema is a multi-paradigm language in its own right which is why we have sort of an instance method here so you can always do your chain with those as well as long as they're immutable well we get to that so this is a very minimalist approach obviously with zero or close to zero boilerplate there is absolutely no structure no context on on the data structure itself beyond this is a 9 by 9 grid which can make it a bit harder to explore but it also makes it easier to reuse as i kind of pointed towards before so but what's the best approach and i can obviously only give my very opinionated answer but here we go so why don't we create a sudoku class that gives the whole thing some context similar like i did it did before but actually underneath that there is just the grid so we're using the simplicity there and we don't model all these rows columns collections explicitly that's implementation detail and we can do that in a very very elegant simplistic way but we then do tie it together on this class maybe have some nice functionality on here and adds context and just brings it all together i like to call it like a shallow wiring class or anything like that yes i'll set that here as well so it saves a lot of code but it does have the context for the user and at the same time comes with all the benefits of being able to sort of take small chunks of your logic take them out make them reusable and concise okay um on to the next part about state handling here it's going to be about we have the sudoku now we have a grid we've passed it from an input source but now we want to start filling things in because at the end of the talk we kind of want to have a solver that automatically solves that sudoku so let's think about the next step how do we fill in the digits and for this i'm using a multi-paradigm implementation right away inspired by pandas and it's in place concept because i think that just showcases very nicely what mutable and immutable means and also that it's not necessarily tied to a certain syntax it's all one classes here um so i'm just gonna create a blank sudoku here just blank nine by nine grid so i can show you a few operations on that the mutable way of interacting with this is change it in place so i've made it explicit here by saying in place equals true and we're setting at x y zero zero so at the top left corner we're setting a seven and then well the seven is in this sudoku done that's it um so we've changed it in place that seems natural like da i changed it now it's changed um it means there is no way back no no history we're changing everything as we go along um but what is the alternative so there the alternative isn't the immutable way of saying in place false so here we're saying we're setting this digit but actually not on this sudoku instead what we're getting back is this new sudoku here with the digit four set um the big difference probably only shows up when i then go and say wait but what was the sudoku and that's still as it was before where we have the seven filled in but we don't have the four filled in yet so we we now have two different versions of it and that makes it really easy to use this parallelize this it's very efficient avoids any concurrency errors because you simply have to synchronize less state between any nodes that makes it efficient and also helps with the errors uh i could probably give a whole talk on why that's the case but let's keep that aside um because you have this before after picture as well that gives you some natural versioning and also like like i've shown before with pipelines it lends itself very well to that but it also in this in this multi-paradigm syntax with classes lends itself very well to method chaining and that's one thing that i want to highlight is now you can do something like this say sudoku set digit set another set another digit and what you're getting back is obviously this change thing with these digit set but also we've not changed the original one so you could from there try to solve the whole thing and then go ah maybe i did something wrong backtrack to the original version or to a version before you can save versions this can be really handy in different application contexts so uh my recommendation is um i learned a lot from bringing more functional programming into my code and making it by that more multi-paradigm and the idea of immutable data structures really really helps with a lot of applications i find so try out things like a frozen data class named tuples which are essentially sort of kind of act like dictionaries and like tuples at the same time but they're immutable uh frozen dick and frozen dick and persistent map are probably the uh the equivalent yeah they are probably the same which is a frozen equivalent of a dictionary um then we've got yeah use multiple data structures in immutable ways as well so if you have a dictionary then still maybe don't set things in place but rather use something from this tools library which i've mentioned before because it's it's my favorite library in python and there's sort of associate functions for example where you have you give it your input dictionary a key and a value and what you get back is a new dictionary with that additional key and value pair set um also an idea try to keep functions pure and idea important try to pull them out of your classes and make them very reusable and then use classes where configuration and state is more required or desired to wire wire adult together um just one more example of how this works in pandas i'm just going to create um a data frame here just a couple of random numbers but then i can do a nice method chain here saying assign a new column assign another new column based on that one you work a lot with lambdas on these and then maybe drop some rows and that gives me this output here don't worry too much about the content it's just a dummy example but also we still have the unchanged original data frame and that i found extremely helpful when dealing with jupyter notebooks because people tend to jump around in jupyter notebooks and not just executed sort of front to back top down but if you want to go back and you've changed a data frame somewhere it can get really messy really quickly whereas if you treat your data in a more immutable way then you can go back and just re-execute any cells and it won't make it won't really make a difference you can always um you can jump around more and you have more self-contained pieces of logic which again makes the whole thing more reusable and because it's more self-contained and more reusable it's also closer to production ready i've noticed this a lot we've well i've um talked to the wider data science team and got them to basically write all the jupyter notebooks for new algorithms new data analysis pieces and that more in this immutable style and since then my job got really a lot easier of taking these things and putting them into a production system because it's just more things that are easier to take out and use okay um last main main part is how do we deal with multiple implementations and here we're going to actually look at now solving the sudoku so one thing that i thought about in solving this was a deterministic solver um based on a mask so we would create a mask so that only shows you the fields where you could put a digit in so it's where could in theory you put this digit and that per digit so you create these masks then fill any unambiguous ones so where there is in the mask just one possible place where digit could be within its row column or block and then you just keep repeating that that alone works well for very easy sudoku that have a clear solution but it's actually insufficient when you get to the harder ones because they can have multiple solutions so a deterministic solution is just not going to get you there fully so i created a random solution as well also uses the same concept of a mask because we want to still solve it by the rules right um but then we just fill a random digit and we just repeat that and keep doing it the problem is that we often back ourselves into a corner with that approach so we have to backtrack and rerun the whole thing um and you have to rerun it so often it makes it prohibitively slow i think i um i tried with a hundred thousand runs that was definitely not enough and the million rows took million tries took too long for me to for my patience so what i did instead was bring it together into a combined approach and here we're just saying run it deterministic as long as you can and once you sort of run out of moves uh then try a random step and go back to deterministic and just keep iterating that you might still need a few tries because of the the randomness but this is actually a really really uh effective solution it's it's still fairly simple but it's an effective solution the big question is how do we get those together so um how do we organize our code to reflect down the let's look at the object oriented way first again so it would probably create something like this a hierarchy of solvers so there's the abstract idea of a solver at the top then we probably want something like a step-based solver that just sort of handles iteration retrying and uh running steps again and again that sort of thing then one implementation of that or one sort of abstract subclass of that is a mask-based solver so that implements our logic of creating a mask and then running the step so it kind of is a mask-based step-based solver really so then our actual implementations of deterministic and random solver can just be underneath and those essentially now just implement the step that it's taking um and then the combination would just use those two classes so it has a it knows those mask-based solvers and combines them um a solution could some look something like this uh or the use case could look something like this we instantiate our sudoku as we added before we instantiate a solver we tell the solver to solve the sudoku that we gave it and then we have a look at the solution and there it is uh let's try the combined one yeah works the same glad so what is yeah what what is the idea of that we have a mutable data access like similar issues as before it does it can make things a tiny bit faster but it's usually not worth it unless you're really in a performance critical environment and even then the difference can be minued um i do find that when you organize your code in in a way like this you end up with a lot of single method classes because it kind of just has to split the functionalities apart in order to be able to reuse them but if you have a single method class why is it not just a function i i struggle to to understand the logic there uh because all you're doing is you're adding boilerplate so it seems to me just a complicated design for simple functionality but again a similar things apply as before it does have some use for the user because it's it's a straightforward way of using it so what's what's uh yeah just a word count for proof that it has a bit of boilerplate um what's the functional approach to that well we just create these functions and then sort of worry about wiring them up later uh we have a function that creates a mask we have these individual step functions the combined step function i forgot the arrows there actually uses the random and deterministic steps and we've got these solve function that iterates over them so kind of similar idea of splitting the functionality up but just broken down into the actual pieces of of logic the use case looks a bit different and you kind of need more context there to be able to run this um so we can wire pre wire a function called solve combined by saying it's solved with this combined step and then we can run it like we did before with our pipeline taking a raw example parse it solve it format it print it and there we go same output and now getting some feedback here at the talk back channel okay brilliant thank you so um let's see so this makes the responsibilities very clear per function it's a very simple pragmatic design um once you know how to use it it's very easy to introspect and combine in different ways as well and it's much more concise the code itself is more concise and on top of that we don't actually have any of the the base module with the boilerplate code so we are really saving a lot of code writing here but like I said it's you have to know a bit more about the library to do your wiring so where do we go from that what's the multi paradigm solution um and here I you could opt for a class like we've done before to do the wiring but I kind of just wanted to showcase that you can do it um in a slightly different way this time leaning a bit more to the functional side than the object side to do this and that is I'm create I've created a solving function here that does the wiring mainly in a functional way but we're still using the sudoku the the higher context class to represent the actual sudoku what we're running this solving function over it so yeah we've basically flipped it around rather than putting a pure function on a class we're now using we're using the class as an input and the pure function is the sort of top-level entry point and yeah take my word for it this runs perfectly fine um so this brings together this simplicity and clarity of the functional programming but it also makes sure you have all that the high context of a sudoku that makes it really easy to explore the actual sudoku itself and then you just want to call solve with the input of that class you've prepared if you really want you can sort of create I'd call this a solving configuration where you say we've got a step function here we've got a maximum number of tries and then we're just going to use this solve function that I just showed you and put it on a class so I'm making this callable here as well so I wouldn't necessarily recommend this I just want to showcase here that you can sort of make the lines blur between what's a function and what's what's a class with this callable syntax and here we can say well take our raw example this time parse it with the from string method on our multi-paradigm class and then thread this class into a solving configuration using the combined step and trying a maximum number of 100 times also that works perfectly fine and like I said this is just now giving you some ideas on how you can combine things together right that's the main parts I just want to give some key takeaways and then open up for questions I'm going to do this fairly quickly though because we're running short on time so object orientation just some observations it's typically you're fairly top-down design you create larger very topical structures and you're quite explicit about high context you bring functionality and data together in a in a topical way it leads to very intuitive use case that makes it all very explorable functional programming is very much the other way around going more in a bottom up fashion tries to simplify everything as much as possible and put it into small chunks of things that you can actually reuse functions that you can reuse and then are entirely separate from the data so there's a high isolation and we have a low context that typically leads to very reusable things tidy and concise code I find if you do it right and the use cases are a bit more flexible just a quick side note to keep your code tidy and concise you do need to work a bit more with modules than you would have to in a purely object oriented way but there's enough means and ways to structure your code so what does that mean for multi-paradigm it's a pick and mix of both worlds warning you can pick and mix the worst of both worlds try not to do that but I think I've shown you probably some useful ways of combining use pure functions in a mutable context so just bring that all together good thing is no side effects no problems you can always use the pure functions in a mutable context and you shouldn't get any problems the other way around when you use something that works with mutable data in an immutable context then you want to make sure you're explicit about it and use a copy to modify a copy and modify pattern so just to make sure you're not changing your inputs and by that keep it immutable for the context that is required so ideally you end up with both intuitive and flexible use cases and you end up with both something that's explorable and has very reusable components yeah I like to you have four four minutes left okay perfect um my including q and a so if you want to do q and a then we should switch to okay thanks mark um yeah so favorite approach iterate with a ripple and that's where i'm going to wrap up so we have sometimes four time four questions okay excellent thank you very much yes that was a very very interesting talk and sorry sorry i had to cut you a bit short we do have a couple of questions we don't have we have three minutes left for questions so i'm just going to read them from top to bottom so first question there how do you recommend organizing your helper functions in a module um that depends very much on what the context is is about so you um i would generally start writing them in one module wherever you use them for and then sort of start breaking them out when you realize hey this is something that i could reuse maybe put it into a utility library or into a topical library or even just in a sub module pull them out when it starts to look like hey this is not just for this particular use case this could or will be more useful in others as well okay great thanks next question is why data class and not named tuple um i use data classes here to just short get rid of some boilerplate i showed you the 120 lines that was already with using data classes so i used them mostly for the object oriented context here and then just kept going to keep it to keep it sort of similar in the talk frozen data classes and named tuples really if you just use them this way there is no a considerable considerable difference i've definitely used named tuples as well with some functionality on them okay great let's see one more question why should you not use lambda inside list current prehensions just for readability um no not just for a readability there is a i don't know what the correct term there is but there is a problem with context generally that a for loop doesn't have a context so if you if you end up for example defining sort of slight tangent if you define a function inside a for loop and then uh you sort of keep memorize that function for later use you're actually going to override the function and only be left with the one of the last iteration of the loop so you can really get yourself into some unexpected behavior there and that same principle translates to lambda functions inside the loop so if you if you directly execute something you know in a list comprehension you don't actually need a lambda but as soon as you start to do something say with pandas inside a loop with a lambda df and then use something on the data frame it will not do what you want it to do so just try to avoid it okay thank you very much i think that was the last question that we can take i would like to ask the attendees who had more questions to go to the the talk channel that we have for this talk which is talk write multi paradigm code and then you can ask additional questions there and alias can then answer those so thank you very much alias for for the nice session let me try to run a short applause for you