 Hello, everyone. I hope you're having a fantastic conference. I wish I could be there with you. Yeah, so I think I'll just start because we have a lot to talk about. So today I'm going to talk about writing secure coding Python. Let me start doing a quick introduction. I just want to explain what is my motivation behind the stock. So I'm sure everyone who's worked with Python or even just messed with the language a bit has heard this before that programming in Python is easy. And this is something I don't disagree. I think it's one of the core strengths of the language. But I believe people sometimes can seem to understand that it's really easy to create a Python program that runs that successfully executes. It's not always so trivial to write a quality code that is both Pythonic and secure. So today, of course, I'm going to focus on the security side because over the years working as a Python developer, I got the chance to see a few patterns with Python code that sometimes we as developers don't think too much about and may end up becoming a security risk and our security vulnerability. So I basically made a list of things that and that is what I'm going to show you today. Some topics are a bit obvious or may seem a bit obvious to those of you who are more experienced. But overall, we got lots of cool things to talk about today. So let's start. The first thing I want to talk about is the function, is about the function evo. So the topic is evo is really dangerous. So let's start. For those who aren't familiar, evo, which is short for evaluate, is a built-in function that evaluates a Python expression and returns its result. So I added a few examples on the side of how the function can be used. And the first two examples are simple mathematical expressions. But in the third example, we can see that with evo, we have access to building functions like some. And in the last example, we see that with evo, we can even access, declare variables outside of evo. So I declare the variable X, and I can access this variable inside of evo. The function can receive two optional parameters, which are globals and locals. They're basically dictionaries that define what would be the global variables and the local variables available from evo. And yeah, we're going to use these parameters. You'll see how. So the danger with this function begins with, well, if a user tries to run a malicious code, a malicious expression, such as this simple expression to remove all the files from the computer using os.system. And yeah, evo evaluates this expression and executes this code. But considering we can control the global variables with the globals parameter, maybe we can manage to securely run this function. If we try to pass an empty dictionary as the globals argument, well, that will work. So we'll get a name error because os is not defined anymore because we cleared the global variables and os was a global variable. But the problem is Python automatically inserts buildings when we don't, when we use, even when we use an empty dictionary as the globals argument. So yeah, we don't have access to the imported os module, but we can import it ourselves using the port built-in function. But then again, we can control the globals so we can specifically clean the buildings. Maybe then we can create a kind of secure evo. And yeah, that again, we work if we, instead of passing an empty dictionary as the globals variable, we pass a dictionary with the, then the built-ins key set to an empty dictionary. Then we are clearing the built-in variables that Python automatically inserts. And then the dunder import function is not available anymore. But maybe we can work it out some way because yeah, we can, we don't have any built-in and we don't have any global variables. But we still can create Python objects using the literal form. So I can create, for example, a tuple using instantiating the class tuple, but I can create a tuple using the literal form, which is the parenthesis, opening, closing the parenthesis, which creates a tuple. So okay, what if I create a tuple with the literal form and access the dunder class attribute? Well, then I got the class tuple. And if I access the dunder base attribute, I got the object class. And we know that everything in Python is an object. So if we call the subclasses method, the dunder subclasses method from the object class, we basically get all loaded classes in our program. And right now we're looking for a specific class that is the built-in importer, because with this class we can import whatever we want. So basically what we need to do is iterate through these subclasses, these object subclasses, look for the one that is called built-in importer and instantiate it and call the its method loadMapModule and then we can import the OSModule and call the system function with the malicious code. But we need to do that in one line, because we're calling in from inside the evolve function. And yeah, the conclusion is, if always really dangerous, we can really create a secure evolve. We can basically do that payload with a least comprehension. This code is from netsec.expert. You can check them all later. It's a good source. And yeah, so what are our alternatives since we can use, or we shouldn't use the evolve function? Well, we have the literal evolve from the AST module and with this function we can basically create a Python object, but only using its literal form. So I have these examples here. So we can create any kind of number. We can create a tuple with different kinds of objects inside. But we can really evaluate an expression. For example, we can do a mathematical expression such as one plus one. It will not run. It's going to be a value error. And if we need something more complex, we can try to parse the string ourselves and implement the code ourselves, which will be of course more secure. So basically, when should we use the evolve? So we have this function available as a built-in and when should we use it? And my answer to this is we should use the evolve. Basically, when there is no other viable way to accomplish a task. And well, when we work with Python for long enough, you realize that this mean basically never, you should never use evolve. There is no really useful, yeah, it's not useful enough to have the security risk. Okay, so let's jump into topic two, which is about arbitrary code execution with Pico. As I said, we have many topics. They're quite different from each other. But yeah, so let's jump into it to manage in time. Okay, so about the Pico module. The Pico module is a way in Python to store a Python object. We serialize this Python object to a sequence of bytes and we can load this object, this serialized object later if we need the object. So we basically do that with the dump function. So we serialize an object with the dump function and we load it when we need to load it using the load function. So on the right I have an example of that. I serialize a set of numbers and I save it to a file and then I load it from this file. So the file is a binary file. I load these bytes and they are deserialized to our Python object, which is the set of numbers. And again, with this function, with the dump function, we have optional parameters we can use. So we have this protocol argument, which is an integer, denoting what protocol is going to be used for the serialization. And we currently have five options for protocols that go from zero, which is the oldest one, and it's basically the human readable one, to five, which is the newest, which is available since Python 3.8. Okay, so what if we need to customize how a class is serialized? Python usually knows how to serialize all kinds of objects, but if we need to customize how a class instance, how an object is going to be serialized, we can do that using this magic method and the reduce. So this method should return a string or a tuple containing a callable and its parameters. And we are going to focus on this second option, which is a tuple containing a callable and its parameters. Yeah, so, okay, so we can create here a class called X, Y, Pico, which implements the done the reduce method. And I'm returning the first, the callable, which is the OS.system, which I'm using as an example. And I'm returning what arguments should be passed to this callable, which is the RMRF code to delete all the files. So yeah, okay, we can serialize it normally. It's going to be serialized to bytes, but when it loaded, if we recall Pico.loads to load the bytes and deserialize it to a Python object, then this is going to run. And if the interpreter has the right permissions, it's going to delete all the files. So this coding side reduce is going to be executed. We can use PicoTools, which is another module available from the standard library, to basically read the Pico and understand what it's doing behind the scenes. So let's just understand what, how can we read a Pico files to maybe create a more complex one? Okay, so using the PicoTools.this function, we can read the row Pico and I'm going to basically go through the Pico code. So the Pico starts with the C character. The C character is used to import, basically import a function. So we have the C character, followed by POSIX, which is the module, align break and system. So to import a function from a module, we have C, the name of the module, align break. So now it's not the main name of the module, it's the name of the function, system. So we're basically calling this function. We're importing this function to our code. Then we have this open parameter. So here when it says 14, the line 14, the column 14, sorry, it's a mark. So it is the argument mark. So we start to provide what arguments are going to be passed to this function we just imported. So the first argument, which is the only argument, is a unique string. And to denote a unique string, we use the uppercase v character and followed by the string itself. So in this case, the string itself is the RMRF code to delete all the files. We then basically close the parameter. So in column 14, we open the list of parameters. And in line 24, we use the T character to close the parameter. So the only parameter we're passing to the function we imported is the unique string RMRF. Then in column 25, we have the uppercase r, which is the reduce itself, which then executes the function with the provided parameters we specified. And on column 26, we have the dot, which is analyzing that the pickle ended. So that's basically what this pickle means. And with that, we're going to try to execute arbitrary code because in the example I showed before, we had a simple OS dot system call. But what if you want to really run any kind of code like a reverse shell? So we have a function here for a reverse shell. We have this import inside the function to... Well, because we want all the necessary code to be inside one function. So this is basically a reverse shell. So the attacker can control the victim's shell from their computer. How can we run this arbitrary code with pickle? Okay, so first we need to serialize this code. Problem is pickle can really serialize code. So we have to use Marshall, which is another module inside the Python Central Library to serialize the code. And yeah, we have the dump and dumps and load and loads functions as we have in pickle, we have in Marshall. And we can serialize the function with Marshall.dump and we will then have the sequence of bytes. And we can just make more readable. We can encode with base64. And then we'll have base64 string, which is basically the code of the function. And if we need to run the function again, we then just reverse the process. We decode with base64. We load with Marshall. And then we need to instantiate it with the function type class to... Yeah, just providing the result of Marshall loads. And then we can call the function as if we implemented the function in the code. So okay, let's create the malicious pickle. I just explained how the pickle works. So let's go through it. We have the C denoting we are importing a function. So it's from model types. It's the function, function type. Why we're importing the function type first, because the function type is what calls the... We want the result of the function type. We want to run the function type result. So inside the function type, what are going to be its arguments? So here we have the D on the right. When we have the column 20, we have the mark to open the parameters. And in column 21, we have another import, which is the Marshall loads. We then have on column 36 another mark for parameters, but then for the Marshall loads function. And for this parameter, we have another import, which is the B64 decode from base 64. And then for B64 decode, we have another parameter in column 55. And now we're not going to import anything more. We're just going to pass the value, which is a new code string. So in column 56, we have the uppercase V denoting a new code string, which is our base 64 encoded function. We then have on column 81 the T to close the arguments to B64 decode. In column 82, we have the uppercase R to execute the B64 decode. In line 83, we have the T to close the argument for Marshall loads. And in column 84, we execute the Marshall loads. And then in column 85... Oh, I'm sorry. I said we didn't have any more import, but we have the building globals import, because as I showed in the previous slide, we have to use for in the function type, we have to pass the globals. So yeah, we use the globals to pass as the second parameter for the function type. The globals is a function. So we have to call the... We have in column 106 and 107, just an empty list of arguments. We then use on column 108 the uppercase R to call the globals. And okay, on line 111, we close the parameters for the function type. And no, I'm sorry. You close the parameter for the Marshall loads. Yeah, I got lost. But in the end, we're in line 115. We execute the function type. And on line 116, we stop the code. So yeah, we have our malicious pickle. And I got a small demonstration to give just to show it how it would work on the right. We have the server code executed and waiting for the connection. And in the left, we are going to open the malicious pickle. And yes, when it loads the pickle, then the reverse shell is connected. And we can execute any comment we want from the server. And okay, how can we prevent that? Well, we can prevent that by signing the pickle with a cryptographically secure hash like HMAC. So here we are creating a digest for the pickle with HMAC on the left. And if we save this digest, we can check when we want to load if the digest is the same. And then if it's the same, then we can trust it. We know it's secure. And we can, we also have alternatives. So instead of using pickle for storing objects, we can maybe use a safer serialization format like JSON, which is really, really simple. Yeah, I have an example here, which is just a string. Yeah, okay, so let's go to our third topic. I'm going to try to be a little fast because we have still a few things to talk about. And now we're going to talk about the power of the peeping stall command. So first we have to understand what happens when we run peeping stall. So I basically divided into four things that happen. It's more complicated than that. But basically, first we have the identification of base requirements and the given parameters. Then we have the resolution of dependencies and the determination of what will be installed. And the third thing that happens is the determination of the installation method. Then after determining what installation method it will use, the package is installed. We're focusing on the third thing which is the determination of installation method. And again, I simplified what really happens. But basically the logic is if a wheel is available, if we have a wheel in the repository to download, people will download the wheel and install from it. If the wheel is not available it will download the package source code. And if it's possible to build the wheel from the source code it will build the wheel and install from it. If it's not possible it will install from setup.py. And this we're going to focus again if the wheel is not available. So if the person who uploaded the package to IPI didn't upload a wheel binary. So when people download the package source code and try to build the wheel or tries to install from setup.py in both cases it will run setup.py. And the thing with setup.py is we're basically having a dynamic metadata. So here is an example of a setup.py file which just reads along description from a readme file and calls the setup function. But, well, it's Python code so we can have things a bit more complex than that. And in the right I just put the comments that would be called so to install the package we would have setup.py installed and to create the wheel we would have setup.py with this wheel. And yeah, so we have the setup.py to create dynamic metadata and how can we execute arbitrary code in the package installation is by, well, just adding code to the file. It's a Python file so we can add the code we want. So I'm adding the same code of the reverse how I showed before. The only difference here is instead of subprocess.run instead of this function we're using subprocess.popen because we want to create a separate process instead of using the same process so that the installation does then freezes and the user will know it freezes. And again, I have a demonstration. We have the server on the right and we're installing the malicious 0.py which is something I uploaded to PyPI and deleted. You can't install it anymore but if you install it it's going to execute the setup.py because I didn't put a wheel. And yeah, we're going to have the reverse shell again. And what is the real-life risk of this? Because you can think that you're not going to install any package you don't know. You're just using standard packages like Django, like Request. But we have a real problem which is typosquadding. So let's say I'm trying to install the Request package and I mistype and type this Request wrong and someone uploaded a malicious package to PyPI with Request spelled wrong and then the code will be executed because of that. And this is a real problem. So in 2021 we had almost 4,000 libraries which got deleted from PyPI because of this because they were malicious. And here on the right upside we have statistics from 2017. So from people trying to install packages that are available from the standard libraries such as JSON, OS, CIS, Platform. So someone uploaded these packages to the PyPI and this can be malicious packages. So if a beginner user, I don't see an example that uses JSON and thinks, oh, I need to install it. They're trying to install it but they're installing the malicious package. And well, another risk is if we use the extra index URL flag because with this flag we can basically use another repository instead of the standard PyPI. And I don't have time to go deep into it but I added a medium article that basically is a guy that made a lot of money by basically doing this. So okay, how can we prevent this? We can use the only binary flag with PyPI installed to only download if we have the wheel. And we can require hashes, use the require hashes flag to verify the hash sum to know we were downloading the right thing. And well, we should never download a package as pseudo or as admin because if you for some reason install a malicious package you'll have the admin permission and can really do some damage. Okay, so let's go to the fourth topic which is about outdated dependencies. Okay, so basically vulnerabilities are found all the time. I have here a screenshot from the, I think it's the latest or no, I think right now we have the 407 but on Django release 406 we fixed a security issue which is a potential SQL injection. And yeah, this happens all the time. Everyone is always finding new vulnerabilities and fixing these vulnerabilities. So it's important to keep up with the releases of the packages we use and it's good if we can keep up with the CVE vulnerabilities list and so we know what are the, I don't know, zero-day vulnerabilities found. Yeah, the fifth topic I want to talk about is basically continuation from the, what I just talked, but it is specific to Python. So outdated Python. And again, vulnerabilities are found all the time. This is a list of some of the vulnerabilities found in Python and when were they disclosure? Yeah, and these are currently unfixed vulnerabilities. So are vulnerabilities that still exist and it's important to understand the status of the Python versions. So we don't end up using an unsupported Python version. So we have this available in the docs. So we know, well, when will this version of Python I'm using stops being, stop having security updates. So right now we have the, from 3.7 that is having security updates to 3.11 which is having bug fixes and 3.6 from, and before it got to the end of life. So we don't have even security updates. So if you find a security bug and we'll fix it for the 3.7 onward but we won't fix for the 3.6 and well, this is going to continue one day we won't update the security bugs from 3.7 and yeah, that goes on. Also you should be cautious with deprecated functions. So if you search the docs you'll find some deprecated functions. I added an example in the temp file module with the make temp function which is obsolete. You shouldn't use it anymore because it can't introduce a security hole because of the race condition that can happen. We have better functions to do what this function used to do. But the functions still exist. So if I, if you call the function it will work. So yeah, you should be cautious with functions that are deprecated. You should look into it. The sixth topic is about solo randomness. So the problem with randomness. I saw this tweet a while ago. I'm not going to show the user. I think it's deleted tweet now. But it was a tweet talking about how to create a secure password generator in four simple lines of Python code. And this is the code. So it's, we have a list of characters, a sequence of characters to choose. We import the rendering and the choice from random, from the random module. And we use the random choice to create the password. This is the code this person presented as secure. And this is a problem because of the random module and the, what I call the infamous seat and the problem with the random module is basically almost all the functions in the random module use a seat. So we can determine what the seat will be with the seat function. So here in line four, I call random dot seat with the seat error Python. And well, when we generate the password with the same method that the person generated, we will always end up with this result. So if you get your computer right now and run this code, even using the random module, which should be random or you think that it's random, it will always end up with the same result because of the seat. So we have lots of alternatives to generate secure passwords or stuff like that. So we have the proper secrets module, which is great. It's simple to use. We have the same functions that the random module has. So on the right, the first example I show is the token acts, which we can create a token and we determine the number of bytes. We then use the secret dot choice to basically do the same thing we, the person was doing, but in a secure way because we're using secrets. So it's not by this predefined seat. We can use the U random function from the OS module, which would be a little bit more complicated, but we can do that. And there is one class in the random module that is secure, which is the system random. It is the only class that uses a cryptographically secure generator, random generator, random number generator. It is this system random class. So the last example on the right is using the system random class to generate the password. Okay. The seventh topic I want to talk about is about bomb files. So yeah, watch out for bomb files. We have... Yeah, we're running out of time. We have five minutes left. Okay. I'm going to speed up to end up in this section. So we have the, we don't have to tag or XML functions are vulnerable to that. We can use the diffuse XML package to secure that. We have a problem also with tar bombs. If we don't expect the tar file we are extracting, we can end up with this path traversal problem. We can prevent this by inspecting the files. Sorry, I'm too speeded this up. So the last thing is about the assert keyword. So what is the purpose of assert? We're using assert now for testing with pytest and stuff like that. But we basically check if a condition is true. And if it's not true, we have assertion error. And people sometimes try to save a line with assert. So we have these two codes in the left. I raise the value error if the password is not what the predefined password was. And in the right, I use the search for this. But the problem is this are not equivalent. Because assert actually checks for the debug constant. And this debug constant is set by the old flag. So if we pass the old flag, it's the optimized flag. Then the debug will be false. And the assert will have no effect. So we shouldn't use assert in our code. I just want to talk about auditing our code. So we have lots of options, lots of programs to audit our code and to check for security vulnerabilities in our code. The one I'm showing an example is Bandit. It's really simple to use. So in this example, it's showing that the code have the random number generator problem I talked before. And yeah, that's it. So just to finish the talk, I want to tell five key points to never forget. So if you want to learn something from this talk, just remember these five points. The first one is to never trust user input. So basically, I think three topics I talk about. The E4-1, the people installing the decode, it's basically about user input. The second point is avoid running Python code as pseudo or admin so we can diminish the damages. Keep your system up to date and your package is up to date. So we have the fixes for the security vulnerabilities that can happen. Read the docs. So I know the docs are not always the most friendly way to learn something, but in the docs, we will have the warnings for dangerous modules. We have the warnings for the targeted functions. And this is why I think it's important to sometimes read the docs and use a static code analysis tool such as Bandit or you have many options to analyze the code and search for security vulnerabilities. And that's it. I think I managed to finish on time. Thank you. And this is my contact information. And yeah, that's it. Thank you, guys. I think we have time for one or two questions. Is there anyone in the audience would like to ask a question if you could use the mic, please? Do we have any remote questions? No. Well, again, thank you, Jan. Another round of applause for Jan, please.