 Okay, so I'm back again. I will be chairing the next session. Unfortunately, the speaker couldn't attend the conference and so he sent in a recording, which we're gonna play. The talk is by Michael Wodinski and he's gonna talk about Python security best practices and gonna go through a lot of these things that you have to consider in everyday programming to make sure that your code is secure and you don't make the, at least not the most common mistakes. So I think this will be a very interesting talk, something that everybody will basically have to, adhere to and watch and so yeah. Unfortunately, I think that the speaker is not, is also not gonna be able to join the Q and A. So we are just gonna play the recording. You can still put questions into the chat. I will record those and perhaps we can find some way to get the speaker to answer those questions, maybe in the breakout room or something later on or I can help, maybe I can answer some questions. Okay, let's take it away. Hello, my name is Miha Wodinski. I'm Python developer and today I will be presenting a talk about Python security best practices. I will be not attending Q and Python live. So I've just made a video of my talk. So I will, if you have any, if you will have any questions, just send me in mail in the, on this address, Defend ProgrammingQRPyton2021 at protonmail.com and I will answer your questions if you have any time. Okay, so let's begin the talk. A few words about me. I'm currently, I'm working in a graphite which is project related to the creating recommendation for the gamers of the hazard games. On daily basis, I'm interested in security, especially in Python and good quality and to just. Currently I'm working only with AWS and Python and I've made many projects related to machine learning on this platform. Few words about the agenda. I will tell a few words about hackers. Then I will just briefly talk about OSP top 10. We'll be talking about the input injection and how to prevent it. Also we'll see issues with reading files like XML, PICL, YAML and then I will tell you about assert statements and some unexpected behavior of the Python. Then I will talk about issues with temporary files and tools that can detect those issues. Okay, so let's begin. Few words about hackers. In previous decades, hacker, it was a person that was good in programming. Now it can be anyone, it can be just a server, some young person that is testing Kali Linux or Shodan IO and just checking what can get from the results. So everything changed. The aim of the hacker can be just hacking the website, getting the algorithm, maybe some metadata from the world which is storing information about where the file was stored previously. It can be some good start for the attacker to know where to find certain documents. Also it can be sometimes man by social techniques, but it's not the aim of this lecture. And that's all about hackers. Sometimes bugs are happening by accident because on every stone 100 lines we made 220 bucks. So it can happen just by mistake. Okay, so let's move to input injection. And the overview of the OWASP. So in this talk I will be just going through, let's say, issues that can be found in here on OWASP top 10, input injection. Generally it can be SQL, it can be XML path injection, shell injection, there is many times of the injections. Also we see that XML issues are also on the top and insecure deserialization which I will be talking today. So I will be talking about most recent because OWASP top 10 is not updated every year. It's updated from time to time. So let's go to input injection. And here we have a function that is compressing the file and it is written in the way that is accepting everything from the user. And just to prevent doing this, so printing hashes of the passwords which can be taken them to the Raybone table and just recognized the pure password and then mail. In this case it can be easily run because there is no any quoting and validation. So solution for that is just never trusting of what user is giving us. And we have to always make sure that we are doing exactly what we are expecting and not what user expect can do. So in case of that, we can just manually quoting everything what we are getting from the user and validating it. Or we can, in case of shell libraries, we can use shell ex library, which has a function quote for quoting what user gave. Okay, let's move to the XML. XMLs are very tricky because of the elastic way of providing them, writing them and all features that are related to the XML files. And I will shortly describe all of them. Generally, XML files can bypass the firewall and find some resources that can be and read some resources that can be found from the internet. It can make a DEDOS on the server on other services that is communicating with by simply exhausting the resources, basically memory. It can be used for just gaining the IP address that is where it's sent from, where it's actually open, the XML. And it sometimes allows for sending emails or any other dirty things. Okay, so let's go particularly to any vulnerability. First thing is billion loves or exponential entity expansion, which is attacked by entity, which basically is creating one small entity and then it's recursively repeating it. Then taking it to the memory, it's just killing the application. Other way to doing this is just creating one big entity and just repeat it many times. And it is called quadratic blow entity expansion. There is also option to just point to the certain entity remotely or local. By the feature of the XML. So we can just point to the some website and take XML and load it with the entities, assign them. And also it can happen on the server side where the attacker just by somehow simply uploads such XML file. There is also option for DDD retrieval, which is pretty the same as previous. And it can contain any of the previous attacks. Also XML in their implementation, and there are some issues related to the name species and name recognition. Generally, square, there is used algorithm square M but in some parsers, there are some hash tables that are doing in the linearity. But they are exposed for the collision attacks and the performance can go to square M. So be aware of that. And there is also processing instruction, which are running other XMLs. Even worse is that XMLs can be compressed using the GZIP or ELSEMA, which is better than GZIP and can go and do more harm. In this case, XML ERPC lip can be decompressed, so it's vulnerable. El XML is well-designed and it's dealing with a big compression, but it's not protected from the compression bumps. So be aware of that. And sex library is the most safe. I will discuss all the parser libraries in a few slides later. We have also issue with the injection where we can pass the X path. So again, we have to validate what this user is passing to us by quoting or by just using the function appropriately. So not just by string format, but by passing the parameter and a library will make validation. Also, we can include other files, not necessary XML files. And we shouldn't do that if we are pointing to untrusted sources. Lip XML 2 supports x-include and there is no way to access allow directories. XML can be combined from schema. And in this case, we can have all previous issues. So entities, we can have some local or remote XMLs, DDD retrieval, basically all the cases. Even worse is that we can go to the, we can use XLT language, which is used for transforming XML to XML or XML to HTML. And in this case, this language can read, write files, access Java files, or we can script in Jton, for example. And here's example of the code that is running CND on Windows and just running some command. So this is the very serious case. Okay, let's go to summary. These points are regarding to the table that I'm just putting to the, that I'm putting later. So we can check which number is in which cell in this table. So, LAXML is protected from billion laughs attacks. LeapXML and LXML is vulnerable to Gzip compression bombs, but it's well-designed. XML E3 rises parser error in, if the entity appears. Also MiniDom doesn't expand entities. And Genshi also doesn't support entity expansion. And libraries with a six and seven. Six is, which is a feature of Xinclude. And this library may be vulnerable for that. And also there are other features that can be exploitable. So let's go here and it's not looking very good. So be aware of that. That if you use, if you are planning to run XML files, it is just your responsibility to choose the library that is appropriate for your case and just not remove cases that you are not using. The most safe library in this case is Genshi parser that is almost safe for everything. But in some cases you can use also other libraries. This table is going through, you can just go through and make sure that this is available here. So you can see which library for what attack is exposed. For one, I think is Genshi exposed for Xinclude. So be aware that in this case you will have to deal with that. But also I know that many of you are using already some of the parsers for some reasons you've chose those. So you can basically add used defused XML, which is a library for fixing those parsers. And in case element three, it is just replacing the object and make it it more safe. And I will just briefly show it on the Python. So in this case, I will be loading such file. It is quite big. I think I will be not breaking my own computer. So this is one of the last, let's say entities that I can handle in current situation with the presenting and recording. So if you look at this, it will take some time to show that, I guess a second, okay, we have it. So in this case, if I run it, you see that it's processing it, processing, and it's not quite big file. So if you compress it or make it some other dirty things, you will make it break. But in case of defused XML, we just simply receive entities forbidden. So you are allowed that, make sure what you won't exactly do with your XMLs, use defused XML if you're already using some libraries and you should be more safe. Okay, let's go back to the presentation. If you have any more questions, there is a project defused XML where I took this table and you can read it more about it. Also, you can check the Python 3 documentation about this issues. Okay, let's move on to pickles. Pickles are generally binary objects in Python for storing data. But to be honest, you can store everything in pickle. It can be even whole project or even program or small program, whatever, it can break your code. And there are many ways to do that. So we have a pickle library that is basically can do what I mentioned already. We have a shelf library, which is a basic library for storing binary data in the key value way. There's a Marshall library, which is alternative to the pickle and it's older. Not going for too deep details, the difference is that pickle is saving data in a different way, more optimized and that's the main difference. There is also JSON pickle, which is basically for storing JSON, but to be honest, you can store everything by JSON pickle. So it is alternative for the pickle. Okay, let's see how we can create exploiting by using pickle. Pickle is combined with such comments. In first line, we see that it's loading buildings. Next comment, it's evaluating the comment that is below in this case, printing. And the last comment is just calling what we had previously and pushing to the stack. And that's all. This will be executing on your site and if you are loading data science pickles or other pickles, you can just break your program. Other way to do that is just implementing the class and the reduce magic method, reduce magic method, which is returning, which is running when the load method is executed. So cases that this reduce method is returning the module that we want to run with the function and the parameter for this function. In this case, it is running basically, let's say remote bash or something like that on the server side that you can connect to. So there is some ways to investigate pickle if you experience something like that. We can use the pickle tools, which will show you what exactly is in the pickle and what we'll be doing step by step. And there is also the small library thick link, which is basically is for creating bad pickles and checking what is inside pickle. Currently, checking whether the pickle is okay, it's not working here. So it is interesting because it is very easy to create pickle by it. So how we can deal with that? We cannot trust pickles. This is the first rule. The second rule is just to sign pickle that we want to trust and check the signature when we are loading that. Okay, let's quickly show this in the terminal. Okay, so I've just prepared, let's run bad server that is not checking the signing. And I've just prepared some pickles. I will be using this send email pickle, which actually what it's doing, it's taking the net stat output and sending to the mail trap, which is the provider for testing mails. So let's run it, make request. Method post, okay, a second. Okay, I will just go back. Yeah, file path, there is a new file path. Okay, so I will be just sending it. Okay, let's wait for a second because net stat command takes some time. So this is, okay, it succeed. And if you go here, we have some precious information. I will, okay, I will show you on the browser. If we go here in the inbox, I have my precious information. I will not be showing that because I'm doing my own computer. So believe me, there is a net stat there. I will be basically show you how this send email pickle look like. It's just big piece of code that is sending all of the necessary information. And it's just dumping it to the such pickle and loading to the memory of the server. Okay, so just going to the terminal back and we can go and see that good server will just simply not accept such pickle because I'm making here the check of the digest. So let's take a look here. So that server was just loading everything and was happy with he gets the good server. When I was accepting something, it was making the digest and checking with the last digest that was created previously by get. So in case of get, it will be working. Let's show it. Let's get good pickle from which will be sign it. Okay, rack people will receive that. And let's send it. Yeah, success. Start processing. So he basically checked the signing, it was okay. So it was doing what was expected. So let's look into some deep of the pickle. And here we can see pickle tools and let's see the send email pickle. So we see here that something bad is happening here that will be loaded and it's base 64 encoded. If we use a thick link, it will show the same. Okay, also in case of NC pickle, in this case, it will show what will be happening in the reduce. So it's more convenient to see what will happen. And finally, I will show you how to create a new bad pickle with the thick link. Okay, so we have it. Let's see with the thick link here and here with the printing. And now we can also put something bad here. We can inject to this new pickle a new thing. And it will be injected. So besides of the print, we get running the net start inside. So there is a many tools for making that view our view our of that. Okay, so that's all for pickles. Let's go to the presentations. And now let's move. Okay, the topic of the pickle is very big. If you want to go inside and see some even worse thing, you can just read all of the sources, the documentation and so on, to see what can go wrong more with the pickles. Let's switch to the yams. So with the yams, there is no so bad. Two years ago, there was a problem because YAML was loading everything what was inside the YAML. YAML files are constructed this way that can run every command that you put there. So for example, net start, sending emails, et cetera, et cetera. And in the version until version five, by YAML library was loading unsecurely YAML files. But after this version, load is using the safe load. So there is no problem. So the only thing you have to remember just to update your libraries. And it's not only the case the pyYAMLs, but all libraries that you are using, developers are switching many security issues. But be aware also of that, that you can may have some case that you will use unsafe load, which will make a lot of issues because it will load everything. Maybe you will need some comments from the YAML, et cetera, et cetera. So be aware of that. If you want to read more, here is the pyYAML project and also some old YAML issues related to the YAMLs. So let's go to the assert statements. So the case is that in some cases, like contract programming, some programmers are using assert statements for ensuring that to the function, everything is going as expected. So it's checking the output and then checking the state after the execution of the function. And this is quite good. But the problem with the asserts is that many of the Python projects are run with the optimization, which can happen in cloud or in your own projects. And here is the case. In optimized mode, dash o, it will just skip all asserts. So we'll be not checking everything. If you want to make contract programming, you have to use some other way, maybe if else statements. Use asserts only tests. And here is the screenshot of this case. So we have the function that is taking something from the argument and it's checking the assert. If we run Python in this way, it will just raise the assertion error, which is okay. But if we run Python with the optimized option, it will skip the assert statement completely and everything will be OK. It will skip the assert statement completely and everything will be running as it is. So we are aware of that. Also, there are some issues related to the temporary files. And basically, temporary files are used by some applications to store something and read it back. So changing this temporary files can make problems in the behavior of the application that will change this behavior of the application completely. So in such case, we should use MK secure temp. So making secure temporary file. In such case, to the file that is created by this application is added a sticky bit. In Linux, basically it means that it will be edited only by this application and it cannot be edited by any other user. So this is how we can handle that. We have some tools that we can use for checking all vulnerabilities. I recommend SNCCIO and Bandit. SNCCIO is a website where you can find many vulnerabilities and check your repo against all vulnerabilities that you are facing. And also is the script Bandit that is basically what it's doing is checking your code and appearance of certain modules like XML, assert and everything to show you that you are using something that is dangerous and you have to make sure that what you are using is used in a safe mode. If you make sure that it's okay, you can just skip it in Bandit and if it's happened again in somewhere other place Bandit will tell you about it again. Okay, so that's the end of my presentation. If you have any questions, you are welcome to write it to my e-mail, Defend Programming EuroPython 2021 at protonmail.com and I will answer it after the conference. And be secure and enjoy programming. Thank you for your attention and I hope you enjoyed my presentation. Here are some more readings if you want to see and see you on EuroPython next year. Okay, so that was the talk on security best practices. He covered lots and lots of things that are very interesting and that you definitely need to pay attention to when writing code. On the topic of XML, XML is you always have to consider being dangerous content or potentially dangerous content. You really have to trust the sources of getting the XML from. Nowadays, it's typically better to use other formats like JSON. If you want to write any questions, then I've posted the e-mail address into the chat so you can write directly to Michael and then ask. I will also post the few questions that were raised in the chat to him so maybe he can send back some answers and I will post them somewhere into this chat or maybe in the hallway. Okay, so thank you very much for listening and thank you very much to Michael for giving the talk. Next will be the keynote by David Beasley and we will take a short break until then. Thank you.