 We're really lucky to have her with us, so please give her a warm welcome. So to give a brief bio about Menci in her career, she's been involved with breaking, defending, and building secure applications. She researched various languages and technologies, found insecure usage in customer code, and took just automation measures and finding vulnerabilities in a very code binary static analysis. She's an avid traveler with the motto, if not now, then when. So please, once again, let's give a warm welcome to Menci Sheth. Thank you for allowing me to listen to you talk about cryptography. Before going ahead, I'll definitely like to introduce myself. I am Menci Sheth. I work as a security researcher for a leading static analysis company called Veracord. We have over 2000 customers across different business verticals using all different kinds of languages and technologies available right now. My job is to be on top of that and even more importantly be on top of the latest and greatest happenings in the application security domain and map these two expertise into helping find anti-patterns in customers' code bases. Cryptography is a huge aspect of anything security. I'm a huge crypto enthusiast. I've spent reasonable amount of time understanding different crypto systems and their internal workings and different language implementations. So why cryptography? Well, if you just stretch a little and look around yourself, you might be using half of these applications right at this moment and everything is based on having a secure crypto systems underlying in the underlying layer. You might be logged on to your company's VPN portal or you might be at the website you are using currently to watch this talk. You might be doing some Bitcoin mining under your desk to keep yourself warm who knows or doing some e-transactions or browsing some photographs stored somewhere. All the systems need our data to definitely be confidential while it is roaming around on the internet or stored somewhere on some cloud. We also expect the crypto systems to provide us some integrity and authenticity services for your e-transactions or bank transfers while using your e-wallets. Crypto is important. It's used everywhere. It's not only you, it's at this moment used by your family and loved ones as well. It's a couple of decades old system in modern computers. It's age old technology used by even ancient humans as we all know. So why do we see so many crypto failures on a monthly basis if it's not sometimes my weekly or weekly basis? Well, I think crypto is hard. Not more than two dozen people fully understand every different aspects of every crypto primitive available and how it can be broken or how secure it is. On top of it, whenever a developer is building a crypto system, he or she relies on the actual implementation provided by the library he or she is using. Well, these libraries are good and that's what you should be using, but it has its own set of pitfalls or things yet to be desired. For example, the APIs are not intuitive enough. They are not good enough security guidances in that or they are loaded with insecure code snippets. And even a well-meaning security conscious developer cannot be on top of latest and greatest things in the crypto field and completely very well understand this implementation to actually develop a full-proof secure crypto system. I firmly believe most of the crypto bugs we come across are not because of the poor implementations, but it's because of poor understanding and usage in the application. And that was my biggest motivation for this talk to help ease out that burden a little. Well, before going too much into detail, I want to give some crypto disclaimers. Never ever roll your own crypto. No matter how unique you think your application is, please use one of the well-guarded algorithms or primitives out of a good implementation you have access to. Next, no matter how you think your crypto system is solid and full-proof, which is great, but that does not give you a license to ignore all other application security issues you might be encountering. So you still have to mitigate your SQLI when excess has injections as required. Lastly, I understand not everyone has liberty to choose their language or implementation or the library, but if you do, I would highly suggest using a LibSodium library for any crypto needs. It is designed or it is developed by cryptographers and very senior security developers. It is a little opinionated library, which means most of the things are secure, defaulted by out of the box. Very little choices are needed to be done by developer, which makes it less error prone. So I would highly recommend using a LibSodium. It also has wrappers in most of the modern languages. I definitely know if there is a wrapper available for LibSodium in Python and PHP. So I would suggest using that. First, we are going to talk about the basic crypto-building blocks. First one being cryptographically secure, pseudo-random number generator. I'm going to say it as CSPR and it's quite a mouthful going ahead. Obviously, your encryption decryption and your hash functions. These three make the most basic primitives of crypto, but they are very rarely used in isolation. They usually take in building up a bigger crypto application. So next, we are going to touch base upon different crypto applications for which there are library implementation available right out of the box. Two of the symmetric encryption applications and one of the public application. Okay, first thing, CSPR and G. They are basically random looking numbers. But what more do we expect out of it? Well, a lot more. It should exhibit these three properties. The first being looking at a random number output, we should never be able to identify a pattern in it. Next is, it should be completely unpredictable. Meaning, looking at the current bit, we should never be able to guess what the next bit is going to be or what even the previous bit was. And lastly, it should never be reproduced. It should never be able to generate the same random number more than once. Now, how do we generate these numbers? Well, the center of generating this highly secure CSPR and G are two aspects. One is the actual algorithm which is used in generating the output. Well, but that's not that's no fun. Like knowing the algorithm, anyone can predict the output. It's about how this algorithm is seeded or what level of entropy is provided to that algorithm to churn it and produce this output is what is crucial. So what are different entropy sources? Well, it would have been awesome if we could have we could have got some non deterministic source of entropy outside your laptop, but that's not practical. So what is the next best source of entropy? It's one which your operating system provides. Your operating system can leverage different kinds of IO interrupts like your keyboard or mouse or or different or or your timing cycles or even your sometimes your kernel and developer space, collect all that different sources, generate and a source of entropy and provide it to the algorithm whenever it requests one and then that algorithm which churn it further and give you a nice looking output. Where is this thing used? It's used in practically any crypto system you're going to encounter. It's used for your key materials you need it for your encryption or your max or digital signatures or your nonces and nationalization vectors. It's used for sorting while you are going to store your secure information and basically it's everywhere. It's almost safe to say your security of your crypto system is indirectly not even directly proportional to your source of entropy. So it's extremely important to pay close attention to that. So again, always choose your source of entropy which is provided by our operating system. Now your operating system provides two different sources. One is a blocking source and one is a non blocking source. What does that mean? It's a non blocking source will provide a blocking source will provide you entropy only when it has sufficient information in its bucket before it gives it to the algorithm requesting algorithm and a non blocking is whatever entropy it has at a particular instant it will just give it to the request and algorithm. Well, for the most applications a non blocking source of OS entropy is good enough. If you think you are not fitting in that 80% of the category, sure go for a blocking source. Second, I just like to point out whenever you're trying to generate your CSPR in a virtual machine, the entropy is provided by the guest operating system and not the host and guest does not have that level of high entropy. And again, when you snapshot a system, it is already snapshot. So just be aware of that phenomena and definitely don't use any hard coded seeds for entropy or any even timestamps are not good enough. There have been a lot of attacks because of that one which comes to my mind is a Sony PlayStation attack which generated its digital signatures based on a hard coded source of entropy. So please don't do that. We have already seen enough mishaps around that. Talking about the actual algorithm always use an algorithm which is based on a good block cipher like AES or use the hash or a MAC based algorithm. There are algorithms based on on weaker block ciphers like 3ds or even a dual elliptical curve. We all know the world when NSA backdoor happened. Don't use any math dot random entropy or even for as a CSPR in G, there is nothing cryptographically secure about that. So please don't do that. I have seen innumerable instances just doing that. I like to point out some code snippets. Again, which is which I've seen very commonly here timestamp is used to explicitly seed a pseudo random number in Java. A hard coded seed is used. Please don't do these things. I had to put this there. Don't use math dot random as a CSPR in G. Even if you think you are using a secure hash that does the output of a hash is not a CSPR in G either. So don't use that for any kind of any kind of your CSPR in G needs. I've seen this in the first two stack overflow post where a hash output is used as a secure king material. Don't do that. Next, let's start talking about the most famous crypto primitive encryption. Well, we obviously expect our data to be confidential whenever it is in transit or at rest. It should only be accessed by people having access to the symmetric key. When most of our data is going to be on internet at some point and for that to happen, we also need some level of integrity and authenticity services out of a good encryption scheme. Why is that so? Imagine there is a sender and a receiver like any encryption scheme, and there is an adversary sitting in between just sending a tampering with the ciphertext and sending it to the receiver and getting information out of a typical padding or a color tag. We all know what happened with different unauthenticated encryption schemes being used in our TLS most notably the beast and the poodle attacks. So it's extremely important for any modern encryption scheme to have authenticity as well as integrity services in built in that. Now, how does well traditionally what used to happen is to get these two services we used to have our typical unauthenticated cipher and then we used to use a message authentication code to provide us all the services. Well, it worked well, but it is just too many crypto, too many crypto primitives involved made it slower, much more error prone, lesser crypto analysis, and it was just not working out very well. And most importantly, when most of the application needs this services, it should not be just confined to some important protocols and protocol implementers. So having us having such a crypto scheme out of box was extremely important. And that's what we should be using going ahead. So how does it actually work? Well, we should be using authenticated encryption with something called as associated data and what that is, I'll just explain. Well, at the core of it, it is still a traditional encryption scheme where you have your plain text, your symmetric keys, your initialization vectors, and your good encryption algorithm. When all this thing is passed through that algorithm, we get the cipher text and life is good. And similar thing happens on the decryption side. Now for authenticated encryption, we did a little bit more than that. We meaning the crypto community did a little bit more than that is we have we also give out an authentication tag, which is what is providing the integrity and authenticity checks on that cipher scheme. Well, in addition to this, there is something called as associated data, which is basically passed in plain text from sender to receiver. It takes part in the encryption, encryption and the authentication and cipher text generation part, but it is, but it is still passed in plain text. Imagine an IP address, it still needs to be in plain text, but the payload needs to be encrypted. So this is the scheme, which should be used out of the box, going ahead. And this is how encryption works. And on the decryption side again, it works very similarly like a traditional decryption with this is you give it your decryption scheme, you give it your cipher text, your authentication tag associated data symmetric key and IV. And once all of this matches a good looking cipher, a good looking other corresponding plain text is given to you, obviously with your associated data as well. So what are these authenticated encryption with associated data scheme look like? So it is still a block cipher user, one of the most famous block cipher AES. There are a few other block ciphers out there on outside United States, which are used, they are good enough as well. The only blocks, not the only the block, the most famous block cipher mode of operation is GCM. It is not patented or protected by any patterns in any other countries. It's fast enough. Use that most of the languages have an implementation out of the box called AES CCM schemes, just use those implementations. Any block cipher will need a good padding scheme use a pkcs 5 or 7, if not that then 5 sometimes languages, the implementations just reverse the names based on the scheme or the underlying block, underlying working of it basically. Key sizes, world is not going to fall apart if you just use 128, but I highly recommend using 256. It should be generated with a very strong CSPRNG and it should be top secret, it should be saved. It should be a top secret one nonsense. The underlying block is of AES CCM is 96 bits, so that much is enough. It should be unique. We are going to talk about a little later. Never reuse your key and IV pairs for more than one block, one piece of message. Again, it should be a very strong CSPRG generated and secret as well. The authentication tags size should be at least 128 bits, obviously to avoid any brute force attacks. So this is a template on which any AES, any authenticated encryption block cipher based scheme should be used and you should be good. I also want to draw your attention to this new kid on the block based on a stream cipher called Salsa. The more modern version of it is Chacha and it has its own underlying authentication scheme called Poly 1305. A lot of languages, Java, even cryptography.io, they all have started having implementations of it out of the box. So if for some reason after some time AES CCM has some flaws, the internet cannot just stop working. So our cryptographic forefathers have already foreseen that situation and are trying to push this out. There is decent enough crypto analysis done on it, it's safe enough. It's most importantly, it's much slower, it's much faster compared to AES, so certain IoT devices or Android phones or something or those kind of things might not be able to handle AES encryption scheme. This is a great idea for that. And most importantly, even Google and a cloud fair have actually started using this in their TLS protocol. So knowingly or unknowingly, even right now, 20 to 30% of internet websites are actually using that. So don't shy away from this if you have to use it, but there's still nothing wrong with AES. Okay, I just want to point out things you should be worried about or you should just ignore right now, just have a selective ignorance for all this in your typical crypto library. Most of the libraries still support legacy algorithms. They have been deprecated, they have been broken, there are much better alternatives available now, just avoid that. These are some of the things are supported by .NET in the latest 3.1 version as well. Lots of unauthenticated block cipher modes are still available again for legacy reasons. None of the block ciphers, none of these modes are, they are not going to leak plain text, but it is still unauthenticated, it's no point using it anymore. Padding schemes also there are many available, just ignore all these things. OAP padding is more a public RSA thing, we'll talk about it later, but just focus on using the right PKCS 5 or 7 padding going ahead. Next, let's talk about hash functions. These are one of the most simplest crypto primitive, but one of the most important, any kind of integrity checks, through message authentication codes or digital signatures uses this, it's used for any kind of file integrity checks you might need, hard disk encryption integrity checks. How does it work is you have this whole blurb of plain text data, it passes through this hash function and gives out a fixed length output. It has no keys. So that's the great part. So there is not a hell lot of a key material involved, but there's no key material involved at all. The output can be called as tags or checksums or hash or whatever, but there's a few words I can think about right now. What are some of the key services we expect out of a simple hash function is it should, it should be collision resistant. So not to plain text should have the same same tag. It should be one way given one input, it should generate always generate that particular hash output. And finally, it should be unpredictable. So those are some of the key properties we expect out of a hash function. And most importantly, it should have a strength of at least 120 inputs. What I mean is it should not, the output should be at least greater than 120 bits to avoid any kind of brute forcing attempts on the hash. I'll just keep it simple. Always use SHA2 or SHA3 family of algorithms. Blake and Shake have been recently approved by government authorities. They're seen in few implementations not all, but these are the only safe ones. Yeah, there are applications where SHA224 would be acceptable and or it could be deprecated in a few years or few it is not allowed. I would say just keep it simple unless you absolutely know what you are doing. Then that's the only time using it, but you have better alternatives available anyway. So why bother? Next, let's start talking about some of the symmetric key based application. As we spoke about earlier, all this permit is very rarely work in isolation. They are usually part of application. So the first being message authentication code. For any of your integrity or authenticity needs, these are usually used. One of the most famously used applications right now is for your API keys. So how does this work? You again have a sender and a receiver. The sender has, the both of these parties have access to the same symmetric key. On the sender side, the plain text is encrypted with the symmetric key through a MAC algorithm and a particular MAC tag is generated. The actual plain text and this MAC is passed on to the receiver side where the same computation is repeated once the receiver has access to the symmetric key. So it can generate the MAC. If the incoming MAC and the MAC generated on the receiver side match, you have integrity and authenticity checks. So again, what you should be worried about always base your MAC algorithms on SHA2 or SHA3 family of algorithms. Your security strength of a MAC is based on few different internal factors, which for your case, just make sure the key size you use is greater than 128 bits. Always protect your symmetric key that goes without saying, but I still have to mention that because I've seen many mishaps about it. And lastly, there have been constructions based on block cipher based MAC algorithms. I would say just avoid that. A few of the bouncy castles still supports them. There are just too many keys involved, a few more crypto and it's much more fragile than the ones which are available based on Hmax, based on hashes. So just use that. Next application is storing passwords or storing any secrets for your, which is secret to your important to your business. So traditionally, we have been hashing and salting and peppering all this, all this passwords or secrets before storing it. And we never store the plain text, which is all great. But with today's modern hardware advances, those things are pretty trivial to break. So I would say use this category of algorithms called key derivation, the key derivation functions KDFs. There are two categories of them. One is adaptive functions, which is just based on repetitive repeating the crypto applications for 1000s and 1000s of times and deliberately slowing it. So basically nullifying the brute forcing attempts. But these are not powerful enough for the current and for future proof in your application. But PBKDF and Bcrypt are one of the important adaptive functions right now. And only PBKDF is government approved. So if you have to use it, I would say at least have your work factor, which is at least 200k 300k or even 500k if your hardware can support it. If you don't have to abide by that, I would say use a more memory focused KDF called Argon2. What it does is in addition to repeating your crypto underlying crypto function, it also uses a bit of not a bit a lot of configurable memory to populate it over and over again. So that almost reduces the attempt of cracking offline cracking by a huge margin compared to adaptive or like way more compared to other traditional methods. So key takeaway use Argon2 if you can, if you have to use PBKDF use a very strong and a very higher work factor what your systems can support. Okay, let's briefly talk about a symmetric key cryptography or public key cryptography. Unlike a symmetric key cryptography, there are there is a key pair involved. It's not a single key here. One of the one of the piece of the pair is supposed to be kept private called as private key and one is public information called as public key. Now over years there have been different ways this key pair has been generated, one of the most common being using prime numbers for the famous RSA algorithm. Well over time RSA algorithm has proved to be more fragile than initially thought up due to its simplicity of the math involved or the padding oracles around it. It has just been more and more fragile with time. So the cryptography industry is trying to promote using elliptical key based key generation, key pair generation and algorithms based on that. So briefly talking about how this elliptical key cryptography works is there is this line drawn with its equation which has obviously its coefficients and based on number of points on that curve the key pair is being generated. So only one piece of information in this whole equation is private which is the actual private key out of the pair and the public is obviously public the coefficients are public and luckily most of the public key based algorithms have a counter elliptic curve algorithm available for it to be used directly out of its implementation. Well elliptical key cryptography is not a novel concept. It has been in existence since mid 1980s. Over time there have been a lot of curves which has been proposed and are in wide circulation. I would say just use the Edward curves with especially the 25519 and 448 curves. NIST has approved a lot of curves not all are sufficiently secured or has sufficient enough security strength. I would say if you have to pick a NIST based curve pick a curve which has security strength of at least 120 112 bits and ignore all other curves. Mentioning the public key applications and the corresponding API APIs for it for digital signatures we have the ed25519 and the corresponding 4481. For key exchange ones we have the same curves for its use and you also have a symmetric encryption algorithm ECIS for it. It's not as common and widespread used due to its limitations of the amount of data it can actually encrypt. Yeah that's what I wanted to talk about today about all different aspects of primitives. It's again very important for us to keep it secure and the responsibility lies among all of us to keep this world a more secure place. Finally I'd just like to point you to my GitHub repo where all the prescriptive ways of doing all the scripto primitives is being mentioned and