 As yw ychydig, mae hynny'n rhoi cymrydau. Mae hynny yn ymwneud. Yn ddiwedd a gwneud hynny, nad ymwyno hwn yn ymwneud ymwneud. Felly, neu rydyn ni'n gwneud, mae hynny'n 4 pwn cyhoedd. Felly, mae hynny'n rhoi yn gallu ein hynny'n rhaid. Rwy'n rhoi'n gwneud ymwneud. Dwi'n sgrifennu, rwy'n meddwl ymwneud am yr mae cymrydau. So, modern cryptography covers sort of three quite important areas. The first one is probably the one that everyone sort of thinks of first is message privacy. That means ensuring that any communications between two parties can only be read by the intended recipient and the sender. Another sort of facet to modern cryptography is being able to verify a message. That's ensuring that a message that you've received is the message that was actually sent by the person that sent it, and it's not being tampered with between you and the sender. And when the final sort of main areas that it deals with is identity verification. So ensuring that the message that you've received did actually come from the person who claims to have sent it and it's not a fake message that you've received from someone who's trying to play a bit of a joke on you. So that's that sort of third area. Now, if you've been to any other talks, maybe this conference or another one, on application security, you may have seen this talk, I think this is from last year or two years ago. It looks a bit like this. Cryptography is really quite a hard thing to get right, especially if you start diving into designing your own algorithms and things like that. And so the main purpose of this talk is to take a brief journey through the evolution of cryptography right from the very, very beginning, all the way up to the algorithms that we're using today, and try and give you a bit of an appreciation of quite why it's so hard and why it's important to get it right. So I'm going to start with some historic ciphers, which you can actually do on pen and paper if you want to. And then I'm going to move on to some of the inner workings of things like AS and RSA, crypto systems, which are in use today. So first on to the historic ciphers. The first one we're going to take a look at, you've probably all heard of this one, this is a shift. It's probably about the simplest cipher you could think up. And the basic idea is you take a message with some words and in order to encrypt it, you shift characters in the message up or down the alphabet by a fixed amount. That looks a bit like this. And you can see we've got like an alphabet at the top. And then in order to apply the C as a shift transformation, encryption on a message, we sort of shift each letter so an A becomes a D, and then so on and so forth. It's not going to be a very short talk if this was the state of the art modern cryptography. So what's actually wrong with this cipher? Why should you not use it? Well, it turns out that for a given alphabet, there's a very, very small number of possible keys. Obviously, you've only got 26 letters in the English alphabet, and that gives you 25 possible different shifts that you can use to encode a message using the C as a shift. Obviously, if you use a 26 one, it just encodes back to itself, which isn't very useful either. Even if you were to use binary ASCII and an apply C as a shift, you're still only looking at 255 different possible shifts. That would mean that anyone wanted to read a message that you've encoded using the C as a shift, could simply just run through, write a little script, try every different possible shift and see which one made a message that makes sense. And that would be really easy for them to quickly run through and decode your message. This is quite important to us because it illustrates a really important aspect of a strong cipher, is the fact that it's got to have a large number of possible keys. That's preventing any attacker from just iterating through them all to try and decode your message. So, move on to maybe another straightforward cipher, sort of an evolution, step up, is a substitution cipher. Now, this works in a sort of a similar way, but instead of just moving the letters up, we randomly shuffle them around. So, you just swap a letter in your plain text for a different random letter. And this looks a bit like this. So, in this one, I've sort of picked an encoding, which changes A to a Z, a B to a N, C to Q, et cetera, and you do that for the entire alphabet. How does it hold up? This one is actually significantly better than the Caesar Shift algorithm, using just English for your substitution and just letters. There are 403 septillion, 291 sextillion, 461 quintillion, 126 quadrillion, 605 trillion, 635 billion, 584 million possible different keys. It's a quite big number. So, with such a large number of keys, you'd think that this is a really quite strong cipher, right? Why aren't we using a substitution cipher to store people's bank details? Well, turns out the weakness of the substitution cipher doesn't actually come from the low number of keys. Its security is equivalent to about an 88-bit key. So, that's obviously not as strong as some of the modern ciphers of 128 or 256 bits, but it's still pretty strong, and it's only just within our computational capabilities to brute force a key of that size. Instead, the substitution cipher falls foul of probably the greatest nemesis of cryptographic algorithms, which is statistics. Due to the simplicity of the cipher, it fails to hide any underlying patterns in the data that you've encrypted with it, which means that if you want to recover the original plain text, if you've got a message that someone's sent and you want to read it, you need to just look for the patterns. If the pattern is English text, this is a graph of all the various letter frequencies in the English language. So, what you can do is you can count up all the letters in cipher text that you've intercepted, and you can plot them on a graph like this, and then you can take a bit of a guess and say, well, the letter that occurs most commonly, that's probably the letter E, because that's the most common letter in the English language. Once you've done that, you can get maybe a few others. There's a peek at around S and T and some of the other vowels, A and O. You can start to decipher bits, and once you've got little bits, you can maybe guess at words and say, well, that looks like the word V, which is one of the most common words. That gives you a couple more letters, and you can start decoding it. And bit by bit, you can sort of piece together the full text of the message. A text is a little bit of work, but you can automate it with a script quite easily, and you can decode this on paper even. So, it's not really very strong. This substitution cipher is quite an old cipher, and during the 1500s, 1600s, people sought to improve upon this, because they realised that this wasn't hiding the patterns quite so well, and they came up with this cipher called the vignere cipher. Now, this is one of a group of ciphers known as polyalphabetic ciphers. These are so-called this, that they're just using one possible encoding for each letter in your plain text. It uses several different encodings. That helps it better disguise some of the underlying patterns in the plain text. How this works for this cipher is in order to encode a message, you need to first of all pick a key. This simple example here, I've picked the key key. Probably not a good idea to use that as your actual key, but it works quite nicely here for an example. What we do is, using the letters here, we set A to be equal to the first letter of the key, and we get a Caesar shift for that. We do the same for E and the same for Y, and then we've got several different shifts that we use one after the other. That looks a little bit like this. We've got a message that we want to hide. That's a pretty secret message. I don't want people knowing that you're about to blow some up. You take the key, and you encode the first letter of your message using the first Caesar shift, represented by the K letter. The second one with E, next one with Y. Once you run out of letters in your key, you repeat it, and it looks a bit like that. We continue reusing this key, and we get out of the cipher text at the bottom. Anyone want to guess, is this cipher secure? Nope. Absolutely not. It did take a while longer to break it, and it was credited to Charles Babbage, who was quite a well-known historic figure. However, it wasn't until 1985 that it was actually recognised, because he did it, and the British government kept it a bit secret, because they didn't want anyone to know they'd broken it. Until then, someone else, Frederick Kaziki, had been credited with the discovery, he discovered it a bit later, and it's him that the technique's actually named after. Unlike the simple substitution cipher, you can't actually use frequency analysis if you've got a cipher text encoded with this, and if you were to do frequency analysis, you'd probably see something with a fairly flat graph. Providing the keys long enough, you'll find that the letters' frequencies are fairly similar to each other, and there's nothing that you can pick out fairly easily. However, despite the fact you can't do frequency analysis because of the multiple encodings, what you can do is you can start looking for repeated sequences of letters in the cipher text. Now, what these repeated sequences tell you is possibly that's where the key has repeated in the cipher text, and so you can count the distance between these repetitions, and you'll notice that that's corresponding to the same word, being encoded and being 12 letters apart. So, this suggests that length of our key is a multiple of 12. Oh, sorry, 12 is a multiple of our key length. So, the key word that's possibly been used to encode this text could be 2, 3, 4, 6, or even 12 characters long. And then that's helping us to narrow down what the key that was used to encode it might be. So what you can then do is, I mean, I'm cheating bit because I know how long the key is, I can take a guess that maybe the key's three letters long, and I split the message into three groups, taking the first letter, the fourth letter, etc. into the first group, the second letter, fifth letter, etc. into the second group, and the third letter and so on into the third group. You can then perform frequency analysis on those subgroups, and it should produce a graph that looks similar to the one a few slides back. Once you've done that, you can sort of take the graph that you're looking at, try and line up the largest letter with E, and then that sort of helps you to work out what the key word is at that position. Once you've actually done that, you'll be able to recover the key and decode the right message. Okay. Now, when I first get this talk, I put together this little cipher challenge. It's a little bit more difficult to put together a cipher, but very, very similar process. So if anybody fancies having to go at breaking one of these themselves, there's a link there. I've got it at the end of the talk, so I'll put it back up if you don't manage to copy it down right now. Let's move on to a different cipher. It's one probably everyone in the room I would hope has heard of. Enigma. It was about another 100 years after the veneer cipher came around. World War II saw several attempts to mechanise cryptography. So move it from algorithms that you could perform on pen and paper to actually using machines to perform cryptographic operations. Obviously the most famous attempt was the Germans enigma machine. Although during the war all the major powers, the Americans and the British were all using machines that worked in a fairly similar way to encode their text. The Americans and the British one was a little bit more secure than enigma. But same underlying principles. So an enigma machine looked to be like this. Made up of four parts. A keyboard which was for typing your message in. A set of rotors on the back. Each of these acted as their own substitution cipher. There were commonly three rotors in a machine but some of the machines featured up to as many as eight of these rotors that could be interchanged. The military machines that the Germans used had five different rotors which they could use in any combination so they could use rotors one, two, three on day and three, four, five the next day. Later on throughout the war they expanded this to a set with a total of eight rotors which provided a little bit more security. The final component at the front here was a plug board. Now this switched around pairs of letters with wires and you could use up to 13 different swaps but only usually about 10 of them were used. This actually was one of the features that gave the enigma machine quite a lot of its security. The ability to flip these letters that's where quite a lot of the security came from. So this is sort of how it looks like inside. It's an electrical machine and so when a typist presses a key on the front of an enigma machine an electric current completes a circuit which goes through the plug board through each one of the rotors and gets sort of mangled up then at the back of the machine there's a reflector which is literally just a load of wires in the circuit back through each of the three rotors back through the plug board and finally it illuminates a lamp on the top of the machine. Then the radio operator would send that letter and then all the rotors would move around and when you press it next time it took a different path through the machine and came up with a different letter. Obviously it stepped the first rotor one at a time and once it had gone all the way round then that stepped over the next one once that had gone all the way round it stepped the next one and so on and so forth. That meant that each letter could be encoded in a vast different variety of different ways each time you use the same letter. By changing the mappings this way it meant that the mapping of the plain text and side text was constantly changing and it makes it really difficult to do any sort of frequency analysis or anything like that on the letters and the crypt analysis at Bletchley Park stumped for quite a while on how to actually analyse these messages. Obviously the enigma is quite famous but one of the things it's famous for is for having been broken. A team of British cryptographers at Bletchley Park led by Alan Shearing who's also quite a famous guy I've heard. So designed machines to help them break the mechanical ciphers. The breakthroughs that the team made weren't particularly based on weaknesses in the cipher and the algorithm itself but mostly on sort of operational errors made by the Germans. Examples of those errors included choosing bad keys like AAA as a sort of initial setting for the rotors. This made it fairly trivial to decode. And having predictable message structures. So for example the first message most of the German military units would send in the morning would be a weather report which would contain the German word for weather. Wetter. And this was the weakness one of the weaknesses that the device designed by Alan Shearing sort of seized upon and it sort of looked for the word wetter in the decrypted text. And then sort of like stepped through all the possible different combinations of rotors and pool board settings to try and find this. And they had full sort of like rooms full of these machines that Alan Shearing and his team designed that whenever they got a message in first thing in the morning one of the machine operators would rush in put the message into the machine it were away for a couple of hours it'd find out the key and they'd change the keys once a day. Once they'd found the key for the day they could decrypt all the messages that the Germans were sending for that day. And obviously that did give the British and their allies quite a big advantage in the war because they knew exactly what the Germans were up to. So obviously this break was quite a big thing. So we're going to leave the historic cyphers there now. We've gone through all the way up to the beginning of the last century with the halfway through with those machines. And we're going to take a look at some of the algorithms that are actually useful to us today for storing data securely. Modern day cryptography as I was leading to at the beginning can be broken down into several different problems that we need to be able to solve to communicate securely. The first one of those is confidentiality. We need to ensure that people other than the intended recipient can't read our messages. There's a wide variety of different algorithms which people have built for doing this. But probably the one you want to be using at the moment is AS, the Advanced Encryption Standard. Most of these algorithms are symmetric. That is, they use the same key for both encryption and decryption. There's two main classes within that. Stream ciphers which work on continuous streams of data and block ciphers which break the message up in separate blocks and encrypt each separately. I'm going to show an example of each in a few minutes. Another thing that we need to solve is key exchange. Obviously, if I want to communicate with you securely, we both need to have a key that we can use to encrypt messages. One of the ways that we do this is with asymmetric ciphers. These are ciphers that use a different key for encrypting as they do for decrypting. Then we can use this. You can give me the encryption key fairly safely. I can use that to encrypt a message to you, but I can't subsequently decrypt any messages that are encrypted with that key. Only you can with the other part of that key. That's quite an important thing that we can do. Another thing that we can do is verify the identity of a sender. Again, it works in a similar way to key exchange. You can sign a message using a private key that you keep to secret. You can publish a public key and you can say this is my public key. Anyone when they receive a message from you can use that public key to verify that it was signed with your private key. It's known as a message signature. Again, I'm going to go into detail on this later on in this section. Another thing we need to do is authenticate a message. Make sure that it hasn't been tampered with. For this, we use cryptographic hash functions generally, such as SHA256. So when you can receive a message, you can compute the message hash and compare it to one that's maybe been sent along with a message and signed with the private key of the sender. If it doesn't match, you can reject the message and say someone's mess with this, tampered with it, send it again or whatever. Obviously, you need to combine that hash with a secret key of some form. Otherwise, if someone tampered with it, they could just re-compute the hash. A final thing that quite a lot of people don't maybe realise is a part of modern cryptography is the ability to generate random numbers. A large number of secure protocols rely on being able to generate random numbers that are actually random. One example is if you're using a public key cryptography system, you might be able to generate random numbers with a public key cryptography system. You might generate a random key to use a symmetric cipher and encrypt that with the public and private keys and send the whole lot along. Now, if someone can predict the number that came out of your random number generator, they can guess what that key was and they can just forget about trying to break the algorithm, they can just decrypt the message. Being able to generate secure random numbers is really important. Symmetric ciphers. As I sort of previously mentioned, there's two classes of symmetric cipher. There's block and stream ciphers. All of these algorithms are really only useful for dealing with message confidentiality. There's no symmetric algorithms that you can really use for key exchange at the moment and it still doesn't really solve that problem. I'm going to start by looking at stream ciphers. How a stream cipher works is it produces a constant stream of pseudo-random output bytes and you use the secret key that you're using to encrypt the message with as a sort of a seed to this generator. The produce bytes from the generator then x-word with your plain text to produce a cipher text and then you can send that along and the person on the other end can produce the same pseudo-random stream of bytes and use that x-word with the cipher text to recover the plain text. It's actually fairly similar to how enigma works in a way. It's sort of stream ciphers sort of evolved from the enigma machine. There are several stream ciphers that we currently use today. Probably the most well-known is RC4 which is used in WEP and SSL. But I've chosen a slightly different one which is called A5-1. Now you've probably never heard of this algorithm but I can almost guarantee that every single one I do in this room is using it because it's used to protect voice and SMS data in mobile phones. This algorithm has actually been broken. It's no longer considered secure. It was state-of-the-art a few decades ago and now it's quite not. But it's an interesting one to look at as a stream cipher example. How it sort of works is a bit like this diagram here. It's got a big state machine in the middle and it consists of three registers each of a different size and what it does is each time you want a new bit out of your enigma generator it takes the top bit out of each of these registers extols them together and that's the output bit that's produced. Once it's done that each of these registers is shifted to the left and depending on this byte here it's got what it's called clocking bits so it compares each of the bits in the register OK, we'll take the majority so if they're all zeros the majority is zero and any that match that majority bit are clocked and any that don't match it are left so if you've got sort of say one zero one the majority bit there is one so any registers that match that the first and the third get clocked and move to the left one. In order to generate a new byte onto the back here it takes the bits that are called in blue extols them together and produces a a new bit on the end so take a look at that in more detail how it actually works when it clocks this is the uppermost register from the previous diagram you can sort of see it's got some sort of numbers in there and we clock it once and everything shifts left x all these bits and put it back on the end next cycle we do the same thing x or it put it back on the end a really long sequence of different bytes and obviously we've got three of these so you get quite a large sequence of run of data coming out of it now stream ciphers are quite useful but they do have a few things that you need to keep in mind when you're using them the first one is that keys must not be reused because of the way that it combines the output of the cipher using xor so it will always produce the same output bytes so if you if you encrypt two different messages with the same key somebody can actually use those two messages to start recovering parts of your output stream and therefore they can actually decrypt your messages to guard against this a lot of stream ciphers include what's known as an initialization vector or IV which is combined in some way with the secret key and then you sort of send that IV along with your message as part of it and then someone uses the same algorithm to combine that key with your IV and then that makes sure you're always using a different key for each message W P E P is actually vulnerable because of this although they use an IV in the initialization of the RC4 algorithm the IV that they picked was too small which means that over some time if you're sending lots of Wi-Fi packets back and forth you'll eventually repeat not just the secret key but the IVs will repeat that means that once someone detects two messages using the same IV they can use those messages to decode the output stream and at that point they can sort of recover the key for the network and connect to your Wi-Fi network and sniff your traffic and things which is sort of why we've phased out W P in Wi-Fi these days it's kind of easy for an attacker to modify a message so let's say you're downloading a HTML page for website and you're encoding it with a stream cipher anyone who can sort of guess that maybe there's a JavaScript file in the header of that can actually compute an XOR with what they think might be in the message and what they really want in the message so let's call that into the ciphertext and that will actually replace it in the ciphertext and it'll decode to what they want it to rather than what was sent this means that when you're actually using a stream cipher you need to make sure you've got some sort of message authentication to prevent this tampering such as a message hash it's also it's not so much a security concern more of a practical one most stream ciphers come all the way through to decrypt a message and you can't sort of arbitrarily seek into a stream cipher some of them have been designed to allow this but most do not this means that if you're say you've encrypted a huge 50 gig database backup using a stream cipher and you need to recover one table worth of data from that backup and you know that it's 20 gig away in you're still going to have to decrypt the first 20 gig to get to that table data you can't just sort of seek into it and decrypt just the bit you need so that's something to be aware of next thing we're going to look at is the block cipher the key difference between a block and a stream cipher is that when a stream cipher produces basically a pseudo random stream where you can use to encrypt a block cipher actually works on a block or plain text directly and applies various different mathematical transformations to it now the size of block works differently depending on the algorithm but it's usually much shorter than any message you might want to send AS for example uses 128 bit blocks older ciphers tend to use 64 bits so that's obviously a lot shorter than any message you're going to want to send so you need to break up your message into blocks and encrypt each one separately and that's where the name comes from a block cipher so obviously the most famous one you've heard of is AS and that's the one you'll be using for a lot of your day to day encryption needs AS was the result of a cryptography competition to find a replacement for an older encryption standard and it was eventually won by a slight variant of the Rhyngel algorithm now you might know if you've been using the Mcrypt extension that there is a different flavour of AS that's just called Rhyngel and if you use the wrong constants with Mcrypt you use that one instead which is not quite what you want but I've got you there fortunately it's been deprecated but if anyone's still using it it's something to look out for so how AS works it's got so you start off with your message and you apply the AS algorithm to it repeatedly for a few cycles depending on the key length is depending on how many actual cycles you run through of AS each round of an AS consists of four distinct phases which is substitute bytes shift rows mix columns and then it adds a portion of the secret key to the data and then it repeats the loop again so each one of those rounds is applied every time you loop through it looking at a bit more detail the substitute bytes it works on a block size of 128 bits and it's effectively got a substitution cipher with a fixed key now the key has actually been chosen to try and avoid a number of different algorithmic and cryptographic techniques to provide defence against various techniques that were used against the previous SDS so the bits in the substitution table have been chosen specifically and what happens basically is it just takes the current byte in the state looks up that byte in the substitution table and replaces that byte in your current block with the byte from the substitution table the next thing that happens is each row in your sort of block is shifted using a bit shift operation and then you sort of rotate the first byte round to the end so it sort of does that looks like that diagram the third step is a complicated mathematical operation but it effectively does a multiplication over each column and it does that to sort of mix up the data in a different way for the final round of the cipher this step is actually skipped over so it doesn't occur in the last round but it's every round other than that and then finally it takes a byte from your key to a byte in the block and then produces the final output once you've gone through all those steps a number of times I think it's 14 rounds for 128 bit key then you output a block of cipher text it's quite an efficient algorithm it's implemented in hardware in quite a lot of processes these days so it's very fast to compute and that's sort of how it works now obviously block cipher you're going to want to encrypt more than 128 bits at a time and so when we use a block cipher a message that's longer than that however there are quite a lot of different ways in which you can utilise a block cipher and these are known as modes of operations so one mode is known as the electronic cookbook it's probably the one that you'd sort of think up first is you literally encrypt a block of text and then the next one and the next one and just append them so you've got a plain text you've got a key you do your block cipher encryption and you get an output block you do it with the next block and you just start to append them believe it or not this is a really bad mode which you shouldn't use although a block cipher produces a really random output and it's really difficult to reverse that any block any piece of plain text that's the same the same 128 bits will come out as a cipher text that's exactly the same this actually gives you the same problems as with a substitution cipher where someone can actually look at statistical patterns in your data obviously it's a bit more difficult but take a look at this penguin tux Linux mascot and that's tux encrypted using AS in ECB mode you can quite clearly see that tux is still there now if this was an image that maybe you wanted not for people not to be able to see having it like that probably wouldn't be your desired result so you should never really use ECB mode unless you just want to make cool pop art images like this that's a legitimate use I guess so a bit of an improvement the first sort of attempt to fix that sort of problem was called cipher block chaining how that sort of works is you take an initialisation vector similar to a stream cipher and you feed that in for your first block and you xor it with your plain text pass it through a block cipher encryption and then you get your cipher text for your next block you pass that cipher text back in xor with the plain text pass it through this adds a randomiser to your plain text so even if you've got two blocks that are the same because this and this aren't going to be the same they're going to come out differently it's obviously a bit similar to a stream cipher in the fact that you need an initialisation vector and you should not reuse those but that sort of half solves the problems of the ECB another another mode that's being used is called CTR mode or counter mode this is quite a cool mode in the fact that it turns a block cipher into a stream cipher with a few advantages so it completely differs from the previous two modes that we've looked at as we no longer directly encrypt our plain text what we do is anons or initialisation vector again and we have a counter which we start 000 and we encrypt that instead and xor it with our plain text which produces a cipher text then for the next block you increment the counter encrypt that and xor it you keep on going until you've got enough bytes to encrypt your whole message now due to the random nature of the output of a block cipher it effectively turns this into a stream cipher now obviously with CTR mode you need to set the same precautions as with a stream cipher such as not reusing the IV and things and ensuring that you've got a sort of a message authentication to make sure it's not been tampered with something else that this gives you that a stream cipher doesn't is you can actually seek into your encrypted text if you want to get block 557 all you do is take your nons increment the counter to 557 pass it to the block cipher and you can decrypt into your encrypted data you can decrypt it without having to decrypt everything up to that point another mode which sort of helps alleviate the issue that we had with CTR mode in the fact that it's a stream cipher without authentication is galoris counter mode what this does is it combines the counter mode with an authentication tag which helps verify that the message hasn't been tampered with and how that works is first of all you've got up here you've got the counter mode going on up here as well but you also have a separate part of your key which is auth data here which goes through a multiplication function so you start extorting it with each of your cipher text bytes and you chain on lots of different multiplications of your cipher text and then finally add on the length of your your cipher text and data and that produces you an authentication tag which when the person who you send the message to receives it they can verify that authentication tag before decrypting the message and make sure it hasn't been tampered with that's sort of how that works so that's sort of covered off a lot of the symmetric ciphers all of those ciphers that we've looked at so far including even the historic ones used a single key for both encrypting and decrypting that leads to a really hard to solve problem which is key distribution imagine for a moment that you're an undercover agent Alice is undercover and in DPNM operation and needs to get a secret message back to Bob at HQ to send mission reports and tell Bob what the evil overlord is up to and they obviously need to do so without the agents of the enemy Mallory and Eve Eve's dropping and modifying the messages this is where the asymmetric cipher actually comes in because they use different keys for encryption and decryption we don't need to sort of pre-share keys so I can just keep on the encrypted message to you without having to worry about keeping my key secure and things like that and even if I send a message out and it's intercepted and the enemy capture it they won't be able to search my room and find the key that I use to encrypt it because it's a public key that can't encrypt it again so looking at public key cryptography quite a good good way of thinking about this is to think about padlock so there's a padlock now imagine that I want to send a message to someone at the back of the room although I want to send one to me what I could do is I could give them a metal box a nice sturdy metal box and they could write a message on a piece of paper and I could give them my padlock and they could attach this padlock to the box and they could pass a box back from the back of the room passed all you people who we don't really trust back to me and none of you, unless you've got some bolt cutters brute force are going to be able to undo that padlock but as soon as it gets back to me I've got the private key I'm keeping over here and I can unlock it and I can read the message that's kind of the same idea that public key cryptography uses one of the first and the oldest public key system is RSA despite the huge amount of processing power improvements and things like that we've made since its invention it's still quite a secure algorithm it relies on a mathematical problem which is quite easy to compute in one direction but is really difficult to reverse and how it's sort of if you've just set this mathematical problem it's really difficult to reverse but if you happen to know a secret that was used when constructing the problem it's really easy for you to reverse again we have actually come up with better public key algorithms since but only by the virtue of the fact that they're actually more efficient to implement in code and they use less CPU cycles to run they're not actually much greater in the level of security they provide it's just that you can use shorter keys with them so the mathematical problem that RSA based on is exponential in modular arithmetic so the idea is that you can find three rather large numbers E here D and a prime number N and such that when you take any number and raise it to the power of E and then raise it to the power of D it equals itself modulus your prime number N so it's sort of cyclic now if you've got a message that's been encoded like this and you only have E it is really really difficult to figure out what D is mathematically however if you already know D you can actually reverse it so you can make this number N and E publicly available and you can pose as a mathematical problem to the world safe in the knowledge that it's really really hard for them to solve it and so in order to actually use this for a crypto system what you can do is you've got my public key which is E and N and you can take the message that you want to send to me you can raise it to the power of E take its modulus via my prime number N and that becomes a ciphertext you can send that to me in the knowledge that only the person with D can do this multiplication here and reverse the process and turn it back into the plain text M obviously a slight issue with this scheme is that the message M must be smaller than this modulus use so usually what you'll do with RSA and similar public key algorithms is that you'll generate a random string that you use for a symmetric cipher like AS and then you'll use RSA or another public key algorithm to encrypt just the key part of the symmetric cipher and then you'll send the ciphertext from the symmetric cipher the encrypted key to your intended recipient they can then use their private key to recover the random key and then they can decrypt the message and that sort of solves the key distribution problem another thing we need to solve is identification of identity verification so although Alice and Bob can send messages to each other securely without Eve being able to eavesdrop and recover those messages how can they protect from the mischievous Mallory who likes to tamper with messages and change them turns out you can actually use a similar thing with RSA if I take a message say a hash of the ciphertext that I'm sending to you and I raise it to the power of my private key D I can send that as a signature when Bob back in the office receives that message he can raise it to the power of my public key and if it has been signed using my private key it will return back to the value of the hash he can then hash the message check it matches and then he knows that it was sent by Alice obviously if it doesn't match then something's gone wrong and he knows that the message has been tampered with or didn't come from Alice in the first place again in practice that message must be short so the signature is usually a cryptographic hash on the message like SHA256 I've covered a selection of all the algorithms that are in use today that solve quite a lot of the problems in modern cryptography so this last five minutes or so we're going to have a look at how you'd actually go about implementing cryptography in applications if that's what you need to do so the first bit of information of advice is don't I hope that you've sort of got an idea from this talk just how many things you need to keep an eye on and how difficult it is to actually do these things securely so if you've got a need to encrypt data don't try and do it yourself as much as possible it's very easy to introduce vulnerabilities into applications through things like side channels someone measuring how even if you take AES which is a secure algorithm an implement it if you don't take care of things like how long it takes to encrypt the data so one might be able to retrieve information about your key or your plain text just by measuring how long it actually takes to encrypt messages and things like that so the best bit of advice is use an existing implementation everything I've gone over today is battle tested and hardened and it's been poured over by cryptographers for at least a decade and there are well known well tested, well used implementations out there that's a good first starting point all the major web servers have support for TLS use SSL for your connections between servers don't try and manually implement in your application some sort of encrypting send it over HTTP decrypt on the other side just use HTTPS it's well tested it's going to work okay another option if you've got like two remote data centres you could use a VPN between the two or maybe an SSH tunnel again, all of these protocols and algorithms have been well tested there's lots of people going over the source code all the time the patch and the security vulnerabilities and things like that you're benefitting from the knowledge of people who do this all the time if one of those situations doesn't fit your use case if you do actually need to implement some cryptography bring in an expert bring in someone to audit your code and make sure you've not made any mistakes cryptography isn't a skill that most developers are really tip top on so although you can probably implement it bring someone in, an expert an outside consultancy and get them to just make sure you've done it right it might seem expensive to bring someone in to do that but it's nothing compared to the costs if you become sort of like if you get hacked and you become the next like Ashley Madison or Sony that sort of sort of get all that data spewed all over the internet especially with the GDPR coming in now that can be quite costly if you make a mistake with this stuff so bring an expert it'll save you money and worry in the long run obviously there's a PHP conference so what should you do if you actually need to encrypt and decrypt stuff in PHP there's a lot of libraries out there I've reviewed a lot of them there's quite a few that default to using pop-up penguin mode and insecure other options and if you were to just download it with Composer and be like yep that's a library I'll use that you'll end up with a few problems like people being able to view your images pop-up versions of them so top of my list of recommendations is a library that Scott Akazuki has written which is Halite Halite's a wrapper around Libsodium which is a library that's been written by cryptographers to limit the amount of choices that developers are given you've got one implementation of things and that's a secure implementation that's the sort of design goal of Libsodium and Halite just provides a high level interface for that which is really straightforward it looks a little bit like this that would be how you would encrypt something using Halite the wrapper around Libsodium it's that easy and that will be secure okay so if you need to do something that's your best option Libsodium is in PHP 7.2 you can install the extension Peckle for versions before 7.2 if you're in an environment where you can't install it Scott's also written a polyfill in PHP which you can actually include and it's got all the same algorithms written in PHP so you've got a fairly good range of options there if you want to use that if for some reason you can't use Libsodium for whatever reason maybe you need a compatibility with the legacy system or something like that diffuse PHP encryption is another good library that's implemented in PHP it uses a lot more the older style algorithms AS and an RSA for its cryptography and it tries to implement those in a secure way to like open SSL if you've got that installed and things like that so it'll still be quite performant so that's another option it's got a fairly similar API to Halite so it's pretty much diffuse encrypt and diffuse decrypt so it's easy to get right so that's pretty much the end of close to the end of the talk so I've got a few links now for anybody who's interested in finding out more there's a wealth of really interesting stuff on cryptography one of the best ones if you're interested in the historic side of the talk is Simon Sings the code book he's got a few other ones I went over and a few other ones and there's also some really interesting case studies on what happened when cryptography went wrong it's a really good resource he's also got a website where you can actually try out some of these ciphers and you can encode and decode messages and do things like that another good resource if you're interested in modern cryptography Bruce Schneier's site Bruce Schneier's a well-known cryptographer worked for a lot of leading companies on security and things like that he also created Blowfish and the Bcrypt hash function which you're hopefully using for your passwords so if you do feel like starting to mess around creating your own algorithms he's got a self-study course on his website which starts off showing you how to with increasing difficulty of algorithms you can have a go at breaking yourself because one of the things in order to become a good cryptographer you need to be able to prove that you can break other algorithms that way anything that you've done you know that if you're really good at breaking algorithms and you can't break your algorithm it's pretty good but if you're just like someone who's just turned up and you're like yeah I've just invented this thing how can we trust that it's any good well if you've got a proven track record of being able to break cryptography and you say it's good you can probably trust that so if you're interested in getting into that side it's pretty good the final link there is a library called PHP Crypt do not use this in production absolutely not but it includes a pure PHP implantation of quite a lot of interesting ciphers so if you actually want to have a look at the code implantation of some of these including the historic ones I think it's got enigma in there and a few others it's quite a good one to just have a look at the source code and see how they tick underneath but yeah obviously none of those are suitable for anything other than sort of like your own personal interest in cryptography right I hope everyone sort of learnt something from this talk and found it interesting my twitter handles give up already I've also got github with a few projects and things like that one interesting post on my blog if anyone's interested I broke an algorithm that someone implemented themselves one of my blog posts is actually showing why you shouldn't actually do these kind of things and finally there is a joined in link if you want to rate this talk let me know if it's any good