 Sometimes, people use the terms encryption and decoding, like they mean the same thing. And that is a common mistake, you may also see hashing be confused as an encryption mechanism. Confusing these concepts may lead to misunderstandings in the way security is implemented. In this video, you are going to see a high-level review of these concepts and understand the differences to help you be more secure on your applications. Let's begin with decoding. You can define encoding as a technique to transform data from one format to another so that it can be understood and consumed by different systems. Basically, encoding has to do with information representation. When you have some information, say the name of the mineral that we can superman, you can represent it through letters as in kryptonite. This is a handy representation for humans, like me and you, but not so easy for being manipulated by computers. What usually happens in this case is the transformation of these sequences of characters into these seconds of bits. When you take the word and the sequence of bits, you have two representations of the same information. The letter-based representation is usually understood by human systems, like me talking to you right now. And the bit-based representation is more suitable for computers. Commonly, you say that the sequence of letters has been encoded into the sequence of bits. So encoding is just a transformation from one data representation to another, keeping the same information. Usually, it involves a conversion table such as our dear ASCII table that was used in the kryptonite example. That maps the representation of a item in one system to the corresponding representation of the same item in the other system. You can find several encoding mechanisms out there, apart from Dear ASCII UHA. Unicode, which allows you to represent more complex items than letters, such as emojis and other symbols. You have Base64, which lets you represent binary data such as an image through text. And you have URL encoding, useful to represent arbitrary data in a URL where some characters are reserved or cannot be used. Think of spaces and columns, for example. Consider now the JSON Web tokens, Jots, for example. These three parts compose a token encoded using Base64 URL, a variant of Base64 combined with the URL encoding. That is your JWT. This encoding mechanism allows the token to be easily packed in an HTML and HTTP environment without the fear of clashes with preserved or unrepresentable characters. If you want to learn more about JWTs, you can download the JWT handbook linked in the description of this video. Encoding ensures interoperability between systems. Yes, I know, it is a big word. But encoding allows systems that use different data representations to share information. Encoding has no security purpose. Anyone that knows the conversion algorithm can encode and decode data. The conversion algorithm is not kept secret. On the contrary, it is public in order to facilitate the interoperability between the systems. Finally, encoding is a reversible process. It is a reversible process. You can transform one piece of data from one representation to another and then go back to the original representation without information loss. Encryption, on the other hand, is a technique that makes data unreadable and hard to decode for the unauthorized party. So basically, encryption is a mechanism that transfers data into a different representation so that prying eyes cannot understand it. And you may be wondering, is in this transformation the same as encoding after all? How can a human being understand that this sequence of bits represents the word kryptonite? Order the meat telling you it does. In fact, this question is not so far-fetched. In no way, encryption is a form of encoding. It transforms data from one representation to another. For this reason, sometimes people tend to use the term encryption and encoding as if they mean the same thing. However, the purpose of encryption is different from the purpose of encoding. The encryption technique aims at making data unreadable and hard to decode. This is the opposite reason of pure encoding. Encoding aims at making the data as much as understandable as possible across systems while encryption tries to make the data undistiffable unless they have an authorization for reading it. The main goal of encryption is to ensure data confidentiality, for example, protecting data from being accessed by unauthorized parties. So while encoding makes its conversion algorithm as public as possible, encryption should keep such algorithms private, right? Actually, it's not really like that. Relying on secret algorithms is not the best choice to protect data in the long run. Better solutions rely on well-known algorithms whose data transformation is based on sequences of numbers or letters called keys. Please do not create your own encryption algorithm unless you are a mathematics expert with a long experience in cryptography field. The best mechanisms to encrypt data are based on mathematical algorithms that can be solved only with the possession of a key or advanced computational power. There are two families of key-based encryption algorithms, the symmetric keys algorithms. These algorithms use the same key to encrypt and decrypt data, like the AES algorithm and the ACMATP algorithms. These algorithms use a pair of different keys, one to encrypt and another to decrypt data. The key pair are bound by a complex mathematical relationship and the ISA, for example, is an algorithm of this family. Like pure encoding, encryption is reversible process as well, although just for authorized people. Authorized people are the ones in possession of a decryption key. The challenge of authorized vessels and authorized people is to make data decryption without the key as hard as possible. This leads to applying a mix of questions such as complex mathematical relationship between the keys, keeping them secret and changing them frequently and so on. Now that you understand the differences between encryption and encoding, let's take a look at hashing. Basically, hashing is a technique to generate a unique fixed length in string, a hash, strictly depending on the specific input data. Since the generated hash depends on the specific input data, any change to the input data, no matter how small or big, generates a different hash. So think of it like this. Having a hash of a given piece of data, you can verify that the data has been altered by calculating its hash and compare it with the hash that you have before. In other words, hashing ensures data integrity. A hashing algorithm must have the five following assumptions. 1. The resulting hash has a fixed length. 2. The same input always produces the same output. 3. Multiple different inputs should not produce the same output. 4. It must not be possible to obtain the input data from the output data. And 5. Any change to the data input implies a different resulting hash. If you think about the third assumption that says that multiple different inputs should not produce the same output, it seems that while you should get different hashes for distinct input data, it can't be guaranteed. Actually, this point makes the difference between hashing algorithms. For example, MD5 has been a very common hashing algorithm in the past, but in 2008, it was deprecated to collision detection. The same happened to some early algorithms of the secure hashing algorithms family, the SHA family. It is also worth to mention that the fourth assumption that says that it must not be possible to obtain the input data from the output data implies that hashing is not a reversible process, unlike encoding and encryption. As you've seen today, encoding, hashing and encryption have the specific purposes and features. Confusing the capabilities and roles in your system may lead to disastrous consequences. For example, you may think that encrypting passwords is the best secured option, but in reality, it's a very bad idea. That's what Adobe engineers learned in the Databridge in 2020. The attackers who got access to their user database could break the encryption algorithm. Remember that encryption is a reversible process after all. Even if they don't have the decryption key, they may have enough time to guess it. Adobe reset the user's password as a countermeasure, but you and I, no users. We use the same password for multiple services. So even if access to Adobe services may be safe, access to other websites was potentially compromised. To avoid this type of breach, they should have used hashing instead of encryption to store users' passwords in a secure way. You know that hashing is not a reversible process. Attackers can determine the password from which the hash was generated. But also, simply relying on hashing is not the best option, as the liquidity breach from 2012 taught us. If you want, there are two articles linked below where you can learn more about storing the passwords used in hashing and how to use salt to store passwords properly. If you mind, this video was a high-level overview. So if you are craving a more technical comparison between encoding encryption and hashing, I also have a link for you in the description of the video. I'm looking forward to your comments, and I'll see you soon. Bye.