 Hello everyone. Welcome to this talk, this recording for Secure Code Development and Lessons Learned from HCD Security Edit. My name is Sadeh Zalla. I am a senior software engineer at IBM. And in context of this talk, I am one of the HCD project maintainer. And I have Hitoshi Midake with me. He is a site reliability engineer at Indeed. And he's also one of the project maintainer for the HCD. All right. So in this talk, we basically will briefly cover a couple of the best practices, like identifying the areas where you should be paying more attention, as far as security of your project or software is concerned. We'll talk about code analysis. And then we will have Mitaki talk about HCD security model. We'll talk about some of the vulnerabilities examples that, you know, they are real life examples that we run while doing the HCD security audit. And then we'll talk about, you know, working with GitHub to publish series. And then we'll conclude the talk. We will not go into the, you know, the full software development life cycle and security, who will pretty much, you know, focus on the coding aspect in other things are out of scope for this talk. So, all right. So there are a couple of things that we really want to emphasize here, you know, as far as the secure coding practices are concerned. The first one is, you know, identify, you know, high security areas, right, that you want to pay a special attention. And, you know, how do you do that? Well, refer to your project architecture, right? Your software architecture that will help you to identify those areas where you know that it needs a special attention. And we'll talk more about those checklists in the next slide. Then we'll talk about, you know, the most important part, right, the code assessment. There are like two main methods in automated code assessment using the automated tools and doing the code review manually. You want to make sure that you define the role of the assessment team up front, right? So that will help you as you go forward with the process. And you also want to make sure that you plan on how do you, how you will be addressing the findings, right? In short term, like, you know, as soon as possible, and then have it planned for the longer term, something that you can, you know, apply so that in future, you can prevent any of these issues that you ran into it as part of your code review. And then, you know, depending on your project, depending on your software, you may want to plan on, you know, publishing the CVEs, right? You may not need, you may. So you decided, but CVEs, they help you, you know, typically to disclose any security flows with your software publicly, right? It helps the consumer of your software, your project to stay relevant. All right. So, you know, the things we'll talk, they apply to, you know, pretty much any development environments, you know, any programming language that you are using. But for this talk, we will use, you know, GoLang and the Etsy project as an example. Okay. All right. So I mentioned earlier that it's very important to identify the areas that you think are, you know, critical, crucial for your project security, right? So here's a list of some of these areas, right? You can have more than what I'm going to mention here. But these are some of the things that you definitely want to keep in mind while you, you know, worry about the security of your project, your software and your application. So as you know, the first steps for, you know, security or software security is, you know, it starts with authentication, right? So for external users, for internal users, you know, regardless, make sure you have a good authentication method in place, right? And then, you know, you are doing a lot of things like, you know, not just accepting any password, right? Have some validations there, password length and, you know, that kind of things. And the follow-up step is authorizations, right? So once users get authenticated, you're just making sure that they only have access where they are authorized, right? So, you know, using like a role-based access control. TLS certificates, you want to make sure that you have encrypted communication, right? And you don't allow the monkey in the middle interface. Input or data validations, you know, when it did all input parameters, right? You know, data type, length, range, whatever you can think of, do the proper validation, right? File permissions, it's another important area. You know, make sure that while creating new files, while working with any existing files, directories, you are checking and handling the permissions, you know, and pay special attention to any third-party libraries or tools that you are using for file management. We will actually show you an example later in this talk. At a handling, that's another important area for the overall health, the security of the project or software. So make sure that, you know, accidentally, you're not leaking any information that can help, you know, hack or strike, for example, the password. And then logging, right? Anything that you think can helpful towards identifying any security issues, you know, alerts, right? You should be logging it. You know, that can help you with the auditing purpose later on, like where the request is coming from, who is sending the request, right? Data exposure, to make sure that you are not exposing any sensitive data. So keep that in mind. And, you know, the last but not least is, if you are using a third-party tools, then make sure, you know, you use their letters, right? And you keep eye on any CVE advisories, okay? All right. So code analysis, as I mentioned earlier, that is critical for the, you know, overall security of your code, right? Just making sure that you know things up front while your development or, you know, after your code's in place, right? Make sure you do the assessment, right? And there are two main ways, whether it's going to be an automated way using the tools or you can do the manual code review. So when you talk about automated tools, you know, you have, you know, either it's going to use the static tools, right, that can prevent, you know, you know, mistakes that can, you know, if not directly, maybe indirectly impact your security of code, right? So, for example, like variable shadowing, you know, the unreachable code, right? So I'm sure, you know, most programming languages, they have tools specific to language, right? For Go, you have actually a whole list of tools available. I have provided link here. You can take a look to see the whole list. But like, you know, tools like Go-Way, static jack, you should be using. And then dynamic tools, right? So like fuzzer, right? That can help you find bugs by, you know, they, by automatically injecting randomly, injecting data and, you know, find, you know, different kinds of bugs based on their, the injection, data injection. The second thing is manual code review. You know, tools are great. They can take a look at the code automatically, you know, and point out to the possible issues. But definitely a human needs to verify those issues, right? And make sure they are the real issues. And, you know, also the development team, they should be, you know, doing the thorough code review of all the areas we talked about, right? And make sure that it looks good to them, right? And keep in mind that there's no alternative to, you know, manual code review, right? That is critically important. All right. So, you know, with that, you know, before we talk more about the security model, before we talk about some of the examples of the vulnerabilities that we ran into as part of the audit, right? And, you know, how we address them. Let me just briefly mention about HCD. You might already know it, right, that HCD is an open source distributed QL store. It, you know, provides you consistency and high availability. And it is used to store critical data of a distributed system. So, for example, Kubernetes store all its cluster data in HCD. The diagram here, it just shows a single node HCD cluster. As you can see, HCD used raft consistency algorithm to maintain the replicated states. And the data are persistent, storing disk. Typically in productions, you would use a three to five node cluster. The HCD is a, you know, it's a CNCF project. And you can learn more about it on HCD.io. And the HCD community, they allow new contributors, you know, that helps grow the growth of the project, the, you know, makes the project better. So, you know, if you're interested, please contribute. You can find more details on contribution guidelines and, you know, other things on the development environments and whatnot on the HCD GitHub repo. All right, with that, I will hand it over to Mitaki to talk about its HCD security model and some of the examples. So, Mitaki to you. Okay, let me introduce the security model of HCD. It's a client and it's the servers communicate with GRPC over TCP. It's also possible to use GRPC proxy for multiproxying at the clients. It's the servers use a specialized HTTP protocol named after HTTP. And the TCP connections used for GRPC and HTTP protocols are encrypted and authenticated with TRS if a cluster follows a recommended configuration. We also have a component named Gateway. This is a component which only lays TCP connections between clients and servers. This is a component which only helps discovery of the cluster. Next. And if a user needs more fine-grained access control keys, it's a deep-provise artwork-based authentication and authorization as an optional feature. With this feature, we can grant permission to read and or write a specific key range to users. From the cluster use this feature, the servers have two options for authenticating and authorizing users. One is using authenticated GRPC method. In this case, clients invoke the method during initializing the client object. Information of authenticated user is stored in a JOT token written by the server. Another option is using common name field of TRS certificate. In this case, no password-based authentication is required. This mechanism also supports limiting special administrative operations like membership change. The special user route is the only user which is allowed to execute such operations. Could you go next? Let me introduce the security audit project. It's a liquidity-performed security audit last year. Since you supported this third-party audit, try a bit of this project. You can download the full report of the audit from the URL. We will introduce some findings from the report. Could you go next? The first example is an issue related to logs. It already had a problem of in-nudrate logging related to failed authentication attempts for users, which can only be used with common name-based authentication. As I mentioned earlier, CD supports multiple user authentication mechanisms for our log. Username-pass password-based approach and common name of TRS certificate approach. Especially for the other one, CD supports to create users which only support common name-based authentication. It doesn't support username-pass password-based approach. When a client applies authentication with password for such a user, CD log isn't useful for understanding that the client gave a wrong password or user gave a password to no password user. It is harmful for auditing failed attempts from the logs. We fixed the issue by adding a numeral code for representing such a failed attempt and made the log clearer. The next example is related to Gateway. Gateway utilizes TCP connection between the client and the server. The component doesn't validate acceptance of TRS connection to its endpoint. It only checks TCP reachability during initialization. So if the endpoints are misconfigured and point malicious of an accident, they can receive data sent by the clients. If the clients don't use TRS globally, Gateway is a component which only helps discover a bit of the cluster and doesn't terminate TRS connection. Using TRS is the responsibility of the client. We fixed the ambiguity of this program in the documentation. The example is related to Metric. Total number of database keys compacted Metric was never changed. This is because the code of compaction worked as a statement of increment, the variable for the Metric. The issue can result without the funding about the resource usage of entity. At the time of auditing, the issue was already fixed in the master branch, but the fix wasn't ported to the stable release. Okay, could you go next? I'll let you handle it. Sure. You want me to talk on this, Mitaki? Yeah. Okay, thank you. Hey, thank you so much again. So, all right, a couple more examples. Mitaki did explain a few of them. I'll talk about a couple more examples here. So, very, you know, a small snippet here, which is basically doing nothing but creating a new directory by calling maker all. Do you see any security issue in this snippet? Well, I wish we were talking in person and I could see the raise hands. You know, besides Mitaki smiling. But we're recording, so I will go ahead and show what's wrong here. Right? Well, so the problem here is if the provided directory path is already existing, right? It's an existing directory, then the maker all does nothing and it returns new. So then the flow continues, right? With new, you are continuing the flow. That's what the project was doing. And that's not something we were expecting, right? Because what happens if there's an existing directory with, you know, 777 permission mode, right? Somebody could have created the directory up front with extra permissions there. And, you know, with some malicious desire and, you know, could hurt the project, you know, running Gatsby eventually, right? So the fix was, you know, to make sure that if the directory exists, then it has the desired permission. If not, then, you know, we will consider that as an error case, right? So that's how it was fixed, right? It was pretty quick fix, small fix, but could have, you know, a security related problem if not handled. So another snippet here, another small snippet, you know, we tried to put some of the smaller issues here. So data validation, right? Remember, I mentioned earlier, the input data validation is very important and you want to basically validate, you know, everything, right? If you don't, then that could hurt the running project, right? So the parse compaction pretension function, do you see what's wrong here? Any security issues? Well, again, I wish we had to raise hands, but let me go ahead here. So the problem here, as you can see the string conversion functions, right? The return value can be negative. And that's not an error case. Can return negative value. And that's not an error case. And as you could see the project, you know, we weren't handling the issue, right? We were just checking the error. And if that's me, we will just go ahead and, you know, continue the flow. So the security issue here was if someone, you know, who has that access, right, that role who could misconfigure the auto compaction pretension by setting it to negative or, you know, maybe accidentally, right? But if it's set to negative value, then, you know, that could impact the FCD, you know, the process, right? It may not work properly as expected, right? Because that would be, that could be like a forever compacting, you can fill up the disk space and whatnot. So the solution again here was simple, you know, just not to accept any negative value and, you know, handle it properly. The other thing that I would mention here is the documentation, right? We need to make sure that you have a proper documentation for the, for the software, right, in the code. You may not see that as a, you know, direct impacting, direct impact on your security for the project, but, you know, three of the report issues that we had in FCD security, they were related to documentations and naming of functions. So, you know, we had a couple functions, they were having a misleading name, again, that happens accidentally and, you know, something that were not caught as part of, you know, code review while, you know, those pull requests, right? And then we had a couple of, you know, misleading documentation, right, descriptions. So, you know, that, as I said, you know, that ended up as in three reported issues by the auditing team. So, you know, the lack of documentations or not enough coverage, right, or poor naming of functions, they can create confusions, right, during the code review process, and that can impact productivity and it can also mislead users. So, you want to keep in mind that, you know, you have a good documentation, right? The other thing I want to cover is, you know, before I hand it over to Midaki again, is this is something we learned as part of the auditing process, right? As I mentioned earlier, the CVEs are important and, you know, they can basically help you to, you know, disclose the security flaws, issues publicly, so that the users of projects are state-relevant. The process can be confusing, but we learned during this process, you know, the security audit process that we can use the relatively new features, which is integrated part of GitHub to request and publish CVEs and advisories, right? So, we just tried that and, you know, we loved it. That was easy to use and very, you know, this can be very handy for you if your project is hosted on GitHub, right? And you want to publish advisories and, you know, CVEs, right? Because now everything you're doing is on GitHub, you know, you're not going to, like, a Mitre website, sorry, no. Anywhere else to work on CVEs, right, the publishing other stuff. So, if you have an admin access to your GitHub repo, then you will actually see, under the security tab, an option to draft security advisory, excuse me. And, you know, the other great thing here is you can actually privately discuss and, you know, the issues and fix issues with your code changes and it's all private and then you basically publish, you know, after having reviews in your internal teams, right, a team of maintainers, for example, right, or interested parties. And then you basically, you know, push them, those changes to your repository. It also has a really good template, you know, shows what kind of vulnerability is it, right, the one that you are interested in publishing, who is impacted, you know, any problem that has been patched, what version should the user of software project, you know, they should upgrade to, right, and if there's any work around. So, the template allows you to put like a good detail that can help users understand the CVE. So, we want to mention that, as I said, this is something we learned as part of the security audit process, okay. Okay. With that, Mitaki, would you like to conclude the talk? Yeah, let me conclude this talk. Yeah, writing in secure code is challenging but possible for the purpose of code review and various tests and various tools like static analyzer and further are very helpful. And also, although it costs time and budget, the audit is helpful for checking the status of the project. And also, making documentation and logging is difficult. It's not trivial at all. And creating and publishing CVEs with GitHub is easy. Let's utilize the feature for monitoring security issues. And also, less popular features can be used on some problems. In this case of security, as a security audit, multiple issues were found in gateway, which isn't used heavily. We need to be careful about such features and think before finding new features for making the entire project secure. Thanks, Mitaki. Let me see this is the last slide we have. So, we want to thank you, you know, many folks here, all the contributors of LCD, all the maintainers, especially I want to thank Siyang Lee, Gio Lee, Jingyi, and Brandon Philipp. They are maintainers and they were integrated part of working with the security issues that we resolved that came out of the LCD security audit. We want to thank CNCF for sponsoring the audits. I want to thanks Trail of Pits for conducting the audits. They have great materials, you know, blog posts, right, that can be like a really big good learning resource for security, understanding the security. The project, we want to thank AWSP. We have a lot of good practices and other materials, so we did refer them. Those materials, we definitely thank our employer, IBM, and Indeed. And I want to thank you, all of you who are watching this video. All right, well, thank you so much with that. I think we will end the talk, Mitaki. That sounds good.