 ProTrust, lessons learned, building a BeyondCorp SSH proxy. Please give James Barclay a warm TourCom welcome. No problem. So my name is James Barclay. I'm a senior R&D engineer at Duo Labs, which is the security research division of Duo Security. And today, I'm going to talk a little bit about some of the lessons we learned building a BeyondCorp inspired SSH proxy at Duo. So we have quite a few slides to go over, but just to give you a quick idea of the agenda, I'm just going to give a really quick update on the BeyondCorp 101. Then we'll talk about a specific aspect of the BeyondCorp vision, which is the access proxy. And then I'll talk about how we're able to proxy SSH traffic through the access proxy. And then I'll go over the client and server implementations of that. And then I guess if we have time, I'll take questions. So BeyondCorp is like a zero trust security model that was developed by Google that sort of re-envisions the idea of a corporate network being perimeter based for access control. So rather than relying on the network perimeter to gate access to your infrastructure and services and so on, it shifted to individual users and devices. And at the core of the BeyondCorp vision, there's the idea that these perimeters or walls don't work. So yeah, like I already mentioned, we replace trust in the network with trust in the device and the user. So it wouldn't matter if you were, for example, working from a coffee shop or a plane or your corporate headquarters, you would get the same checks regardless. So as an example, we'll say like to access the company lunch menu, which we don't really care about, maybe you just have to have a managed device which would be determined by maybe the presence of a certificate. But to access source code or the crown jewels, you would need more than that, not only to have a managed device, but maybe the latest security patches and so on. So BeyondCorp is a complex system. It's been discussed at length by Google, most notably in their research papers. Today, I'm just going to be talking about one part of it, which is the access proxy, which is responsible for gating access to applications or services behind that in your corporate network. So before I get too far into this, just some terminology. I'll say that in our case, the access proxy for us was a web server running nginx. And it is responsible for authorizing and then proxying those requests to the back end services. And a service, in our case, is just anything that sits behind this access proxy. So whether that's an SSH server, RDP, VNC, whatever. And to give you a little more clarity on that, the access proxy is a web application that determines whether a user is authorized to access a service. And then the reverse proxy, in our case nginx, would communicate with this application to determine if the request should be allowed to pass. And then a service, in our case, was basically a DNS name. That's how we identify it in our back end, which has both an external name and an internal host name or IP. And then the access proxy is responsible for determining what services you have access to. So the first thing that we decided to tackle when we went about creating our own Beyond Corp, sort of inspired by Google's, was web applications. That's arguably the easiest part. And in many cases, is the most important. For us, every employee needs to access at least a handful of web applications that are hosted on-prem. And we wanted a way for our employees to be able to access those web applications without a VPN. So that's what we solved first. And this is an overly simplified diagram, but I think you'll get the idea. Basically, we have external clients, or maybe internal, in some cases, that just want to access on-premise stuff. And the access proxy is what gates that access. So next, I'm going to talk about the authentication flow we used to sort of handle this. So Nginx has a pretty cool feature called auth request. So this auth request directive can be used to basically, when a request comes in to Nginx, it is passed on to a sub-request. So our web application, in our case, it's a Cyclone web app. And if that sub-request determines that the user is authorized to access the service, so maybe they go through an SSO flow, they do 2FA, whatever, then it returns a 200. And then Nginx treats that as a request as authorized. If it returns a 401 or 403, it's denied. And any other status code is considered an error, and access won't be allowed. So the application will check for the presence of a valid session cookie for that particular service, so like wiki.example.com or whatever. And if the user is authorized to access that, we return a 200. If not, we return a 401, which tells Nginx to redirect to a login handler. For example, if they just didn't have that cookie. And then once authorized, Nginx will proxy the request to the back-end service. So this actually, it's simplified, but this is basically the gist of how you would configure Nginx as an example to work with an application like this. We point the auth request to this slash verify handler, which is an internal handler. And if that request returns a 200, then Nginx will pass it along. If not, then we actually set the value of this variable here, for example, access proxy check to the login URL, and then set the error page to that custom login handler. So when it gets a 401, Nginx will say, OK, this is your error page, kind of like a custom 404 or something like that, but a little different. So that's sort of the overall architecture of how this is handled for web applications, but next I'm going to talk about how we solve for this with SSH. And why did we care about SSH? For us internally, it was really second only to web applications, and we really wanted to set our VPN on fire, so that's what we solved for next. And at the start of this, we had a few tenants that we wanted to stick to. The first one I mentioned here is that it must be easy to add new services that are behind the access proxy, and that's something that Google stresses is pretty important in one of the BeyondCorp papers. Another thing is we didn't want to write our own SSH tooling for macOS or for anything. We wanted to use the existing tooling, but then also support Chrome Secure Shell, because we do use Chrome OS pretty heavily at Duo. And then this last one here, we wanted to be able to keep the exact same authentication flow for both SSH or any other protocol in addition to web applications. And so we preferred to keep the same authentication flow, and then what enabled us to do that was using a browser-based authentication flow. Another thing that we thought would be determined to be quite important with this was that the back end, like the SSH server or whatever, shouldn't know that traffic is passing through this access proxy. It should be completely transparent. And for non-HTTP protocols, we were able to use web sockets to do this. So this is from one of the BeyondCorp papers, BeyondCorp Part 3. Google talks about wrapping SSH traffic in HTTP over TLS, and then they talk about this proxy command thing, which I'll get to in a bit. They call it easy. I generally disagree with that. But yeah, maybe it is for Google. So when we were trying to figure this out, we knew we wanted our solution to work with Chrome Secure Shell. Unfortunately, Chrome Secure Shell is open source. We knew about this relay options text field. If you look at the Chrome Secure Shell, this file in particular is responsible for handling that. And it actually talks about Google's internal HTTP to SSH relay. And although Google, they say in this file that the source code isn't available and we don't have any public relays, there's enough information here for you to probably create one of your own, which we wanted to. Yeah. And that's an actual gift of my face when I figured that out. Yeah, so after this, I'm going to talk about behind the scenes how everything works. But before we get there, I'm just going to give you just a quick idea of what it looks like for Chrome Secure Shell. So the user click Connect, and the user will authenticate. And then once we've determined that the user is authorized to access that resource, we just redirect back to the Chrome extension URI. And at that point, bytes are being tunneled through the access proxy and then passed on to the back end service. And all of that in Ioncat is going through the access proxy, all those bytes. So for Mac OS, for example, we just use an existing open SSH, user types SSH, user at whatever they normally would. So launch the browser, go through the exact same authentication flow that we do for web applications or for Chrome Secure Shell. The user authenticates. And at this point, we actually redirect the credential back to a local HTTP server that we stand up through that native on-demand proxy. That way, when we establish the web socket connection to the access proxy, we're able to provide those credentials. Because unlike Chrome OS, it doesn't inherit that. Because it's not a browser-based application. So in a nutshell, this is the flow for Chrome Secure Shell, the client is a web sockets client. Those bytes flow through the access proxy. And then the access proxy handles taking that data and then passing it on to the back end. With the native SSH tooling, I mentioned we didn't want to write our own client. So we took advantage of this proxy command directive in SSH that Google also mentions in the BeyondCorp papers. And then what that does is, for specific hosts, for example, it'll actually launch your on-demand binary. And then you can do whatever you want with that traffic. It passes it as standard input to your program, like the SSH traffic. We tunnel that in web sockets. And then it makes its way to the back end eventually. So this is a somewhat simplified data flow diagram for how it works with Chrome Secure Shell. And then a slightly more complicated one for open SSH. So as you can see, it all starts with the user typing SSH at whatever.com. Proxy command launches our external program. And at that point, we launch the browser, stand up that local HTTP server, catch that authentication cookie, and then establish the web socket tunnel to the access proxy. And then the access proxy just tunnels those bytes back in service. So this NASH relay protocol, I mentioned the relay options in the Chrome Secure Shell, sort of like going to go over how this works in a nutshell, like what the handle is specific handlers that you would need to implement in your access proxy to get something like this working. So like I mentioned before, one of the cool things about implementing this through the, or implementing the NASH relay protocol is that it just works out of the box with Chrome OS. So web sockets. What the hell are they? So it's defined in this RFC 6455. It's basic message framing layered over TCP. And it's designed for browser-based applications, for opening up a persistent connection from a web application to a back end without having to open up multiple HTTP connections. But one of the interesting things about using web sockets, and Google actually talks about this in the papers, with SSH in particular, the credential is inherently portable. But whereas something like, we wouldn't be able to, for example, tie a device identifier to an SSH certificate with web sockets. We were able to just completely separate the two. Are you authorized, is your device authorized? And then you can use whatever credential, whether it's like a password, or an SSH key, or if you're using an SSHCA, that stuff all just works completely separately from the device authentication. So the web sockets actually starts out as HTTP. The client handshake is an HTTP upgrade request. And then once the connection is established, then messages are just passed over this persistent connection, just like a lightweight wrapper over TCP. And both the client and server are able to close the connection by sending a closed control frame that just has a specific opcode that is understood by the client. So this is what a web socket client handshake looks like. And then this is the server responding to that with an HTTP 101 switching protocols. And at that point, the persistent connection is established. I'm not going to go over all of this here, but this is what the web socket frame looks like. We'll talk a little bit about the opcodes. But most of what we care about are what's actually in the payload, which could just be anything. And so I've mentioned NASSH. Just want to go over this here. It's basically synonymous with the Chrome secure shell. If you look at the readme, it says that it's the Chrome app that combines each term with a knackle build of open SSH. And so what's knackle? Knackle is this native client that is supported in Chrome that allows you to run compiled C and C++ code in the browser. And each term is just an HTML terminal emulator. So just sort of an overview of the NASSH relay protocol. It's an HTTP to SSH relay, and it's supported in the Chrome secure shell. Defines a series of HTTP handlers that if you implement in your access proxy, you'll be able to tunnel that traffic. And then at its core, it's just the regular, old SSH traffic with this custom app propended to it. And then it uses WebSocket binary frames as opposed to UTF-8 frames, with the exception of there's an optional app latency message like the client is able to request app latency from the server. And if you want to do that, then that one uses the UTF-8 frames. So this payload here, like the ACK, and then whatever the SSH payload, that would be contained within the overall WebSocket frame. And then so this ACK, what is this hacking thing that we're talking about? The client and server keep track of the bytes that are read and written. And when the WebSocket connection is established to the access proxy, the client reports the ACK in a query string. And then what the server will do is trim that this retransmission buffer minus that ACK offset. If it's a new connection, the client would just report 0 for its ACK. And then this retransmission buffer, this is also defined in the NASH relay protocol. Both the server needs to keep track of the bytes received by the back-end service. And then we trim that retransmission buffer whenever we receive an updated ACK from the client. So when there's a new connection to this access proxy, it'll be in the query string because it starts out as an HTTP request. And then once the WebSocket connection is established, it's just that four-byte integer that is prepended to the SSH payload. And then we use that to trim the retransmission buffer. So talk a little bit about the server implementation here. Just real quick, these three lines you would need to add. If you were using nginx, for example, to your config file to support WebSocket connections. And then I mentioned earlier that we use Cyclone. Cyclone is just a Python framework that is built on top of the twisted Python framework. It's a web application framework, and it had support built in for using WebSockets, which was nice because we didn't need to add any additional dependencies to our access proxy code. There was a minor modification that we needed to make in order to get the binary WebSocket messages working. It was just like four lines of code in a single file. So these HTTP handlers that you would need to implement in an access proxy are defined in that NASSH-google-relay.js file. There's five of them, cookie, proxy, read, write, and connect. But we really only cared about three of them, which is nice. It's less work. And the reason is read and write are HTTP handlers. And that would be useful if, for example, read and write, as in when we're reading bytes from the access proxy and then writing new stuff from the client. That would be useful if, for example, the Chrome secure shell you were using didn't support WebSockets or, like, for some other reason, you couldn't use WebSockets. But connect is the single WebSocket handler. So we don't need to implement read or write if we're using that. So what does slash cookie do? It handles authorization and authentication. And then we'll redirect to the Chrome extension ID or local host once the user has we've determined that the user is authorized to access the user and the device, for that matter. Slash proxy, this handler, if you look at the protocol in that file, it tells you that this is responsible for opening a TCP connection to the back end service. So when the client hits slash proxy, we just create a open a TCP connection to the back end. And then we keep track of it by just generating a UUID and then returning that in the response body. And then slash connect is the actual WebSockets handler. So that's what we're used for bidirectional communication between the client and server. So in the way what the client will do is it will have received the cookie from the access proxy during the cookie step. And then it gets that UUID during the proxy step. So then when it actually goes on to make the WebSocket connection, it provides both of those so that the access proxy is able to say, oh, OK, I've determined that you're authorized to access this resource. And I've kept track of that TCP connection in the back end. So next, I'll talk a little bit about the clients. The first one is just the regular old Chrome secure shell. That's what it looks like. The bit like below user and example.com, that text field, those are the relay options that you can use to configure Chrome secure shell to communicate with an NASSH relay rather than making a direct connection to the server. And then with standard SSH tooling, we wrote this local on demand proxy in Golang. And then we point to that using the SSH proxy command directive. And then the local proxy itself understands WebSockets. So even though WebSockets was designed for browser-based applications to communicate with servers, it's, of course, not limited to that. And then proxy command, if you look at the documentation on it, it says that it'll take standard output from SSH, like those bytes, and then pass that as input to your program. And then you are responsible for displaying output back to the terminal by just writing the standard output. So this is what an SSH config file might look like to get this working. You specify a host, for example, and tell it to use a proxy command, ours we cleverly called in Nashville because NASSH, the host and port are just the SSH host and port that automatically get passed to the proxy command. And then we have this relay, which I'll talk about next here. So a relay host, of course, we need some way to tell the client to communicate with a specific access proxy. And the way we do that is by providing it in that switch there. But what's interesting about it is it's actually, we have a different relay host or host name for every service that we put behind the access proxy. And the reason is, it's actually kind of a shitty reason, but there's a limitation in the NASSH relay protocol where the first thing it does is it hits the slash cookie handler, and that's the only time you're able to determine if a request is authorized. But we have no notion or no idea what server the user is trying to connect to. So if we wanted to be able to, say, have different policies for different servers, not just like, OK, you proved that you are an employee at Company X, you get access to everything. We wanted to be able to say, OK, you have access to these things. So and the way we worked around that was by using a different host name that just resolves to the IP of the access proxy for every single service that we put behind the access proxy. And that way, we're able to enforce and code, OK, this server and port map to this relay host and set policies accordingly. I briefly mentioned a local HTTP server. Again, that was something that we did to catch the authentication cookie from the server. So when our local on-demand proxy runs, it launches the browser, hits the slash cookie handler, and then we redirect to local host on some ephemeral port. And that's used for the remainder of the session. So now I'm going to go through sort of a step-by-step of how this all works with open SSH plus this NASH relay or whatever this access proxy. I'm not going to go over how it works with Chrome Secure Shell because basically, if you get it working with open SSH, it'll work with the Chrome Secure Shell. We pass it a few different query parameters in the local proxy versus what Chrome Secure Shell sends by default. So the first step, the user types their SSH command. Proxy command is going to be responsible for launching our local on-demand proxy. So this is kind of cool too because we didn't want to have a demon or something running constantly. It's just gets run whenever a new SSH connection is established. So the local proxy will open a browser, go to this URL, whatever that relay host slash cookie, and then we have a few query parameters that we pass in there. And then once the user and the device are determined to be authorized to access that resource, we redirect to local host. So this is different than Chrome Secure Shell, which we would redirect to whatever the Chrome extension URI scheme is. So next, the local proxy will hit this slash proxy handler, which the server, once it receives that request, is going to create a UID and then tie it to this whatever accession object, something we can keep in memory to keep track of these connections. And then we establish the TCP connection to the server. So we keep track of the connection in the session object. And then we have these callbacks that fire whenever we get new data from the server. And once we're done with that request, we return that session ID or UID in the response body so that the client knows from now on when I make these web socket requests use this session ID to identify myself. So this is just a quick sort of simplified example of what that session class might look like. Sorry, not sorry if you don't know Python or hate it. But basically, we need some way to keep track of the TCP connection and then to the back end and then the web socket connection to the front end. And this is also where we store that retransmission buffer and then read count and write count, which are the ACK, basically. So sessions, for example, we could just have a global variable. And then when a client hits slash proxy, we'll create a new session object and then just tie it to that UUID. So we're later able to look up that connection when the client will provide that UUID and we'll see, OK, this session exists or it doesn't. So slash connect. So at this point, once the client has hit the HTTP handlers and we're determined that we're able to authorize to access the resource, we start the web socket connection to the access proxy. The server will respond with that switching protocols message. And at this point, we just have callbacks that fire when we get new web socket connections or when we receive web socket messages. And then like I mentioned previously, the local on-demand proxy will take standard input comes from SSH and then it passes that to the access proxy. And then the data that it receives, like the web socket messages that it receives, get displayed on the terminal. We just write those to standard output. So when new web socket connections are made, we look up that session ID that was provided by the client. So during the proxy step, they would have gotten that. During the web socket connection is made, they provide that. So then we point to that. I showed the whatever that session object, what it might look like. We keep track of that. And then we update the read count and write count. Next, we trim this retransmission buffer. The client will report how many bytes it's read. So we can discard the bytes that it is confirmed to have read. And then we just send a web socket message to the client with the contents of the retransmission buffer, which, if it's a new connection, will just be an empty payload. I'm not going to go over all of this here, but just later, if you want to look at these slides, if you're interested in building something like this yourself, maybe this will be helpful. But this is what, for example, what you could do when a new web socket connection is made. So now when we receive messages, what we do is we update this write count in that session object to the amount of, or the length of the message, like how many bytes we received minus the four byte act. And then we just send the message that we receive, which is like the SSH payload. We just pass that on to the back end, but trim the four byte act, because that'll break the SSH protocol if we include some random 32-bit integer at the beginning of all of our SSH payloads. And then we'll trim the retransmission buffer using that act that the client provided, either in the, well, in this case, since it's receiving a web socket message, it's contained in the payload, similar to what we did when we got a new connection, and it was provided in the query string. So next, yeah, again, not going to go over this, message received, what it might look like if you wanted to implement that. This is when a new web socket message is received on your access proxy. And then data received, this is just like, so when I receive data from the back end service on that TCP connection, I will update the amount, or the read count that we're keeping track of by the amount of data that we've received. So we're able to say, oh, I've read this many bytes from the server. And then we concatenate our write count with the data that we're receiving, and then just send that as a web socket message to the front end or to the client. And then here's what that might look like. So other stuff. So there were some gotchas. One thing we noticed was the Chrome Secure Show, when it receives a close control frame, it just, it doesn't really honor it. It'll just try to reconnect, probably by design. But if you want to permanently close the connection, the way you can do that is by sending an empty payload with a negative act, or I guess probably any payload in a negative act. And then the next one here was one that caused me to bang my head against the wall a couple of times. The retransmission buffer seems to be useful, mostly during new web socket connections, because we separate the step of creating the TCP connection to the back end from the actual web socket handler. So if you've ever typed like Telnet, some SSH server 22, whatever you get back an SSH version string, that's because that part of the SSH, that's how the SSH protocol starts out. The client and server exchange the version string. So if we would have received that before the client had gotten a chance to send that to the back end, we would encounter this brace condition where the client would say, I'm this SSH version, and then the server would say, here's my key exchange list or whatever the hell it is, like the next step in the SSH protocol, which of course broke everything. And that only happened when we were testing on local VMs, like making an SSH connection to a VM running on our machine, because we would get a response back pretty quickly. But if, for example, there were a few hops in between, that wouldn't happen. So the retransmission buffer is useful where we know we haven't sent, the client hasn't seen that data. So when the web socket connection is established, we just send whatever data we've already received, which in this case would just be the SSH version. So I mentioned RDP, VNC, like other things we might wanna put behind the Axis proxy. So how would we do that? So obviously RDP doesn't come with this cool like proxy command feature that allows us to, launch an external program. I don't even know how that would work, but it doesn't exist. So what can we do there? We've already tested this, so we know that it works if we just use like a socket. For example, rather than an RDP client connecting to, or whatever, rdp.example.com, we connect to localhost on some ephemeral port, and then have our proxy handle tunneling those bytes back to the, or through the Axis proxy, and eventually to the backend service. So that's about it. These are some references. If you're interested in building something like this yourself, these may be helpful. And then this link here, I'm gonna make my slides available probably later today. I haven't done that yet, but that bit.ly link is reserved. So beyondcorp-ssh-proxy. And that's it. Yeah, so right now, we just actually have one, but we've thought about it. We just haven't exactly determined how we're going to handle that. Any other questions?