 I'm Michael Smith, a software engineer at Puppet. I spent a lot of my career at Puppet working on command-line tools for sysadmin, such as Factor, Puppet, and Bolt. This talk is going to be a bit about designing command-line experiences, a bit about cool Linux tech, and a bit about cloud infrastructure. I'm going to refer to a lot of examples in Google and Amazon hosted cloud infrastructure, but it can be generalized to anything that manages infrastructure behind an API, for example, OpenStack and Kubernetes. A lot of us probably spend a majority of our time working in command shells, bash, zish, fish, whatever. I certainly do. I can get a lot done quickly on the command-line, particularly with tab completion, selecting just the concepts I need to describe the specific task I want to complete. But being effective on the command-line requires a lot of learning. You need to have a working model of each tool you're going to use so you can quickly select the concept you need as a command, sub-command, or option. There are a huge number of tools available to us. I want to talk a little bit about a particular type of command-line tool, specifically those built for interacting with cloud infrastructure. So when using those tools, there are a number of idiomatic operations you might do. You might want to see what's running. What are my containers, VMs, S3 buckets, volumes? How do I figure out if my S3 bucket is public? Or who created that EC2 instance? We need to select the right description. Are they volumes, persistent volume claims, cloud storage? Then is it just LS, or did they come up with some new way to, say, list this thing? This image is an example of Netflix traffic among microservices in one availability zone. It illustrates the scale of complexity we're dealing with in our infrastructure. You might look at logs for different services. You need to remember how to see logs with Kubernetes or Docker, or SSH into an EC2 instance to look at something in its log directory. What if you want to see some logs written to a persistent volume? Or how do I see what's happening in an AWS service? Or what messages are coming in on the simple notification service? How about execute a command? With containers, you probably don't have SSHD running, so you need to remember the options to use for exec. To connect to a VM, you need to find your instance's public IP address or DNS name, or hop over to a bastion host so you can then SSH to an instance with only a private IP address. You may need to stop some services. Shut down a container or application stack. Remove a Kubernetes deployment. Stop a Google compute instance or delete other things that are costing you money. These examples are all concepts you're likely familiar with. They have simple shell commands. What if we could use the same commands across different cloud providers and APIs? It'd be nice not to have to learn a new different command line tool just to do the same sorts of things. Using this further, what things could we build if we could rely on those commands to abstract across different infrastructure for us? We could have a live view of processes and usage across a fleet of VMs. We could aggregate logs across different containers, VMs, and services. I've previously worked on a data ingestion service that takes in data, submits it to Google Cloud via PubSub, which transforms it in a data flow and submits it to BigQuery. I've always wanted to be able to look at the logs and messages across all those systems in a single stream, but don't want to run my own Elk stack or other log ingestion service. Given the right tagging, it becomes pretty easy to find and clean up resources that are no longer used. These are the ideas behind WASH, a shell for constructing abstractions across cloud-native resources and exposing them with a set of powerful and familiar tools. The name is short for the wide area shell. I want to show a little bit of it right now. We are going to take a live tour through it. When I start up WASH, it gives a helpful suggestion to look at some documentation. This is available consistently throughout the experience of using WASH. WASH enters a shell. It gives me some ideas of things to use that are familiar tools, so I can see there are a number of plugins available right from the start. We have AWS, Docker, GCP, and Kubernetes. I can take a look at docs for AWS. That gives me a little bit more of how is it configured, what things are available to look at, how to deal with MFA or multi-factor authentication. It also suggests looking at S-Tree. WASH plugins implement a schema that describes what kind of things we will find in it. S-Tree is a way to look at that schema. I can see that under Kubernetes, I'm going to have things categorized by context and namespaces. I'll have pods and persistent volumes. Additionally, I'll have logs. I'll have something called FS that we'll get into later. Docker has a similar view. Google GCP shows a bit more. That's both useful to understand what things are available to me. We also use that to optimize some different ways of searching through all of these pieces. Some regular built-in commands, such as Tree, just work as a file system back in this. You can use commands you're familiar with. I can use a different LS to look at projects under GCP. We also have built-in help. This describes a bunch of the commands that are available. I'm going to work with find for a minute. Find is a command modeled on the Unix find command. It's for selecting resources, files, etc. based on a set of queries that I put together. One of the ones I most commonly use is searching based on metadata. Most cloud resources have some metadata associated with them. Often it's how they're configured. If I look at AWS, EC2 instances, I can look at one example of that, has a bunch of metadata about it. I'm going to look for just instances that are running. I can say state name, running. I have two instances. WASH also has a concept of common attributes. We can see what common attributes are available on those things. We have a set of actions that you can take on them. Those are modeled to the commands you can execute. Other things like creation time, modification time. A login shell available on those systems. As you can see, just piping to XARGs works just fine. This is just a normal shell. That only counted AWS instances, but I know I have some stuff running in GCP. Let's take a look at what metadata looks like there. We have a different set of metadata. WASH has some common attributes. A lot of stuff is still described as unique metadata. As we find commonalities among a bunch of different resources, we try to bring them into attributes to make queries like this simpler. I'll look for all the stuff that matches this now. I find I have two instances running in different projects in GCP. Find has a full grammar available to it. It's modeled on how Unix finds how you can string together multiple queries. I can say give me either the status is running or state name running. There's a whole set of operands. There are extra primaries that are unique to WASH dealing with things like the schema. You can specify a kind. I can say give me just things that mount a file system, this FS. That's using the schema declared on these plugins to identify what things to search for. Otherwise, this could take forever as we dig through the file systems of a bunch of these computers. That's fine. Since this is just a file system, I also have access to things in storage. I can just pull up finder and look at files hosted in S3. That's just made available immediately. A similar thing works for GCP, but with GCP at the moment, we've implemented a few other sets of resources such as looking at documents in Firestore. We have storage in GCP as well. I can look at files and I'm hosting some containers there that live in Google Storage. I can also interact with a PubSub in GCP. I'm just going to tail on that, and that attaches a new subscriber to that queue. Then if I send a message, I get that response back. That pushed a message onto that queue and my subscriber received it and displayed it. With Docker, we have containers listed. We also have volumes. Accessing Docker volumes is not necessarily always straightforward. The most common pattern is to mount them in a new container, which is what we're doing in the backend, but this makes it a lot simpler to interact with. I can go delete. I can look at what's my Redis swarm been up to. I can also go delete files. This foo file doesn't need to be here. Let's get rid of that. Additionally, we have the ability to see some ... We have a command that aggregates across different VMs and containers for looking at what processes are running. I can go take a look at ... That is a little too wide for this display, but I can take a look at processes running in my containers. A couple of them didn't work because they're very streamlined containers that don't have a shell in them. Then last, I can ... Getting back to one of the other examples that I wanted to be able to do, I can tail logs across several different containers with running different purposes. Here I have MinIO, Postgres, and a service all running together. I can see messages from those services interleaved as they come in. That all works really simply via commands we're already familiar with. Looking at a few of the others, I can see a history of what I've run. I can also see some details about what kind of operations I was doing in the background for that. There's some debugging information available here. Returning to screen ... We have AWS instances. I've just reiterated that you have access ... There's EC2 and EC2 instances in S3 buckets, and you have ways of viewing those, accessing and looking at the output of files from them, querying what's there via metadata attributes and other queries. Same with GCP, plus some extras that we've added, or that I've added. A similar approach with Docker and Kubernetes. Additionally, WASH has a simple external plugin interface, which a co-worker used to write up a view of Spotify playlists. Exploring Spotify playlists via the CLA may not be what you need, but this shows how simple it is to adapt an API to WASH's plugin system. I've shown a little of what WASH can do. It supports more commands that provide powerful ways to find things based on their metadata, a consistent way to execute commands on compute instances, and most of the other examples of common operations I talked about earlier. If you're interested in seeing more, the short link is on the bottom right. It's pup.pt slash WASH. I want to spend the rest of this talk looking at how it works. But first, why is it a shell rather than just a set of CLI tools? Let's examine what properties we expect to have in a shell. A shell is a REPL, a read-a-value print loop, so you can quickly test things out. It's all text-based, so we can execute it, share it, and rerun it. What's happening is explicit there. You can see all of the behavior. You can share that easily. You can rerun it. This is the VSH project, which provides a shell for querying and configuring secrets in hashCorpVault. It seems that the demo is not working exactly, so I think I'll pull that up on the live screen. I wanted to use this example because it shows off some of the rest of the properties that I see in a shell. This is a shell for querying and configuring secrets in hashCorpVault. It's designed around making it easy to write complicated operations quickly. Short command names, tab completion, simple access to environment variables. You can navigate through a hierarchy representing how the secrets are stored and organized, and the context of where you are in that hierarchy feeds into the commands we run to read or update them. WASH provides these things for cloud resources. It builds on an existing shell using familiar commands. It represents your infrastructure as an organized hierarchy you can navigate. WASH commands operate on the objects of that hierarchy. Now, standard shell tools are built on system calls. On Linux, those are a small set that enable access to processes, memory, networking, and the file system. They then expose more OS information through the file system, for example, via slash proc. So what are our cloud-native system calls? The cloud consists of file objects and hierarchical organization, VMs and containers, databases, service appliances. We should have create, read, update, and delete operations on each of these things, plus access to logs, configuration, and labels, and links to related resources. For general purpose compute instances, we need remote access to see what's happening in the OS. We need to send data around, such as publishing messages on a message queue. WASH has defined a set of primitives that represent abstract versions of many of those operations. The ability to list what's there and get configuration or metadata about it. You can read data or get a stream of updates. You can update a file or a configuration. You can execute a command, delete a resource, or signal it to trigger some sort of state change. The most common example we use right now is restarting a VM or a service. While WASH's primary focus is on cloud-native infrastructure, it's designed to be pluggable. So each resource it interacts with can choose to implement any or all of these primitives, and commands in WASH will work with anything that implements the required primitive. WASH occasionally adds new primitives as new use cases demand them. By keeping this first list small, I hope to make it easy to add new resources to WASH. To map these primitives to the ways you use WASH, we mount plugins as a fused file system. So ls becomes a call to the list primitive. Opening a file becomes a read. This lets existing tools like less or an editor or the finder as I showed earlier access to remote data transparently. Some primitives, such as exec, don't map to file system operations. So WASH is running libfuse. It's got plugins connected to it. This is showing a couple of examples. So for things that we can't do through the file system, WASH also runs a local demon that shell commands talk to over a socket. That lets things like execute, signaling. The main WASH process also maintains a cache so that repeated interaction can be quick rather than reaching out over the network again. It caches the result of primitives that we expect to be repeatable. So primary list, metadata, and read. And when you're doing reads and writes, that's operating on a local copy until it actually saves it and uploads it. But to build WASH quickly, we're actually doing a bit of sleight of hand. WASH reuses your existing shell for the scripting layer and customizes it with its own defaults and a WASH RC file. So then you're calling aliased WASH sub commands to do some of the more complex interactions with WASH. And we provide a customized prompt to give that a sense of place within a WASH's hierarchy. And so most of these WASH LS go through web views, WASH exec and PS go through the socket. You're invoking all of this as aliases to those commands via your own shell. So while plugins that are distributed with WASH are written and go and compiled into WASH, we wanted to make it really easy to add new plugins. So new plugins can be written in any language by creating a script or executable that responds to specific commands, command line arguments. It mirrors the primitives we defined earlier with some additional arguments to help manage state between runs of the script. I also have a Ruby gem that simplifies most of this to defining classes that reflect your data hierarchy. So I want to show two different examples of this. One, I'm going to restart WASH with a new plugin loaded. That lets me browse my Goodreads, let's see, Goodreads library, I guess. So I've got different categories for things. Books I'm reading, books I've read. And most of the weird behaviors you're going to see on this are because the Goodreads API is all XML and paged and kind of slow. But after this first operation finishes, it's pretty fast. So I can go look at, well, some metadata about Zoe's tail. Who's the author, John Scalzi? Get a description, get all sorts of information about it. And I can look at currently reading. So this project or the plugin that implements this is a fairly straightforward Ruby script if you ignore the XML parsing. We need to authenticate against the API. We're going to come back to those methods later once we use them. But we'll implement a init method that's called in the plugins to load the plugin that handles authentication and returns a JSON object that we cache to that describes the plugin and caches any additional state, in this case a user ID. Then whenever I list a directory, we're going to get a list argument. We'll pull out that state to get the user ID. And then depending on what path we're invoking, whether we're at the top or whether we are looking at a bookshelf, I'll return a list of bookshelves or a list of books in that bookshelf and serialize that to JSON. Implementing things like read are similarly simple. When you get a read request, it's going to be on a path and the response is going to be writing everything you wanted, everything in the response to standard out. So developing those is pretty quick. You can change them live in Wash. The other example I wanted to show was what the structure looks like when you're dealing with using the Wash Ruby gem. So in this case, everything is declared as a set of classes that implement methods and all the extra serialization is handled via the helper. So we have this is a piece for connecting to targets that you've declared in a bolt inventory. This ties to PuppetBolt, but it has a root plugin that implements its own list behavior that shows groups. Those groups have targets and those targets implement ways to execute commands on them, among other things. So I've worked hard to make extending this via plugins pretty as simple as possible. So Wash has been out in the wild for a year now. I want to talk about some thoughts and feedback based on that time and what that can lead to in the future. One of the things that really clicks with people is the same thing SSHFS provides, flexibility in working with files on remote systems. That might be taking a diff of files on different systems or copying from one system to another. Wash is like having any system that you have access to automatically mounted. It also works over WinRM, Docker and Kubernetes APIs and PuppetBolt transports. And it's easy to extend new things. The only thing missing right now in that is a primitive to create new files and directories, which I'm hoping to add soon. Running in a shell is great at orienting you in a new context. I also dislike running extra background services unless I'm using them. So booting into a shell provides a clean context for saying I'm working on a different problem now and ensures everything is cleaned up when you exit. But it also adds some additional overhead to interacting with the system. If I want to view files in an IDE, why would I start up a shell first? Wash has ways to run just as just a demon that would keep a file system persistently mounted, but the configuration to run that as a service isn't there. Starting a shell is also lots of overhead for doing small things. So especially as I get more practiced with them, I still find myself doing Docker copy or kubectl logs. Wash could be improved, so subcommands can be run by themselves and support auto-completion. So you could do wash tail and be able to refer the things by wash identifiers without having to start a shell. While wash principally sticks to standard POSIX shell commands, some dedicated tools improve on those patterns in cool ways. Stern is one I came across recently. It's a tool for viewing Kubernetes logs across multiple containers that supports really simple selection via pattern matching. I've found myself using it a lot because it's so simple to select what I want to see. I'd be really interested in adapting it to work with Wash's plugin system to make it simple to view aggregate logs across a wide set of systems with some simple, with very simple ways of matching what I want to view. So folks have written plugins to GitHub, PuppetDB and Goodreads. With a simple way to extend your shell and file system, what would you add to it? Thanks for listening. I will be here for any questions. So I have a slight change of perspective. There are a couple questions up already that I'll go ahead and respond to live here and we have a bit of time for any more that people want to throw in. So first one was, are there any plugins for Azure yet? And the answer is no. There's a lot of surface area to cover and I've been more experimenting with different models than trying to build out breadth of support yet. So Azure would be cool to add, but it's not there at the moment. How do I configure, another question, how do I configure Wash to discover the metadata of different services? Right now Wash pulls, it looks at defaults for how different command line tools for those services are configured. So right now it's going to pull in anything that you've used via a different native command line tool like gcloud or AWS or kubectl or docker, primarily, those ones, and just start with those automatically. Other external plugins tend to have their own methods for adding configuration and understand that those CLI workflows are often not obvious. It'd be really cool to enhance this with better login workflows for each of those, but it's not something I've tried yet. Another question, is it possible to write Wash scripts, i.e. for batch processing? Yes. In the help, you might have noticed that there's a script argument, so you can say Wash and provide a script and it'll run it non-interactively. Well, yeah, it'll run it. It'll choose whether it's interactive based on whether it has an interactive file descriptor to work with. It also has a dash c argument, so you can just run a short command and have it exit immediately. The slides will be posted, but I believe that the OSS, the conference has been collecting slides. I'll double check that those are going to be publicly shared, but I will be uploading them after this talk. Another question, can I use this to create resources in AWS or is it just to get information about them? It's currently targeting just getting information around them and more operational interaction. One of the areas that would be easy to add is the ability to edit configurations of something that already exists. That's a fairly common pattern already for Kubernetes resources. One of the next things I do want to add is creating new things, initially focused around files and folders to make it easier to copy stuff around between different systems. That design should be broad enough to handle different resources. It may not provide an easy interface because there's a lot of parameters that are unique to each of the systems. One thing that would be interesting to explore is having some simple defaults for people that just want to spin up something that they're using and plan to throw away. I don't think I see any more questions at the moment. There was one about whether Wash is more like a shell for the cloud services, like easy to S3 containers, et cetera. That is a... I'm not sure I fully understand the question. It is that, but I think as I said at the beginning of the talk, it's designed around a plugin framework that tries to be useful when interacting with anything driven by an API. One way of thinking about this is a pattern for creating... for being able to quickly adapt these tools like Wash for browsing resources that are represented by an API. Rather than having to curl it and all that, it's a short step to build a plugin that brings a bunch of... a bunch of extra capabilities, like being able to navigate around it, use your own tools to browse it and edit things, and be able to use Find as a query language over it. Another question, any thoughts on how configuration management can be done or implemented? I think it's... I'm not really trying to make it a configuration management tool. It is orthogonal to something like Terraform at the moment. It's generally... it's more targeted at filling in kind of all the gaps that you might run into where you need to go do more manual operations. I think the... Yeah, I don't know that it's well suited for configuration management, per se. It does take some similar concepts around abstracting across different resources, but we're targeting different modes of operation on that than primarily creation, which is something like creation and delete updates and deletion, which is what Terraform, in particular, would specialize in. Another question, if you were to use Wash for navigating Kubernetes, would you need to also install the Kube-Cuttle client as well separately? No. You will need a config of some sort that grants you access, but it doesn't rely on CLI tools, any CLI tools in particular to be installed. It's a self-contained binary that comes with the SDKs for those things, embedded into it. All right, great. Thanks a lot.