 Hello, welcome to Devly, a multi-service development environment. I am Eric Hodel. I work at Fastly in the developer engineering department on our development environment, which I'll be describing for you today. I've written a lot of Ruby code, some of which you use every day. I'm Ad EZKL on most places on the internet. I've been writing software of one kind or another for more than 20 years now. We currently work within Fastly's site liability engineering organization, focusing on improving the internal engineering experience. Fastly, for those that don't know, is a content delivery network in Edge Cloud Provider. We serve traffic for GitHub, New Relics, Spotify, and many other popular websites and services. We also provide service for all Ruby and Ruby gems downloads. Do the same for many other open source projects for your charge. Ask us after the talk if you're interested in using Fastly for your open source project. We have servers all over the world which serve more than $14 trillion with a T-request each month. This constitutes more than 10% of all internet traffic, which still kind of blows my mind, because it was not always that large. We also employ the owners of 100% of the world's best dogs. Very dog from the company. We've got dogs all over the world. We are currently hiring Ruby application engineers. If you're interested, please find Eric or I after the talk, and we can provide you with details. A quick note before I continue. I currently live in Portland, Oregon, but I was born and raised in a small town called Meadville, about two hours north of here. This is my first public talk, so I'm excited to be getting so close to home. Thank you for the opportunity. So today, we'd like to discuss a problem that we believe impacts organizations of all sizes. Can I raise this up? Is that cool? Awesome. So today I'd like to discuss a problem that we believe impacts organizations of all sizes. To help us illustrate this problem, I'd like to tell you a story about the evolution of Fastly's API. This is a rough approximation of the service architecture that backed the Fastly API circa 2012. The original Fastly development environment consisted of a copy of each component of the Fastly API running on each engineer's laptop. Soon after, the early team decided the virtual machine should be employed to provide a degree of operational uniformity and parity between development and production. Another attribute of Fastly in the early days is that all the engineering work was being done by a very small group of people. Changes to the systems were easily introduced and distributed through source control, which allowed the teams to rapidly develop and deploy changes. Another side effect of the small size of the company was that focused discussions are possible, and this made decisions easy to communicate. Let's zoom back in and step forward in time a few years. Fortunately for us, the company was successful during this period of time, and that success opened doors for new opportunities to expand the business by adding additional functionality to our API. In some cases, when we added new functionality, we added new supporting systems. When we, as software engineers, get one in this room, add new functionality and dependencies to our systems, we introduced complexity. I don't mean to imply that complexity is necessarily a bad thing. To the contrary, we would argue that complexity is an unavoidable side effect of growth. There is something else that I haven't mentioned yet that complicates matters even more, which is that we like to use the right tool for the jobs, and many of our services were written in entirely different languages with very different workflows. So despite the increase in the number of languages and services, our development environment stayed much the same. Moreover, the gaps between each group's development workflows grew considerably. This became increasingly problematic as our engineering department doubled in size every six months for a number of years. So as a result, our original development environment became increasingly unreliable and established processes to communicate changes broke down. Maintaining any single engineer's development environment was problematic. So we have engineers working on everything from code that runs in the Linux kernel to code that runs on the browser. And the needs of each team in those different areas are dramatically different. Our original development environment was unable to meet the needs of one team without compromising the needs of another. This growth continued regardless of our development modes. Companies may continue to grow. That's just what it's going to do. So rating and scaling software is complicated. And there are many moving pieces and things to keep in mind while you're doing it. As an industry, we've established and continue to improve upon strategies that help us direct our time and energy. I believe this is due in large part to our ability to observe software systems in isolation. Organizations, on the other hand, are far more complex and much harder to observe in systematic ways. But by introspecting on our own experiences and listening to our coworkers, we were able to find some themes and common frustrations of which these are some examples. So here's your laptop. We'll see you in two weeks when your development environment is running. Does anyone know why the API gateway crashes in a loop? I updated the rest of my development environment and now nothing works. What happened? I can't do my work today because I need to rebuild my development environment. It's clearly untenable and becoming worse over time, increasingly problematic. How many people just out there have actually had a development environment like this or experienced these things? I'm sorry, but I'm glad it wasn't just us. But we're over in the development environment. Our friends and coworkers are becoming increasingly frustrated with the situation. So what to do? During the same period of time, a lot of new tools arrived on the scene, none that met all of our needs. So, through observation, research, and a lot of discussion with our friends, coworkers, and peers in other companies, we arrived at a few important themes. We believe these themes embody the traits of desirable developer focused productivity tools. The development environment must be reliable. I should be able to run a small number of commands to get what I need running. I should not have to know how every system works to do my job. And I should be able to easily see the local health of systems I rely upon. I should never, ever have to spend a day rebuilding my environment. Development environment must be accessible. Maintainers of systems must be allowed and encouraged to maintain their development environments collectively. I should be able to build and test new changes across systems owned by different teams easily. A development environment that spans multiple teams and workflows must be maintainable by the community of folks that are using it. Managing changes in source control illuminates past and present ownership even with many components. So structure and form should be encouraged through convention, documentation, and good tooling and feedback loops rather than enforced by gatekeepers. We want our development environment to be able to run the street services together in a composable unit. So it should be really easy to try new supporting systems and swap things in and out without having to worry about writing like a bunch of chef code or doing a bunch of other things like that. A development environment must be reproducible. We need the ability to determine and apply the last known good state of all systems. Source control with similar mechanisms should allow us to determine how we arrived at this known good state. And we should be able to leverage existing tools like Git, RubyGems, Perl's CPAN, Python's PIP, and Golang's DEP to arrive here. So through the rest of this talk, we hope to show you how we started to meet the needs of our coworkers at Fastly by applying these themes to a tool we've been building together for the last year. We call the tool Devly. To tell you more about Devly, I'd like to hand things off to my friend, close collaborator and the lead engineer on the Devly project, Eric Hodel. Thank you, Zeke. I will talk about Devly and some of its components and features. As Zeke covered, Devly is designed for developers. Devly builds images from your repositories. It uses those images to manage containers, and it enables communication both within and across teams of developers. Devly is distributed for macOS and Linux. We provide a standalone executable built by Rubypacker and provide packages for macOS and Debian. Devly lets you configure all of your services. It helps you build images from your repositories using Dockerfiles. It allows you to configure those images to run as services and runs groups of services together as part of a rack. An image contains the files necessary to run a service. The audit log image uses Ruby, so it has a copy of our application code. This code requires some libraries like Rails, Sidekick, and a JSON parser, so the image contains those installed gems. And the JSON parser requires a C library, so we install that along with the OS package system. In our repository, there is a Dockerfile that contains the instructions for building this image. Images can contain applications for any language. Our stat service is written in Go. Its code has a Go binary compiled from the stat's application code. The web app our customers use is written in Ember. This image runs a copy of the application code ready to run. We share all these images across all the teams by uploading and downloading them from the Google Container Registry. And this allows us to be sure we're always using the latest images and the latest source code. Devly service is a runtime configuration for an image. Here we've created the audit log service using the audit log image. The service runs a command. Since the audit log service provides an API for managing event data, it runs a Rails server to provide the HTTP interface for events. Our audit log service needs to be accessible to other services so they can read and write events. To allow other services to communicate with us, we expose port 8888. And if you use a development framework like Rails that supports live development, you can mount your repository on top of the files in the image. This allows you to work in your favorite editor from your favorite OS. You can change a file on your host OS and see the changes in your browser. This service runs the audit log API, but we also have some sidekick background jobs to run. To make it easier to read our logs, let's use a separate service to run those background jobs. Since the background jobs use all the same models and databases as our application, we can use the same image. We create the audit workers service, but we run the sidekick command instead of the Rails server command in the audit workers service. Then we can start up the audit log service, the audit log API service. It only runs the Rails server, and when we start the audit workers service, it only runs the background jobs. This separation helps make development a little more accessible because the logs are separate. We can also test our audit workers in complete isolation from the API. We'll create a few more services for our applications, including the authentication API, the configuration API, and some databases they use. If we're going to work on the configuration API, we don't want to start up the services that we don't need, the same if we're working on the authentication API. We create a rack for developing the configuration API that only contains the services it needs. We need a MySQL database, the audit log service, and the config API services to do our work. A rack can customize a service. Since we want to access the services running in the rack for development, we expose ports for a few services to the host OS. This allows us to connect to those ports of our browser. You can also set environment variables or mount different files to change the behavior of the service. Devly allows you to configure multiple racks. The authentication team needs to work on its services, which include the Postgres database, the authentication API. The authentication development rack also uses the audit log service, just like the configuration team. When we start these racks, they use independent containers to run their services. This allows the teams to have different configurations and software versions for the audit log service that won't collide with each other. For example, you can start up both racks at the same time and isolate bugs that span multiple services. Using common configuration to replicate services across teams makes sharing your work easier. The configuration for the images, the services, and the racks are in the shared Devly library repository. Fastly, we allow any developer to make changes to the Devly library and have them discuss the proposed changes with people that develop that service. The authentication, configuration, and audit API teams all have racks, but using the audit service. When the audit dev team proposes changes with audit log service, all of those teams need to be able to discuss them. By tracking the connections between teams and services through the Devly library repository, they become more visible, which improves the maintainability of your services and the communication across your teams. Now that we've had an overview of the components of Devly and how they combine, I'll show demos of some common development tasks using Devly from the perspective of developers on the various teams we've just seen. We'll run through some workflows like getting started with development, sharing changes within and across teams, and set up some convenience tools that will make development easier for ourselves and our coworkers. We'll start at the beginning by setting up Devly as a first-time user. We run Devly Setup and give Devly a Git repository to pull a Devly library from. This downloads the Devly library repository and the other repositories for our services. Along with checking out the repositories, Setup performs some additional checks, including the Docker version and your Google SDK version. The Setup command will try to fix things it can or give you a message to help you fix it if it can't do that by itself. This step takes no more than a few minutes to fetch your repositories and perform the necessary checks. Once Setup completes, we can run Devly info to see what racks and services are available to us. Devly will give us a list of the racks and services in our Devly library. We can retrieve information for a rack, which includes the services it starts. And we can retrieve information for a service, which includes the image, the repository, and metadata for ensuring the image is compatible with the files in the repository that we've mounted, and it's up to date with the image in the registry. Now that we have completed Setting Up Devly, let's start a rack and perform some basic development tasks, like viewing logs, using our service, and making a small change. The Devly Up command starts a rack. Since we don't have all the necessary images downloaded from the registry, first we see Devly pulling one of those images. Once all those images are downloaded, Devly creates a network to isolate this rack and starts all the containers. When containers aren't dependent upon each other, Devly can start them in parallel to speed up startup. Now let's check to see if everything is running okay. We run Devly status to see which racks and services are currently running. We can see that the two API services in the database are running, and we can see that the two API services are accessible to the host OS on ports 8888 and 9999. We can view the logs for the rack by running Devly logs. This command will continue to follow any new logs until we exit with Control C. Since everything seems to have started for real, let's try out the configuration service by switching to the browser. The configuration service was running on port 8888, so when we load it, we see the main page for the configuration API. Now we're triple sure that the rack is working. Let's switch back to the terminal and view the logs from this request. Devly logs shows our HTTP requests from viewing the config API main page. Everything is definitely working, so let's do some work by opening up the main page in our favorite editor, VIM. We open the source for the main page from the host OS and add some text. This is an example service, and because no one has ever figured out how to exit VIM, we only save the file. So let's switch back to our browser and see if our change worked. Reloading the configuration API shows the text. This is an example service as appeared. Our change was successful. Let's check the logs to see if it wasn't a fake. So of course the request from the refresh page appears in the logs, and since this has a 200 response with a different page size, we definitely loaded new content. Now that we are done with our work, let's shut down the rack by running Devly down. This stops all the containers we had running. This might save me a little bit of CPU power, but normally our services at Fastly are lightweight, even when a rack has a dozen services running. When we work within teams, we'll be pushing and pulling changes to our repositories. And when we work across teams with Devly, the other teams will push images for their services when they have a new set of features ready. For this workflow, the AutoLog team has updated the AutoLog image to add a source field for events. And we need our services to use this new feature. First, let's see if we already have the source field by loading the AutoLog service in the browser. I see the user ID, the timestamp, and the action fields, but no source field. So we're using an old AutoLog image. Let's switch to the terminal and update our image. So to be verified that we don't have the source field, we'll pull the latest image, and sorry about the lack of output, it's a bug. Our running AutoLog service is still using the old image, so we need to shut it down and start a new image. We can do this with Devly Restart, which will replace our AutoLog service with a new one running the updated image. Let's switch back to our browser to see the updates. We refresh the AutoLog service in the browser and see that the source field has appeared as we expected it. Now that our AutoLog service is running the latest image, we can continue updating our service to use the source field in the AutoLog. So far we've worked outside the container, but sometimes we need to run commands from inside the container where all our dependencies are loaded. So let's pretend now that we're on the AutoLog team and we'll go back in time a bit. We're not working on adding that source column to our database. And to do this, we need to run this migration we've just finished writing. We can't do this from the host OS because none of our applications gems are available. They're only installed inside the container. So we need to run the migration from inside the AutoLog service. Let's start with the browser again. We view the AutoLog homepage, and of course, we don't have the source field because we haven't run the migration yet. Devlyexec lets us run commands inside the container. We don't exactly remember the image layout, so we start a bash shell so we can explore. After the shell is open, we remember to change to the AutoLog source directory. Then to check to make sure we're in the right place, we run rake, capital T. Then we can run the migrations. We see that the migration said that it added the source column. So let's go back to the browser and check it. Reloading the page in the browser shows the source column migration is complete. Now that we've remembered where the rake tasks run, rake tests live, let's run the command directly so we can use our shell history in case we need to roll back and retry the migration if there was an error. So we switch back to the terminal and run our migration using Devlyexec with a complete rake command line. Now this is a little better, but it's really only okay for this one task, because we have one time. When we share this work with our other teams or team members, how will they remember how to run the migrations? What we've done is not very usable, and it would be nicer if the migrations ran automatically when we started a rake so we can get to work right away. We can automate running the migrations at rake startup using a post build task. The post build tasks run for a service after the rake starts to perform any extra tasks, any extra setup tasks you might need, such as the migrations like we saw or seeding data. This lets users who are unfamiliar with the service get to work right away. Post build tasks live in Devly library and are built as rake tasks that Devly up runs. The tasks live in the post build namespace. The task is named the same as the service it will run for. The rack argument allows Devly to run migrations on the correct service if you have multiple copies running in multiple racks at the same time. Here we'll run Devlyexec on the auto log service, just like we saw running the migrations earlier, and we'll use the same rate command line to run the migrations. After shutting down the rack, we can start it up again with Devly up. We go through all the steps we saw before from the start a rack section. Then at the bottom, we see Devlyexec run our migrations, including the migration output. Now, whenever someone starts our service, the migrations will run automatically, so they won't have to look up or ask what to do. Of course, we still need to run migrations during development. For the next time, we want to change the database schema. Shutting down and starting the rack takes several seconds, and we don't want to take all this time. To make this easier, we can save the long Devlyexec migration command as an easy to remember command. We don't want to have to remember or look up or type this long command to run migrations. Let's give this command a friendly name that's easy to type and remember. The saved commands live in a Devly YAML file. Excuse me. Saved commands live in the Devly YAML file for the repository we are working from, here at the log repository. Each repository can have its only Devly YAML with custom commands. Let's zoom in and look closer. The run commands are a collection of the friendly command names that we want to run. I chose migrate as the name of the command which will run the migrations. The command runs on auto log service. And the command line is the one we've seen earlier that runs the database migration task. We can also define a test command that runs the tests inside the service. This way, anyone can run the tests where all the dependencies are up to date. So now we can run Devly run migrate from the auto log directory and we see the migrations run. Or we can run the tests. Since these tests are inside a container which is running as part of a rack, they may communicate with other services in the rack. You can have separate racks where one is configured to run unit tests that don't talk to other services. And a larger rack with more services that runs integration tests. Either test suite could be started from a saved command. Sometimes we have to work on a service together with another team and Devly has a workflow for cross team development. The auto log team is working on some new high security features. They haven't been complete yet, but they want our feedback before they continue and make something that's too difficult to use or integrate. To give them feedback, we need to work with their work in progress branch. We were told that if we went to the auto log page, we would be running the correct code if a high security logo appeared. We go to the auto log page and see the same one as usual. No high security logos anywhere, so we'll need to switch to their branch. The high security branch may have new dependencies that our image doesn't have. We'll need a copy of the updated branch on top of our existing image because updated gems and code won't be there. We'll need to build a new image to be sure everything will work. To build the new image from the high security branch, first we need to tell Devly to use our repository we control. We use Devly link to tell Devly about our copy of the repository. This will let us build an image from the correct branch. We see that the auto log repository is now linked inside of the Devly library. We go to our repository copy and we check out the high security branch. Then we use Devly build to create a new image for the auto log service. Now that our new image is built, we can restart the auto log service. We use Devly restart again like we did when we pulled the auto log image that had the source field. Now when we reload the browser, we can see we're using the high security branch because the high security logo is present. We can now do some test integration with the new code to give feedback to the high security features. As adoption increases, we'll want to centralize image building through continuous integration so you always have an up-to-date images in your registry. By running your tests through Devly, you have a more consistent environment because the image, service, and rack are all built and configured the same way, both continuous integration and local development environments. Regular Devly setup may take too long in a CI environment as it performs in the library history. The CI mode for Devly setup reduces the history and repository saved to save time. The CI environment has our repository checked out to the correct commit already so we can use Devly link to use the correct source files. Overwrite flag makes sure we replace any existing files. The new code we're testing may have new dependencies, so we need to build a new image Finally, we start the correct rack for testing this service. Then everything will be ready to run tests, same as our local environment. The next thing to do is run our tests. This uses the saved command we saw earlier. If the tests successfully pass and we're on the master branch, we can then push our new config API image to the registry to share this image with all the teams. Adopting Devly has given us a common way to start more and more of our services. Once you have a sufficient set of teams on top of this capability beyond the workflows I've demonstrated. All the workflow demos I showed were for Ruby applications, but they are no different for developing a Go application which runs a compiled binary inside of an image. With a Go app, you edit the source code, create a new image with Devly build, then test it with a saved test command. This makes the development process more accessible as you don't need to learn as many new things when working on different languages. I've already shown the basics of CI and Devly, but with a common way to start services and run their tests, you can go beyond running tests for a single service. The services you build are composable into larger racks. The more services in a rack you have, the closer to a real deployment you come, so the easier it is to run integration tests or end tests across your services. By ensuring your images, services and racks are reliable at every level. You can more easily move your containers toward development deployment. The image as the base of all your services makes the contents of any application accessible to various security scanners. This allows you to run internal compliance processes, run vulnerability scans of the libraries you're using in your images, or find issues through static analysis. You can perform enhanced testing, such as a rack for fuzzing, where bizarre inputs are sent to your service that try to break it. You can get started with chaos engineering from an isolated, stable environment. You can build separate staging environments for groups of services, order on integration tests. And Zeke will share with us a few things we've learned while building and collaborating on Devly with our co-workers. One of the first things we've learned is that finding early adopters is key. We were fortunate enough to have a really diverse group of early adopters with varying degrees of experience who were willing to provide us with constructive criticism early on. We had a few early adopters who were able to provide us with feedback on containerization and orchestration strategies, which is really, really helpful. We also had a few early adopters who were relatively new to the company and who had little or no previous container experience. All of our early adopters made Devly a better product, made Devly an early product, a better product early on. Especially with the folks who were new to the company and had little to no container experience, their perspective was key because accessibility is one of our desired traits. So on top of the feedback of our early adopters, their advocacy helped us increase our internal adoption from 5% of product engineering groups to over 50% in a little over 6 months, which was pretty fast. So with that kind of adoption rate, we learned the importance of building, sustaining an open and supportive community within the company very early so we talked very openly about our plans, successes, and especially our failures of acknowledging the fact that we making mistakes and we're learning with everybody as we're going along. As of today, our Devly library repository has 30 contributors and includes people from almost every team in fast engineering, which is kind of pretty rad. So all this fosters a sense of shared ownership and togetherness that has been really important in the development and adoption of it. Establishing feedback loops with the people, with this community you're developing is very key. The more heard people feel, the more likely they will be to talk and ask questions and provide feedback and that's ultimately what we're trying to do is to get people to talk more. It's kind of interesting that as software engineers we're building tools and we're kind of thinking about everything as being codes and requests going between things, but really the communication is what this comes down to in some ways, gets introduced. One of the ways that we really working to establish these feedback loops is to sort of acknowledge, discuss and ticket bugs that our users are finding very, very quickly. And when we fix them, we let the people know the report of them and we try to get them involved in sort of the review process to make sure that we've actually solved the problem that they're seeing. It's been really important. Documentation is really good. Getting started docs can reduce friction and it absolutely need to be maintained. One of the things that we did early on that I think was really impactful was to have sort of like separate operating per operating system getting started directions so that people would just say, just go to the docs so that they're actually accurate. That in combination with the packaging that we're doing lets people say like, here's your laptop install this package, go through these directions and it actually has a terminal within a relatively short period of time. Our desired target was sort of to go from multiple days for the time to start up a new development environment to 15 minutes and we've been achieving that for three or four months now, I think which is pretty kind of phenomenal. So we also because we're building this community we want our users to know that they're free to update the documentation if they want to. We don't tell people like, go update the docs yourself, but we'll do it as well, but we want it to sort of become a shared resource that everybody's using to learn from and to teach their peers. Another thing to document, and this is something I think is a major learning for that he helped enforce that I was not doing a very good job at was documenting the administrative tasks and release processes that we were using to get these package things out. You'll be very glad you did that because the last learning was that I made everything that you can as part of your internal pooling. It turns out that QA automation for cross-platform CLIs is really really hard. I think QA in general is really hard. I find it really hard, but doing it when you're dealing with multiple operating systems and a lot of moving pieces makes it incredibly hard. So despite the deadly CLI having 97% test coverage we find bugs in weird state cases all the time, or I guess our users do. And so the more automation that we have especially around our release process proving to be a real time saver if you're going to be releasing a lot of new versions pretty rapidly and delivering them to the end users. We've also started to extend both the tooling and test harnesses so that we're able to check for and report common issues around things like the rack schemas. The beginnings of that make it really really easy for people to redirect and that we do some sort of like automated testing against it to see if all of the services are exposing the right ports or if they're doing anything that might be untoward for the environment so that we have a way to sort of inject new conventions into our CLI process. We regret to say that Devly is not yet open source supporting our users at Fastly in preparing for this conference to not allow us the time to prepare Devly for open source release but we're very close. Watch us on Twitter and probably at Fastly as well we'll post something on the blog and we'll certainly link those to open source. Once again we are hiring, if you're interested talk with us. We're both very nice. We'd like to thank our reviewers and there are a lot of them for helping us bring us to talk we'd also like to thank the Devly users to help us make a great tool for development just like legitimately took a village and we have a one at Fastly a lot of really supportive folks credits for logos and things that we used and thank you. Yes, thank you for your time. Are we managing configuration for services? So the Devly library contains how the application code gets shared within Devly and is a reflection of how it will run in production. Inside of the service if there's any whatever ports it needs to connect to for other services those are handled internal to the image. So those live in the repository anything that connects two services together like exposing the ports or setting environment variables to run a certain way that lives in the Devly library. So how do we manage two services sharing or two projects how do we manage two projects having the same service? So there's many of our racks we use services I think the most used services used in 15 racks or so and that one the configuration inside of Devly is very small because there's been a convention about talking within all the teams that use it about we're going to run on this port we provide this API and there's the regular communication that Devly is facilitating between the teams that use it so Devly doesn't necessarily to manage the service and more as a focus on managing the communication that needs to happen to share that information. The service configurations can also be overridden at the rack level if you want to do something different can we provide some local environment overrides as well as branches and those sorts of things? So the question was this looks a lot like Docker Compose is it what's the relationship so we started by using Docker Compose we found that Docker Compose's features seemed more production focused and we wanted to have more control over what commands our user had to type what error messages we would give because if we use Docker Compose another command line tool so we'd have to parse its errors convert those into something that we could tell our users this is what you do to fix it so because of that we are borrowing large parts of the Docker Compose schemas but we provide all of our own Docker interactions we have our own client for that. We did a hackathon project to kind of kick this whole thing off that I sat in a room for three days wired up Docker Compose Docker and a bunch of rake files and made a monorepo and then sort of we did it that way and pretty quickly it was like guys are going to become unmanageable it was also a really good opportunity I think for both of us to really learn a lot more about the Docker APIs and sort of what's sitting under the covers and how it's been really good as we're looking at using containers in production as well to actually have a pretty good idea thank you for your time and you can speak with us afterwards outside