 The first of those last two talks are being presented by Sharon Tateron and Michael Lang who are from a newspaper that you might have heard of. Please make them welcome. Okay, I see. So hello, I'm Sharon Tateron. I'm a software engineer on the New York Times photo team. Michael Lang also a software engineer, same team. We're really happy to be here today to talk about how we build sustainable systems powered by Python. So first some background about us and our team. As software engineers on the New York Times photo team, we build applications that support ingestion and management of the photos received by our systems. This includes photos from staff and freelance photographers as well as external services. We usually get in around 15,000 photos per day, but sometimes up to 45,000 photos a day for big events such as the Super Bowl or Oscars. As part of our infrastructure, we have developed and continue to develop a suite of Python-based applications and microservices that we refer to as Photon. This includes microservices that are responsible for downloading images from FTP and photographer cameras, stamping image metadata, and moving images into the photo archive. This also includes APIs and shared libraries. The responsibilities that Photon app support continues to grow. So what we provide, for end users we provide resilient, highly available systems that quickly resume processing when faced with unexpected, yet inevitable errors and crashes. And for developers we provide flexible, maintainable systems that can be enhanced with ease. Systems that really empower developers to develop. What we found is that a better developer experience often leads to more reliable systems. When developers can build, patch, and iterate upon existing code with ease and more easily identify failures, new features and bug fixes can be released consistently and safely. This ultimately benefits the end user. We've also found that maintainable systems are sustainable systems. Systems that can be built upon and sustain over time as teams change and new features are added. To achieve this, our systems are powered by Python, both built in Python modules as well as open source Python libraries. So why Python? I'm going to take a little diversion here. Earlier in the year, maybe you could flip the slides for me. I went on a trip to Jordan and to a place called Petra. And this is the entrance to Petra through the Sikh. Petra is an old city, 2,500 years old. Here's a picture of what it looks like. It covers a lot of territory. And so my wife and I decided that we needed some transport to get around. Vehicles were not practical there. So we had a choice. We could take a camel or maybe a horse. But ultimately, I decided on donkey. And the donkey could do everything. It could go up and down stairs. It could go all day. It didn't need to stop for any reason. And so that's why we use Python. It gets the job done. And we use a lot of Python. And Sharon will take you through more of our presentation. So what we're going to cover today is easy installation and configurable runs with set-up tools and click, code quality and type checking with Black, Flake 8, and MyPy, and feature addition with pluggy. Also, we're going to talk about resilient runtime with fail-fast and start-fast approaches and automated documentation with things. All the tools that we talk about today are either built into Python or modules, I can't say that right now, that are easily installable with pack. So installation and running. So first we'll talk about easy installation with set-up tools. Set-up tools enable building and distributing Python packages as well as namespace packages, which enables us to build and distribute sub-packages separately. Namespace packages are often used for loosely related packages and to indicate an organization name, application family, et cetera. In our case, our namespace is Photon. With namespace packages, each sub-package can be its own project and have its own repository, while also contributing to a shared namespace. In our architecture, each Photon microservice or shared library is independently installable within the Photon namespace. This allows us to locally install and develop one package at a time, deploy each microservice separately, and distribute each common library to our internal package repository individually, or all while maintaining a shared namespace. This is an example of how imports look from a shared namespace. Common and demo-util are separate projects, but both contribute to Photon. So one issue that we faced early on was whether we should use monorepo or a multi-repo. Monorepo's got some advantages, which you can see there and others. Multi-repo also has some advantages. We ended up, which is kind of a work in progress, but we have a lot of apps and we want them to be able to share, but we also have a lot of developers and we want them to independently be able to work on things. So we combine sort of mono and multi and maybe you can guess what we call it. Yes, it's the Monopython repo structure. So Sharon's going to give some examples of that as we go on. So in order to use setup tools for easy installation, first we pip install setup tools, then we create a setup.py file, and then we run pip install in editable mode in the project directory. This is an abbreviated example of setup.py and the setup function. After importing setup from setup tools, we pass an option such as name, description, and version, as well as entry points, packages, requirements, among others. In the packages option, we define the namespace packages with each prefaced by Photon to establish a namespace. For entry points, here we're defining a console script such that running a command that we've named invokes a Python function within our project. In our case, that function exposes a command line interface to the application and related utilities. The command line interface is based on the click library, which we'll discuss next. So click stands for command line interface creation kit, and it allows us to build command line interfaces by creating composable interfaces out of the box. In our architecture, each Photon application package has a command line interface entry point as we so defined in setup.py for running the application as well as supporting utilities. This enables us to pass argument to the application at runtime so that different configurations and scenarios can be explored with ease. So here we set things up. So first we pip install click, then we define the commands and invoke the commands. We've set things up such that individual commands are loaded from a command directory into an implementation of clicks multi-command class. If we run the entry point defined in setup.py, we invoke the multi-command class and we see that the subcommands listed correspond to the files in the directory. This allows commands to be lazily loaded and also any new commands can be added by simply dropping a new file into that directory. So here's an example of how we use click function decorators to define the command as well as the options for that command. Next we'll talk about how we do code quality and type checks with Black, Flake 8, and MyPy. So Black is an automated Python code formatter and it follows PEP 8 style conventions. It aims to reduce the smallest ifs possible, which means the fewest number of line changes. This tool adds productivity during development and during code reviews. Developers can really focus on just the content being added. It's effective for onboarding and knowledge sharing. Consistently formatted code is more readable across many developers. And it also allows us to standardize and format existing repositories with one command. No manual reformatting is necessary. So in order to use Black, we pip install Black and then we run it. So here's an example. We're moving the method call to two lines instead of one. We accidentally add a semicolon and we change the double quotes to single quotes. When we run Black, we'll see that everything has been reformatted. The semicolon disappears, the single quotes change to double quotes, and the method call goes back to one line. Since Black defaults to a max line length of 88 characters. Slide. Here we go. So next, we use Flake 8 also for code quality. Flake 8 is style guide enforcement and is actually a wrapper around three other Python tools. This includes PyFlakes, which checks for unused imports, variables, et cetera. PyCode style checks for PEP 8 style conventions, but we don't need to worry about that since we're already using Black. And McCabe checks for code complexity. This is another tool that really allows developers to just focus on the content of the code. So in order to use Flake 8, we pip install Flake 8 and we run it with optional configuration. In this example, we'll see that we add an unused import and then we're going to remove a used import. Once we run Flake 8, we'll see that we get expected errors. There is an unused import and also there's undefined names. So when we switch back, we'll see that everything has been solved. Let's see. It's hard to stop the video. It wants to keep going. Okay. So we learned a lot about typing in MyPy earlier as well. So to recap, we use MyPy for type checks. MyPy is a static type checker and it allows us to add type annotations and receive type hints. If there are inconsistencies in what's being passed to a function or what's being returned, there's built in Python support for type hints with the typing module which supports types such as any, union, optional, et cetera. It helps to identify common type-related bugs. So in order to use MyPy, we pip install MyPy and we run MyPy with optional configuration. Annotations are treated as common by standard Python interpreters, so typing does not interfere when running the application. So in this example, we change the parameter type from path to string and we get errors for incompatible types. Then we're going to change the return statement to a boolean and we'll see that we'll get errors since no return value is expected. So here we go. We're going to run it again. And then once we change everything back, we'll see that MyPy has passed. So in addition to running the three tools locally, black, flagate, and MyPy, they can be added as part of precommit hooks or also continuous integration build tests. For example, in our pipeline, a pull request must pass these standards in order to be merged in. So this is an example. It's a simple example of how we organize our applications. These are the server-based applications. We use a threading model. We use the threading model, I should say, pretty extensively to, in effect, create each thread. It kind of becomes its own little micro service within the application. And it's kind of a split input process output, which is an old model in computing. And so I want to remember this. Sharon's going to show a little more detail about examples, how this is organized. So another tool we use is Pluggy for feature addition. Pluggy is a plugin management tool. It enables plugins to hook into program execution. Pluggy is used internally by PyTest as well. Pluggy is particularly useful when defining divergent paths and when specific and sometimes lengthy logic is needed for particular use cases. It helps to avoid ever expanding if-else conditions and allows conditional business logic to be modular and self-contained. It's also useful for defining sequential steps in a modular way. Pluggy is really great for productivity as well. Developers can really focus on just the one feature being added by only interacting with the relevant plugin. And supporting a new use case can be as simple as defining a new plugin. So in order to use Pluggy, we pip install Pluggy, we import Pluggy, and we define a hook specification as well as hook implementations, which are the plugins. So this is how we set things up for plugins. Similar to the commands directory that we looked at earlier, we define a plugin directory from which plugins are loaded. Function decorators are used to define the hook spec and the hook implementations. When the plugin manager's hook method is called, it invokes each of the implementations and returns a list of return values from each plugin. In this example, we're using Pluggy to define separate image transformation, such as resizing or smoothing an image. A new transformation can be added by creating a new plugin class, specifying a function as the hook implementation, and dropping the file into the plugin's directory. It can be removed just as simply by deleting that one file. More implementation details of how we use Pluggy are available within our demo repositories. So moving on to a different topic, we're going to talk about thread-based, fail-fast, and start-fast approaches for a resilient runtime. We can always count on failure, so we must prepare for it and design our systems to bounce back, even in unstable times, even when external services are throwing errors. When we handle failures through retries, sometimes it's difficult to keep track of the state or bring the application back to a healthy state, and sometimes failures are masked. In addition, when we use robust libraries in our applications, such as the Google client libraries that we often use, they tend to already have retry mechanisms built in so that we don't have to. Fail-fast approach works such that the app encounters an exception, logs the exception, and fails quickly. This allows the application to shut down, resume processing, and while providing developers with instant visibility into errors. The photon applications are set up to attempt up to 40 retries over 15 minutes before entering a fatal state. To accomplish this, we define an event object for the threads to share. When a failure occurs, we set the event, and we use the event as a signal for all threads to exit. Then we use a process manager, we use Supervisor, to restart the application when it has exited. On the other side of things, we use start-fast. When we restart, we want all processing to start simultaneously. This eliminates differences in startup time between threads and prevents errors that may occur when all threads are not ready. To accomplish this, we define a barrier object for all the threads to share. We specify the fixed number of threads to wait on, and then reaching a barrier account signals all threads to release and begin processing simultaneously. We're going to go through a visual example of how start-fast and fail-fast works. First, we initialize the main thread, and we instantiate and initialize the module. I'm having a lot of trouble with this word today, modules. Then we start up the demon threads, and then we see that one thread is ready and waiting. Next, there's a few more threads waiting at the barrier, but we haven't reached the fixed number of threads yet. Now, all the threads are waiting at the barrier, so then they can all be released simultaneously. Now, all threads are processing, and they're all waiting on a shared failure event. One thread encounters a failure, and it propagates the failure and cleans up. It sets the fail-fast event, all the other threads are waiting for. Now, all the threads fail, and they clean up as well. This indicates that the application is ready to restart, and we're back at the beginning, a clean slate. That takes a few seconds. We're pretty quick on our restarts of our services. One interesting thing about these services, I think, is that we arrange them in a hierarchy, and go to the next slide, like this. In effect, we have a hierarchy of microservices. We consider our applications to be microservices, but internally, they're also organized as microservices. This allows us to make our applications either simple or relatively complex, get self-contained, and let developers focus on them. I'm going to abstract a little bit from that. I think an interesting thing I've run into in my career, actually, we're both very experienced. On average, we have about 25 years of a professional experience. If you consider complexity and think about component count, and I don't want to dive too deep in what is complexity or what are the components, but in general, if you draw a line which is constant functionality, you're wanting to build your application somewhere in that sweet spot. You don't want it to have too many components. You don't want it to be complex, and you want to be in that sweet spot. The sweet spot is actually quite wide. When microservices first came out, I dived on them. I think I built applications with way too many components. Since I've pulled back and looked at ways to simplify that architecture, and that's pulled me a little bit more to the left side of the graph. Now, we can, of course, describe this graph a little more rigorously. Oh, actually, those are Maxwell's equations, but you can describe everything with Maxwell's equations, or it could be summarized, as we all know, as the number 42, which is significant to me this year because it's my wife and I's anniversary, and so we went to IMAX. That's the end of that, non-secretary. So, last, we're going to briefly touch upon automated documentation with Sphinx. Sphinx generates documentation from code comments, parameters, types, et cetera. In order to use Sphinx, in order to use Sphinx, we pip install it, we go into the docs directory, and we run make HTML, then we can open the index page. This can be done on the subproject level, but we also set up our development environment to generate docs from the top-level directory, such that documentation can be generated cohesively for multiple photon applications, common libraries, and other utilities all at once. This is an example of how the resulting documentation looks. So, in our code, we comment it pretty extensively so that we can produce this documentation. And this is after the donkey ride, relaxing in the Bedouin tent. So, we made available three repositories, which provides examples for how we set things up for our applications. You can find it on GitHub and interact with it. If you have any questions regarding it or comments or feedback, feel free to add an issue to the repository as you explore it. Also, we're hiring at the New York Times, so if you have any questions about that, feel free to find us and ask them. Thank you. Thanks. Thanks, Sharon and Michael. So, if you code Python U2 can be an award-winning newsroom. Hey, we're going to swap over our presenters. So, this is our last talk for the day, but before we get started, I want to remind you that our government-mandated extra hour of sleep is tonight. Daylight saving does indeed finish this evening, which means if you do not adjust your clocks or your clocks do not automatically adjust for you, you will show up at this venue one hour before all of the organizers show up, and you'll be in a cold room with not much to do for an extra hour. So, please, use your government-mandated extra hour of sleep wisely and show up here at 10 a.m. Pacific standard time, not Pacific summertime. You know what I meant enough to correct me, so you didn't need to correct me. Okay. So, for our featured presentation, in 1989, our featured presenter wrote a programming language and named it after his favorite TV show. He's recently started writing a new parser for it, and he wanted to talk about it.