 Good morning, good evening, or good night, depending on wherever you are. It's such an honor to be the first to open the series of outstanding talks here at Dapper Day. For the next 20 minutes or so, it will be my pleasure to tell you a story and some tips regarding Dapper as production applications. My name is Alex, and I am what Microsoft calls a regional director and the most valuable professional. But don't be fooled into believing that I work for Microsoft. I most certainly do not. I am based in Romania, or Transylvania to be specific. And I run a company of highly talented software developers called Key to Get Solutions. Among other things, I also am one of the global Azure admins, a worldwide community of Azure experts who organize yearly global Azure. If you're new to Azure, make sure to search for a global Azure chapter near you. Or if you're someone who loves to bring people together, put the pin on the global Azure map and fellow Azure enthusiasts will come together at your location. Last year, global Azure was organized worldwide in six continents and brought together over 483 speakers to deliver it around 500 sessions or so, all at the same time for the same three days. Now that's what I call community. These are all people who deserve the kudos, friends, and it's thanks to their hard work and passion that global Azure is such a massive event and that Azure received so much love from the communities worldwide. But enough about that. Now, during this session, I'll tell you a story about an application which worked perfectly fine. And then we decided to make it not work. We effectively decomposed it into microservices and we depraised it. But it failed the moment we pre-provisioned it in the pre-production environment. So here's the deal. The application was initially comprised of an API serving not one, not two, but four different UI layers targeted for different audiences. And additionally, there was a second API which powers an application that has a completely different logic but must share the same data set. And lastly, there are numerous background services which are either triggered or are polling the same API to retrieve the information utilized by them. So for instance, there's a notification service which usually gets triggered each time something happens in the API. For example, upon data entry, when the data expires and so on. Now, what initially seemed like a good idea because we would only write the authorization logic in the logging layer once, quickly turned into a bit of a nightmare because of the highly volatile deployments and many teams collaborating on the same code base. It was therefore kind of a no-brainer that we had to quickly go back to the drawing board and we had to reconsider the design. Now, thankfully, we already were experienced enough with microservices as well as with Dapper and enhancing the application with Dapper's capability was definitely a no-brainer. I kid you not, it took less than a sprint to get everything properly set up and redesigned with zero technical debt on the architectural side at least, side of things at least. But before we get our hands dirty, let's cut to the chase. Developing microservices should not be done just because monolithic applications are the worst. In fact, they're not bad at all despite the word monolith having gained a negative connotation. You know, which apps I mean. Those passkey, free tier, four tier and tier applications. Specifically, one layer for presentation like a web app which consumes the output of another layer specifically running the business logic which we usually refer to as an API which in turn consumes the output of another application such as the data layer. And here's the deal. These monolithic applications, they're not entirely that bad. I mean, seriously, we gotta give them credit for running mission critical applications for the past few decades. And in all fairness, the world as we know it today was based on these kinds of applications for so many years. Of course, we didn't have Netflix and Spotify decades ago, but nevertheless, these were applications that were perfectly fine. They work and they get the job done. And even more microservices, they're anything but easy to develop. On one hand, the sole idea of segregation of work when forms of single responsibility APIs in the so-called bounded context and the segregation of data sets into independent day structures is an extremely easy thing to say, but incredibly difficult thing to achieve. Specifically, assuming we would exactly know how to develop microservices, we would hit a few problems that on. For starters, if we would implement microservices, but our basic understanding of solid principle wasn't that strong, we would end up having a ton of integration points. You know what I mean? Like caching and messaging cues. These are all particularly fun to develop because once you get your hands dirty with asynchronous programming and re-ethic programming, you pretty much never wanna stop. And you see the whole application as a chain of events that can be put into asynchronous programming style applications. And this includes third-party APIs, secret repositories and whatnot, but they all must be implemented and they all different, different SDKs, different APIs. So to make things even worse, each one of these technologies comes with its own SDK and each one has its own specificity to make things really world, really chaotic. Each one has its own cadence and with its own application lifecycle, which features can improve, they can also dramatically limit the functionality of your own application. And when you start implementing all kinds of different technologies, different APIs and different libraries, you realize quickly that you're writing a boilerplate code at an exponentially higher rate than you typically did in the monothical world. Specifically due to the lack of the right tools, there's general a tendency that a feature might take a sprint to develop in the monothical way and it might actually take several sprints to get rightly integrated with those rest of services or in the microservices-based app. So why did we even choose to split the API into smaller components? Well, here's the deal. Number one, we had way too many moving parts in a single project. We had way too many contributors to a single project. We had way too much code written even in the monothical application. We had way too many dependencies and zero responsibility whatsoever, even though we had a lot of contributors. Effectively, we had ton of friction. Now, I trust most of you already know what Dapper is because you're at the Dapper day. As you know, Dapper stands for distributed application runtime. It is a portable, serverless, event-driven runtime that makes it easy for developers to build resilient, stateless and stateful microservices that run on the cloud and edge and also embraces the diversity of languages and developer frameworks. Dapper codifies the best practices for building microservice applications into open, independent building blocks that enable you to build portable applications with the language and framework of your choice. And each building block is independent and you can use one or some or even all of the building blocks inside your own app. You gotta remember the goals Dapper has. Number one, they wanna enable developers using any language or framework to write distributed applications and they wanna make it easy for you. Second, they want to solve the hard problems developers face building microservice applications by providing the best practice building blocks. They wanna be community-driven to always be open and vendor-neutral. They wanna gain new contributors in an organic fashion. They wanna provide consistency of portability through open APIs. They want to be platform agnostic across cloud and edge, so on-prem as well. They wanna embrace extensibility and provide pluggable components without any vendor lock-in whatsoever. And most certainly, they also want to enable edge and IoT scenarios by being highly-performant and very lightweight. Last but not least, they wanna be incrementally-adoptable inside the existing code. So no need to write greenfield applications on the applications that have been running real business for the past decade or so. Now, what's old awesome Dapper, you might ask? Well, first and foremost, all that marketing and big selling point that you've heard before. But also, we loved the ability of having pluggable technology stacks. For instance, in our dev environment, we decided to use RapidMQ. And we went with that, but we knew RapidMQ was not specifically good enough for our production environments. And we didn't know what technology we're gonna go for for the message queues in our product environments. So having the ability of swapping components was just simply blissful. Then from their gestational standpoint, each team was responsible for a component and we usually had smaller heterogeneous teams, meaning that each team member has vast expertise in a lot of different areas. So using Dapper, it also meant that we were decoupling things. Okay, enough intros, let's get into the good stuff. Dapper support for Kubernetes is aligned with Kubernetes versions queue policy. This might sound very technical and difficult, but here's the catch. In a highly available cluster, the newest and the oldest QBAPS server instances must be within minor version. Now from the QBled, which is the small component that runs alongside each node inside your cluster, the QBled must never be newer than the QBAPS server and the QBled may only be up to three versions older than the QBAPS server. There's a caveat here that if you're running 1.25 or newer Kubernetes, I mean, then it can be two minor versions older than the QBAPS server. Similarly for the Qube proxy, it must not be newer than the QBAPS server and the Qube proxy, which you know is responsible for networking side of things, may be up to three versions older than the QBAPS server. So that three minor versions is always like a magic number in Kubernetes and it kind of works the same way in Dapper as well. The Qube proxy may be up to three versions older than or newer than the QBled instances, it runs alongside. And again, if you're running 1.25 or newer, it may only be two versions minor, or sorry, or older, maybe two versions older or newer than the QBled instances, it runs alongside. Now the same principle applies to the QBled controller manager, to the Qube scheduler and the cloud controller manager. It must not be newer than the Qube APS server instances they communicate with. Now you're not expected to know all of these numbers and the version numbers and whatnot, but the key takeaway here is that there is the same version skew policy in Dapper as there is in Kubernetes. However, you must keep in mind that you must always have resource settings as a starting point. And the numbers you see here are good starting points. In my opinion, you should always perform your individual testing to find the right values for your environment because they might differ. When you're installing Dapper using Helm, one thing to keep in mind, and we learned this through the trenches, is that there's no default limits. So effectively installing Dapper through Helm means that Dapper can literally cannibalize your entire infrastructure and your entire set of resources. It should never happen, but misconfiguration sometimes does occur. Also, there's some optional components like the placement, the sentry and the dashboard, which will be super useful, especially if you're running actors for the placement and if you wanna get some more operational stuff, things done. You should always set resource assignments for the Dapper SideCard using the supported annotations. These annotations usually go alongside the SideCard CPU limit, SideCard memory limit, SideCard CPU request, and SideCard memory request. And if they're not set, then the Dapper SideCard would run without resource settings, which obviously may lead to a lot of different issues. For the production-ready setups, it's strongly recommended that you configure these settings. Alongside the same lines with soft memory limits on the Dapper SideCard, when you set them up, you effectively would have this garbage collector free up the memory once it exceeds the limit instead of actually waiting for it to double the last amount of memory present in the heap when it was run. This may again sound very technical, but it's very important because if you wait for the default behavior of the garbage collector that is used in Go, you can inevitably end up having OOM kill events with your containers and with obviously your pods as well. Now, obviously we verified all of these production ready deployment tips, and we also went further to make sure that we had mutual TLS enabled, MTLS. Dapper has MTLS on by default, and you can learn more about bringing only your own certificates. We also made sure that the application to the Dapper API authentication was enabled because this has to be enabled to make sure that you secured the Dapper API from unauthorized access. Sometimes there is a tax from within your infrastructure. We also made sure that the Dapper to the API that we wrote was also authenticated. This is also very important because you must let Dapper know that it's communicating with authorized applications. And again, you'd be using the same token authentication principles. We had component secret configured in a secret store, and we never hard-coded any secrets in our component YAML files. And we also had the Dapper control plane installed on the dedicated namespace, for instance, Dapper system, which we commonly use, rather than having it in the default namespace or the application namespaces. Dapper also supports scope components for certain applications, even though it's not required, it's a good practice. Now here's the deal. We did all of that, and this is good. And everything seemed fine, especially in the Dev environment. But once the refactoring of our application was done, and it was fully Dapperized to be production ready, we deployed it in the pre-production environment just to learn that it didn't get any Dapper containers injecting the application spot. Now clearly, such a situation usually suggests that there might be an error indicated in the side guard injector, right? Here's the log we saw. Reading through the logs, the only three parts seemed to be unable to get as a Tecton pipeline's controller UID. It's a tongue twister, I'll give you that. And it's a bit of mumbo-jumbo even. Now I believe that the very next thing we did was the typical thing anyone else would do as well, namely Google on Bing unable to get as a Tecton pipeline's controller UID. Only to come across the first result of the query, which is a GitHub issue titled Dapper Side Guard not injecting the pod. See how we didn't even specify Dapper as an issue in the query and yet the first result was referring to this very same problem? Sure enough, the issue was closed, which usually is a good sign. And as we scrolled all the way to the bottom of the issue, this was a very wrong thread, one of the suggestions, and obviously a lot of people had a lot of different culprits, but nevertheless, sometimes as it turned out, we learned that there was an extra label, at least for the people who committed and commented on this issue. Hmm, should we have had any additional labels? Everything was working in that environment. Clearly they must be using the same labels everywhere, right? I mean, right? Time to compare and contrast. Here on the left-hand side, what you see is everything in terms of logs from the Sidecar Injector running in the Dev environment. On the right-hand side, the Sidecar Injector logs for the pre-production environment. And by the looks of it, it seems that the Sidecar Injector isn't even getting triggered to do his job. Now, either that or we have another pre-SachiA typical Microsoft situation where a product only logs success and fails to even log errors. You know what I'm talking about. So what should we do next? Well, time to compare some more. I trust everyone else would do the same thing and would probably take it through the same approaches. What you have here is the pod's description. On the left-hand side, the description clearly states two containers in the pod. On the right-hand side, there's a single container in the pod. So yet again, this suggests that there is literally no Sidecar container injected. Even more so, the missing labels are usually an indication that a Sidecar failed to properly see the annotations and respond to them. This made us think. Either the annotations are triggered on the Sidecar Injector because the Sidecar Injector isn't accessible for some obscure reason, which we haven't figured out yet, or the annotations are completely wrong. But the annotations seemed fine. Wait, what if the annotations are there before Dapper was installed? As it turns out, if your pod spec template is annotated correctly and you still don't see the Sidecar injected, you must make sure that Dapper was deployed to the cluster before your deployment or the pod were deployed. If this is the case, usually restarting the pods will fix the issue. So I guess it's time to reinstall. And you know what? Let's even reinstall Dapper. So first, we removed all the charts related to our application. Then we reinstalled Dapper just to be sure and then installed the application. And nothing. The Sidecarts were still not injected. I guess it's time to Google some more. Aha, another issue. It even says on prod environment. We're getting closer, aren't we? Now the issue here suggests that a firewall might be preventing the Sidecar injection. Of course, this makes sense. However, we didn't have such a firewall. We even asked the agent, the master, just to make sure. We even went as far as reinstalling everything from scratch again. And this included Dapper and all of its dependencies. And again, there was no error on the Sidecar injector pod which we would normally expect to tell us why something would fail regarding the injection. So it was not a firewall issue. Should we maybe blame the CNI? Weave in this case and tell you this before but this was a vanilla installation of Kubernetes. No distro, no flavor, no easy installation process whatsoever. It was through the trenches. So let's reinstall everything and leave Weave out and replace it with Calico. And as you'd expect, still not working. All ports were properly configured but still not working. Now here's the deal. In Kubernetes, there's this concept of an admission controller which intercepts the request to the API server and Kubernetes prior to the presence of the object. And after the request is authenticated and authorized, these controllers may be validating, they may be mutating or they may be doing both. Mutating controllers can modify objects and they can actually validate the request to make sure that the request should actually go through. These controllers can also block some custom verbs such as a request connecting to a pod using the API server proxy. These admission controllers do not and cannot block requests to read but they can block things from getting changed which would be the case of the injection of the sidecar. They also proceed in two phases. In the first phase, the mutating of the admission controller start run. And in the second phase, there's the validation admission controller which is executed. And if the controller in either phase rejects the request, the entire request is rejected immediately and an error is returned to the end user. Now frankly, all of this conversation is getting me a bit anxious because we're talking about some super techy stuff here like level 400 Kubernetes deep. Let's think about this once more. Could it be that the request is somehow getting blocked somewhere and never makes it to the sidecar injector? There's no firewall. What could block the request then? Hmm, maybe you know what? It's not really an admission configuration or an ankle problem. But wait, Vapper like other service meshes relies on Kubernetes to get the component trigger to do stuff. I know what you're about to say. Vapper is not a service mesh. Even though they do offer some overlapping capabilities, Vapper is not a service mesh. I know. A service mesh is defined as a networking service mesh and unlike a service mesh which is focused on networking components and concerns, Vapper is focused on providing building blocks that makes it easy for developers to build applications and microservices. Cool. But here's the deal. Vapper is developer centric, right? Versus a service mesh which is not. But even though conceptually they're different, the inner workings to a very large extent are the same. So you know what? Let's go back to the docs and see if the docs can help because no fairness. We should have read those docs first. Hey, wait, this is not the docs. This might be promising. Let's crawl some more. Uh-huh. That didn't really help because it's all about things that we already did. There's nothing new to this. But this got me thinking. You know what? There's one place we haven't looked into, namely the APS overlooks. Let's read through those. Now, as it turns out, in the APS server logs, when you read the APS server logs, we did actually get everything we were looking for. It told us that it couldn't call the SETCHER injector. It told us that it was getting a four or three back. And after some detailed network troubleshooting, we even learned that the four or three was actually generated by proxy, which the request went through. And this was because an HTTP proxy environment variable was picked up from the host when Kubernetes was installed and it persisted the host environment variables in the APS server prod. Now, thus, the internal requests were getting to the proxy, which had no idea how to run the requests back to the SETCHER injector, and there was no route back through the net whatsoever. Now, sure enough, removing an extra environment variable, we learned about, by running a tube CDL describe pod and restarting the APS server, it got us back in business. It might have taken me probably the past five or 10 minutes or so to talk about this, but the level of frustration we accumulated, the lack of insights into someone else's infrastructure was probably way too high to even mention. What did we learn though? APS server logs are priceless and people always forget about looking into those logs and they expect all of the logs to come from other places. Since we talked about logs, Dapper uses OpenTelemetry and Zipkin protocols for distributed traces. OpenTelemetry is the industry standard and is the recommended trace protocol to use. Now, most observability tools support OTEL and this includes Google Cloud Operations, AWS X-Ray, Azure Monitor, Neuralic, Datadog, Zipkin, Yeager, SignalFX, and many different more. The diagram that you see here pretty much demonstrates how Dapper using OpenTelemetry and Zipkin protocols integrates with multiple observability tools. And these scenarios can be utilized for so many different things, like tracing service invocation, Popsub APIs, you can flow trace contacts between services that use these APIs and they're commonly two scenarios for how tracing is utilized. Either Dapper generates a trace contacts to propagate and trace contacts to another service or you can generate a trace contacts and Dapper propagates a trace contacts to a service. Either approach doesn't really matter as long as you use these. Now, if you have any questions, I'll be more than happy to take them and I'll be here to assist. This was definitely something super, super interesting, at least in my opinion to learn from going through the trenches. Friends, I'll be hanging around to answer questions, but please make sure to also join the Discord channel. Dapper's community is super wide, super open and they're more than happy to assist in whatever questions you might have. Happy Dapper Day and let's keep the conversation going. As I told you, there's an active community on Discord supporting, helping and listening to your pain points, requirements, ideas, whatever it that they are, I urge you to join Dapper's community on Discord. Thank you all for streaming in and happy Dapper Day. Thank you.