 Welcome to community versus enterprise open source, which is right for your business. My name is Lindsay Hooper, and I'm one of the conference organizers and your moderator for this webinar. Today, we're gonna explore patterns and enterprise edition add-ons, look at some concrete examples such as consulate versus Kafka, and ensure that you are better prepared to decide where you spend your open source license dollars. I'm here with Justin Riak, chief architect at OpenLogic by Perforce. Justin has over 20 years experience working in various software roles and is an outspoken free software evangelist delivering enterprise solutions and community education on databases, integration work, architecture, and technical leadership. So welcome to you, Justin. With that, I'm gonna hand it off to Justin. Take it away. Thanks so much. So I work for a company called OpenLogic owned by our parent company, Perforce. And really just so that you understand some of my background, we mostly work in helping enterprises adopt and consume free software, open source software, mostly software from the community. So we provide support services for a very wide range of technologies, mostly stuff you've heard of, mainstream technologies. And so because of that, we've gotten a chance to, I think kind of get a good sort of topographical view of some of these patterns that are emerging around businesses in their consumption and use of open source. So that's the closest thing to a sales pitch you will hear from me in this talk. What are we discussing today? Businesses really do differ in the way that they consume open source. And that's because of businesses with specific needs. Obviously I think we can talk about businesses in various verticals having specific concerns, security, privacy, things like that. Maybe other businesses have critical systems with a real need for high nine uptime and such. And so because of that, businesses are gonna change the way that they consume open source. And so we've seen some, I think opportunistically emerging patterns in the industry that have kind of risen to meet some of these demands. I think it's fair to say that open source is still a fragmented space. I think there's a lot of good work being done, making sure people understand roles and things like that, but we still see different patterns emerge. And one of these patterns is what we see called an open core pattern, which where we have a community open source project where a company will come and brand that code or provide add-ons to that code as part of an enterprise addition of that code or a commercial version of that code. So ideally then a vendor is gonna be able to benefit from open development, these patterns of open development that we know work very, very well, while hopefully and presumably providing some real value back to the community. What we really look for in terms of I think a healthy relationship between a community and enterprise company is kind of in this non-zero sum symbiosis. We're a little troubled by some companies that maybe don't give back as much and we're delighted to see companies that really do. Certain companies I think very succeed in establishing this while there are certainly companies that have some opportunities for improvement too. In these patterns themselves though, when we look at these products that are being offered, there are some very distinct patterns that we see emerging, right? So in other words, when a company comes along and offers an enterprise addition of their software, they're usually adding on one of these different six sort of overall category of benefits that we'll see in a second. And I wanna point out too, that we're gonna in this presentation, we're gonna be looking at pros and cons, we're gonna be looking at some of the concrete values offered by these enterprise companies, but in all of these cases, I really hope that it's understood that there's gonna be inherent benefits to running the community upstream versions of these. So things like the freedom from license cost obligations, the freedom from lock-in, all right? So although our company is very much on the side of the community and we really try to encourage people to use upstream community releases of things, this presentation is very much meant to be sort of an unbiased look at what some of these enterprise companies are offering. And what we see are usually offerings in some of the, one of these categories that you see in front of you here. We see the addition of some sort of AHA solution or maybe a DR solution, some additional DR stuff. We see maybe an enhanced, a bit of enhanced security functionality that's offered as an add-on, something that helps with interoperability. We see a lot of this, right? Where some add-ons that actually help whatever that technology is replace another piece of technology or work well with another piece of technology. Obviously performance is one that we'll see monitoring deployment management. So these categories are kind of important. They're gonna inform or guide sort of the rest of this discussion. We will see how the community provides these particular bits of functionality for their solution versus the enterprise equivalents. And so specifically the companies that we're gonna look at today are Postgres versus EDB. I thought I would just get that one out of the way first. And we'll take what is hopefully a good unbiased look at what the community upstream of Postgres offers versus EDB, who I'm a big fan of by the way. Cassandra versus DataStacks. We'll also look at Kafka versus Confluent, as we mentioned in the intro. Elk versus Elastic.co. We'll look at Puppet versus Puppet Enterprise, and then we'll look at Jenkins versus CloudBees. Okay, and again, this isn't really meant to take one stand or another. Again, very open about my philosophies regarding free software, very much a believer in the community upstream model. But I don't really, I don't want this to come across as me saying one is bad and the other is good. Hopefully I'm just laying out some objective details of the differences between these technologies. All right, so let's dive into looking at the Postgres versus EDB solution. So again, our high level kind of feature compare again, these six categories that we mentioned. HA for EDB and Postgres, both obviously available. With EDB, you get that out of a box. So you need some third party tooling to get that with Postgres Community Upstream. Very similar with our DR. We do have additional security functionality that comes with our EDB solution. But of course, as we all know, and love Postgres, it is highly secure and we have a lot of additional security through just upstream Postgres. One wonderful feature of EDB that is not currently matched in the community is EDB's support, native support for Oracle code right out of the box. This is wonderful. This is an excellent feature. We are very much about encouraging folks to move away from some of the larger database providers and move to solutions like Postgres, whatever helps people do that is in my mind a very good thing. And so cheers definitely to EDB on what they've done with that work. Good performance optimization out of the box with EDB. And of course, as we all know, if we've administered Postgres, plenty of good tooling for achieving really good performance from the upstream addition. And then in our monitoring deployment management observability, the Postgres Enterprise Manager helps us with that in the EDB solution, but we do have a lot of good third party solutions for monitoring deployment management. I'm not gonna read off of all of these. We really don't have time, but I'm just gonna get some of the highlights here. I will send out at the very end of this talk, there's a QR code to pull down these slides. So you may of course have them and you're welcome to reference these things. I will say that one of EDB's most compelling features though is the Postgres Enterprise Manager feature. It is the best in class tooling right now if you're gonna manage wide distributions of Postgres. Very wide feature set, well documented on that feature comparison slide that you saw before that I linked to here at the bottom. Whoops. And similar with DR, the EDB Postgres Enterprise Manager gives us a nice HADR solution here. And the EDB gives us benchmark standards for security with things like FIPS, compliance, the ability to do row level access control, some additional SQL injection protection. Hopefully you've managed this at other layers, but you do have this here. In terms of performance optimization from EDB, we get the nice query optimizer hints, very fine-grained control over the way the query engine plans things and interacts with the database. Nice metrics available to us at the session level, at the database session level, which is very helpful for anybody who's debug widespread client issues with database. You know how helpful it is to be able to get session level metrics easily, not that you can't get them from Postgres. Canned and calculated metrics giving us this nice visibility into the health of the database engine. So there's a theme here of accessibility with what EDB provides, right? Again, this nice support for Oracle out of the box. Is there anything here that you couldn't do with something third-party? No, and I don't think, excuse me, minus the Oracle support, we admittedly did not have something like that upstream in the community yet. But beyond that, mostly you can do this stuff with third-party tooling, but it's very convenient, it's very nice, and well put together with EnterpriseDB. All right, so let's talk then about what we can get from the community upstream release instead. We do have numerous options, but some of the most common ones that we're seeing people use, Bucardo, we're seeing a lot of, which is a nice multi-master solution. Bucardo actually is, as a lot of you probably know, a community in and of itself that provides a lot of different tools beyond just the HA tool set. Of course, PG pool is still here and works great. Petroni is not on this list, but we've seen recently a lot of folks interested or actively adopting Petroni. That's a very interesting solution. Recommend you look into it, don't be thrown if you see so many of those references to Kubernetes. It was definitely envisioned for running Postgres inside of a Kubernetes environment, but you do not have to. The mechanisms and everything actually work just fine outside of Kubernetes as well, so worth taking a look at there. For DR, we have our WAL streaming replication. The active passive pattern, very useful for true DR environments. We know that that right ahead streaming is happening pre-commit, so it's very safe. And the streaming model is better than taking big, large, massive backups and monolithic backups and things like that. And of course, we can always script backup or store. I mean, we can DR our storage volumes. And we've evolved many, many ways of dealing with DR in the Postgres Community Edition. Enhanced security, I really wanted to give a nod to SEPG SQL. This is a great project that I'm pleased to be c-folded into top-level Postgres SQL.org. SE Linux is a project that got a bit of an unfair start. It was sort of not accessible and I think a lot of people just got in the habit of turning it off. I think fast forward to now, there's a lot of good reasons to turn it off, excuse me, to turn it back on. A lot of really good tooling that makes it a lot easier to deal with. And SEPG SQL allows us to extend SE Linux semantics and labeling into table-level resources in Postgres, which just helps us further unify our host security. So really interesting project. Also, PG Audit, very good. More verbose details than we get out of our current Postgres session. Is this everything? No, lots of different security libraries, but these two I just kind of wanted to call out specifically. Of course, performance optimization is a big subject in Postgres, but beyond some of these things that would include our normal layers for optimization. So network storage, those types of things. We do have some great tooling available from the community as well. If you've not used PG Badger, go grab it. This is a great tool. You can turn up the query logging in your Postgres logging. This will ingest all of those queries and give you a beautiful HTML dashboard that tells you everything you need to know about how those queries performed. It'll give you some clues and hints as to where you need to start making some changes or optimizations in your queries or in your system. Whenever we're going in to do performance optimization for one of our customers, this tool is in our tool set. We also have, as you know, if you're familiar with Postgres, a lot of optimization flags, like checkpoint segments, work memory settings and things like that that are just part of Postgres's standard config. So obviously they do exist in EnterpriseDB, but they're right there in upstream Postgres as well, as well as Explain Analyze and the powerful, powerful Explain Analyze functionality that we get in upstream Postgres too. For monitoring, lots of good monitoring now. It's a great time for free software monitoring solutions. Prometheus and Grafana, I think, have really been a little bit meteoric. They work great with Postgres, with pulling PG catalog details. There's already an exporter for Postgres that's officially supported by the Prometheus community and works very, very well. Grafana, obviously, very good at visualizing a lot of those metrics. And we still have plenty of folks using Nagyos, using Xabix, hesitate to call them legacy, but I think they are starting to move in that direction when we look at what time series metric gathering is doing for us now. And then, of course, for our deployment management, so many great things, Ansible. We're seeing, obviously, a lot of Ansible, many, many Ansible modules for Postgres, obviously not the only way to deploy again, but just certainly one thing that worth illustrating there. So really, I look at EnterpriseDB, EDB is like a turnkey database solution, really comes down to sort of what you get out of the box. So you have solutions for HA, DR, security monitoring that are just sort of included. And we do have similar solutions for Postgres, but kind of in the form of those third party solutions. EDB gives a lot back to the Postgres community at large. I very much consider EDB to be one of the open core or Enterprise open source communities that does it right, or companies that does it right. I do feel like they have a very non-zero some relationship with the community at large. And I always wanna make sure that I bring that up because not every company does that, I think, as we know. I just wanna point that to where we sit in the DB Engine rankings. You can go back and review this algorithm if you'd like to challenge what you see on the next slide, but I think it is kind of interesting the way that these metrics are gathered, basically a way of trying to determine the popularity of various database engines. And we do see here that Postgres is number four on this list in DB Engine rankings, just under Microsoft SQL Server. Hoping to see that change. Then Oracle MySQL, of course, I would love to see those numbers changed too. And EDB is sitting down at 104. Now listen, it's a commercial solution and it's a niche solution into EDB's credit with the exception of Oracle's compatibility layer for the Oracle compatibility layer. They do a very good job of being seamless with that code that comes out of the Postgres code. So you have to pay for it. So not a lot of surprise to see it further down on the list. But just wanted to give some perspective there. All right, continuing on, Cassandra versus data stacks. So again, looking at our categories, you see a lot of stuff just supported out of the box with Cassandra around HA and DR. Not surprising, because if you really look at one of Cassandra's goals, right? It was to be just so bulletproof, right? To be able to just take so much ingress in a really nice, widely distributed way. So not surprising that it's a highly, highly resilient technology. It was very much built that way. Data Stacks Enterprise does give you some additional security out of the box we'll look at. You do have some additional platform interoperability that comes out of this data stack solution. So some enhanced Spark integration, which again, you can use just a patchy top level Spark with Cassandra and it works wonderfully. But this does give you a bit more of a sort of a runway for it. Performance optimization in memory processing and claims to be faster out of the box with Data Stacks Enterprise. And it has some very compelling data to show that. Now, again, with Cassandra, course performance is important, but it achieves its ingress heaviness through horizontal scale. So there's a good argument that any bit of additional processing, you can get any banging for your buck and get in memory you could also get from horizontal scaling. Just depending on what patterns you're trying to implement. And again, hopefully you're using this for as Cassandra was intended, which is very much a resilient and ingress right heavy, no SQL solution. And then monitoring deployment, we get the Data Stacks ops center from Data Stacks. And then of course, third party options available to us for monitoring and performance optimization from the community. And there's a full comparison available here as well. So looking at the Data Stacks side, really you get some more granular patterns in HA from the Data Stacks side. You'd get to be able to tell Cassandra, not only do I want replicants of these nodes, but I want a specific number of replicants to happen in this data center, right? So I can get a little bit more granular in the way that I configure my replication. There is some difference in the immediate consistency that's available. It's decided per right, as opposed to a blanket setting for the entire data set. So in other words, when I'm a client writing to Enterprise Data Stacks, I can say that I just want this one right to achieve consistency this way versus with the upstream community of Patrick Cassandra, you just have to make it a blanket setting for that whole data set that's coming into the system. So you get more granular there. And there is a good explanation of this available here. Some additional security functionality offered out of the box with the Enterprise Edition. So encryption at rest, very important to a lot of people, although I am a big believer that by the time your data has gotten to a database, it should already be encrypted by the application. That's my opinion, we can argue better at another time. And in-flight encryption as well. Out of the box authentication for LDAP-AD, Kerberos, row level access control, which is very helpful in a course auditing. And then the Data Stacks Enterprise Graph Project allows real graph concepts inside of Enterprise, which is helpful for doing any type of interoperability with something that needs more of a graph data model. And then we have lots of connectors that are built into the development tools here, including integration with Docker and Kafka right out of the box again. So this is helpful. The Kafka integration especially works well with the Data Stacks Enterprise Solution. Claims of two to four time, better performance. And on paper, that CP optimization should perform at that level. If you factor out poor performance and excuse me, poor performance in other areas of the database, there's better ownership and affinity basically. And so theoretically, the CPU should perform lower number of context switches when it's working with datasets. Claims to prove out. Claims to prove out. Basically a trick of allowing processors to spend time working on group data or similar data so that they're doing similar actions and not context switching as much. OpCenter monitoring is a good monitoring solution. Although this is gonna be the case in all of these things, you really have to wonder, do we need another monitoring solution when we have centralized monitoring, when we have things like Prometheus, like we just talked about? We have OpCenter, which is great for deployment and managing data stacks enterprise clusters. So then on the community side, how do we answer some of those same concerns? Well, I don't wanna harp on this too much, but again, for HA and DR Cassandra really was designed for HA and DR. So through the snitch and gossip protocols, we're doing a good job already of having nice redundant multi-data center redundancy. DR same way. So we have all these options that allow different consistency behaviors and different replication behaviors. And you have at the cost of performance, you'll have the ability to tweak those settings and make the system as DR ready as you need to. And then of course everything else that you already know about DR applies, like availability zones and replicated storage and all those things. Some nice third-party solutions for enhanced security. You can encrypt on write, decrypt on read for encryption at rest. Again, I still think it's bad practice to ask the database to encrypt in the first place. I think it should be an application concern. Third-party solutions are available for a lot of authentication engines. So if you wanna hook Cassandra up to LDAP, Kerberos, those types of things, there are third-party solutions available for those things. RBAC is just native. Role-level access control is only included in data stacks enterprise. So that is a consideration. And then as of Cassandra 4.0, auditing is just standard out of the box. We've got lots of ways to interact with Cassandra. One of the best ways is to use something like Apache Camel or Apache Kafka to stream data in and out of the system or Camel if you wanna build more sophisticated enterprise integration patterns into Cassandra. And so it doesn't have graph or document support, but I think, again, talking a little bit about that fragmentation that we see in open source, and I think it's a good thing. Cassandra is doing a good job of being what it is. Cassandra itself is written in Java. So when we talk performance optimization, we've got all of our standard tooling, JVisual VM, Flight Recorder, these things. And of course, there's no replacement for good data model. Lots of monitoring solutions not gonna go too deep into these again. They're kind of the same as we saw before. Just note Cassandra is written in Java, so there's lots of JMX metrics available to you. Makes it very easy to use the Prometheus JMX exporter to export those metrics and get them right into a Prometheus based monitoring solution. We do have a lot of good tools for managing Cassandra deployment as well. We have Reaper for health management of Cassandra, which will go and find nodes that are broken and bring them down and bring them back up. We do have a good Ansible role as well to help with our deployment. Cassandra deploys in a very logical way. You have nodes that are then arranged into racks and then those racks are arranged into data centers logically. And it just lends itself really well then to using deployment automation tools like Ansible. I would argue that you can't really do Cassandra right because of the amount of nodes that you need if you're using it for the right reasons without deployment automation. So some real value out of here that I think for the right customer with data stacks does provide some real value for those who want a solution that just works off the shelf but almost all of this functionality could be replicated in community open source projects. If you want extreme fine tuning, you will benefit from the Enterprise Edition's consistency options. And OpCenter provides a lot, it does a lot. And again, data stacks like EDB is a company that I think really does open source the right way to you. And they do provide an awful lot of not just code. It was one thing I really like about data stacks enterprise, not just code back to the community. They provide a lot of free knowledge. They have a lot of really good Cassandra training sessions that work just fine on the community edition of Cassandra. About eight hours worth of good training information the last time I checked that's totally free and will help you use just community Cassandra. So again, I do like the way that they do things. DB Engine rankings puts Cassandra at number 10 with data stacks down at number 43, okay. Questions before we move into copying versus Confluent? I'm gonna have to pick up the pace a little bit here. Yes, we have a question. All right. Doug says, this could possibly be a question for any section, but what about the offerings from AWS, Azure, et cetera? That is a long conversation. And if we have some time at the end of this, I would like to delve into that. It's a really good point because, and it's part of a big discussion right now. If we take a piece of free software and a cloud provider takes that code, figures out their own means of deployment to run that open source as a service, but are also housing that code behind DRM, behind litigious blocks. Have they made that code not free according to the free software foundation standards? I'm not gonna answer that right now, but that is the way that it's being thought about. It's a great question. And if we have some time at the end, we can talk about it some more. So Kafka versus Confluent? Okay, so this is one of the ones where you're gonna see, I think, some of the most similarity between what you get from Confluent versus what you get from the community. You see a full comparison here from Hortonworks. You do have HA provided out of the box with Kafka and Confluent, again, not surprising. Kafka was designed to be highly available. We actually have improved DR in Kafka with some third-party community tooling that's been made available versus what you get with Confluent, at least right now. I'm sure that situation will improve. Basic security out of the box for both of these, it's messaging, it's streaming technology, for most of these types of technologies, not just Kafka, but other ones like RabbitMQ, ActiveMQ, any of these MQs, as we know, federation is often more important than security, at least in the design principles, not saying that that's right, it's just what ends up happening. Platform interoperability, Confluent does have a series of adapters that they provide with their solution that makes it helpful to integrate Kafka, and we do have third-party clients and connectors available from the community as well. Comparable, good performance, really good performance between both of these additions, the messaging core is the same, and then monitoring a deployment out of the box for Confluent versus third-party solutions for Kafka. Similar to what we've seen. The Confluent Control Center is kind of the flagship, and it can help manage a Kafka cluster's health, but again, Kafka really was designed to be healthy and how they available in the first place. Similar with DR, but the Confluent Replicator does provide a streaming replication function, but just like Kafka has built-in replication strategies and was already designed for distributed data integrity at scale. So we do have ACLs in Confluent that can be mapped directly to AD and LDAP groups, which is helpful for enhanced security. And Confluent Hub is where developers can build various connectors into Kafka. For instance, we have a REST and MQTT proxy that help keep things interoperable with other platforms. Confluent for performance optimization does ship with an interesting feature called the auto data balancer. This makes sure that when new subscribers or producers come in to a Kafka broker network that partitions are rebalanced according to those fluctuating numbers of data consumers, that's really important for anybody who's ever worked with messaging before to be able to achieve that balancing when our consumers change. The Control Center provides some monitoring specific to Kafka, but I'd make the same point there as I always do. Do you really need another monitor? And then the Confluent operator will, which is coming soon, which is a Kubernetes operator, is going to automate the deployment of Kafka inside of Kubernetes, which is already possible with tools like Canal and Marie Ansible. So this may have actually just released. I have to go check. But I know it's either coming very soon or it's already here. On the community side, again, HA meant to be distributed used as zookeeper for forum election routines, which is also open source free software. DR is achieved in Kafka through some third party tools like Miramaker, which was kind of the de facto tool until Uber of all people came along and released the UReplicator, which is a free and OSS replication solution, which is definitely on par with Confluent Replicator. So it's a very, very good replication solution. It's good enough for Uber and they've made it free and open source. We have some enhanced security. You do have RBAC auditing encryption out of the box. Again, I argue whether a messaging system should care about encryption. LDAP and AD is provided with native job security. So again, Kafka is Java and so Java's built-in security functionality is still at play. Specifically, we can implement the login module interface, which if you've worked much with extending Java security that should look familiar to you. Platform interoperability, I mean, we have all these nice connectors and that's all well and good, but Kafka integrates seamlessly with Apache Camel. And once you have that, you're good. I mean, just go and look at the amount of Camel components that are available, the hundreds of Camel components that are available to you. It's a fully normalized message routing system and it integrates seamlessly with Kafka. So once you've got this, really Kafka can integrate with anything, like really anything, down to raw TCP packets that it needs to using Netty. Performance optimization, built with performance in mind written in Java so all the standard tooling applies there as well. Monitoring, same conversation here, Prometheus Grafana. Prometheus and Grafana is becoming ubiquitous. In fact, I will say with Kafka, we're just starting to see this together kind of always now. It just makes sense because it's easy to expose the JMX metrics out of Kafka using the Prometheus JMX exporter. Very easy to pull those into Prometheus. It's well supported with Ansible Kubernetes, other platforms like ChefPuppetSalt. It works really well inside of Kubernetes versus other messaging platforms which don't usually do as well inside of containers. Kafka actually does it very well. So, lots and lots of ways to do your deployment management with community features. So I would say that really Confluent is mostly about commercial support. When we talk to folks who have bought the product, they're not generally not gonna make a blanket statement but in general, we are seeing that people are mostly just interested in getting the support which is great support. Confluent is a company that was built by the original inventors of Kafka when they were working at LinkedIn. So you're not gonna get much better support than that. It's enterprise-ready already and organizations that kind of seek enterprise support are gonna look at Confluent, not Confluent as an organization that has that right relevant experience. So if you really want a zero-touch streaming platform, Confluent could do that for you but again, we mostly see people having an interest in the support, not so much the features. Elk versus elastic.co.io. So first, make sure that we understand that there are a lot of elastic companies out there. So when you go searching for elastic, the commercial arm, you may come across elastic.io. That is not the same thing. Elastic.co is the commercial edition of elastic we're talking about here. So the community edition then of elastic is separated into three, the Elk stack as you've seen is separated into three distinct technologies. I know you review for some of you out of three different products. The community edition is three separate products here, Elastic Search, which allows that searching of key value data, log data. The log stack processor to actually ingest log data and send it into Elastic Search. And then Kibana to do our visualization of log data similar to Grafana, but really, really geared towards tech space searches and just visualizing in a way that makes more sense for looking at log data. In fact, you'll often see people now just building web like iframes for Kibana right into their Grafana dashboards. They're just using both. The commercial elastic.co product actually just takes all of those and makes it into a single stack. So out of the box then from the commercial edition of elastic.co, you get your HADR and enhanced security. Platform interoperability, these are text logs. That was not even really necessary, I don't think. If you can access the logs through whatever mechanism that's not really gonna have much to do with Elastic it's gonna have more to do with the way that you set up your storage volumes. Performance optimization, you do get some additional scaling out of the box that comes with the commercial edition. Not really included so much in the Elk Community Edition. And in fact, pipelines in log stash are starting to help with this problem, but as we'll see that source code is in sort of a strange state right now. So hard to tell if it's really open or not. We'll look at that. And then monitoring deployment management here. So out of the box included with the commercial edition versus third party tools from the community. We have a better master election process for the Elastic CO which tends to work a little bit better. But again, that begs the question of whether you should even be using the software to manage the masters in this way, given that these are basically scripts. There's cross cluster replication that's provided by XPAC for DR. And we have the ability to do frozen indices so that we get nice long-term storage of logging data if we need to. So that's helpful. A lot of additional security provided by CO. So RBAC, IP whitelisting, LDAP-80-SSO, FIP support all through this XPAC shield plugin which I really think that this is probably the most compelling feature. I know people really love pipelining and I don't blame you, it's cool. But there's other ways that you can achieve that pattern. But this security is stuff that's actually really hard to do by hand for the community edition. So this is very helpful. And then we have SQL APIs for interaction with Elastic Search. We do have a larger quantity of backend data sources that we can use as well if we don't want to use Elastic Search so you can use things like Cockroach or Azure Storage if you'd rather. Query profile in Elastic CO allows for more detailed during queries. And allegedly, machine learning is now being used to identify the types of application problems that are coming in to Elastic Search. So it's sort of providing some taxonomy which we all know that AI is good at doing. Monitoring, again, additional stuff provided by XPAC. But you probably have a single monitoring solution already. And you do get this nice upgrade assistant for deployment manager as well as well as centralized management for the log stash and beats at the higher tiers of the product that you're buying. On the community side, we have excellent AHA options already out of the box for Elastic Search. It's just being just built to be distributed so it already works pretty well that way. And log stash and cabana are highly portable. So easy to stick them in a container or whatnot. Each individual technology has DR capabilities through third party solutions. With enhanced security, XPAC does provide this very comprehensive security for Elk but I will say too that basic architectural security practices are gonna help you here. Why would you want an internal system monitor allowed publicly over the open internet? I mean, right there that takes care of almost all of your attack vendors, excuse me, attack vectors unless you've been compromised on your internal network. The security of monitoring systems themselves is another large discussion because so much sensitive data gets put in there. A lot of practices and it does deserve a longer discussion than we have today. We've only got another 15 minutes and I still got two more technologies to get through. On the ingress side, again, log data is ubiquitous text. So Elk can consume it from just about any system. In terms of exposing that data, Elastic Search also integrates with Camel. So same story with Kafka there. Most of the platform's performance here really comes from the performance of Elastic Search. Kibana's mostly executing in the browser. The pipelining now allows us to horizontally spread out some of the ingest of those logs. The standard community adopt options apply for monitoring like we saw before, Prometheus, Nagyo, Sabix, Grafana will have options for this stuff. Third-party tooling can help us with deployment or another fragmented stack, but things like Ansible can help us come up with a single image that we deploy with and use some of those pains. So with this one, you kind of want to take the good with the bad without getting too in-depth into it. If you go look at the XPAC licensing, it's not all parts of it are listed under free software licenses anymore. Some of them are listed under the Elastic Commercial Licenses, but they're done in a strange way. The software, the source code is still available and you can even contribute to that source code, which can't compile and run that source code in production without paying a license. So be careful. I think it's an interesting model, right? I will see what happens with it. I do find it a little bit troubling though that there's not something built into their model that allows kind of their success to carry over to the broader community. There's nothing, they're allowing their code to be worked on by a broader community development but not necessarily providing those benefits back because you can't even run that code in production without paying for a license. So anyway, we'll see how it goes. All right. Puppet versus Puppet Enterprise then. I think we see one of the more textbook examples here. I think Puppet really came on to the enterprise circuit early, right? They were one of the first open source technologies I think to start really playing with this open core model and I think they learned a lot of lessons. But one of the things that they did was took what I see sort of its textbook solution of Puppet Enterprise or other enterprise open source, taking another community project, like in this case, ActiveMQ, and folding it into the Enterprise Edition, which was originally M Collective, which is now deprecated, but Puppet Orchestrator is what they're calling it now, which was the main selling point of Puppet Enterprise. So they basically subsumed another or consumed another open source project, branded it extended Puppet's functionality with it and called it an Enterprise Edition. So at the time, there weren't a lot of businesses doing it. So out of the box of Puppet Enterprise, you get HA, DR through the Puppet Cloud solutions, third party enhanced security. Puppet is already very highly interoperable. It needs to be, so not really a big difference there. Not a big difference in performance. Puppet Orchestrator can help with deployment, but Puppet Orchestrator is just ActiveMQ. So you can actually build a very similar pattern to Puppet Orchestrator. You can build just by using topics in ActiveMQ. HA is provided with Puppet Orchestrator, creates an active, active HA model with a Puppet Master server, but multiples are already supported in the community edition. Orchestrator does make it easier to deploy those. In many ways, Puppet Enterprise is really betting on the cloud. So the DR solutions that you see available are mostly through cloud mechanics, but Puppet Orchestrator does provide some replication solutions. Enterprise Edition does have easier LAP and AD integration and it doesn't really interact with and automate anything that Puppet community can't automate. So there's not a whole lot of extra provisions there for platform interoperability. The performance is really way more dependent on the systems that are being controlled by it. Horizontal scaling is usually the biggest concern and that's very much the point of M-Collective or Puppet Orchestrator, but achieving that kind of performance isn't really specific to Puppet itself. It's more just achieving horizontal scale of Puppet Masters. Puppet Enterprise has a dashboard that'll let you look at individual processes, but as always, other monitoring solutions are probably gonna be superior there. Two nice tools for deployment automation, Code Manager and R10K. Both of these are gonna allow you to deploy full Puppet environments. Code Manager's looking like it's the one that's really shaping up to be the most feature-rich one. So again, Puppet Orchestrator is really just ActiveMQ and so it's really easy to create this master of masters pattern that you see using topic subscriptions in ActiveMQ and that's really what's happening with Puppet Orchestrator. It's just an app and so standard DR and Cloud functionality also just works on the community edition. Architecturally speaking from a security standpoint, you definitely want to lock down your Puppet Masters. I think we probably all read that Stack Overflow story about the administrator who ran an RM-RF forward slash with a variable manifest in Puppet against like 200 something servers and forgot to populate the variable. So it just ran RM-RF forward slash. I'm not making that up. You can look it up on Stack Overflow. So you want to be very careful obviously because if somebody does own one of your master servers they can do massive infrastructure damage very quickly. Performance optimization, the performance really depends more on these downstream systems. We got plenty of other monitor that we can use and Puppet itself is about managing deployment. So the platform itself is pretty easy to deploy out of the box and Orchestrator and M-Collective kind of makes that even easier. But as discussed, we can use community active MQ to do that. There's also the Puppet Ansible project which allows you to orchestrate Puppet Masters servers using Ansible. So, okay. So another time, I think this is another time really where you see most people interested in Puppet support. It's just kind of what we're seeing. Doesn't have a lot of features that doesn't add a whole lot that you can't get from something like active MQ. But Puppet Manifest can be difficult to write. Organizations need training and support. And Puppet Enterprise also just has a wealth of content and services that help with implementation and maintenance. So I think support is really what they're chiefly providing. All right. But we're going to talk about Jenkins versus CloudBees. CloudBees being the commercial arm of the Jenkins package. And we've been here before. If you've been in the world of Java development for a long time, you might remember Hudson. Hudson was the earliest branch of Jenkins that tried to go commercial. Jenkins was the open source core to that. I think we all know what happened next. Most people have not even heard of Hudson. But CloudBees then is taking a bit of a different approach and is now definitely billing itself as the enterprise arm of Jenkins. And what can we learn from some of that stuff? Let's kind of go through that in our final installment here. So before we talk about the technology though, CloudBees has made a whole business out of helping people deploy code. So I want to make sure that it's understood that they're not just the enterprise arm of Jenkins. They do a lot of other stuff too. They're really an organization that's just made a business now out of helping people deploy code, which is a very noble pursuit. So none of their solutions are things that couldn't be implemented though by a customer that are given a little bit of legwork. But they do have the right philosophy, I think, when it comes to managing code. And they have employed some very smart people and help businesses really change the way they deploy their infrastructure. They are absolutely a big part of CICD, the modern CICD revolution, the idea of constantly deploying code to production. Every few seconds, CloudBees has been a big part of that. So I might be beating up a little bit on their technology stack here as you'll see. But I want to make sure it's understood that as an organization, they do a lot of remarkable things beyond just Jenkins. So out of the box, you get everything. It's a cloud-based Jenkins deployment. So AJD are all these things are provided by those cloud mechanisms. And so with that on the community side of Jenkins, you can get just about everything too through third-party tooling. And by the way, you can just deploy Jenkins in the cloud yourself. So CloudBees' business model is based on running Jenkins in the cloud. And you can sort of think of it as a managed Jenkins service that you just don't have to worry about. DR is the same story here. It's just a managed service. CloudBees does secure their hosting of Jenkins. And they actually have a Jenkins Core product. Excuse me, several members of, several employees of CloudBees are members of Jenkins Core security team. Okay, so actually all of Jenkins security is done in a way that I very much like in that it's all handled upstream. So CloudBees certainly does work and helps provide fixes and patches and things like that, but it's all managed upstream. In other words, you're not gonna get better security in one product or the other, minus the fact that CloudBees is managing all of your cloud security for you. Dozens and dozens of platforms and languages have been integrated in the form of Jenkins plugins. And these are available on CloudBees and the community edition of Jenkins. Very interoperable solutions no matter which side you go. CloudBees doesn't do a lot to optimize the performance of their fork of Jenkins beyond just tuning it appropriately for the way that they're hosting it. So our normal infrastructure scaling rules that we're just gonna apply. We do have a monitoring plugin which is compatible with CloudBees version. It's meant to expose data for third-party monitoring. So really this is more like an agent that exposes the data so that you can plug it into an existing monitoring solution. And of course CloudBees is gonna shine here in their deployment management. That's how they built their business. On the HA side from the community, what do we get from the community? Jenkins is so ubiquitous at this point that it's been deployed on every platform imaginable and every direction imaginable. Jenkins X, which is a really cool project is a native open source implementation built for community Kubernetes. Really cool project. You should take a look at it. DR is kind of the same story here. Docker cluster, native clustering, Kubernetes clusters in Jenkins X. Jenkins ships with very deep security features out of the box. And again, from that core upstream security team. So beyond that, CloudBees presumably is deploying in the cloud with good security standards. Jenkins has dozens of language plugin platforms so very interoperable. The performance optimization again, more to do with how your CI CD pipelines are designed, the kinds of jobs you're running, how long those individual tests take to run. Normally, as is normally the case here, that your performance optimization is gonna be achieved through due diligence, right? But CloudBees is gonna pre-optimize for their cloud platform, but only for their platform. Standard community options exist also for monitoring like we've talked about before. And you have a lot of open source solutions that can help you like Jenkins X, like Ansible, like Docker, that sort of thing. Okay, so I would say that Jenkins does see, excuse me, CloudBees seems to be doing better than Hudson. Hudson failed I think because it was swimming sort of upstream against this current of open source development. But we still have just as much of a need for strong CI CD now that we had back then. Pre-cloud revolution when Hudson came out. So I think CloudBees approaches to treating Jenkins as a managed services going well. But in the meantime, according to market rankings, this is IT Central Station rankings, you still see Jenkins pretty far outpacing CloudBees. All right, so some closing thoughts. Thoughtfulness I think really pays off when you're making these considerations for your own business. As more and more vendors are emerging here, we wanna be really mindful of that balance between freedom and convenience. Some of these vendors are gonna provide really valuable solutions like we've seen and add-ons that are going to legitimately improve your experience with that software. Other vendors have invested very heavily in support resources. We've seen that like Confluent and Puppet Enterprise and can provide some augmentation to your existing infrastructure. But in the end, I really would suggest, do the research, like really do the research. Don't just make it your knee jerk to go to the vendor, to go to the enterprise solution and just accept that lock-in. Really scrutinize that vendor. Make sure that this is a company that you wanna keep doing business with and make sure that they're giving back to those communities that they benefit from. That says a lot about what your experience is gonna be like. If you have a company that really gets it and they're in with the community and they're respected by the community, that experience is gonna go downstream to you too. So consult with experts also and get their opinions, including me if you'd like to. Do feel free to reach out. I really do like people. I'm mostly on the road out in front of people at live events doing a lot of evangelism and speaking. I can't do that right now, obviously, none of us can. But this is, so our virtual presence is a big deal. So feel free to reach out if you have any more questions that you didn't feel like asking today. The slides, as I promised, are available up on a slide share. This QR code will get you right there. And that is it for today. Okay, so with that, I think I will let you all go. Justin, thank you so much for all of this information. My pleasure. Attendees, thank you so much for joining us and we hope to see you on future Postgres conference webinars. Thanks everybody.