 Hello everybody and welcome to another OpenShift Commons briefing. This time we're going to talk about some of my favorite things, one of which is operators and data portability. And we have with us Josh Mintz, Will Hawley, and Mike Breslin all from IBM. And they're going to talk about using the Apache CloudDB operator for data portability. And I'm going to let them introduce themselves. There's a bit of a demo here. At the end of this session we'll have time for Q&A. So if you want to ask questions in the chat, please do so. And as always, we will post this video and the slides up on OpenShift.com's blog and up on our YouTube channel within a day or so. So with that, Josh, take it away and tell us all about Apache CloudDB. Thank you, Diane. Hi, everyone. My name is Josh Mintz. I'm a product manager at IBM Cloud. I sit in Boston alongside Mike Breslin. We're also joined by Will Hawley from the UK. Today we're going to be talking about the operator for Apache CouchDB. The whole team that's here is part of the organization that delivers IBM Cloud, which is a database as a service in the IBM Cloud. That database as a service is built on Apache CouchDB. And we took that experience, running CouchDB at scale in a public cloud environment. We wanted to transform that into the operator paradigm so people could take our lessons learned and easily allow them to run it in their own OpenShift cluster. Apologize if you hear any squeaking with the work from home going on. I'm here with my 16-week-old puppy that has just gotten a new toy. She's very cute, I promise. So thank you in advance for understanding. So a little picture to go with the names. You can also get a picture of Mike on here later, beautiful face. So there's Will and I. If you have any questions or concerns about the presentation or want to talk to us, we hang out in the Apache CouchDB Slack. I have a link at the end of the presentation to join us there. So we're definitely down to talk on Slack or talk on the phone or talk in the open source at any time. We're here to help. So before going to the operator, I just want to give like a bit of background. IE, why you should trust our opinion. The opinionated design that goes into the operator pattern. So at IBM Cloud, we're the data backbone of the IBM Cloud across 50 data centers all over the world. We have petabytes upon petabytes of data under management. And again, Cloud is at its core code, CouchDB. And part of us running it as a service is the years of experience that we have operating, monitoring, and scaling these systems for hyperscale use cases. It's fully compatible with Apache CouchDB. There's some API differences that you might expect in sort of a public cloud versus a piece of software you'd run on a server or a Raspberry Pi. But you can use them interchangeably for the most part. And there's lots of information on the web about the minor differences between them. So instead of focusing about cloud, I want to focus on Apache CouchDB today. And we're going to talk about the operator and do a little bit of a live demo from Will. But before we get into that, I just want to cover the basic high-level feature set that you get when you use Apache CouchDB. And for people that have been around the database community for the last decade, CouchDB was one of the first no-SQL data stores to really carry that movement forward. CouchDB and MongoDB were very popular a number of years ago. And CouchDB has still continued to improve and become even more reliable and even more feature rich. So it was still there under the Apache organization as an open-source project governed by that PMC and those standards. So at a high level to JSON document store with an HTTP API, so it speaks the language of the web for very ease of use for web and mobile application development. It places a premium on data durability. So it uses structures and paradigms and the way it sets up the clusters and how it deals with crash failures to make sure that we're focusing on keeping the gold of the database, the data, as safe and durable as possible. In CouchDB 2.0 and 3.0, it uses multi-master clustering. So it's kind of a master-master architecture that allows you to scale up and out very easily. Start with one node and add another, and add another, add another. A similar paradigm would be Apache Cassandra. The other best part about CouchDB is its ability to sync data. So there's this thing called the CouchDB replication protocol and it allows you to very easily move data wherever you need it to go, whether it's in public cloud on a store or a point of sale device at the edge or an oil rig out in the middle of the ocean as long as there's internet connectivity. You'll be allowed to do things like active-active setups with data replication. You can go single direction or bi-directionally between regions and between locations, like your private data center or public cloud. But also as a continuous changes feed, which is useful for venting off changes in the database. And CouchDB 3.0 has full integration with a full text search sort of component, which is Lucene under the covers. So you don't have to parse data off that you want to do searching on or faceting on to another engine and try and keep those states synchronous. You can just use Lucene as you would want to through the CouchDB API. It's also API compatible with CouchDB, which Will is very familiar with. So if we have folks on the call that want to talk more about that, I'm sure he'd be happy and interested to answer any questions there. It's a software for running sort of the CouchDB protocol on mobile apps or small devices. RxDB is kind of a newcomer to this space. It is for JavaScript applications. I think running something on your phone as well. It's also API compatible with CouchDB and Cloudant. And lastly, as we discussed earlier, IBM Cloudant is compatible with Apache CouchDB. A lot of the folks I work with are also contributors to the open source projects somewhere on the PMC. And it's an awesome community and we'd love for people to come say hi, learn more if you're interested in sort of joining the fold and helping develop Apache CouchDB. Sure, there's a few people in the community that would love to help steward your involvement and answer any questions you may have. One of the last cool things about CouchDB is it scales down to small devices like Raspberry Pis, but also we run CouchDB, e.g. Cloudant, as databases in the public cloud that have many, many terabytes in them, for instance. So it scales up and down very nicely. If anyone has any questions, feel free to pause me throughout the session. I know people drop in, drop out. So no problem at all if you need me to cover something or go back. One of the cool things that you can do with Apache CouchDB and Cloudant because of that data replication protocol is sort of this open hybrid multicloud architecture. And I recognize the jargon there, but it's pretty descriptive for what we're actually trying to deliver in what we see as a common use case for people we work with. And that's partially why we went and did the development on the operator for Apache CouchDB to help take what we learned in the public cloud and allow people to use that knowledge through the operator framework on OpenShift wherever they want to run. So CouchDB's strong replication protocol allows you to, for example, run the operator for Apache CouchDB on an on-premise OpenShift cluster in your own data center. And then if you punch the right holes in your firewall and do all the IP whitelisting, because if the CSO is listening, I'm sure they would want everyone to do that, then you can easily just kind of replicate bidirectionally or one way to any other environment. So that might be a managed database as a service cloud and on the IBM cloud where you don't really have to worry about running the environment, you just pay for provision throughput and off you go. And then you can also fill your apps on OpenShift as well in the IBM cloud because we have a managed OpenShift offering there as well. And then you can take that data that's been replicated to the IBM cloud and you could also replicate it over to Azure. Let's say you have a footprint in Azure where you're running Red Hat OpenShift and you want to use the operator pattern again. So basically what we want to do is, let people get their data wherever they need it to be. We are big believers in Kubernetes at IBM and in OpenShift and those help dramatically with application portability. One of the things that we've seen a problem for our customers is data portability. So you can move the application now seamlessly between clouds, it's not super easy to move the data. We feel strongly that Apache Cache DB as a technology is well suited for that given it's data sync technology and the fact that it's open source under the Apache foundation. And if you'd like to use it as a managed service in the public cloud, it's there under IBM cloud but if you want our expertise in the operator pattern, feel free to use the operator for Apache Cache DB the more the merrier here. One kind of finer implementation of this sort of use case is something we see in the Apache Cache DB community around sort of deploying it for retail use cases just as sort of an execution or a drilling down into how you might actually use this in your day to day jobs. So you might be running the Apache Cache DB in the cloud let's say you have stores all over your continent whether it's the US or Europe and you need to replicate data from the cloud into those stores or there's devices at those stores that need to get data from the cloud. And you want to make sure that you're able to do that even when there's not any internet connectivity which Cache DB does very well it's able to sort of understand that it's lost internet connectivity and when it comes back up it's able to resume replication and carry on like there hasn't been much that's happened to the network infrastructure. So what you can do is sort of run it in the public cloud or anywhere you want and replicate that data to your stores where you might also be running Apache Cache DB and at those stores that you then may be able to if you want replicate them to like point of sale devices whether it's like maybe an Android phone or an iPad using something like Cache DB running on the devices which speaks the Apache Cache DB replication protocol. So if there's any questions on that I can pause hopefully that was sort of a useful use case for you to help couch no pun intended how you might use this technology and with that I will pass it along to Will for what you're probably all here for the demo and how to get it going and used in production. So Will you want to add anything or take it away? Yeah, thanks Josh so I haven't got any slides I'm going to demo the operator but I was just going to talk a little bit more about Cache DB. So one of the well there are a few nice things about Cache DB that make it well suited to deployment in in Kubernetes. So as Josh mentioned that one of the kind of big things is that it's you interact with Cache DB entirely through its HTTP API. So exposing the database low balancing of the database instances reaching the database is all fairly kind of straightforward because human entities and HTTP applications are a very common use case for Kubernetes. Data durability story is another one that makes it well suited because Cache DB uses a copy on right storage engine which is extremely fault tolerant. So as long as you've got a positive compatible storage back end you can basically it's very tolerant to database instances just being stopped abruptly and so it's easy it copes well with the Kubernetes schedule and moving things around. So that also massively simplifies the deployment of this. The main kind of weak points in Cache DB historically especially when it comes to clustering have been around the setup and administration. So our kind of approach to the Cache DB operator was really to focus on that try and make it as easy as possible to set up plus the Cache DB instance using the configuration kind of best practices that we've learned at Cloud and over the last 10 years or so. So with that background I'm going to share my screen and just look through drawing the operator on OpenShift. Okay, so I guess the first thing is where can you get the operator from and the operator is published to two locations. So if you're not using OpenShift 4 you can go to operator hub.io and install the operator so that will work with vanilla upstream Kubernetes or it will work with shift 3 and you install OLN directly. It's also a Red Hat certified operator so if you go into OpenShift if you're on OCP 4 you can go into the operator catalog in OCP directly and then find the Cache DB operator here. I just installed it in a new project called Cache DB demo if I go here and we'll see it has installed a number of custom resource types. The main one to be well the only one that you care about really as a user is this Cache DB cluster resource. These other resources are internal resources that the operator uses and we'll see those as we go through the demo but essentially the Cache DB cluster resource is the only thing that you as a user would interact with. So I'll create an instance and this has just filled out some of the basic defaults so just CPU per Cache DB node, disk per Cache DB node, storage class I'm going to leave empty for now which just uses ephemeral storage but in production you'd expect to specify some persistent storage depending on your environment. I'm going to specify a one node cluster because this OpenShift deployment is on my laptop using code ready containers and there's only one worker node available. The default configuration of the operator is to only allow one Cache DB node per OpenShift worker or Kubernetes worker. So I wouldn't be able to try and try and have a three node Cache DB cluster it will fail to deploy. You can override that using a developer dev mode flag so for development use cases where you want to try a three node cluster particularly because some of the consistency properties in Cache DB differ between single node deployments and multi node deployments, you can do that but in production the default is that we will try and spread the nodes across workers using anti-affinity rules. So I'll create that and that's going to go off and create my Cache DB cluster. So under the hood what that's going to do is create a resource called a formation which is the piece that does the real work. That's a pattern that's extracted from our IBM cloud database offerings which use Kubernetes extensively. So the Cache DB cluster is actually just a thin wrapper over all of these other resources and the formation is the thing that does the work. If I go into my Cache DB cluster we look at the YAML here we see that it's got a formation generation which is basically the generation of the formation that's being created and this observed generation will get updated as the formation is expanded into its sub-resources. So when these two numbers, when observed generation equals two then it means that the formation is fully representative of the Cache DB cluster resource here. It takes a little while to go and flesh everything out so if I go into the pods view we'll see that it's created a Cache DB pod here and that's going to take a little while to initialize. It will also create a secret for my Cache DB cluster. The operator will generate cluster-wide secrets which is typically a pain point when using tools like Helm to deploy a Cache DB cluster because there are certain secrets that need to be unique per Cache DB cluster but they need to be the same on every database node within a cluster. So the operator generates those for you and ensures that they're synchronized across all instances or pods that are database nodes and we have a similar thing with the config map so if I go into the config map it's generated some of those common configuration options which need to be propagated across all nodes in the cluster. I go to the secrets so you'll see I've got like an admin password that's and North secrets and a cookie which is used to communicate across Cache DB nodes in the cluster. Hopefully that pod has now come up so you can have a look at the pod. The pod has two containers in it it's got the Cache DB node it's also got a sidecar container which is essentially an agent for the operator so the sidecar container is waiting for instructions from the operator and then it executes those against the Cache DB database node which allows us to run rich cluster-wide operations so things like coordinating upgrades where maybe you have to stop all the instances of the database, run some command and then restart them or that pattern is what allows us to do that. If I go into networking I can see that it will have created me a cluster IP service so it's got an internal service here which you can ignore but it's got this cluster IP service here it's also integrated with the open shift service certificates so it's exposing an HTTP service on port 443 which is using those open shift certificates so it will be able to be validated by any clients within the open shift cluster. The other neat thing with that is that it makes it very easy to expose to the outside world if you want to so I can go in and create a route for Cache DB and select my service here for 443 and I'm going to secure the route with re-encrypting and create that insecure traffic will redirect so that's created me a route that I can use directly from outside of the open shift cluster I launch that it's going to give me a warning because I'm running code ready container so it's using a self-signed certificate but I can go past that and it's given me my Cache DB instance the base URL for Cache DB is just a JSON endpoint so if I go to the utils endpoint that will give me the dashboard and here's my Cache DB I can log in with the credentials which we specified when we created our Cache DB cluster resource log in and you can see it's created a couple of system databases the replicator and users database I'm not going to fully kind of go through the dashboard but we can see from here that the Cache DB set up is configured for production usage as a cluster node if we go to the configuration we can see all of the configuration options in Cache DB that the operator has set up for us and we're basically ready to replicate data now so as I kind of follow on from the use case that Josh was talking through earlier I thought it would be good to show how we can replicate between essentially a private cloud which is my laptop and a public cloud instance so I've got an OpenShift instance deployed to IBM Cloud as well which also runs a Cache DB so I can show you how to replicate between the two by launch my IBM Cloud lovely URL OpenShift instance here this is a OCP4 as well and I've got a root Cache DB instance somewhere here so what networking roots here is my Cache DB same thing if I go to the base URL it's a basin but if I can log in and I've got just a demo database here which is a movies database so if I run a quick query here that's a moon and I can replicate that to my local Cache DB so in my local Cache DB I will set up a replication so the source is a remote database this one that I just saw so movies demo and then to cope with my admin password which you probably shouldn't do in production but it's fine for a demo and I'm going to create ask it to create to me a new local database movies demo as well and I'm going to set this up to run continuously so this database will continuously sync using HTTP of the internet now I have to run this replication from my local deployment because my local deployment whilst it can pull data from the internet it's not exposed to the internet directly so I wouldn't be able to access this actually the instance external outside of my network here so I'll start that application and that should go off and pull all that data from the OpenShift deployed Cache DB in IBM cloud let me see if I go here so you see it started it's created a movies demo database and it's going to slowly populate that so we're up to one, three, one, four so this Cache DB instance in OpenShift is hosted in Dallas and I'm in the UK so it's probably going to take a little while just to replicate that over so the other part of replication that's nice so a Cache DB replication by its nature is one directional so this is just pulling data from that external Cache DB instance hosted in the cloud if I want I can also make a second replication to make it bi-directional so I'll take that source and I'm going to make the source my local movies database the target can be that existing remote database that's continuous as well so that's basically how you would configure an active-active deployment across multiple clouds or multiple regions within the same cloud so bi-directional replication one that pulls data and one that pushes data so here I think I've fully populated my my database here but if I run a search here let's say same one I've got my anachendric search now if I add another document here the so old world tour that's what my kids are watching save and then shortly if I go to my cloud deployed one you see it's just turned up there equally if I go and add one in my cloud deployed um forward catch db so I'll add frozen 2 create that now should be able to do search here there we go frozen 2 is turned up and so that's basically basically it's that simple to deploy catch db in multiple clouds and these these OpenShift instances could be in obviously in IBM cloud it could be in AWS or GCP or your own machines as I've done in my demo it's that straightforward that was awesome Will thank you so I have a question hi this is Shana when you set for continuously sync or reprk what are the time interval in between or it just recognize if there's a new data and just automatically reprk or how how does that what does that mean for like you know when we do incremental replications like per day or but when you say continue means what's the context around that so there's there's a few different ways you can do it depending on what your requirements are and sorry there's a few different ways you can do it depending on what your requirements are for latency so catch db does support a fully continuous replication where it basically maintains a connection over HTTP it's it's a sort of long running long running request so any data that gets added on one side will as quickly as HTTP works get transferred but a lot of it will depend on the load that your database is under right and how many rights it's processing you can also another way of doing it would be to schedule replications so you'd have to do that programmatically have a have something that replicates say at certain times of the day or or just periodically if the data isn't changing that that frequently that's kind of a way to do it with less load on the cluster but probably the continuous replication is the most common way to do it let's see okay i'm thinking on like if we're doing if an application is doing active active on multiple cluster how are we guarantee the the data integrity on both and right the sections is going and then we're doing copying the data that's all right so yeah the replication isn't transactional in that sense that you won't I mean CAHGDB doesn't have have transactions anyway in that in that sense so there are sort of there are patterns you can use get close to it but ultimately it the way that most people work around that is to just have their own monitoring that will test that you know you could have a document that gets updated say once a minute on one side and then you can test that that has propagated on the other side figure out what the lag is that kind of thing but if you need a hard guarantee that the data is replicated across regions but that is difficult with with CAHGDB the way that I've seen it done before is to instead of using replication for that actually just have your application make rights to two different databases at the same time the what CAHGDB does do for you and the CAHGDB clusters deployed by the operator will do this is that it maintains multiple replicas within the cluster so you've you've already got three replicas of the data by default within your CAHGDB cluster so if you're deploying to a cloud or an OpenShift cluster that has multiple fault zones then your CAHGDB nodes within the cluster will be placed across those fault zones and it will attempt to distribute the replicas across them as well so you get some redundancy within a region or within a deployment anyway and that you get without having to do anything especially in your application it's the multi-region it's really nice yeah we're nice that you can easily replicate the way you had it very simple thank you so Will I have a quick question if you're at the end of your demo are you at the end of your demo? yeah I know that you are here that you are one of the people that wrote the operator itself for CAHGDB and I was wondering if you could talk a little bit about that process and any lessons that you learned or anything you know as you mentioned earlier that it was an opinionated version of it how that went and if you could shoot yeah so I mean really the CAHGDB operator is standing on the shoulders of giants and certainly we have been using operators at IBM for a long time to support our IBM cloud databases offering and part of the work for the CAHGDB operator was to take the patterns that have been established in doing that work and bringing them into a framework that could be used for standalone developments like this the main one that I sort of alluded to earlier was this idea of an agent sidecar which the operator can interact with so every pod has this service that sits on the side and understands how to coordinate rich operations across all of the components of the cluster so the operator can essentially we call it a recipe but it will have a series of actions that it needs to perform to get to a state and can instruct the sidecar to perform that action on every replica of the database notes and the complexity of that depends on the database that you're deploying but we've used that pattern successfully for things like Redis and Postgres and MySQL and MongoDB and now CAHGDB as well our operator work largely predates the operator framework so we don't use that directly and I guess a lot of the work that we did for the CAHGDB operator was trying to bridge the gap between what we have done for our internal needs and what is expected by the community and the consistency of operators for open shifts and communities generally through operator framework and operator hub but I think that pattern is the main sort of thing that I haven't seen that much in other operators and it's significantly different from things like a Helm based operator all right it's allowing us to do those kind of rich operations the same thing on every replica well that's good insight and interesting to hear how long that IBM's been already using the operator pattern so I think it's something that has been really useful and it continues to evolve so thanks for the insights on that Joshua did you have anything else? I did one data point that people might find interesting is that we use this pattern that Will spoke of sort of standing on the shoulders of giants to run many tens of thousands of databases at IBM so it has scaled extremely well for us and has been a a good framework to deliver databases as a service for our team so with this presentation I've also included a resources slide that links directly to the operator the documentation for the operator for Apache CouchDB a short video that one of our colleagues put together with a light board to describe what is Apache CouchDB and why you might want to use it and also links for folks to peruse the CouchDB website join us on Slack or help contribute on GitHub I'll also provide to Dan contact information for myself and Will