 Hi everyone, I am Shilpa from Red Hat. I work in the storage team as a software developer, and today I will be talking about sync modules of Ceph Rados Gateway. So here's the agenda, it's a quick introduction to Ceph and Ceph Rados Gateway, and then how Object Multi-Site Replication is done along with an introduction to sync modules, which is an important part of this talk. So it is Ceph. Ceph is an open-source software-defined storage. It runs on commodity hardware with absolutely no vendor lock-in. It is distributed and highly scalable in nature. Architecture is designed in such a way that there is no single point of failure. But since Ceph clusters are massive in nature with thousands of nodes, there is bound to be failure in the system, but the clusters are designed to be self-healing. It provides a unified interface for object, file, and block. Rados Gateway. Rados Gateway is an object store client to Ceph cluster that exposes RESTful S3 and Swift API. LibRados is the API it is built on top of. Rados Gateway handles things like user accounts, ACLs, and buckets. It supports two HTTP endpoints, one is a civet web and the other is a beast. And the data is stored in buckets in the form of objects. Buckets have a flat namespace, unlike a file system like hierarchy. Objects are immutable. That is, once they're uploaded, the data cannot be modified, but they can only be re-uploaded as new objects. Take, for example, videos and audio files. So it supports most of the Amazon S3 features, like multi-part uploads, object versioning, lifecycle policies, object encryption, compression, and static website and a lot more. So what is object replication? It provides geographical redundancy with async data replication. It is eventually consistent. And so it is an asynchronous copy of data between multiple clusters that are running Rados Gateway. The main use cases is disaster recovery for remote sites, but the advantage owing to its eventual consistency and asynchronous nature is that it would allow for unlocking a lot of use cases, such as backing up of object storage to an external cloud cluster or custom backup solutions and even indexing of metadata in Elasticsearch and providing notifications, et cetera. Going into a bit of how object replication looks like. The main logical entities that are involved are REM, zone group, and zones. Data is replicated across zones. Within these zone groups, REM enables the self object gateway to support multiple namespaces and different configurations on the same hardware. We have master Rados Gateway zone, which is the source of truth for all metadata changes. So the sync process of multi-site replication it contains mostly two types of changes. One is the metadata and the data changes. And metadata changes are mostly bucket operations, such as create and delete and enable and disabling of versioning and user operations. So metadata changes are not really done very often, but data changes are things like object updates. And these are done quite frequently. So the thing to note is that metadata changes are always synchronous. And data changes are the ones that are asynchronous. So only the master can execute metadata changes and all the requests from other different zones are forwarded to the metadata master. So for an update to metadata originating from different zones, it is first forwarded to the master. Here it updates the metadata log and it performs the change locally. And then it pushes the changes to other zones. And these other zones read the metadata log and applies those changes locally. As for data changes, they also maintain their own data log. And there are two phases, init and sync phase. In init phase is where you fetch the list of buckets, bucket instances. And then sync phase is where we check if the bucket exists or not. If it does not exist, create a bucket and then sync all the contents. Or it can be an incremental sync where the bucket already exists and we just push the data changes. So sync modules is the main topic of this talk. And these plugins are built on top of multi-site replication framework. And this allows for forwarding data and metadata changes to different external tier. So with just multi-site replication, we would only be able to transfer data between zones. But with these plugins, you would be able to push the changes to external data solutions, data tiers. So a sync module allows for a set of actions to be performed whenever a change in data occurs. So the list of sync modules are Elasticsearch, CloudSync, PubSub, and ArchiveSync modules. So the first sync module, Elasticsearch, it is a distributed scalable analytics search engine that is built on Apache Loosene. It provides simple REST APIs and quite easy configuration. The motivation behind using Elasticsearch with Rado's Gateway is that Rado's Gateway already stores rich metadata information with its objects. And with Elasticsearch, we could enhance the search and queries to give an overview of object storage trends. So Rado's Gateway still has a native Rado's Gateway admin API to provide some query-based metadata, but it doesn't really help that much in aggregation and analysis. So for example, we would like to say run some usage reports on certain buckets or we would want to find out how many videos have been uploaded by a certain content creator, or we would want to get the average image of size of a certain type or et cetera. So this is where Elasticsearch helps. And another reason is that we can get notifications from upon creation of buckets or objects or user account creations, et cetera. And all this is quite trivial on Elasticsearch. So the way that it works is that we have, we already have metadata being forwarded to a secondary zone from master. So we dedicate Rado's Gateway, so sorry, yeah, so we dedicate Rado's Gateway server to feed Elasticsearch instance to, by configuring the Elasticsearch endpoints. One thing to note is that we don't expose Elasticsearch endpoint to the end user, but it is only for security reasons, but it is only accessible by the storage administrator. So this way, the Rado's Gateway just acts as a proxy to pull the data from other zones and feed into Elasticsearch. This way, the authentication is taken care by Rado's Gateway itself for the end user. So that's about Elasticsearch. The next one is Cloud Sync module. The goal of this module is to enable syncing data to multiple cloud providers that support Amazon, that support S3. So it may be used for redundancy or compliance use cases, and even here in this case, it's required to have a dedicated zone to define the sync type. So here, one issue is that it is not possible to preserve original object modification time and ETag. So what Cloud Sync module does is that these metadata attributes are stored as a part of destination objects. Here, minimal configuration is required, usually. So we have to provide connection details and ACL mappings, and each ACL mapping would look like that where you have a type and source ID and destination ID. So these will define the ACL mutations that will be done on each object that is pushed to the external tier. So an ACL mutation is basically where it allows for converting source user bucket to a destination bucket ID. The next one is PubSub. So the PubSub, as the name suggests, the sync module provides a publish and a subscribe mechanism for object store modification events. So based on the triggering of an event, certain actions, desired actions can be taken at the endpoint. For example, say we get a notification for an image upload, which is triggered, or which is triggered, and then you could invoke a custom code at the endpoint or something like a lambda function in AWS to probably do things like classify the image or run some analytics on that image or on that object. So the module can work in two modes. The one is push and the other is pull mode. So events are published into something called predefined topics. And these topics, they mostly contain the definition of an endpoint and the notifications for a topic are created for every specific bucket. And it also provides the ability to filter what type of events we would want to notify. If topics contain endpoint, then that means the event will be pushed to that endpoint. And the supported endpoints are HTTP endpoint, Kafka endpoint, Kafka or AMQP. So if topics don't contain the endpoints, then they are stored as a subscription within CIF, and the events can be pulled from these subscriptions. So that is the pull mode. PubSub module also requires the creation of a dedicated Rado's Gateway zone, and which defines the type as PubSub. The common notifications that are involved are things like object create, object remove, and associated types, associated APIs. So that's about PubSub, archive zone. So archive sync module is fairly simple. It allows to have read-only, it's basically a read-only zone and it stores a history of versions of S3 objects. And it is not meant to serve the users directly, like the other zones. The functionality is used to have a configuration where you have many non-version zones replicating data and metadata, providing high availability for the users. But the archive zone only captures all the data updates and metadata updates and they consolidate them as versions of S3 objects. So in the event of requiring any older version of an object, we can always retrieve it from this zone. So that's the purpose of archive sync module. And that brings us to the end of this talk. Here are some resources for reference. One is for Rado's Gateway and the sync modules. And that's it. Thank you. Here's my email ID. And we are available on HashSafe and HashSafe Devlon IRC. Thank you very much.