 So Guerrero and today I'm going to talk about data gateways and how to bridge between legacy data and monolithic databases and microservice architecture. So I hope you enjoyed the following minutes. Thank you very much for joining and let's get started. So we will be covering in a couple of topics in the agenda for this session. We will be talking about the architecture evolution and how applications have been changing in the way that they are using their components and how they're handling the cloud native nature of distributed systems. Also we will be talking about microservices data and what was the problem that the API gateway saw when they start to rise in a few years ago as well as how to handle data under the microservices piloting and the architecture that its focuses on microservices. As well we will be talking at the end about the data gateways capabilities, how they're working and what is the idea behind having a data gateway as well as some of the types that are currently available. And finally we will be talking about different open source data gateways, what options do we have available and we will be focusing on one specific one that it's called Teeth and it's a project that is sponsored by Red Hat. So I introduce myself. My name is Hugo. I'm a Mexican and it's currently based in the Boston area in Massachusetts here in the United States. I'm currently working with Red Hat as an API and messaging or even driven specialist. I also have been an open source advocate since I first started working with the Javas Enterprise application server around 2004 so it has been a long journey working with open source software. I'm also a history and travel enthusiast so I really enjoy traveling as well as discovering new food and new snacks or street food around the world. So here's my trade handler. If you want to follow the conversation you are welcome to just follow or tweet or continue to post more information on these topics. So we were talking about what is the volition that applications have been coming around since we start moving away from the traditional type of architecture of three layers, one single application, one single deployment or multiple deployment files but just running the same infrastructure. So first we start with the 12 factor applications. We start to decouple certain characteristics and part of the design of your application moving away from high-coded configurations to externalized ways to do injection as well as keeping different source repositories to have different paths and ways to handle our applications. Obviously that led us to microservices architecture with all the benefits of having these cloud resources that looks like they're infinite and allows us to have several deployments of our applications in different locations just asking or requesting resources to a cloud provider or in a cloud feature. And finally the next step on this kind of volition is what is coming right now on the topic of serverless as well as function as a service. Serverless being the capability of having all these microservices or containers or functions being able to scale to zero and just being called, just being activated while being idle when there's an event or when there's a call or request that it's added to that specific service container or function and then it gets started it do all the processing that is required and then scale back to zero. Also there's you know this part of this change and then this focus and having also function as service as a synonymous of serverless but that's a specific way of how to handle the code and the packaging and who's actually managing those kind of components but obviously they get the benefit of having this option of scale to zero as part of the of the function as a service infrastructure. So this is how we have been moving away from the traditional one single block of code in one single deployment as an application. However this has been only a trend or a change at the processing network layer because most of the times this kind of development, this kind of microservices and serverless and functions, most of them are very dynamic that's the way they require this scalability, this portability but sometimes we are actually getting away from the from the state signature of one of the components of the architecture that we cannot neglect and that it's data. So most of the times you perhaps have been facing this kind of problems when dealing with your application. So you have your this beautiful microservices architecture that it's pristine and it's ideal and it's ready to be deployed. However you have behind you this monolithic database, this single source of data that it has not been refactored, has been not work around to make it work with your microservice architecture. So you think you're already there, you think you have already saved your just a few moments of crossing the finish line. However there's this big monster behind you that is just waiting for you to be distracted to take that stuff on your back. So I am totally sure you have seen this kind of scenario before when you have microservices that are actually sharing a database or are being dragged back by the limitations on how the services are limited and coupled in the same moment when they're sharing data with other services. So what can we do? Well we have seen that the traditional architecture has some challenges, also benefits when you have all these three layers all together but they're actually accessing just one single data source or one single type of storage. You certainly have some consistency, right? All the logic is in the same language, all the all the access to the database, most of the times it goes to one single storage so it's something that it's going to be there forever when we take a decision around the storage it's going to be, we're going to be using that story for a long time. So there's obviously the challenges of how frequently you're going to release, if you make change in one of the layers you certainly will need to be sure that that change does not affect the rest of the architecture and how to make these resource isolations if you want to focus on problems in a specific component of your architecture. Well those are the things that we are trying to get away from. So that's why we have this distributed architecture and there's new challenges that we need to deal with when you were switching from the previous monolithic type of deployment traditional architecture to the distributed one. So the first has to do with microservices being a distributed system, a system that it's not running in one single place but it needs to be deployed across different resources called it could be servers or it could be different clouds even and that makes communication between those services critical and vital because we need to, we need to face this, there's no service that lives in the vacuum and that it can exist alone. So most of the times your services, your microservices will need to connect to communicate or have this kind of connectivity between them or with other services. So that's why we still see some challenges on this kind of architecture but now they're moving from these previous monolithic challenges to two kind of challenges, two new challenges. So the first ones are related to the network and we mentioned that we are having a distributed architecture that needs to communicate, right, it needs to connect. So that's the network challenge where you need to know where your services deploy, so you need to do discovery, you need to balance the load between the call to your services and what happens when suddenly one of your services that is not available and how do you share the information, you can use just simple connectivity or you can use the call of services directly or you can use something like APIs to have their governance of your communication and obviously monitoring and tracing to know what is happening in your communication layer. The second type of challenge that you get when you're talking about distributed architecture goes in the data layer. So we talk about when we're seeing this monster behind you that the data access also becomes a challenge when we are talking about microservices. Why? Because there's this focus on having these decoupled services that can be also managing independently and the idea is obviously to have each one of those microservices being able to just own one of the one of the storage and being able to manage that, you know, take property of that kind of storage. So you still need to be able to share that information, share that data between your services, right? If you have a customer that is doing orders and you want to have the items and the payments, you need to communicate with all these systems and share that information. So you will certainly need to have this kind of abstraction layer to be able to move that data in a simple way between all these different services. Also, you want to have the services available in different points of your system. You also want to handle them in a homogenous way where you can communicate them easily and everybody is able to understand them. So those are the kind of challenges that you can face where doing a distributed architecture like doing microservices. So when we were talking about the network challenge, one of the things that we came on the idea when we discovered that we need to have to handle this kind of network connectivity and way to connect your services is the race of API getaways, right? So when the suddenly the load balancer is just not enough for having a system that is a security device that is in the perimeter of your application or your network that is just shielding you from outside, that's suddenly not enough. You need to take more decisions on how to apply certain controls, certain governance or the way you share information between your systems and how your applications suddenly start to consume those services. So this is how the getaway helps us to cover this type of access into this new layer. So the API getaway first gives us an abstraction layer where you don't need to know exactly what system needs to answer what request. You can also apply in a single point of access certain policies. For example, access control or rate limiting or even adding some security layer that perhaps you don't actually need to do or implement on each one of your services. When you are taking your applications to access for the getaway, you are able to have this point to be able to afford those policies. And then because you are actually now in the trusted zone of the network or your application, you are able to call those services and the getaway has the enough intelligence to know which service needs to solve or access one of those requests. So in the data in the data layer or in the data type of challenges, as we mentioned, when we're talking about microservice, we're talking about that we have services that require an independent database per each one of those services. Because this will help us to tackle that problem with the coupling. If we have a shared storage and we have services working on the same storage of data and we change one, obviously it's going to hit or it's going to pass some of these improvements to the other system. So the idea behind how microservice should address data is one having their own database. And by owning this database, it could mean two different things. The first one is be literal and have two different instances of database totally independent each one from another. So we have an instance where in one server for microservices A or having a second instance with the data and the access just for microservices B. The other option to have this kind of independent that is just having the same instance but having this separation between schemas if the database is able to handle the schemas or grouping of tables. So a set of tables are owned by one microservice and the only way to communicate through that or to manage or access that data is through that service that it's owning that information and other service being able to have the information from the other perspective and just accessing their own table. So there's different ways to handle that. It depends how really you want them to be independent. But the focus is that just one microservices owns the data. You are not sharing that information at the start level. The second thing that microservices tend to enforce or shape the way we handle data is the eternity of the services. So most of the times when we start to build or develop our services in specific languages or different languages and different tooling, also it represents that we certainly will end with a different persistent ledger for each one of those services. Because actually the data is different. There's data that needs to be treated differently and also it needs to be stored in a different way. So it's over those times where we can put in one single type of storage to make it compatible with the rest of the data available. So in this case, we certainly will need to have type of information and data that is stored in a relational database where we need to have high consistent data or perhaps we will need to store that information in a document database type where you will be able to access that easily and you will be able to have a more flexible way to store that information. Or even sometimes you will need to do just a in-memory type of key value type of storage for your information because that is different. microservices allow us to have this kind of difference between how to handle data and that obviously represents a challenge when you need to deal with this polyglot persistent ledger. So as we can see we have different type of solutions but in kind of similar way other than the challenges we mentioned before. So in one side we have the API gateway where we can have this implementation and the abstraction of the communication with services where we are focusing on the contract through API first design and development and it's in charge of the load balancing and the network resiliency on how to access our services and allows you to implement certain access control on how your applications are calling the microservices. So in a certain way we will have something similar when we're talking about data gateways. So we still want to have this abstraction layer on top of the implementation details on the data store. And also we want to have this type of federated approach where we can access and address data in the same way independently in how it is stored. From the application perspective I just don't want to deal with the all these implementation details of how the database has been managed and how it's implemented. Also allows us to boost the performance when suddenly we need to scale allowing us to have this type of data gateways can help us to being able to implement some policies or some mechanisms like caching or materialized views to allow us to you know have more as services or more applications being able to access that data that perhaps has been it's been kept on an infrastructure that perhaps is not able to scale saying we're targeting a very old database or perhaps a mainframe that cannot scale beyond the the current infrastructure. So basically the data gateway is just like an API gateway but instead of you know focusing on how to access the network layer to just reach the microservice that you want to do and being able to talk with that services under the same circumstances it focuses on the data layer implementation and and it's been able to actually access the data implementation details and being able to connect and talk to the specific implementation resource and then being able to offer that information or that data in an homohenous way to any one of those consumers. So that's basically how data gateways as a resemble the API gateways. So what are these data gateways capabilities that we're talking about? Well we mentioned a couple of them in previous slide right? We actually were looking for a piece of software that can do the abstraction layer that could help us do the decoupling between the service that it's calling the data and how is the actual data start being implemented. So it hides these implementation details I just don't need to know if the data is coming from a document database or in memory database or even an API service I can just access it in exact same way. So it hides implementation and abstracts the physical source of the information or also allows us to add this security layer where we can take some control on the access of the data through the modeling of the actual data. So instead of thinking about just accessing resources like we do in the API gateway we can focus then on implementing some policies around specifics onto data like for example row or table level so you are able to have a more fine-grained control of the database without having to implement all those policies all that information in the actual database. So you will have this layer at the data gateway level where you can implement all these policies without being to go to the actual implementation detail. And obviously this with the mechanisms like caching and materials use you are able to scale your infrastructure to be able to handle more loads right instead of you know hitting a mainframe that perhaps it's limited in capacity you can just do it once then populate the cache and then it's being used to return all the responses or the responses that are being requested to that data gateway. And with the usage of some other components like for example or patterns through tools like change data capture for example through the bsume you're able to well being able to invalidate the cache or just update the caching when suddenly there's an update on the persistent layer and you're able to continue to have these scalability features on your gateway. We talk about the federation and being able to access the data in a single way even though the implementation details are different on the different sources. And one of the approach that is very useful for applications is to you know keep a very known language and how to how to query the information and that can be very useful to try to keep it the schema first or the design of the contract through standards like SQL. So we have talked about the different capabilities of the data gateway and now let's go over some of these examples of what that is because there's not one single solution there are different shades of data getaways going from for example the classic data visualization layer where you can find solutions like a composite or the nodal where they still have like the same centralized single deployment approach that we used to have in the ESB when we were talking about networking or SOA for example to do this kind of visualization where everything goes under the same infrastructure or you can have also the federation within the database where you can have these databases like portraits that can implement some kind of genes that allow you to offer access to the details of the database through standardized connectors. There's also a rise on the use of GraphQL bridges to be able to access data sources in a simple unified way so this is it's pretty common to be found in mobile devices or front-end applications that need to do several movements or several changes in the in the data store at the same time and so using a bridge like GraphQL allows you to apply all these different types of changes in the single store. There's also things like cloud hosted data getaways like for example AWS Athena has some examples there's AWS Redshift where you can have a single access to a data store that can be backed by S3 buckets or other different type of components that are more proprietary to the cloud provider but still you can have a single point of access that it's been able to serve you through standards like SQL and others and there's other ways where you can rely more on the network level and have these kind of data proxies that are just doing secure tunneling to access the the backend services so you can use something like Google's cloud SQL proxy or even there's a project based on on the QP dispatch router called scoper that allows you to access certain certain resources on your network certain services like in your coronaries cluster going through a proxy that can apply certain policies and been able to let you forward the the connection to your to your actual database but you're missing some of the capabilities that we talk about what we are expecting into data getaways in the past and finally there's some open source data getaways that are also available out there so Apache drill is one of the project that has this schema free SQL querying gene for no SQL databases there's other solutions like PrestoDB PrestoSkills also a project started by Facebook that allows you to have this gene being targeting mainly big data use cases but still something that that it's open source and it's available in the field and finally there's a project that it's sponsored by Red Hat that's mentioned at the beginning that's called Teeth. It's a project that has been around for a while but currently is now being focused on being able to deploy this kind of getaways at a Kubernetes native level. When talking about Teeth, so Teeth has different components the new focus or the new report of this project is to be able to have a based on the operator pattern the Kubernetes operator is mainly focusing on Kubernetes to be able to deploy these getaways as microservices that act through connectors to be able to you know connect and establish a communication with the data source they have different type of connectors to different type of of of data stores that can talk to relational databases non-relational databases APIs or even object storage and then can connect or interact with policy control planes where you can define things like the type of security that you want to apply the roll access control and then expose that information through different type of it but you can just use a or implement or start a JDBC endpoint or driver where you can connect with the with the JDBC driver and be able to create information using traditional ANSI SQL or you can access that information through a REST endpoint or for other applications that perhaps require different type of access there's an endpoint that could be used through ODBC so you can have different ways to interact with the gateway in in a way that most of the applications are ready to do and then apply the policies as part of the the gateway pattern and then being able to access the the data store so this is a way where people like data engineers can design the access to their to their data to data pipelines or data lakes or or data ponds where they've been able to share that information in a secure abstract and homogeneous way to other applications so just to finish this is a it's a great comment from a colleague of us in red had a bill in me you know that he is having this this part of first part of the of the article he wrote about data getaways so data has gravity also requires control it's hard to scale if you are using a traditional approach obviously and it's it's sometimes difficult to move between cloud infrastructure so the get out of the gateway is one of the components that it makes even more clear that we need to have this kind of of companies to be able to deploy not just in a one single way not just in one in data center or just in the cloud what is in is the requirement of this kind of components at the hybrid cloud level that actually it's becoming a necessity and the data getways is it's a way it's a pattern that you can use to be able to overcome the problems and the challenges of the data layer as we did with the api gateways at the network level so i really appreciate these moments with with me i hope you like this session it has just a brief information on how data gateways work what are we expecting about capabilities and there is different projects that we are you are following on how to implement this kind of of software components you have some examples there you want to take a look visit the the t-th project it's something that's still working and they're still updating on the different options you can see some examples in your data page in their website so i hope you like this session i really appreciate the opportunity to be with you and thank you very much see you bye