 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of DataVersity. We would like to thank you for joining this DataVersity webinar which today is a case study Gordon Cruz delivers fresh data to the cloud sponsored by Click. Just a couple of points to get us started due to the large number of people that attend these sessions you will be muted during the webinar. For questions we will be collecting them by the Q&A panel or if you'd like to tweet we encourage you to share highlights or questions via Twitter using hashtag DataVersity. If you'd like to chat with us or with each other we certainly encourage you to do so. And just to note that it defaults the chat defaults to all panelists so you can switch it to all panelists and all attendees so you can chat with each other. And to open the Q&A or the chat panel you can find those icons in the bottom middle of your screen for those features. And as always we will send a follow-up email within two business days containing links to the slides, the recording of the session and additional information requested throughout the webinar. And to introduce to you our team of speakers for today, Kyle Partlow, Tom Majeski and Adam Mayer and Eric Thornton, sorry you guys I'm getting tongue twisted on your names. Kyle has led Gordon Food Services data engineering team for the past two and a half years as part of the organizational goal of becoming an insights driven organization. Kyle's team set out to build their revolutionary data platform to enable access and analysis of all enterprise data. Tom is an experienced technology leader with a broad range of experiences with recent interests including cloud and data. Three years ago he led an effort to research and develop a successful operational metrics concept hosted on G, Gordon GCP utilizing quick replicate. Eric is the North American head SAP customer engineering at Google. He has 20 years of experience in retail tech and SAP and varying formats. He spent nearly 15 years with SAP focusing on technology, retail and account management. And Adam is on the global product marketing team at click covering the entire click product portfolio. He is responsible for delivering the company's Internet of Things and GDPR go to market strategy with a strong technical background and computing underpinned by an extensive engineering perspective. Adam is an avid follower of new technology and holds a deep fascination of all things IOT, particularly on the data analytics side. And with that, I will turn the floor over to Adam to get us started hello and welcome. Hi, thank you very much for that great introductions Shannon so good morning good afternoon and good evening, wherever you are. So data is the lifeblood of any organization and now more than ever it's really important to be able to keep that data circulating so then you can take important decisions on the most freshest data available. So as Shannon said I'm Adam from click and just to give you a quick overview of today's session I'm going to talk to you about the challenges in moving data out into cloud and particularly how you can overcome them and how we can help you get data into Google cloud specifically and then I'll hand over to Eric who will tell you more about all the wonderful services that are available in Google to help you get more value from the data. And then we are going to hear from two of the greatest minds in Gordon food service on why they chose click and Google to really enable and lead their enterprise data strategy. And all the great improvements that they made from that and then we'll wrap up and open the floor up to questions. So moving data to the cloud is certainly not without its challenges, and there are a number of ways that you can go to build out your data pipelines. One way is the traditional ETL approach the extract transform and load, and that commands quite a high skill set involves a lot of manual scripting manual coding, and it can take a lot of time to deliver. And it can be prone to human error over time because of this it doesn't tend to scale well particularly as in when things change. So not just changes at the source of the data but also as you start and need to add new data sources. So it can work quite well for you know relational databases with particularly when the target is in the same format as the source that you want to replicate, but it does get more complex when you want to add more data sources, especially those mission critical systems, such as your mainframes oracle and SAP. And over time, that can start becoming prone to brittleness. So this can result in quite frustrated business users from slow delivery of data, and essentially creating a workplace that can be reported on out of date, out of date data, worse inaccurate and that can lead to untrusted data. So this is what's driving many organizations now to modernize their data integration approaches and shift towards more of a zero coding environment as much as they can do, which we've got on the right hand side, this can really help you to simplify your most complex processes, particularly through automation that can really help to remove the human error and start providing that scalability that you're looking for and this can ultimately start to deliver more timely, accurate and therefore trusted data into your your data pipelines, even you keep in mind that you're keeping the lights on, and then it allows you to focus on those much higher value tasks that are always going to keep on coming in. So in terms of the key drivers, in terms of what's driving modernization and particularly why modernize your data integration approach. Well, first off, starting from the left hand side moving across the demand for real time is ever come in and it's it's not only coming from within your organization, but probably outside as well we're all used to instant results these days. And if particularly if the pandemic has taught us anything it's that need for quick change to pivot on the dime based on what's happening now as opposed to yesterday. So the more freshest data that you can be working on the near real time as possible, the better data decisions that you can make. And then next we're living in a hybrid world there's a need to seamlessly move real time data between many heterogeneous systems. Many of those are on premise and then there's a need to be able to connect those on premise systems out into your cloud environments as you start to modernize and even move data in between the cloud as well. And then there's a need as well to be able to want to do that as automated as possible, that can really help take the heavy lifting out of data ingestion, the things like your mappings and creating data models particularly for your data warehouse and data updates out in the cloud as well. And you want to do that as scalable at more at scale as well and start to embrace the fast pace and ever changing tech landscape. So it's quite important then to work with partners who have deep expertise in data in data integration, particularly in the cloud to the cloud, as well as modernizing out in the cloud as well, when it comes to your important data and analytics initiatives. And this is really where where click comes in. I'll just give you a very quick overview here many of you may know click as an analytics bi company. However, we are a data and analytics company now, we're able to offer a truly independent data integration solutions as well. We have lots of acquisitions of podium, as well as more recently, a tunity back in 2019 will join in the click family. So this means that now click is really the only vendor to deliver a truly open end to end platform that enables that data value chain that you see at the top there so the ability to free and find your data, and then to understand it and more importantly take action on it as well. So on the left hand side. We have our data integration platform. This is really what allows your DBAs and your data engineers to be able to generate deliver refine and merge data from all sources on the fly and then land that data where the business needs it to be. And that the very foundation of this data integration platform is our log based change data capture technology. This is the CDC streaming part that you see there. And that's what delivers real time data from a wide breadth of sources that you can see depicted from the bottom. This is your databases, it can be mainframes on premise systems cloud systems oracle databases and SAP systems, wherever they are and then allow you to deliver that out to multiple targets. And then we also have two other solutions there data warehouse automation and data lake creation. This is all about automating the process of managing and creating data warehouses and data lakes. So the click data integration platform can support replication and migration of data out into a cloud environment like Google, and then you can use it with any bi tool of your choice so it doesn't have to be click. Of course, we believe that click delivers the best analytical experience and and on the right hand side you see that full range of analytical capabilities from self service analytics to reporting and alerting. And in the middle just to call out there we've got our cataloging capability. This really sits between the two portfolios and it actually bridges the world between it and the business, and it not only allows you to connect to data from the data integration capabilities. You can also bring in the source points directly and then feed that into your analytics environment, whatever it is. So, this flexibility and adaptability is ultimately the basis for why our customers and tech vendors are like a consistently choosing click for their strategic partner of choice. Now today we're going to focus in on the data integration side specifically that CDC streaming solution. And incidentally that consists of two products, click replicate and click enterprise manager, but just focus in on the click replicate product. Now this is formerly known as a tuner to replicate coming from the acquisition. And I just want to have a kind of pop the hood if you like underneath the architecture, and we just briefly walk through how it works because this is quite paramount in what Gordon foods are going to be talking about. So click replicates of web based data replication tool typically configured as a middle tier server. And then what you can see the door data sources on the left hand side, and these can be on premise or out in the cloud. And click replicate allows you to capture data from those sources and then automatically apply lightweight transformations and things like filter in in flight before then propagating the data to the wide range of targets that we can support depicted on the right hand side. Now again, that could be on premise but most commonly now it's moving data out into the cloud like Google for data warehouses data lakes, as well as streaming messaging platforms as well like Kafka. So we click replicate you can seamlessly replicate data from your full load or batch load as it's known and then it will automatically switch to capturing and replicating just the changes in the data source as and when they occur. And we do this wherever possible with allowing you to use agent this and log based data capture. So this means ultimately it's minimal impact on your source systems. And from a transformation perspective we do as much as we can in memory. So that be things like standardized in particularly date formats for example, filtering capabilities. For example, filtering multiple years down to a single year. You can only move in the most recent transactions or even regions, or even things like obfuscating sensitive data such as PII during the transformations as well. And just incidentally, more additional more complex transformations and also be applied further upstream using the rest of the click data into by integration platform if you wish. And such as the automatic creation of data marks and data vaults for your data warehouse and data lake creation. And that's by taking advantage of the two other solutions I mentioned earlier data warehouse automation and data lake creation. And they consist of other products as well. But going back to click replicate there's a lot of flexibility in the data flows that you can configure. So you could replicate data from one source out to many targets or even vice versa. And you can migrate data from multiple disparate systems out into the cloud. And just before I move on a key point about that persistence store that you can see at the bottom of this diagram is this is not about storing the data that's actually being replicate. It's storing the configuration and the state so that each replication source and target that you can figure in click replicacy is seen as a single task, and it's the task configuration that we store here is as you metadata fields, and the tables to this selected as well as the transformations that you want to apply. Save the state of the last replicate task here as well so you can pick up where we left off in case of any interrupt interruptions such as arrows or pauses in the replication tasks. So you can think of that persistence store like storing bookmarks. Now order this can be done at scale and we have many customers running hundreds of tasks out in production systems automating their data pipelines just like you're going to hear about the modern food service, and we will hand over to them very shortly. But before I do. This is just a very quick snapshot of all of the heterogeneous sources that we can support here on the left hand side well over 40 n points, and that is growing and we maintain that, you know, keeping the versions up and all this kind of stuff. And then on the right hand side, you can see we support propagating to full Google environment as well so you can take full advantage of all those many Google services and deliver real time data with that best in class CD, CDC technology has changed data capture technology. So this is a good point then to hand over to Eric we're proud to be a partner of Google and you can see we've built out that extensive integration out into Google to solve those challenging data integration challenges that we discussed so Eric over to you. Great. Thank you so much Adam. And again, like you I'm thrilled to be here. I'm really looking forward to Tom and Kyle and their and their content. I do want to take a minute though and talk a little bit about what Google cloud is doing as a hyperscaler supporting the needs of the enterprise. I run a group of people here in North America focused exclusively on SAP, but most often we're talking to SAP customers about migration, and then how to handle data. So this is just a wonderful opportunity to talk about what our customers experiencing, and also give some background on why data is so important to us. You can see big data is in our DNA. We have nine products with over a billion users that's billion with a B so when you think about YouTube and Gmail and maps and cloud and photos and Android and Chrome play store etc. We definitely have scaled over the last few decades and we've maintained commitment to managing that data serving it up very quickly and then preserving the security of it. So those are the examples of some scalability in the past. Let's look at some of the innovations which I'm really excited to be a part of. In addition to my background with SAP I did spend some time and big data with the Duke, as well as AI and ML. And so I certainly used Kubernetes I use some of the other different different tools for less of a better analogy. And you can see the commitment over the last few few decades of of Google delivering white papers and projects that were donated to the open source things like Bigtable and MapReduce and Dremel, Spanner and Kubernetes. So innovation is very, very key to the deliverable from Google Cloud. And many of our customers look towards migrating and managing data and then taking advantage of that innovation opportunity. So that's an exciting opportunity and a great journey that we like to take with our customers. Let's go ahead and go to the next slide and talk about our mission statement. It is important to focus on this because it keeps us true to our North Star. What we want to achieve is accelerating organizations and their ability to transform through data. You can see how it highlights there. We want to provide innovation and the infrastructure and platform and a big focus lately on industry solutions. We're spending a lot of time focusing on the industry specific needs of our customers and helping enable the business processes that are unique to those industries. So that is commitment that we make and is carried throughout our deliverables to our customers and partners. So thinking about what SAP does with Google Cloud on the next slide, what we see many SAP customers doing is migrating into the cloud, moving into a hyperscaler, taking that historically on-premise resident application, which really is the nerve center of their organization. And in order for them to achieve success there, they need security. They need to manage a hybrid landscape. They certainly want to manage that where appropriate with serverless. And like I mentioned before, in addition to migration, they want to start taking advantage of intelligence, artificial intelligence, machine learning, big data, visibility, KPIs, metrics, all the things that are helping organizations drive their success in the marketplace. Many times we'll find their roots back in an SAP system, but we also have to make sure that we're accommodating non-SAP and unstructured contents. We'll talk about that in a few minutes, but I did want to emphasize why Google Cloud is so important for SAP customers who are looking at migration into a HANA system or moving into a ECC on HANA systems. So we're very proud of the work we've done here and we're very happy to see customers benefiting from that as they go live with their SAP system in Google Cloud. Let's look at the next slide then and think about where the data resides. So as we see customers with transactional systems and as we see SAP landscapes that are driving innovation and they're driving visibility, etc., there's a need for other unstructured content, things like social media, news, events, weather, all those different data signals that can support the specific decision that a customer needs to make to change a business process or to meet the needs of their specific industry. Retellers reacting to the demands of a consumer or changing the supply chain routing based on weather events. Google BigQuery gives us a fully managed, cost-effective, highly scalable enterprise data warehouse. So great opportunity to put a data lake right next to your SAP BW system. And what we're excited about is the improvements here. Back in 2016, if you queried a petabyte in BigQuery, you would get results in about four minutes. Not horrible but certainly not responsive and certainly not meeting the needs of a business professional wanting to make a decision. When this paper was written that I were referring to back in 2019, the results were down to 12 seconds and now we're down to four seconds. Quering a petabyte gets you information at your fingertips and it also opens up the opportunity to blend that in with algorithms and predictive responses, fully enabling a business professional to get full information on making a business decision. So we're very proud of the work that the customers are doing with BigQuery and seeing how they can innovate based on that. Let's move to the next slide then and think about how in conjunction with the big data, we can start applying algorithms. Google has long been focused on artificial intelligence and many of us use this in the language section there in the center. If you've ever taken a foreign trip, you may have downloaded the Google Translate app and what's brilliant about that is you can speak in English, convert to a different language and carry on a conversation. Really innovation and very compelling and very powerful. That's using a very large trained deep neural network that's running in the Google Cloud that you can tap into through APIs. That continues through natural language and translation and auto ML. That's just on the language side. Over on the right then with conversation, you can take that and start to work on text to speech or speech to text or working in the dialogue flow component. The end result is being able to automate communication and being able to automate processes that are so important for keeping consumers engaged and providing feedback on systems. What's really exciting on the left there is the site side of things. Visual intelligence and auto ML vision and auto ML video. The vision API is very interesting and it's opened up a lot of doors for us and for customers who are applying AI algorithms to business processes. Let's go ahead and take a look at the next slide which is a drill down into vision API. In our manufacturing segments, we see aerial inspections and asset management being able to send a drone up to a very remote system or a very out of reach system and be able to do inspections or in healthcare, providing a second opinion on radiography or X-rays or here you can see a cordia scan with healthcare and medical image analysis there on the right hand side. Really exciting ways to tap into that API and that artificial intelligence algorithm and combine that with the needs of a business running their organization and being able to track what the customers are looking for, what consumers are demanding, what supply chain routes are needing to be adjusted with and really exciting use of the vision API. One example of what consumers are doing in the marketplace and what customers are doing to adapt to their specific needs with the Google Cloud environment. Let's move to the next slide then and talk about where this all connects. We're going to hear from Tom and Kyle shortly. They've got a lot of different backend systems including SAP. They're using some data lakes, they're applying some machine learning where appropriate, but at the end of the day, this is what we see customers take as a model to pursue. They're migrating their SAP and integrating that in with data lakes and then applying machine learning and artificial intelligence to get to that end result. A very proactive answer to a specific business question that helps them innovate and helps them drive their business forward. Again, that's petabyte scale, data analytics, accessible machine learning, and it's all available at your fingertips with this Google Cloud environment. Looking at the next slide then, you can see most customers, specifically SAP customers have said they do have standard pains. They have challenges they're working with. The most surveyed result in that question on what is their top analytics pain is data integration and how can we resolve that? When we work with customers, we spend a lot of time looking at what their landscape definition is, what their plans are for migration, how they're going to achieve the needs that they have, and integration is a top part of that discussion. Having gone through the background of what Google can provide as an infrastructure to this entire conversation, I'm pleased to be able to transition then over to Kyle and Tom who can take us through the next steps in talking about what Gordon Food Service has done in this environment. So with that, I'll send it over to you guys. Thank you very much. Thank you, Eric, and thank you to our partners for providing the opportunity to share the GFS story here today. As the slide says, I'm Tom Majewski, leader for data and application shared service groups here at Gordon Food Service. I'll be joined later in the presentation by my colleague Kyle Partlow, but first, I'll be taking you through a little background on our organization and the history of the technology relative to our discussion today, and then Kyle will drive through how quick and Google have helped realize our data strategy over the last couple of years. Why don't we jump ahead? Two slides. Perfect. Before we dive in, let me give you just a quick background about how Gordon Food Service, first of all, we're a North American food service distributor. Our roots trace back over 100 years to a horse and buggy peddling butter and eggs. Technology back then was pretty basic. Geographically, we operate nationally across Canada and primarily east of the Mississippi River in the US, wrapping around through Texas and the South. We serve many food service industry segments throughout our customer base, including things like education, healthcare, independent operators, chains, entertainment, hospitality, and several more. If I call your attention to the images at the bottom, you might note that our industry, Gordon Food Service included, has been embracing change and innovating solutions for decades. We started with a horse and buggy, and there you can see a gas pickup truck, and eventually we're on to automated distribution centers servicing multi-state regions. Data is just one of the next opportunities for us. Could we move on to the next slide? Prior to 2005, the majority of our data resided on our mainframe. By 2015, 10 years later, it had spread throughout hundreds of Oracle and SQL server databases within our data center, as well as to many SaaS offerings out in the cloud. Managing the sprawl had become a challenge for data architecture, as well as a technical opportunity when needing access to disparate data sources for reporting or other reasons. Fortunately, we had the foresight to recognize a coming need for data consolidation to support the growing demand wave for analytics. Can we get the next slide? Let's talk for a moment about the needs as we saw them for managing data replication in 2015. At that time, we had a lot of custom replication scripts that moved data from one place to another to try to bridge the gap that we had with either operation or reporting systems that had needs for data that they had no easy way to get access to. And as the many sources of data grew, so also grew the challenge to try to bridge that gap. We knew that we needed a solution that could be managed within our existing data services team that was composed of DBAs and data architects. We did not want to grow the size of our staff to manage more products, so there were some things that were important to us. We believed that first an intuitive user interface that abstracted the complexity would be foundational. As the quantity of tasks and volume of replicated data would grow, we felt that access to the platform's metadata for incorporation into our own repository would be essential to decision-making and answering questions in an efficient manner. While we didn't know for sure where our journey would lead, we believed that the platform would need to have a vision for supporting just about any data endpoint we could ask for. We were concerned about the possibility of creating performance impacts within the source data stores, so we needed a minimal footprint which included log-based replication versus the use of triggers. We also knew that in order to make the platform fit into our own internal processes and controls, we would need access to an API. Could we go to the next slide? In 2016, our research had focused in and we were ready to select a partner. Our final candidates included Attunity and we'll call them Candidate B, a large red technology company known for their databases and sailboat. We found that the base replication capabilities were essentially equal between Replicate and Option B. However, Attunity, now click, represented a vision and a consolidated product management direction that aligned with our vision for how GFS would like to manage data replication. Attunity was ultimately selected and the partnership began with an initial goal to simplify and increase the reliability of that on-premise oracle database replication that I spoke about. We felt that would be the fastest way to show immediate value and also build muscle with endpoints we were already familiar with. Other endpoints such as SQL server, Postgres, and eventually GCP services and SAP HANA were prioritized later. Along the way, we developed a number of tools and integrations as well as built knowledge necessary with which to manage a diverse installation. The API available through QEM was helpful in building a lot of extension services. One of the first things that we did was build a password change process for endpoint service accounts. Later, as non-administrative teams began creating tasks, it became more essential to have a process that permitted promotion of those tasks through pre-production environments similar to change deployment for software. The API was helpful here as well. We have also built a metadata extraction process that gives us visibility to objects and tasks and ultimately the overall data lineage through our replication tasks. Next slide, please. Let's take a look at an early timeline for the GFS adoption of replication technology. In 2015, we began our research into data replication vendors and in 2016, we began our partnership with Atunity as well as replacing our existing custom replication processes. In 2017, we successfully used Replicate to migrate databases from our Legacy Spark Soler's platform to X86 Linux platforms, minimizing the downtime of dependent business applications during that process. In 2018, we executed a GCP-based CRM Insights initiative using Replicate to source the data from many data sources within the GFS data center and move it out to Cloud SQL as well as Kafka for event triggers. This consolidated data source in GCP made it really easy to perform operational analytics that could alert sales to opportunities right within their CRM platform or provide information not previously integrated into an easily accessible location. While not demonstrated on this slide, in 2019, we began sourcing Replicate data from our SAP S4 HANA instance, which is, incidentally, living out in Google. You will see more of this in the next half of the presentation. There will be a number of options available for your use case and your organization will need to understand the trade-offs of each. Our partners here are well-versed, they helped us a lot, and they'd be able to help you as well. Now, I'm going to turn things over to Kyle, who will take you through the next part of the GFS story. Kyle? Yeah, just confirm everybody can hear me. Yep, we can hear you, Kyle. Excellent. Hey, thanks, Tom. Thanks Google and Klick for hosting us. So, like Tom mentioned, I'm kind of picking up where he left off in regards to our attunity slash GCP journey at Gordon Food Service. So, about two and a half years ago, we underwent our data transformation strategy, and one of the first decisions we needed to make was who was going to be our cloud provider. And I think it's pretty obvious the three organizations that we evaluated, I think we pretty quickly eliminated AWS because who doesn't compute with AWS nowadays. And then when it came down to it, we believed in five years from when we did the evaluation that 90% of what cloud providers provide will be table stakes. Everybody can do it. Everybody can do it at the same speed, but where the cloud providers will differentiate themselves is how well they can do machine learning and advanced analytics. And we saw what Google had was far and above what other providers had, and our hypothesis has just became more and more true over the two and a half years we've experienced with them. So, first decision was to go with the Google cloud platform shortly after that. We started looking with Tom's team and how we could leverage the attunity replicate tool to get data from our systems into the data lake. As Tom mentioned, we have a long history of applications, both fortunately and unfortunately, we are not digitally native. So, we have a lot of systems that the only way to get data is through database replication as much as we'd prefer to get it through APIs or other means. So, through the evaluation with Tom's team on attunity, we chose to go forward by using those two technologies to kind of kick off our data transformation strategy. So, we spun up the team. We started with two and we've only grown since. I'm going to go over in 2020 when we turned on our attunity replicate pipeline, which I'll talk more about in the next slide. Shortly after that, in late 2020, we ingested our first SAP table, which was a marquee day for us. We had been doing Oracle and SQL server for a long time. We chose to go with Google instead of SAP BW just for the same reason. We wanted to leverage the best in-breed and machine learning in AI and we wanted all of our data to be in the same place and we didn't want to spread it across two technologies. And then last, I'll go over some of where we're at today with this pipeline. So, in the advanced slide. Excellent. So, I'm going to start with where are we now? And then I'm going to go into how we got here. So, the picture, some of you may be wondering what that is. Personally, I have never actually seen Willy Wonka, but all of my data engineers, apparently it's one of their favorite movies. If you have seen it, I guess there's this guy named Augustus Gloop who eats a lot, I think is probably the easiest way to describe it. So, when we got around to naming this attunity replicate GCP data pipeline that we built, the team came up with Willy Wonka and named it Gloop. And surprisingly, it has gotten a lot of traction all the way up the organization and gives the team a little bit of leeway to make some fun names. And we actually coincidentally have some scripts that we run to kind of maintain our Gloop pipeline and they call those the Willy Wonkas, I don't have the name quite right, the Oompa Loompas, I'm sorry. And again, spare me because I've never actually seen Willy Wonka, but they call them the Oompa Loompas because they help Augustus eat all of the data. So, where are we now? We can now replicate data from any on-prem database, both HANA Oracle SQL Server into the Data Lake which is hosted in GCP and BigQuery. I think it's important to note that we went after structured data first, which is why we're able to leverage BigQuery as we get into more unstructured about the leverage other technologies like GCS in GCP. One of the most exciting things is that we have, and this isn't completely true, but for the most part, we have sub-second latency from the time the change happens in the source database that changes then reflected in the Data Lake. We also automatically handle scheme updates, which was a big deal for us. Like Tom mentioned, we have hundreds if not thousands of databases. If we had to take down, adjust the schema manually and then spin it back up, we would have a team of data engineers only doing that. So, our scheme updates happen all the time and our pipeline automatically processes those. Replicate sends that event. We make the change in BigQuery and then data continues to flow from there. We also have a pipeline that's self-healing and can automatically scale with demand using data flow as well as PubSub. And so, I can talk a little bit more about some of the statistics that we have around just how little it takes us to maintain a solution that provides so much value to the organization. The other key component here is that we have all changed data records for all databases. So, what that means is that we have a record for every time a row changes. So, we can tell you the exact state of the database at any point in time since we turned on Attunity Replicate for that particular table. We can also tell you for a database that is very update heavy, we can tell you how often an attribute of an item changes even if the application itself doesn't store the audit records for all of those records. So, advance the next slide please. Okay, so this is a slide that has gotten a lot of traction at GFS, which is kind of talking about our data modernization strategy. So, on the left, I think anybody that is in a non-digitally native organization can relate to this kind of spider web of integrations, operational data store, sharing data across a bunch of different applications and all of these tertiary places outside like our EDW and vendors and customers that need data, which has put us in a position where we don't even know what we have, we don't know who's calling what, who's using what data. Where we're moving towards is kind of this data hub data platform concept. And so, the idea is we have all of these source systems that we have internally as well as external systems that we use to run our business. All of that data funnels into one place through data pipelines, through APIs, through files, and then we provide that data to downstream consumers through our API layer. It could be other internal applications, customers, vendors, our North American EDW, and even third-party consumers. And so, I'm going to specifically talk about in this presentation the piece that I drew a red dotted box around and where are we going? And then certainly it would take a whole another hour to talk about the things on the bottom half of this chart. Advanced, next slide please. Okay, so, Gloop is born. We actually worked, we had multiple versions of this and I tell you what, I'm super glad that we kept going and kept innovating and kept iterating on this solution. So, first, we, Tom's team with all the legwork that they'd done the three years prior to the data strategy kind of kicking up in earnest, had already had click at the time, attunity replicate connected to many of our on-prem databases. So, thankfully that work had been mostly done. So, during the build of this, not only were we trying to build this, we were also trying to stand up our platform, introduce governance, train up engineers that hadn't traditionally worked in GCP as well as build the team. So, we also use a tool called Kafka in order to connect with PubSub. But so, essentially the attunity replicate monitors the database logs. Every time a change happens, it sends it through Kafka and then subsequently PubSub. We use Dataflow to automatically process that. And one of my favorite dashboards is our Dataflow monitoring dashboard in GCP that basically monitors how many tickets are in the PubSub topic and then automatically can scale the Dataflow job accordingly. And it does it without us knowing. So, it's really cool to wake up, look at that dashboard and see that we got a bunch of changes last night. Dataflow calmly scaled up, calmly scaled down, and then moved on to the next day. We then store all of the change records in BigQuery. And I kind of define the change records in the previous one. And this is where the data lake starts for us. We also have views that we set on top of the change data logs, which basically, if you query the view, it will just give you the current version of the database. And we also have the ability to backdate that view if someone is interested in looking at the database as a particular date. We then use the EDW, I'm sorry, Composer, which is essentially managed airflow in GCP to do our ETL. And then we also host our North American EDW in GCP and in BigQuery as well. All right, next slide. Okay. So, like I mentioned, that initial slide was just Oracle and SQL Server data. GFS is undergoing a large-scale SAP implementation. And so, this happened right at the beginning of our data strategy. And one of the key reasons that we chose both Google and Attunity Replicate or Click Replicate now is because they had the ability to connect to SAP. And like I mentioned earlier, we chose to not go back and use SAP BW because we wanted everything to be on a common platform, which is GCP. And Attunity gave us the ability to have a native connector to SAP to get that data. So, because we're going through a build, we have, I'm not sure how many, but at least six SAP environments through the build process. And we're able to connect Attunity to all of them, our friend Gloop can process those and then put them into either our pre-production and production data lake. And because we're going through a build of SAP as well as our new North American EDW, it's really important to have the delineation between those environments. The other important part here is that Attunity Replicate, I'm sorry, Click Replicate actually makes the change record look exactly the same from source system to source system. So, an SAP change record looks exactly the same as Oracle change records, which look exactly the same as SQL Server, which makes it much easier on our side to build one thing and then it can process data from any data type that, I'm sorry, Click can connect to. Okay, next slide please. Okay, so today, we do have multiple different pipelines and kind of our calling card for our data engineering team is to build as few data pipelines to cover the maximum number of source systems. So Gloop is by far our largest. It was also our first as a special place in my heart and it handles all of our database replication. It handles the largest volume of data. We also using other technologies in GCP built both an API and the file pipeline. So basically our API can do one of two things. It can be scheduled to say, Hey, call this API once every 10 minutes and it may get whatever information is necessary and then stores that data in BigQuery and it can natively handle the JSON message there. And then also we have a file pipeline, which also uses other GCP technologies that basically any source system can drop a CSV into a GCS bucket. Our data pipeline can then process that and store it in BigQuery. So the big advantage and the way the specific way that we built all of these pipelines is all of the data is in the same place. And I think when we think back to before our data strategy really kicked off, that was a huge challenge is just getting the data in the same place and using analytical tools like BigQuery and MicroStrategy and Data Studio to analyze that instead of analyzing in Excel, which I think most people on this call can probably relate to some of the key statistics around Gloop is we went from the first time we turned it on in production, we went over a year without a single production issue. We never had to take the data stage. I'm sorry, data flow job down or the pub subtopic. We've ingested 54 subject areas. We have 929 tables. We've got 12.7 terabytes of data. We also have access to all of our all of our systems that are essentially all of our systems through Gloop that are on-prem or HANA. And what we found is by having our CDC pipeline or API in file, that covers 98% of any source system that we need to get data from whether we own the application or not. It takes us less than a day to add new data, which is one of the key reasons we spent so much time and money building it the way that we did is we knew we were going to get hundreds and hundreds and hundreds of data requests and we needed to be able to fulfill those quickly. We also have 21 data labs which is basically a sandbox for users to interact with this data and we've built a team of 20 plus data engineers over the past two and a half years. Next slide, please. So where are we headed now? We really spent the first 18 months of our data strategy getting data into the data lake, introducing some degree of governance so that people can use it. Now that we really have our arms wrapped around the data pipelines and I have to plug any data engineer on my team that was here would slot me on the risk because the data pipelines are done, but I always remind them that they are functional. We will continue to improve them. But now that we've got our arms wrapped around that, we're really turning our focus to how do we get value out of the data that we've ingested into the data lake. Fortunately, we had the backing of the Gordon family as well as the executive team to build things the right way to scale 100x not just do one analytics project which I think is a major key to our success and differentiator that GFS's data strategy has. So right now we're going to do that in one of two ways. One through our enterprise data warehouse that we're working on building now, building a one place where we can basically evaluate the performance of the business and then have our executives have one version of the truth where that is a major struggle for GFS today. We're also working on building our advanced analytics teams and have continued to think as we build our data lake, think about the different personas that are going to be using our product and our data scientists was certainly one of them. So as that team builds out, we're going to continue to get feedback from them and adjust the way that we build the platform. Next slide please. Okay, that is it. I really appreciate Google and Clicks time and glad that Tom and we're able to share our story and how we're able to use these technologies to drive our data strategy. Awesome. Thanks, Carl. That was amazing. Love that and thank you for your and Tom's time as well taking time out your busy schedules to tell us that amazing story. I'm loving Dloop. I'm amazed you haven't seen Willy Wonka on the Chocolate Factory. So after this call, I'm going to send you a link on Google movies and you can pick which one you want to watch, whether it's Gene Wilder or Johnny Depp. Show my age. I'm a Gene Wilder dude, but that was awesome. Thanks, Adam. Thank you. Just to recap then as we start to wrap up. So yeah, as you can see there, the click data integration platform can really help you unlock your most valuable data from many heterogeneous data sources, whether it's a database, SQL database, mainframe, that big red company that has that cell boat or SAP, whatever it is, you can deliver that data from those mission critical systems and then continue to build those automatic pipelines to deliver the freshest data in a reliable way out into Google Cloud. In addition to that, we can help you make it data analytics ready as well. If you want to automate the data models and data maps for the data warehouse and data like creation out in the cloud as well. And then that gives you access to all of those Google services that we talked about today and then apply that whatever the analytics tool that you want to use on top. So you heard a great example and a good story there from Gordon Foods and we have many others that are able to scale out their data and analytics workloads out onto Google Cloud. So with that, I just want to end with an offer that we built from having a very deep strategic partnership with Google. We've got this joint offering that is essentially a free proof of concept that not only includes software but also the expertise to help you accelerate the management and delivery of one of those most important data sources, your SAP data out into Google BigQuery. So it's called the SAP Jumpstart solution. It supports a wide range of SAP application data sources, whether it's legacy SAP environment, HANA or application servers. And the program is redesigned to help you accelerate and simplify your SAP data delivery for that real-time analytics out on Google BigQuery. So as you expect, it utilizes our click data integration platform. That's to help you provide that real-time data pipeline to ingest and automate the delivery of analytics ready data out of the SAP systems out into BigQuery. So there's a link there to help you get started. All you have to do is fill in out a form and we will take care of the rest for you. So with that, I will wrap it up. I'll leave the links there. You will get a copy of the presentation, as Shannon said. And once again, a big thank you to all the speakers. Thank you, Eric, from Google. Carl and Tom from Gordon Foods really appreciate sharing your story there and really looking forward to see with all the great things that are coming out from your team. And I'll say thank you to everybody on the call for your time. And we can open the floor up to questions, Shannon. Thank you all so much for this great presentation and tying it all together. Realize some great stuff here just as the most commonly asked questions. Just a reminder, I will send a follow-up email to all registrants by end of day Thursday with links to the slides and the recording. So diving into the questions here in the Q&A section, feel free to type them in there. I will let me just start with this. Given the impact of COVID-19 on business, especially with regard to the supply chain, how has the transition to the cloud facilitated your organization? And how has it facilitated your organization's ability to manage essential supplies as well as use of forecasting tools? You want to try to attack that one, Kyle? I mean, my take on that is that that's not really that that's not a piece that we've attacked yet when it comes to managing essential supplies or forecasting tools within the within the cloud. Yeah, I apologize. Which question was asked? Given the impact of COVID-19 on businesses, especially with regard to the supply chain, how has the transition to the cloud facilitated your organization's ability to manage essential supplies as well as the use of forecasting tools? Yeah, I think I'm probably not going to quite answer your question, but I think like many organizations, COVID has really accelerated many digital transformations that may have already been in motion. So while I'm not a supply chain expert, what I can say is that we recently gone through kind of our long range planning, and we came out with many what we call must win battles. And I think the realization that the organization had was that almost every single must win battle had a dependency on the data must win battle. So I think that just kind of reiterates like like Tom was saying, data is the next frontier in terms of GFS's transformation and the supply chain and forecasting and using data to make better supply chain decisions and operational efficiency is a key part of that. Yeah, so maybe it's highlighted the need. The cloud has not yet facilitated our ability. It's driving us there. I love it. Thank you. So on page 29, the slide 29, how do you do it? How do you automatic do automatic schema updates? And how do you do data updates? Replica has the option of sending schema updates within the data payload. So we ingest that information on the loop end of things. And then there's magic in the code to apply those changes. Yep. Tom's absolutely right. Our friend Gloop, when it gets a schema message, it pauses the data ingestion, processes the scheme message, adds the column, deletes the column, and shout out to Google for adding some additional DDL capabilities there, and then continues processing the data. And we don't even know what happens. I'm not quite sure what was that. How do you do data updates? Was that related to the schema update question? Yeah, so how do you automate the schema updates? Yeah, I think that's what we covered. Yeah, so that's... Okay. Sorry, it's out of me. It's really quickly on the data updates. Just might be stating the obvious, but one thing Replicate can do is automatically switch to change data capture technology. And that question is referring to as and when this changes its source, that's what Replicate does in terms of sending those changes out into the target. And it will switch to that automatically through configuration. Perfect. So would you elaborate on the buy-in process at all levels of the organization? And we get this question a lot. How do you get the C-suite to buy-in and understand this is important? What was the most challenging aspect of getting that? This is a great question. I understand why people ask it. Fortunately for us, I think we're in the minority and we were lucky because well, we forecasted a need for these kind of tools and that the demand was coming several years ago. It really came from our C-suite a few years ago that we needed to pursue this angle. So to his credit, our CEO was interested in the transformation to digital business. And so we didn't have to get the buy-in at the senior leadership level. They pushed it downward. Yeah, I totally agree with Tom. We talked to many other companies that it's, I think it's number one, it's from the bottom up. And number two, it's IT driven, where we had the exact opposite. Like Tom mentioned, it was business driven and it was from the top down. I think one of the other key notes that I'd say as far as getting buy-in is, is number one, a huge shout out to the Gordon family. And it wasn't as much a, hey, this sounds like something we should do. It was, if we want to be around for the next 125 years, then we need to be leaders in this space and we need to double and triple down. The other key, I think, advantage that we have is Gordon Food Service is a privately owned company. And I will bet we are making more five, 10 year bets than most companies are in this space because for that reason. And so I think it's part. And so the last thing I would add is, like I said, we started out with two engineers and Tom's team, I think, had three at the time that kind of started up this data transformation. And we started small. We got value. I think what we did a really good job at is telling the value story and saying, we did this small thing. If we were able to 10X it, we could do this small thing, 10X that and then imagine the value that we get from there. So every step along the way, every piece of value that we got along the way, the story was told and the company kept continued to plow more money into our strategy to build the team and build the platform. Adam, anything that you want to add to that that you've seen from people or how to buy-in? Yeah, definitely. I mean, it's great when you get an organization like Gordon Foods that's really forward-thinking and you can have it from the top down because that's really how that kind of change, transformational change needs to happen. And forward-thinking organizations, that's what's driving the need for the role of the CDO coming in. One thing that we've seen a lot and from personal experience as well is try and get a senior stakeholder. It doesn't necessarily have to go at the top if you can't get there, but if that is one of your challenges and one of the hurdles, depending on how your organization is structured, get a senior stakeholder involved, prove the value exactly as Kyle was saying there and something that's going to resonate and hit home, find your kind of pain point or lowest hangers through and then use that senior stakeholder to sort of work your way up as it were. And when people see the value, that's when you can kind of start creating that buy-in and that kind of knock-on-domin effect and get more people in. Yeah, I think Adam makes a great point and piling on. I would have meant, what's interesting about the executive mandate, it's not that uncommon. Most organizations are driving towards an executive mandate, so having a vocal CEO, like in the case of Gordon Foods or a chief data officer or what have you, there's going to be executive mandates to live up to. When that's not available, though, it's always helpful to have a digital journey, break up the journey into achievable segments with easily measurable outcomes and add innovation. Especially from an SAP migration, many times it's looked at as a simple lift and shift when there's huge opportunities to add value to every single step. Don't just bring over an ECC system and add transactional reporting, bring that over and add AI, add some visualization, add some metrics and outlooks. So yeah, I think everybody on the phone has made a great point about getting buy-in by showing steps along the way of that digital journey and it can be very powerful. And at the end of the day, it's an impact to the business. It makes our lives better, not the other way around. So it's a very exciting time to be a part of those discussions. I love it. Well, I'm afraid, though, that is all the time that we have for today. I will get these questions over to click the remaining questions. So but thank you all for this great presentation. Thank you to click for making help making these webinars happen. And of course, thanks as always to our attendees who are so engaged in everything we do and love the questions that have come in. Just again, to remind everybody, I will send a follow-up email with links to the slides and the recording of this session by end of day Thursday. So thanks, everybody. I hope you all have a great day. Thanks, guys. Thank you. Thanks, everyone. Thanks, everybody.