 From around the globe, it's theCUBE with digital coverage of enterprise data automation and event series brought to you by Io Tahoe. Hi buddy, we're back and this is Dave Vellante and we're covering the whole notion of automating data in the enterprise. And I'm really excited to have Paul D'Amico here. She's the Senior Vice President of Enterprise Data Architecture at Webster Bank. Paul, good to see you. Thanks for coming on. Hi, nice to see you too. How are you doing? Yeah, so let's start with Webster Bank. You guys are kind of a regional, I think New York, New England, believe have headquartered out of Connecticut, but tell us a little bit about the bank. Yep, Webster Bank is regional Boston, Connecticut and New York, very focused on in Westchester and Fairfield County. They are a really highly rated bank, regional bank for this area. They hold quite a few awards for the area for being supportive for the community and are really moving forward technology-wise. They really want to be a data-driven bank and they want to move into a more robust group. Well, we got a lot to talk about. So data-driven, that is an interesting topic in your role as data architect, architecture is really senior vice president of data architecture. So you've got a big responsibility as it relates to kind of transitioning to this digital data-driven bank, but tell us a little bit about your role and your organization. Right. Currently today, we have a small group that is just working toward moving into a more futuristic, more data-driven data warehousing. That's our first item. And then the other item is to drive new revenue by anticipating what customers do when they go to the bank or when they log into their account to be able to give them the best offer. The only way to do that is if you have timely, accurate, complete data on the customer and what's really a great value or something to offer them or a new product or to help them continue to grow their savings or do and grow their investments. Okay, and I really want to get into that, but before we do, and I know you're sort of partway through your journey, you got a lot to do, but I want to ask you about COVID, how you guys are handling that? I mean, you had the government coming down and small business loans and PPP and huge volume of business and sort of data was at the heart of that. How did you manage through that? So we were extremely successful because we have a big, dedicated team that understands where their data is and was able to switch much faster than a larger bank to be able to offer the PPP loans to our customers within lightning speed. And part of that was is we adapted to Salesforce very for, we've had Salesforce in-house for over 15 years, pretty much. That was the driving vehicle to get our PPP's loans in and then developing logic quickly. It was a 24-7 development, roll in, get the data moving, helping our customers fill out the forms. And a lot of that was manual, but it was a large community effort. Well, the thing about that too is the volume was probably much, much higher than the volume of loans to small businesses that you're used to granting. And then also the initial guidelines were very opaque. You really didn't know what the rules were but you were expected to enforce them. And then finally, you got more clarity. So you had to essentially code that logic into the system in real time, right? I wasn't directly involved, but part of my data movement team was and we had to change the logic overnight. So it was on a Friday night, it was released. We pushed our first set of loans through and then the logic changed from, coming from the government, it changed. And we had to redevelop our data movement pieces again and redesign them and send them back through. So it was definitely kind of scary, but we were completely successful. We hit a very high peak. I can't, I don't know the exact number, but it was in the thousands of loans from little loans to very large loans. And not one customer who applied did not get what they needed for, that was the right process and filled out the right amount of data. So that is an amazing story and really great support for the region, New York, Connecticut, the Boston area. So that's fantastic. I want to get into the rest of your story now. Let's start with some of the business drivers in banking. I mean, obviously online, I mean, a lot of people have sort of joked that many of the older people who kind of shunned online banking would love to go into the branch and see their friendly teller had no choice during this pandemic to go to online. So that's obviously a big trend. You mentioned the data-driven data warehouse. I want to understand that. But at the top level, what are some of the key business drivers that are catalyzing your desire for change? The ability to give a customer what they need at the time when they need it. And what I mean by that is that we have customer interactions in multiple ways, right? And I want to be able for the customer to walk into a bank or online and see the same format and being able to have the same feel, the same look, and also to be able to offer them the next best offer for them, for their, if they want looking for a newer mortgage or looking to refinance or look, whatever it is, that they have that data, we have the data, and that they feel comfortable using it. And that's an untethered banker attitude is whatever my banker is holding and whatever the person is holding in their phone, that that is the same and it's comfortable. So they don't feel that they've walked into the bank and they have to do fill out different paperwork compared to filling out paperwork on, just doing it on their phone. Yeah, you actually want the experience to be better. I mean, and it is in many cases. Now, you weren't able to do this with your existing, I guess, mainframe based enterprise data warehouses. Is that right? Maybe you can talk about that a little bit. Yeah, we were definitely able to do it with what we have today, the technology we're using. But one of the issues is that it's not timely. And you need a timely process to be able to get the customers to understand what's happening. You need a timely process so we can enhance our risk management. We can apply for fraud issues and things like that. Yeah, so you're trying to get more real time. I mean, the traditional EDW, it's sort of a science project. There's a few experts that know how to get it. You kind of line up the demand is tremendous. And then oftentimes by the time you get the answer, it's outdated. So you're trying to address that problem. So part of it is really the cycle time, the end-to-end cycle time that you're pressing. And then there's, if I understand it, residual benefits that are pretty substantial from a revenue opportunity. Other offers that you can make to the right customer that you maybe know through your data. Is that right? Exactly. It's drive new customers to new opportunities. It's enhance the risk and it's to optimize the banking process and then obviously to create new business. And the only way we're going to be able to do that is if we have the ability to look at the data right when the customer walks in the door or right when they open up their app. And by creating more near real time data for the data warehouse team, that's giving the lines of business the ability to work on the next best offer for that customer as well. Apollo, we're in the data with data sources these days. Are there data sources that you maybe had access to before but perhaps the backlog of ingesting and cleaning and cataloging and analyzing. Maybe the backlog was so great that you couldn't perhaps tap some of those data sources. Do you see the potential to increase the data sources and hence the quality of the data or is that sort of premature? Oh no, exactly right. So right now we ingest a lot of flat files and from our mainframe type of front end system that we've had for quite a few years. But now that we're moving to the cloud and off-prem and on-prem, you know, moving off-prem into like an S3 bucket where that data can, we can process that data and get that data faster by using real time tools to move that data into a place where like Snowflake could utilize that data or we can give it out to our market. Right now we're in batch mode still, so we're doing 24 hours. Okay, so when I think about the data pipeline and the people involved, I mean, maybe you could talk a little bit about the organization, I mean, you've got, I don't know if you have data scientists or statisticians, I'm sure you do. You got data architects, data engineers, quality engineers, developers, et cetera, et cetera. And oftentimes practitioners like yourself will stress about, hey, the data's in silos, the data quality is not where we want it to be. We have to manually categorize the data. These are all sort of common data pipeline problems, if you will. Sometimes we use the term data ops, which is sort of a play on DevOps applied to the data pipeline. Can you just sort of describe your situation in that context? Yeah, so we have a very large data ops team and everyone that who is working on the data part of Webster's Bank has been there 13, 14 years. So they get the data, they understand it, they understand the lines of business. So right now, we could, we have data quality issues just like everybody else does, but we have places in hand where that gets cleansed and we're moving towards, and there was very much silo data. The data scientists are out in the lines of business right now, which is great, because I think that's where data science belongs. We should give them, and that's what we're working towards now, is giving them more self-service, giving them the ability to access the data in a more robust way. And it's a single source of truth. So they're not pulling the data down into their own like Tableau dashboards and then pushing the data back out. So they're going to more, I don't want to say a central repository, but a more of a robust repository that's controlled across multiple avenues where multiple lines of business can access that data. Does that help? Got it, yes. And I think that one of the key things that I'm taking away from your last comment is the cultural aspects of this by having the data scientists in the line of business, the lines of business will feel ownership of that data as opposed to pointing fingers, criticizing the data quality. They really own that problem as opposed to saying, well, it's Paula's problem. Right, well, my problem is I have data engineers, data architects, database administrators, right? And then traditional data reporting people. And because some customers that I have that are business customers, lines of business, they want to just subscribe to a report. They don't want to go out and do any data science work. And we still have to provide that. So we still want to provide them some kind of, you know, regiment that they wake up in the morning and they open up their email and there's the report that they subscribe to, which is great and it works out really well. And one of the things is, is why we purchase IO Tahoe was I would have the ability to give the lines of business the ability to do search within the data and read the data flows and data redundancy and things like that and help me clean up the data. And also to give it to the data analysts who say, all right, they just asked me, they want this certain report and it used to take, okay, well, we're going to four weeks, we're going to go and we're going to look at the data and then we'll come back and tell you what we can do. But now with IO Tahoe, they're able to look at the data and then in one or two days, they'll be able to go back and say, yes, we have the data, this is where it is, this is where we found it. This is the data flows that we found also, which is what I call it is the birth of a column. It's where the column was created and where it went to live as a teenager and then it went to die where we archived it. And it's this cycle of life for a column and IO Tahoe helps us do that. And we do data lineage is done all the time. And it just takes a very long time and that's why we're using something that has AI in it, machine learning. It's accurate, it does it the same way over and over again. If an analyst leaves, you're able to utilize something like Tahoe to be able to do that work for you. Did that help? Yeah, so got it. So a couple of things there is in researching IO Tahoe, it seems like one of the strengths of their platform is the ability to visualize the data, the data structure and actually dig into it, but also see it and that speeds things up and gives everybody additional confidence. And then the other piece is essentially infusing AI or machine intelligence into the data pipeline is really how you're attacking automation, right? And you're saying it's repeatable and then that helps the data quality and then you have this virtuous cycle. Maybe you could sort of affirm that and add some color perhaps. Exactly, so you're able to, let's say that I have seven lines of business that are asking me questions and one of the questions they'll ask me is, we want to know if this customer is okay to contact, right? And you know there's different avenues. So you can go online to go, do not contact me. You can go to the bank and you could say, I don't want email, but I'll take texts and I want no phone calls. All that information. So seven different lines of business ask me that question in different ways. One said okay to contact, the other one says customer one, two, three, all these. And each project before I got there used to be siloed. So one customer would be a hundred hours for them to do that analytical work. And then another analyst would do another hundred hours on the other project. Well, now I can do that all at once. And I can do those type of searches and say, yes, we already have that documentation. Here it is. And this is where you can find where the customer has said, you know, no, you don't want, I don't want to get access from you by email or I've subscribed to get emails from you. Got it. Okay. Yeah, okay. And then I want to come back to the cloud a little bit. So you mentioned that three buckets. So you're moving to the Amazon cloud, at least I'm sure you're going to get a hybrid situation there you mentioned snowflake. What was sort of the decision to move to the cloud? Obviously snowflake is cloud only. There's not an on-prem version there. So what precipitated that? All right, so from, I've been in the data IT information field for the last 35 years. I started in the U.S. Air Force and have moved on from since then. And my experience with off-prem was with snowflake with working with GE, with GE Capital. And that's where I met up with the team from Ayo Taha as well. And so it's a proven, so there's a couple of things. One is, is Infamatica is worldwide known to move data, right? They have two products. They have the on-prem and the off-prem. I've used the on-prem and the off-prem. They're both great and it's very stable and I'm comfortable with it. Other people are very comfortable with this. So we picked that as our batch data movement. We're moving to probably HVR. It's not a total decision yet, but we're moving to HVR for real-time data which is change capture data, moves it into the class. And then, so you're envisioning this right now in Fitret. You're in the S3 and you have all the data that you could possibly want. And that's JSON, all that, everything is sitting in the S3 to be able to move it through into Snowflake. And Snowflake has proven to have a stability. You only need to learn and train your team with one thing, AWS is completely stable at this point too. So all these avenues, if you think about it, it going through from, this is your data lake, which is, I would consider your S3. And even though it's not a traditional data lake, like you can touch it like a progressive or a dupe. And then from into Snowflake and then from Snowflake into sandboxes. So your lines of business and your data scientists can just dive right in. That makes a big win. And then using IOTAHO with the data automation and also their search engine, I have the ability to give the data scientists and data analysts the way of, they don't need to talk to IT to get accurate information or completely accurate information from the structure. It would be right there. Yeah, so talking about Snowflake and getting up to speed quickly, I know from talking to customers, you can get from zero to Snowflake very fast. And then it sounds like the IOTAHO is sort of the automation cloud for your data pipeline within the cloud. Is that the right way to think about it? I think so. Right now I have IOTAHO attached to my on-prem and I want to attach it to my off-prem eventually. So I'm using IOTAHO's data automation right now to bring in the data and to start analyzing the data flows to make sure that I'm not missing anything and that I'm not bringing over redundant data. The data warehouse that I'm working off of is not a, it's an on-prem, it's an Oracle database and it's 15 years old. So it has extra data in it. It has things that we don't need anymore. And IOTAHO is helping me shake out that extra data that does not need to be moved into my S3. So it's saving me money when I'm moving from off-prem to on-prem. And so that was a challenge prior because you couldn't get the lines of business to agree what to delete or what was the issue there? Oh, it was more than that. Each line of business had their own structure within the warehouse. And then they were copying data between each other and duplicating the data and using that. So there might be, there could be possibly three tables that have the same data in it but it's used for different lines of business. And so I had, we have identified using IOTAHO, I've identified over seven terabytes in the last two months on data that has just been repetitive. It just, it's the same exact data just sitting in a different schema. And that's not easy to find if you only understand one schema that's reporting for that line of business. So that's been, yeah. More bad news for the storage companies out there. To far. It's cheap. That's what we were telling people. You know, store it cheap. And it's true, but you still would rather not waste it. You'd like to apply it to drive more revenue. And so I guess let's close on where you see this thing going. Again, I know you're sort of partway through the journey. Maybe you could sort of describe where you see the phases going and really what you want to get out of this thing down the road midterm, longer term. What's your vision or your data driven organization? I want for the bankers to be able to walk around with an iPad in their hand and be able to access data for that customer really fast and be able to give them the best deal that they can get. I want Webster to be right there on top with being able to add new customers and to be able to serve our existing customers who had bank accounts since they were 12 years old there and now are multi whatever. I want them to be able to have the best experience with our bankers. And that's awesome. I mean, that's really what I want as a banking customer. I want my bank to know who I am, anticipate my needs and create a great experience for me. And then let me go on with my life. And so Paula, great story. Love your experience, your background and your knowledge. Can't thank you enough for coming on theCUBE. No, thank you very much. And you guys have a great day. All right, take care and thank you for watching everybody. Keep it right there. We'll take a short break and be right back.