 Hello, everyone. Can you hear me? Yeah. Come on. Well, thank you all for coming. It's really an honor being talking in Madrid. I'm from Madrid, so yeah, but it's my first time I talk here. I've been presenting in data conferences, well, in different parts of the world. I've been really fortunate on that. I learn a lot from other people in this kind of conference, so I kind of feel that is my responsibility to give back a little bit to the community and explain my experience. I started working in kind of the big data area around four years ago. I was asked to build the data platform, let's say the modern data platform in Santander, UK around September 2014. I left everything I was doing and started on my own. Got a few people from the organization to start with me, and this is my last talk as part of Santander because I'm moving on to a new challenge. I'm leaving behind a team of 140 people that serve, primarily the whole organization, hundreds of analysts, and more than a thousand people in our Hadoop platform. And so we've moved quickly. We've broken up a few things, but nothing major in the last four years, and we learn a lot. So I wanted to share with you some of the things that I learned. This is my own personal take of the journey, obviously. I represent what we do in Santander, UK, but it's pretty much my own opinion. So I've worked in three, mainly in three countries in the group, and the three big, bigger banks in Santander, Spain, Brazil and the UK. I work in the UK only, OK? So the Santander group right now is a little bit desegregated. It's country goes on its own. In Spain, obviously, Santander is a really big bank. In the UK, we are the fifth bank in the market, which give us enough size, but also a certain position to present ourselves as challengers. So we want to disrupt dynamics in the market and defend the customer. OK? This is what Santander is all about. Creating value for customers and making customers and companies prosper. That's how we define our purpose. And in that journey, we decided around four years ago that we wanted to become a different company and transform it to a data-driven company and digital organization. OK? So that's a little bit of what I'm going to talk about. First, I'm going to start with our journey to give you some context of how we've moved. As I said, we've been around four years in the data transformation. A bit earlier than that, we started experimenting with new technologies, created an innovation area. And then we started testing the technology that was in the vanguard at the time. And since then, we've embarked also in an agile transformation and what we call a digital transformation. OK? So lots of transformations kind of happen in parallel, which for me means that the bank realizes that we cannot just change at a normal pace. OK? A transformation is something that is changing more quickly than you would expect. It's not normal evolution. It's accelerated evolution. And so it creates new challenges. OK? I'm going to talk about how it feels like a revolution. OK? And what happens in revolutions. OK? So around four years ago, we started testing Hadoop. It was in the news. It was in the Harvard Business Review Magazine. Data scientists are the sexiest jobs here on the future century. And so the technology was mature enough. It was open source, but there were also enterprise-graded distribution like cladera. So that's what we chose. We started testing it. We saw the potential of it to change significant dynamics that we had in the organization around data. And we started testing performance, scalability, things like that. And we decided to embark in, well, bet in it and start investing in it. And we started doing some analytics projects. OK? The analytics projects that we started doing were quite successful, even though we didn't really have tools on top of the platform. We were doing basically impala queries, SQL, simple visualizations. But we partnered very closely with the business. We partnered very closely with people that had the knowledge about the problems that we were trying to tackle. We applied some design thinking. We got into a room. We painted. We looked. We were very pragmatic about what we had available to us. And we created value very quickly. And that created a very good showcase for what analytics can achieve. And then a few months later, in July 2015, we were launching in the UK Apple Pay. So we created a real-time streaming application that it goes to the customers. Customer facing is called Spanletics. And it gives analytics to the customers. It was a very advanced architecture at the time. It was a lambda architecture using streaming architecture at the time, Flume and Kafka, using H-Base and exposing data to customers and giving it to them, giving the power of the data or the value of the data to the right owners because the data belongs to the customers. That was our understanding at the time. And obviously, GDPR has validated that. And it was a very important, it was very well-designed. So it became the showcase of what data can do, which is not just analytics. It revolutionizes the way you offer data to customers, what you can do with data, including in the channels that we offer to them. So from here on, once we demonstrated that technology can be put in production, can be maintained, can be scalable, can be replicated, highly available, and so on, we started opening up for the rest of the organization. Okay, what I call broadening. For the next few months, we started implementing a new operating model in which we were opening up the platform and the data for self-service, which is a very different challenge to implementing data in a closed, for a closed area, for a closed analytics team that is very well-defined in an organization like Santander UK. We are 20,000 employees, and we are aiming for everyone to have access to data at the right time with the right information in their hands. So we ended up finding the most friendly business areas and partnering with them, getting their feedback, and building the foundations for the platform. And a lot of it was around data governance, which the previous talk has been about. So definitely finding the right metadata in your organization is gonna be key. Putting the security as a first step, as a foundational step, without security, everything can break down. That was very important at the time and we continually reinforced in the foundations. And then we started scaling up. So that's when we started opening up for business to start doing things by themselves. And we had the challenge of growing well, more than 10 times in the volume of data that we were processing and the projects that we were doing. But obviously the number of people that we had could not grow 10 times. That is impossible. It's very, basically, the talent is really scarce. So if you guys are doing big data, you are very rare in the market. So ask for more money from your companies. And this is very, we had to be very effective so that we could scale the use of data and the value that the company was creating, keeping the headcount in the team as low as possible. Then we started democratizing. And democratizing for us means as people start using the data by themselves, they need to start giving back to the project, to the platform. So we ask people, we have our own tools so that when people are identifying, creating insights, they can publish them and they can be searchable. People, other people can discover them. And then as they find issues in the data, they can publish them, they can report them and they can be made available to other people. And they have the rights to use the data but they also have responsibilities, okay? This is a quite different concept. We call it data citizenship and it's very difficult to make work. And honestly, we haven't made it work properly in some other UK yet, okay? But the goal was to become data driven. And we haven't got there yet. If you see, I've put kind of phases year after year and I am now in beyond the 48th month of the journey and we haven't achieved this because this requires a significant cultural change that we haven't been able to make. So data is a big asset. We are using it at scale but we haven't got to the objective that we had which was to change how people perceive data and what they do with it, okay? And that's the next part of the conversation today. So where we are is represented in a few slides. Hopefully this is a view of a bank, okay? So we split our organization in domains and we are representing here how many data sets we have in each of them, okay? So these are data sets that typically we bring to the platform every day. Most of them are some of them are parameters so we bring them weekly or monthly or something like that. And we have more than 4,500 data sets now in the data platform. So that's more data than we had in all the other data warehouses that we had in the past combined, okay? More data than ever, basically. But people still keep asking for more data because they don't seem to find what they want. Curiously, okay? Around 300 of those data sets are being streamed now which is a big part of the evolution of the platform, streaming architecture. I'm gonna talk about that a little bit. And we have implemented BI on top of the platform very tightly integrated micro strategy, in this case, with very tight security so we don't need to replicate and we don't have any leaks. More than 500 analysts now working in micro strategy and more than 1,000 users on Hadoop. As you have seen, we are serving the data to all our channels now. So the app that I showed before, the online banking, the mobile main application, the brand systems, all of them receive data from the platform but are also around 1,000 users using it for operational purposes internally, okay? And we've been trying to explain to the organization in this drive to create a cultural change. I've been trying to explain to the organization what we do in the platform and it's really hard. I don't know if you have experience with that but people don't really understand data at all in our organization. So we talk about ingesting, we talk about streaming. Most people don't understand really any of those concepts, okay? They can hear about it a hundred times but they don't place it in their heads and they don't. So we ended up talking about these simple verbs, okay? And well, more or less, this has helped in the conversation with many parts of the business. So we just talk about acquiring data, we store it, we do computation on it, and we exploit it, okay? Obviously we manage all of that. We, very critically, the data is nothing unless you understand what it means, right? So understanding the data is a key initiative that we have and then when we find something of value, we serve it to anyone that wants to consume it. We expose APIs with it, okay? So that's what we call our conceptual architecture. And we open up all of those concepts in smaller verbs as you can see here. So streaming, obviously here we now talk about ingestion but we talk about access, monitor, support, and in this part we talk about transforming and organizing the data in a common model. We analyze it, we iterate on the data, we create a model, et cetera, okay? And we use this to also show what technologies we use in this next slide. So you can see, on top of the conceptual architecture, you can see the technologies that we use. Most of it is open source based, although we use cloud-era distribution of Hadoop and we use some components, Kafka particularly from the confluent community version. And obviously we have some products that we buy it. So we do use an issue for metadata but also for change data capture and micro-strategy and we built our own products in some cases. Right, so we have a whole data platform now that we can use for analytics, we can use for operational reporting, we can use for regulatory reporting, we use it for regulatory reporting, we use it to serve data to our channels, okay? So now the next stage is to move in the middle to move to cloud. So we are already, we have already implemented, I'm gonna show you now the streaming architecture in a cloud, in a private cloud model using Docker and Kubernetes and the journey in some of the new case to try to do as much as possible in a cloud paradigm, architectural paradigm. Using Red Hat, by the way, OpenShift, okay? So this is our streaming architecture. We ended on this top corner, we have our APIs that we serve to our channels built with microservices on top of Apache Jays and Apache Phoenix and we wanted to do that at a scale. So we ended up buying Abinicio because it's the most effective change data capture that this is now. So we use Abinicio to take the data out of our core banking system, which is a mainframe, IBM mainframe, and as things happen in the mainframe, we stream it out of the mainframe, putting it in Kafka, create different raw topics for each of the tables that we streamed, so around 300 now, and then we transform those streams, we merge topics and all the processing that we do, instead of doing in Kafka, we do it in OpenShift with Docker, in Red Hat OpenShift with Docker and Kubernetes. So it's all streaming to our private cloud and streaming down again, and then we load it in the different databases, so we have a polyglot architecture in which different databases are used for different purposes, and we use HBase for real time access, we use something called newDB, which is a distributed relational database and it's gonna be key for our core banking, and we use Apache Kudu or Cloudera Kudu for streaming analytics, and this makes the backbone of our future core banking. So the core banking on the left is a monolithic, so we have channels that contain presentation layer, business logic, application layer, all in the same application. On the right-hand side, we are building a microservice reactive architecture using APIs that serve the different channels and that can be reused, okay? And all of that is based on, well, gradually adopting a reactive architecture using streaming and with Kafka as the database. For analysts, we are implementing what we call data service. The point being, if you want to implement self-service, and you have hundreds of analysts let's say we have 500 or so, and you have a single team doing data governance as we were talking before, or one single team doing data architecture, those things become bottlenecks, right? So you need to remove yourself from the process as a central center of excellence team, remove yourself from the process and make it all automated and as a service, right? So that's the idea. So we have a portal where you can search starting from the top over there. You have a portal where people are supposed to search first, right, to see if what you are trying to do is already assist. And if it does, you use it. And if it doesn't need necessarily what you want, then you are supposed to iterate on it and improve it. If it doesn't assist, then you find the data, right? So that's another portal that we have where we've published all the data that we have. We have documented it with metadata. And now you can search in that portal and find the data and how the data relates, how one data set relates to another. So we built a graph, which is kind of an ontology or a knowledge graph, it doesn't go as far as that, of course. But it starts opening up the data for people to understand it by themselves without asking any expert or anything like that. Assuming you have the data, then you can use it. You can put it together, you can transform it. If you don't have the data, we are building what we call ingestion as a service with my friend Guillermo Almonathy-Rans. And the idea is that if you have the data in a different database that is not in the Data Lake, but you have the metadata of that data, then automatically you can bring that data into the Data Lake because we have all the knowledge that we need to make the decisions about security, personal information, PII information for GDPR, data architecture, and so on. So then we can bring the data automatically. The user can bring the data automatically without talking to IT. Well, so now we have the data in the Data Lake. You can put it together according to the data model and we are gradually building the data model of the bank in an open source model. So as projects do create a data structure for a certain part of the bank, we plug it in the common data layer. And you can transform the data, then visualize the data, potentially model the data, and everything that you do, you publish it back so the next person can find it, okay? And overall, where we are is that data is now considered a critical business capability, okay? And I'm gonna talk a little bit about that in the next section. And this relates as well to the presentation about Data Owner that we just saw if you were in this room. And we have defined it, we have defined 12 data capabilities. We define what we do around data in the organization. My team, well, now we have an area called Data Services in Southern UK, which manage all of this. Well, it's the privacy and information security which are managed by independent areas. The 10 ones on the left are managed by Data Services and basically six or seven of these are managed by my current team. And this help us organize the maturity of the capabilities in the organization and where the investment needs to be. We put a person responsible for its capability and each of them work kind of as a separate startup within the organization in that they have to provide value and provide a service to the rest of the organization and make their customers happy, okay? So how have we grown to the place where we are today in four years? For first, we take the business capability and I kind of say there is part of it that enables learning and parts of it that enables value creation, okay? So you are gonna create value with the data, the process and the technology, but obviously all of that needs to be operated and informed by people, okay? So for us to be able to grow the capability we need to grow both the learning and the value. So starting with the value, what you always hear in these conferences is, well, start small, create value and so on, yeah? Which is fine, it's true, and this is what you should be doing, find value, find something that is useful for your organization, fix that, show improvement in that and someone will advocate for your initiative and then you can keep resolving more and more problems, okay? If you don't have enough people, well, find other people to help you that are committed and interested in solving that problem, okay? If you give them the right tools, the right training, people start solving problems by themselves, that's the whole point of self-service, right? So you keep growing by finding a use case, that use case typically has other uses in other parts of the business, so you can take the same use case and with a small change it's applied to several problems and you keep multiplying like that, almost like a tree that keeps opening up more branches, okay? Except that you need to be careful because if you have very slim foundations and you keep opening more and more branches in that tree, there is a moment where the tree starts shaking because it's not strong enough, okay? So as you grow, you need to continuously reinforce the foundations and improve your platform, okay? The other thing is that you need to keep learning because no matter how effective you are as you grow to the scale of 20,000 people company with 40 million customers, you are going to end up needing to grow your team and finding the right skills is really difficult. So this is the model that we use for a data science team in which we bring together people from three different backgrounds, data scientist, software engineer, data domain, and we try to put them together so they learn from each other. If they have enough common knowledge, they can learn from each other. Typically, they don't, okay? So then they don't, okay? We've tried this for two years and it's really difficult. So now we try training, we try collaboration with universities, we try software to help them, and we try metadata tool so people don't need to have so much knowledge of the data because it's documented. But anyway, the whole point is that you need to keep growing the knowledge of your team, okay? If you do both things and you manage to do both things, then you can grow by applying this model that we have used somehow, which is mitosis. If you have enough, think about a cell in biology, if you have enough sun and enough food, then that cell opens up in two. There's enough cross-pollination so it splits in two and then we can put one part of the cell in one domain, another part in another domain, and then you give them the opportunity to grow, right? So you've taken maybe six people, they have learned from each other for six months and now they have enough knowledge to separate in two teams of three, put some three new people in those teams and they go on their separate ways, start creating value in the organization in that way, okay? So basically, you started with these people, now you have 12 because you've plugged other people that were already in the organization or that you brought from outside. The other way we've been growing is by thinking about product teams, about creating product teams and thinking about them as startups. So you think about it, you have a capability, as I painted before, the capability has people, they have software, technology, they have data, and they manage some processes, right? So a capability and a product team are very similar concepts. And what we do in a product team is we separate certain people, multi-functional people, we give them a challenge, we give them a value goal or they define their own value goal and they try to deliver that value for the organization by themselves. And if they do it well, they create value for the organization and the demand that comes to them keeps growing, right? As they keep growing, they start identifying, they get feedback from the consumers, from the users of their service and they keep growing their service and making it better. And there is a point that there is no more value to be created. In that process of creating value, they may come up with more services and split up in several cells as well, okay? And at the end of the day, the organization ends up being hundreds of small different teams, which obviously creates a challenge of ensuring that all go in the same direction and with the same goals, okay? So scaling agile is a big challenge. Each of these things should create a successful proposition, which is not necessarily easy, but we think about it in this way. Obviously, starting with customer value, it needs to be a value for the company as well. We need to, we try to be forward-looking and go past assisting mindset restrictions into what the future would need and what is the actual customer need, not just what the customers declare, let's say. Think about customer experience. We, the market is talking about, instead of minimum viable product, we are talking about minimum lovable product or something like that. So you need to add experience to the product so people really feel that the product that you are creating has the value that you aim for and try to demonstrate value quickly and reduce uncertainty, ideally differentiating the company from the rest of the competition. If you do that, you get some successes and then you try to make sure that the successes are across the whole organization. So you get someone from the top, I don't know, an executive member to sponsor your initiative and give you some money, but that's not enough. And if that person leaves the organization, then potentially your initiative is finished. So you need to create a report and advocacy at all levels, ideally in different business areas, the managers of the data intensive teams and the analysts, right? And hopefully you get some feedback from them and you learn from them. And if you have the ability to adapt to what they need, then you can continuously create a product that they value and they are gonna help you in the journey by selling the successes for you. Okay. And then going back to biology, if you do that, then you have the ability to partner with the business. So your team, you can put some engineers on your team in a product team and maybe in the first stage of that product team, you need to put three or four engineers, but then as the business users, if they are technology savvy, they start learning from your engineers, then you can start removing engineers and dedicating them to something else until the product team is independent and or they have kept one of your engineers maybe and you can multiply the number of product teams or you put people by partnering with the other areas of the organization, which is basically what we aim for and we have some what achieved, although not to the scale that we would like, okay? Because they are challenges. So the first challenge I already mentioned, you create quick value, but sometimes you grow too quickly and then your platform is not as robust as you would like. So you need to be worried about that and always trying to make it as robust as possible. Almost I would say also as simple as possible because we tend to think about improving the platform by adding more complexity and in many cases the best solution is to remove complexity and make it simpler, okay? Technology is evolving really quickly. So if you are looking for a tool and you don't make a decision in a year, likely who is the next year, there's gonna be another tool that is better than the one you thought you were gonna get. You need to be conscious that you start using, I don't know, you start using Spark and maybe in a few years you're gonna need a different tool. So almost you need to be ready for that and make the organization ready for that because there are gonna be investment in, people are gonna feel that you are wasting money by investing in something that maybe a few years later you need to invest in something in the same space, but that is much better. And obviously I think it's very important to be always looking at what the industry is doing, coming to these events, looking at the Apache community and other open source communities and trying to experiment with the tools that are coming out. So you can select the best for you, okay? One key aspect for me is the people, right? So it's really difficult to find good people and you can have the best technology and whatever, but if you don't have the right people, nothing matters much. So it's really complicated to keep growing and creating value quickly and keeping people in a peaceful state, okay? I kind of joke that we've been running for four years in sprints, right? So you imagine a user involved running a sprint of 100 meters or you imagine Mo Salah running a marathon, but you don't imagine a user involved running sprinting for 40 kilometers, right? And sometimes we are trying to do that and the team gets burned and you lose them. So these are very important balance between working intensely but then having some break or working in a certain, you know, special project that gives you more, you know, more adrenaline maybe, but then working in another project that is more lower-paced, okay? You can have the best engineers in the world, but everyone ends up cracking under pressure if you apply too much, okay? And then, as I said before, this is a transformation on top of another transformation on top of another transformation. So really it's a revolution and in a revolution there are people that are gonna suffer. There is gonna be conflict. Sometimes you want to avoid it, but sometimes you need to seek it because you are not gonna do our, I mean there are very few revolutions that have happened without conflict, right? There are gonna be people kind of at the top that are not needed anymore. People at the bottom, they don't have the right skills and they are not gonna get the right skills. And the people that are driving the revolution, well, you have been in the forefront of the revolution and sometimes you get shot and joking, of course, metaphorically shot and you also leave the organization because when you are driving the transformation, you also get burned in the process, right? So not everyone should survive it. There are people, if this is a real transformation, you are very young people, most of you, but so maybe you are not gonna suffer it, but maybe you suffer it in 10 or 20 years. Some people are not going to survive the transformation, okay? And the sooner we realize that, the healthier it's gonna be for the whole company, in my opinion. This is personal opinion, okay? And all of it is a cultural change. So it's very good to be creating value, but what are you trying to do? Are you trying to create value or are you trying to change the company? Or maybe you are trying to change the company through creation of value, definitely, right? But if you are trying to change your company, you need to think about slightly different things, okay? And the last idea for me, this is something I added yesterday because of some people that I came hearing about, is a few of these comments that you hear all around, okay? Everything should start with the business. Technology is not important. Big data, machine learning, artificial intelligence are just technologies. I mean, I don't know if you hear these things. I heard some of these in a couple of sessions yesterday. People don't mean exactly this, okay? But this is what they say. Everything should start with the business. It's not everything should start with the business. It's everything should start with the customer needs, okay? That's what they mean. But is the business in your organization tuned to the customer needs? Really? Do they really understand the customer needs? Do they need data to understand the customer needs? Or they think they do, but they don't, because I don't think they do, right? So not everything starts with the business. Everything starts with the customer, okay? That's different, it's a different thing. Technology is not important. I mean, seriously, it's not important. Everything is driven by technology, right? I mean, if there are big oil companies, because there is an engine that requires oil, that's technology, right? There are companies that are, the internet is technology. I mean, the wheel is technology. Is that misunderstanding of what technology means? Right? Technology is not a computer. That's a piece of technology, of course. But technology drives the evolution of the human being. So of course, technology is important. And in the digital, in the digital era, data is also very important. So I think what they say is technology is not important. What they mean is it's not sufficient. Well, technology is not sufficient. You can have the best technology and use it for nothing. And if you don't create value, it's not gonna help you. Same with data. But customer value is also not sufficient. Nothing is sufficient in this world. You have to combine all of these things, have a strategy, add some innovation, differentiate your product, and definitely be efficient. Hopefully you think about the overall good of the society and create value for your customer and then maybe you get somewhere. But definitely bring technology and data to your strategy. Try to get that into your organization as much as you can because that's the only way organizations are gonna survive, in my opinion. And that's it. Thank you very much.