 Hello everyone and welcome to our next EW session titled Discipline Self-Service BI Enables Advanced Analytics which will be presented by Marius Moscovici, the CEO of Metric Insights. All audience members are muted during the session so please submit questions in the Q&A window on the right of the screen. Our speaker will respond to as many questions as possible at the end of the talk. That Q&A will be hosted by Natalie who you can also see on your screen. Please note that there will there's a linked form at the bottom of the page titled EDW Conference Session Survey. This is where you can submit sessions for tech and encourage you to do so. So let's begin our presentation now. Thank you and welcome Marius. Thank you. Thank you Jim and welcome to our presentation. Today I hope to be able to go through self-service and what's involved in really making it successful and hopefully leave you with some specific examples and tips and ideas that you can implement within your environment irrespective of what technologies or processes you choose to use. So let's start with talking about what self-service is and obviously kind of intuitively right in the enterprise you've achieved self-service if you as a consumer of information can get that information without having an intermediary between you and the information. But that really breaks out into two different definitions depending upon who you are within the organization. So for the analyst self-service is about data. It's about how do I get to data that I can trust and build generate analysis that is correct and accurate that can be shared out with the business. But for consumers of the information for the business users it's really about insights. It's about how do I find the answer to my question in such a way that I can find the right dashboard find the right report get my question answered and not have to go and bother an analyst to be able to to go dig and figure this out for me. So it's important to think about the fact that there are these different constituencies and for each of them self-service means something very different. So let's break that down a little further. If you think about your business users and the challenges that they have with self-service the biggest issue they typically have is the fact that they're overwhelmed by a whole slew of irrelevant dashboards and reports right. If you go inside of almost any mature BI environment there are lots of reports multiple BI tools and then even within a given BI tool there are often two or three or four versions of something. Some things are obsolete, some things are current, some things have correct business definitions, some do not and it's a huge challenge for the casual consumer of information somebody who just wants to spend a minute or two and quickly gets an answer to be able to figure out well what is the thing I should be looking at. When they look through bookmarks or look through emails it's that they're all huge barriers to self-service and enterprise. And then if you flip to the other side of the coin and you look at the analysts they have two challenges. First they're spending a lot of their time oftentimes answering questions from the business users who are just you know frustrated don't want to try to figure out information you'll just find it easier to just send an email or pick up the phone and ask a question about how to do things. But then also on their own for the analysts they're doing analysis to find answers they need to be able to figure out well what is the data that they can trust? Which dataset do I use? Does this contain the right data? Is the data current? Do I have the right information to be able to pull this data out correctly with the right logic? And so these challenges are there for both consumers and producers. And we've come up with a general governance life cycle that I want to share with you that we found to be very effective at serving both of these constituencies. But first I want you to consider the fact that when we think about building content in business intelligence we're using a very limited paradigm. Typically we're using kind of the builder's paradigm of you know let me figure out what needs to be created let me design that let me go ahead and implement it. And that makes sense in the context of a specific reporter dashboard that needs to be built. But it doesn't really work in context of the overall overarching ecosystem that represents your BI infrastructure because there it's much more of a garden right it's much more of an ecosystem with different plants different different species all co-existing together. And you need a solution that addresses the fact that there's a continuous motion required to refresh and revive that ecosystem so that it's relevant to your audience. So let's talk about what that means. So first everything starts with understanding usage. So none of us start with a blank slate right there's content already out there. So as you think about enabling self-service the first place to start is to say well what content is being used from the various touchpoints that are out there understand that then use that information to then build and deploy new content that fills critical gaps. Once you've done that now go ahead and measure the engagement. So it's not enough to put something out there I built it great to now move on to something else. The key is to say well how is this how what kind of engagement do I have with this content that was created. How does that engagement change over time? Is there sustained engagement or is it a blip where people look at something and say oh this is new and interesting but once I've looked at it I'm moving on to something else. Then you need to based on that information I need to inform a whole process by which you promote and purge content. So take it and say if you've got poor engagement but you believe that this is actually something useful go ahead and promote it. If not if there's something sustained for engagement even after it's been promoted then may possibly miss the mark either needs to be modified or should be purged so that the space can be decluttered and and all the content that's relevant is being shown. And then you use all this information that content optimization cycle to then optimize the resource allocation and resources are both things that you are the licenses right so if you've got if you're spending hundreds of thousands of millions of dollars on your licensing take a look and say well where are those licenses being used who are you using it and making sure that those are allocated in a way that's based on usage that's rational that optimizes resources the key part of the process and then having that information that the content optimization process feed into the way the teams get managed so your most valuable resource of course is your bi team and so you need to make sure that that team is aligned in such a way that they're building content that is effective that is useful to people that creates sustained usage for the for your community not one-time bullets and then people forget about it becomes part of the clutter in the environment and so following this continuous improvement process is almost infinite sort of cycle and and having a rigorous process by which that happens would is what would enable a true self-service environment and let's talk about a little bit further about that so you know there's that expression that you know ferns can't grow in the desert and that's obviously true and because you know a fern requires the proper context to grow needs the right climate needs the proper amount of water and you know that's not present in the desert and the same is true for self-service infrastructure there's some things that have to be in place for that to thrive and those are there needs to be easy and very rapid consumption of content so a user should be able to say you know quickly find that content they're looking at whether they're an analyst looking for data or the business user looking for a particular analytic they need to know that they can trust that particular piece of content i cannot self-service if i'm looking at a visualization but i have no idea if that is really the visualization that i can trust that has correct data or if it has some other business terminology or definition that's not correct and similarly as an analyst i i'm not going to be able to go ahead and build a new analytic off of a table unless i know that that is data that i that i can trust that has the proper definitions and then that really translates into correct context so understanding this data what are the definitions that come into it what how does the data get in there what is all the contextual information that gives me the comfort that it is accurate and trustworthy and and timely and then therefore i can i can on my own make the right decisions with the data so all of this has to be in place for an effective self-service infrastructure and so kind of look at some examples of that right when i as a consumer of information when i come in i need to have a portal a single place where i'm going to have all my content and this is critical if i have to go one place to go to my our shiny applications another place for tableau another place for micro strategy each of them with their own sort of organizational structure and search algorithms and ways to find the data you've already lost me from a self-service perspective right i need to remember what kind of report it is and you know we're nowhere to look and i need to remember all these different paradigms for finding these to all be in one place all tagged and organized and searchable in one coherent organizational pair it also needs to be possible for me as a user to come in and go from a low fidelity to a higher fidelity view in a progressive manner so having the ability to have previews is really important i might have a kind of a initial i see a list of things maybe some small number of visualizations small thumbnails that i need to be able to quickly transition to to be able to see a larger visualization seek contextual information that tells me you know if you who are owners of the particular piece of content what are the who are the data data steward the the technical owner the business owner if there's tagging associated with it does this contain pii information whereas it did classification all those things need to be available contextually right there for me to be able to use that information for it to be useful needs to be published in such a way where there is governance around it and there's a proper certification workflow for it right so it doesn't just magically show up in the ecosystem useful it has you know in order for it to be to get there with the right tagging with the right information and documentation and certification there needs to be a process by which we can move things through the various stages to get it certified and that that's not a one-size-fits-all solution it's going to vary based on the type of content that you have it could be as simple as an engineer creating a piece of content publishing it and then uh you know authority a business stakeholder certifying that is correct or it could be something that is multi-stage where you say you know it's going to be a it's going to be reviewed by data steward or data stewardship team and move through various stages of certification before it's published and available to the business users and and so whatever that the right process is for that particular piece of content it needs to be followed such that then you know that it's been promoted in a way that it supports effectiveness discovery and then all of that tagging and context needs to be available to the consumer of information when they're viewing a dashboard so for example you know take a look at this screenshot here i'm looking at the tableau dashboard but there's context wrapped around that dashboard so up at the top i can see that there is a a uh you know that this data has actually been delayed so then i know immediately and not to get the wrong idea by looking at this information and and jump you know jumping to the wrong conclusion because i'm looking at yesterday's data or last week's data there's a delay it's going to get solved i should come back and check later further more if you look at the next slide down i can see underneath the name and description that i've got those little tags that tell me what are the kpis what are the the enterprise terms that are relevant to this and then if i click on one of those then i can see a pop-up that gives me the context it says me oh this has enterprise churn well here's the definition of enterprise churn uh this is all about uh giving you know this is the definition and here's the business owner here's the technical owner and oh below that i can see that there are a number of other visualizations that are available that are using the same terminology so that fosters up that self-service because if this anal analytic doesn't have what i need i can click on one of the other ones or go into my search and find the item that maybe does so this kind of guided process whereby i i wrap the particular visualization with context and as a consumer of the information i then see that contextual data together with the actual visualization gives me then for the the comfort and the confidence to know that i can use this and make decisions on my from the analyst perspective or even perhaps even from a from a business user that is perhaps a little more sophisticated and really knows a little about their ecosystem another key piece of context that's necessary for self-service is to understand lineage so for particular visualization i'm maybe looking at this tablo dashboard how what what is the where does the data come from you know where it comes from this particular tablo data source and where does that tablo data source come from you know maybe that's coming from snowflake tables or or some database or some csv or some combination thereof so understanding when i'm going to consume something where that information comes from is incredibly useful because you know if i'm consuming data maybe this dashboard is coming directly from Salesforce i know that i'm looking at raw data i'm looking at information that's being collected directly to my crm but if it's coming from my enterprise data warehouse well if it's coming from my cleanse area then i know that there's a whole bunch of cleanup that's been done and this is data standardized it has gone through some data quality if it's coming from some raw area of the system that i know that perhaps it's more unfiltered raw data and and that that context will tell me how to use the information which then enables the proper level of data discovery the other aspect of self-service from an analyst perspective is that i need to be able to discover the data sets themselves and you find a data set it's not just about you know here's the name of the data here's the columns and the data i need to know what are the analytics that are used by that but is it certified who are the owners for that data set who do i go to ask a question if i do need to figure out something about the data what for those columns what what are the specific kpis that those measures so oh i see a column in here this is measuring enterprise churn i should be able to again go back and see that definition now i know from an analyst perspective that i can use this particular data set for my analysis because it has the definition that's relevant to my analysis so all of this has to be all leveraged together and consolidated into some kind of a view that's accessible to me as an analyst so that i can use that information to make sure that a particular analytic is the one that i need so i want you to think for a moment and if you can go back this far to pre-pandemic times when you might be have been going into the office working on perhaps you are someone who builds analytics reports and and consider for a moment that you know what that experience the serendipity that would happen in our day-to-day experience working on the office you know perhaps you would go and you're working on a particular analysis and then you go to take a break to get a cup of coffee and in the breakout room you'd run into joe another analyst and you know you just wake up a conversation and you mentioned to joe what you're working on and joe would say oh you know i i was just talking to alice the other day and she has an analysis just like this as she created last year you might want to check with her and see if what she's created has built is actually useful to you and and you'd go and you know follow up with alice and sure enough she's built something in that that coffee break that you took saved you a day of work because you didn't have to go recreate the wheel and not only that but the fact that you stumbled across something that's already useful well that saved the end user from having two objects out there that's six months from now they have to figure out well which analysis do i go with right there isn't duplication of content because of that serendipity so one of the key things that as you build out a data discovery platform for your users you want to think about how do you generate serendipity by design how do you avoid the duplication of effort to content that happens so often when people build you know three or four similar versions of the same thing creating a it make it impossible for an animal for anybody other than the analyst to figure out who should use a particular piece of content to get a question and and making sure that then the business users stumble across that content as they need to find the information so in other words that it's not something that oh i have this in some data governance tool or in some wiki page somewhere that somebody has to go look for it but because you know people are in the stream of conscious that they're going through their workflow they're going through the day-to-day process of doing their work it has to be that they find these things through that process through that regular workflow naturally and that's and if you can build that serendipity by design to system then implicitly you are fostering a sense of governance and a sense of data discovery so take a look at that as an example you know you need to have a universal search engine built or in deployed across all of your analytics so irrespective of the type of content that the user should be looking at there needs to be a place where they can find that content and it needs to be more than here's a share point site with a bunch of links right because that's just overwhelming i need to be able to go in and search for something and then by in the process of that search just discover other content you know whether that that content is things that are that are maybe tagged by the same thing find that there's you know different types of objects that i have i should be able to search by by popularity right because if there's something that's more popular clearly that's going to provide potentially more useful information so as a consumer of information i should be able to find that and and and all that needs to be available and accessible in the search paradigm that i'd be using to be able to find the information that i want um so that i can discover content and then furthermore i need to be able to have a class of content available that is discoverable even if i do not have access to it right if you think about that the types of content that you might have in your enterprise there's obviously the things that everyone that a particular user has access to they should just be able to find it and use it easily then there's the things that are highly confidential they're highly restricted you don't even want to make people have users necessarily know that they exist unless they're you know a privileged set of group users you know hr reporting on who received the bonus last quarter well maybe that should just be uh a visible and available obviously to the air chart team but then there's this vast set of content in between where it's not going to be open to everyone it's going to be accessible to a specific set of people because there's security they should govern it properly however you want to make it discoverable you because let's face it when we when we create a piece of content and we assign it to a group of users sometimes we're not perfect about what that access control should be maybe there's another ancillary group of users that would be interested in it but i'm not sure if i should give them access to it so i don't and then if those users are not aware of that they cannot discover that content then they're going to go and ask analysts to create it so and then there'll be duplicative content created so to ensure to minimize that kind of duplicative effort and to and then to build in serendipity into discovery you should be able to create content and put it out there make it discoverable so people can search for it but when they look fine when they find it maybe they're seeing a blurry image maybe they're just seeing the metadata around that content and when they click they're saying ah i don't have access to it and here's a process by which i can go and generate access that will them put it in place so i think that's a a you know that's critical part of ensuring discoverability and finally instrumentation is very important so whatever you've done whatever judgment calls you've made about hey here's the content that i'm going to publish here's i'm going to make it discoverable the only way to know if you've been successful is if you measure your success right i mean that's obvious so you need to understand are you succeeding or not you have to understand what is the user journey with that content and then you need to figure out what those gaps are and fill those gaps so what do i mean by that well you need to make sure that you have some kind of way to understand first what content is useful so in that overall garden that you've created with the whole ecosystem of visualization what are people looking at what is most popular what's increasing in popularity what's decreasing in popularity right and what's not being used at all so that gives you kind of a snapshot of what's useful that's that's critical secondarily it's much more important to know obviously it's important to know it's useful something is useful but more more so you need to understand the usage patterns you need to know is some visualization that was created three months ago is it doesn't have a whole bunch of active users because people were using it a month ago but then stopped using it or is there is something is really sticky are people coming back and using it every single day and so this kind of a this is an example from our application where there's a an animation and you can see the the circles that you're seeing there represent the number of days since somebody's used a piece of content and then this particular content that i'm looking at i can see the users that have have used it just a day ago and those users that have used a 30 days 60 days 90 days as they drift out and then if you play that animation out those circles move out of this from the center if the user stops using that content and they stick close to the center if the user continues to use that content with the frequency so then you know that level of stickiness that's involved that's that's that's critical and then finally you know understanding the content that's there is only part of the equation because the second part of the equation is you know what's missing so that search page that we talked about the place that people go to find content whether that's data from the analyst perspective or or visualizations from the from the user perspective we need to be able to track what is it that people are looking for that they are not finding you know what are the unsuccessful searches what are because that tells you that either that hey maybe some things are maybe that content's there but i need to tag it better i need to describe it better i need to make it more accessible to people or perhaps the right content is missing and this thing that people are looking for we need to build up and that informs the proper alignment resources in that in that organization so let's review for a moment the pieces that we need to put in place in order to make their discovery work and then we can go to questions so first we need to create a clean and engaging content space so that's about more than just creating yet another dashboard or report to put it out there it's about looking and carefully assessing what's there making sure that only the things that are actually useful are deployed that things don't have usage or either promoted or demoted or sort of removed there's not duplicative content out there that it's all accessible and searchable and visible to users then we need to make sure that there's a certification or lineage process that whereby you can as a user when i discover content i know what is certified what can be trusted and where does this data come from and and so that i have the proper context for making decisions about how to use it i then need to be able to make sure that the environment balances security and discoverability so it can't just be about locking everything up needs to be about of course completely securing your data and making sure that things that are should not be discoverable or not but that things that can be discoverable are enabled so that users can have those serendipitous moments where they go oh look at that somebody already created that analysis let me go in there and take a look at that and they'll have my answers and then finally there has to be a monitoring and usage analysis component that has been built in such that you can then iterate in the cycle because nobody is going to be perfect in their implementation you need to be able to have processes in place for every month you evaluate what's going right what's not and then iterate through the process so with that let's open up the question yes we do have a couple of questions from the audience and just about five minutes left in our session the first question is um is the tagging and linking of visualizations automated or does it need to be done manually in your tool well that's a great question and i just forgot to mention one thing before i answer that question um if you're interested in any of these um we have created a resource for you that summarize that provides a lot of information um shared in here as long as a lot of other useful information was created by the IIA these the International Institute for Analytics um and we collaborated with them on that and it really talks about a general framework for establishing a self-service program so feel free there's a URL there for our websites or anybody that wants to kind of just check it out um and you know have a more material around self-service and how to achieve it within the enterprise i invite you to go ahead and download that um so going back to your question around tagging uh to enable discovery there are two ways in which that can be done or multiple ways um that information can either be inherited from the BI tool so when for us when we on board a visualization from a from the for let's say from Tableau we can bring in the tags that have been set in Tableau directly into the portal so it's there available for search and inclusion it can be set up within metric insights so somebody can come in and manually set those up and and uh uh as as part of the process in which the governance process in which content is on boarded and then finally it can also be imported from data governance tools so if you're using sandalation or a calibre and you've got tagging information there for that content uh we have hooks that allow that you both pull that information in from there and automatically apply it so the idea is you know our philosophy at least is you know wherever it is that's most comfortable to create those tags um that's where you should do it and then and then pull the content in the key is to make sure that it's accessible all the way not just to the not just in the source system but all the way to the at the point where an analyst is looking for that information thank you we have another question um if we build so much governance infrastructure for self-service such as data set discovery certification workflows lineage et cetera is there still a need to promote the data set to a production platform managed by it with slas et cetera well i i think those are kind of i mean it's a great question and i think that the governance infrastructure it's not necessarily you know how it gets implemented in organizations is vastly different oftentimes for proper governance infrastructure there is some it or central group that helps set the standards and processes and then they're managed and supported by the business units so that the business unit is actually making the day-to-day decisions about what is what is available and what is not um so so it depends i mean i'm not knowing much about the the particular questioner's infrastructure it's very hard to to give you something prescriptive but i would say that both parties need to have something to play here both the the centralized data governance team should be establishing high-level standards and then those standards should be mindful of literacy and how to promote data literacy because governance and literacy are sort of go hand in hand they're not they're not by themselves and then the they have to be in partnership with the business units and the and the folks that are actually building the analytics for the business use because none of this can be successful as in a backing you know you have to have somebody who's intimately familiar with the data set that's providing the right tags that's defining the lineage that's doing that so think of the centralized team as technology enablers and then the folks close to the business close to the creation of the visualizations as as the people that are really pouring in the context and there should be a close partnership between those two to make for them for the highest level of effectiveness all right i think that that just about wraps us up so thank you maries for that wonderful presentation uh we do want to just note again that there is the linked form at the bottom of the page where you can submit your session feedback that's very helpful to all of us so it'd be very great if you could do it this wraps up our session everyone is encouraged to continue networking inside the spot me app and don't forget to check out your sponsor section for information about the tools available to support your data management programs sponsors will be in their virtual exhibition booths until 1 30 p.m pacific thank you again and we look forward to seeing you in today's other sessions thank you thank you