 Hello and welcome. My name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. We would like to thank you for joining the latest installment of the Data Diversity Webinar Series, Data Insights and Analytics, brought to you in partnership with first-hand Francisco partners. Today, Kelly O'Neill will be discussing big data analytics. Just a couple of points to get us started. Due to the large number of people that attend these sessions, he will be muted during the webinar. For questions, we'll be collecting them by the Q&A in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DI analytics. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session and additional information requested throughout the webinar. Now let me introduce to you our speaker for today, Kelly O'Neill. Kelly is the founder and CEO of first-hand Francisco partners. Having worked with the software and systems providers key to the formulation of enterprise information management, Kelly O'Neill has played important roles in many of the groundbreaking initiatives that confirm the value of EIM to the enterprise. Recognizing an unmet need for clear guidance and advice on the intricacies of implementing EIM solutions, she founded first-hand Francisco partners in early 2007. And with that, I will turn it over to Kelly to get today's webinar started. Hello and welcome. Hi, good morning, good afternoon, and good evening. I hope everyone is doing really well. And I will be doing this webinar today solo. John Ladley is on a well-deserved vacation right now. And although I will do my best to channel John Ladley, there are no guarantees it will be effective. So anyway, I'm really looking forward to this webinar. It is filled with trends and interesting new developments that are occurring in the world of analytics and what it means to you and the way that you are managing your data. So today we're going to go through a few different topics, so three core topics. We will spend most of our time talking about new directions and trends in big data analytics. So there will be lots of good information in here about what is happening and what does it mean to you. We will briefly review our big data analytics architecture, leveraging the San Francisco reference architecture to show how the trends manifest themselves from an architectural perspective. We'll talk a little bit about tools and capabilities to leverage additional data types and then some key takeaways. So with that, what we want to do is start with our polling question. So what type data types are you analyzing? And Shannon, correct me if I'm wrong, but we can do a multi-choice answer here, correct? That is correct. Let me open the poll here. There we go. I will just do a brief explanation here. So row and column is your traditional relational database type of data, free form text things like email, Twitter streams, etc. Geospatial location data. Sometimes this is generated by telemetry applications, images. We're considering kind of still images or photos, audio or sound files, recorded phone calls, etc. Video would be the moving images or other sorts of video and then you could of course choose all of the above. All right and the poll is just about to close. There it goes. It is sorting it and it's about tells me that in 15 seconds I can push the results live. All right and the winner is Shannon. Here we go. There are the results. Great. So by and large the majority are still analyzing row and column data, which is frankly no surprise, right? That's what we've been doing for many years. So why is that going to change? But of course we're continuing to add to it and the winner from that category is in fact free form text. So people are complimenting row and column with the free form text. We do have about 10% of you that said all of the above. So that's really cool and probably very different than what we would have seen a year ago, let alone five years ago. So that's great. Geospatial also got 11% there. So geospatial being highly complex. That's really a great accomplishment for those of you doing geospatial analytics. Excellent. Okay. So let's go ahead and get started with our presentation. So what we're going to talk about in these new directions and trends in analytics is really the fact that a lot of the change that is happening within analytics is because some capabilities are not necessarily new, but they are more available, more relevant, more attainable. And therefore people are using them and you should consider how you should be using them within your environment. So just so that you guys are prepared for what's coming up next, we've got some different perspectives about what's been happening in the industry, leveraging a tremendous amount of investigation. So I'm going to call out some of the resources as I go through the presentation. But thanks to several of our business partners, as well as some internal assistance or marketing department and that sort of thing to help out with this whole process. So big picture trends. What's going on here? Well, the good news is, we are getting better at analytics. So there are more people in organizations that are looking to leverage analytics in their work so that they can make their own data-driven decisions without having to go to another analytics group. So this is whether they're embedding it into operational processes and workflow, potentially via bots, whether they are getting analytics at the front line, potentially via edge analytics. I'm going to go through these different capabilities in more detail later. This is just the big picture. Or whether this is getting greater access to all data of all kinds. And so the point is, is that more people want and need the data to excel at their jobs and in their personal life. So there is greater adoption of personal analytics as well, in which, for example, I can leverage information coming from my devices to improve my own health, financial well-being, et cetera. So we're getting better at this, which is great. And using the royal, we are still expensive. So data scientists, people that are the gurus in artificial intelligence, these roles are still hard to come by in the market, and the people are expensive. And as a result, there's a lot of effort to determine how do we accomplish some of these tasks of artificial intelligence or of data scientists in a more automated and repeatable way. So a lot of the trends in technology are addressing the fact that there's a cost implication here. And so there's a lot of different references that identify that a barrier to adoption for artificial intelligence technologies are the heavy services components and heavy services costs. I mean, just this morning, there was an article on information management that talked about the adoption of AI in healthcare and it referenced high implementation costs as being one of the primary barriers to adoption. So it is still really expensive primarily from a services perspective and therefore as a result, software and hardware need to catch up and of course they are catching up. So machine learning automation or smart things is a very common theme. And there's a trend beyond what's been traditional visual data discovery to this concept of smart data discovery or what Gartner is now calling augmented data discovery. And this enables business users and citizen data scientists, so not our expensive data scientists to automatically find, visualize, narrate those relevant findings and look at things like correlations, exceptions, other predictions without having to build models or write algorithms themselves. And the idea is that users can explore data in these different ways through visualizations, through search, natural language querying, natural language generation and interpretation of the results in a more user friendly way that isn't reliant upon creating the algorithms themselves. And the use of these applied algorithms enhances other sorts of analytics processes. So to a certain extent this builds on itself in order to broaden the audience for some of these analytics tools and to speed up the time to insights by leveraging these analytics tools. And as a result, it's becoming more and more pervasive. So these three big picture trends tend to kind of go together in the sense that we're becoming better at analytics, which means it's becoming more sophisticated, but it's driving automation, which means that the software and hardware providers need to catch up and they are catching up, which enables us to be better at analytics. So you can kind of see how this is a nicely repeated cycle. As a result, different questions are being asked of data. So we truly are now evolving from what happened and when did it happen to why did it happen and how did it happen. And so the idea is that we're thinking about, you know, from a text perspective, our collection of instant messages and other sorts of text that we can mine help us to understand what are people talking about. We can start to see what are the reactions and therefore what do they do versus not just what do they buy, but we can start to identify some more of the behavioral components like their mood, are they satisfied, dissatisfied. We can link these two image and audio. We can link these two speech patterns and that sort of thing. And we can then start to see other sorts of sentiments based on the text or the data investigation. Prescriptive and predictive analytics are becoming more commonly adopted. So it's not just that corporations are more commonly using things like prescriptive and predictive analytics, but even software companies are starting to embed these types of analytics into their applications as a way to provide a competitive differentiation so that prescriptive analytics can recommend a course of action within a workflow or within an application that helps to standardize what happens. This is largely based on the fact that algorithms are more available in that sort of thing. Graph analytics are more widely adopted. So graph analytics helps to identify and show relationships across all these different data types that are virtually impossible to see with structured data. And this is becoming more and more common. So, you know, one of the things that I think is awesome about prescriptive analytics is that it does enable your personal life. So my favorite prescriptive analytics application is Waze. I use it a lot and I'm really trying to find a Waze that guides me through other obstacles in my life. So if anyone out there is planning on creating a Waze for the grocery store or a Waze for IKEA, let me know because I will use it and therefore that will increase the value of your app. But what this demonstrates is that analytics are everywhere and it's no longer an isolated capability within an organization. It truly is everywhere. And we'll talk a little bit more about this when we talk about some of the other capabilities. It also means that there is an inherent expectation of real time. Latencies need to be low, practically zero. And this real time analytics is what is continuing to fuel the capability and the pervasiveness of prescriptive, predictive analytics as well. So different questions are being asked of different data, which then enables us to use analytics more pervasively. And people are recognizing now that big data is the new normal. And so big data since I guess 2005 or whenever it was first coined has been a new way that we've been thinking about data. And as it turns out, big data has really changed the way that we view both technology, but also how we respond to innovations, the rapidity at which we can respond to innovations. And this is really important. But the reason that I say that big data is the new normal is that it is becoming mainstream and it is becoming closely integrated with other types of data. So the demonstration of this is that the traditional, and these air quotes are probably going to go away here, is that the big data vendors are now starting to support other types of data in the same way that the master data vendors or other sorts of vendors have been reaching out to support integration to data lakes and integration to big data. So the idea is that big data is just becoming another way that we are looking at our entire data landscape. One of the things that's also driving this is that the availability and the sources that we have access to are growing quite exponentially. So there's a big trend about open data. So whether that open data is made publicly available through a government entity or whether that open data is just data that is shared via corporations or that sort of thing, there's a lot more data available to be able to mine. Not only are there data marketplaces, but there's now algorithm marketplaces. So you can buy algorithms or you can just use algorithms in an open marketplace in the same way that previously you could source data. Algorithms and data are being crowdsourced. So companies like Crowdflower, Workfusion, they're really trying to increase adoption to make sure that the availability of data is broad and can be consumed. I think there's greater consideration of dark data. So dark data is that data that traditionally goes unused in organizations. Maybe it is considered to be not relevant, but it is rich in potential. So for example, old production line data that's been collected could actually be used to see some future trends and that sort of thing. So there's a tremendous amount of expanding data sources. And so what are the implications? Well, I think inherently with the expanding data sources is the reevaluation of data governance concerns, right? So as you look at leveraging open data, as you look at buying not just data, but models and algorithms, as you look at consuming crowdsourced information, what does that mean to the way that you consider data? And so how does that impact your privacy approach, your ethical approach of how you use data? It's also really pushing this concept of self-service data preparation and data catalogs to truly being business created and managed. Now within the industry, I'm sure you are all kind of nodding your heads. We've been talking about this for quite a while in the sense of the fact that business glossaries, metadata, all of that should be business created, managed, etc. But there's been a really big IT component in things like business glossaries, metadata, data quality, etc. But it is shifting to be more business created and managed. And so in the support of data lakes, there's this renewed importance and interest in the concept of data catalogs. So data catalogs are leveraging traditional metadata management capabilities, but adding some innovative approaches like self-generating topic extraction, taxonomy generation, semantic discovery, which all of this leverages machine learning capabilities. And some of these new technology solutions are coming up to support the better generation and curation of these data catalogs, which means it's a lot easier for the business people to consume and adopt. So there's exciting things happening with companies like Elation, Waterline, Calibra in this space. And then from a data prep perspective, data preparation tool addressing that statistic that we all know and love around 80% of the effort in any sort of analytic or data science exercise is in data access, profile and cleaning modeling, sharing, etc. And so these data prep tools are able to reduce that time and complexity in order to make the data available more quickly. So we recently used trifacta in one of our clients and it's a great way of short cutting some of this time through an easily accessible tool. Now, another interesting thing is in the same way where they expanding data sources, the offerings from an outsourcing perspective are also expanding. So there's, you can outsource just about anything, anything right now, you can actually outsource your model creation, you can outsource your algorithms, you can outsource the management of your platforms, and you can outsource this concept of mode two, which is a Gartner term around the new way of or the new version of a data warehouse or the new way of managing all of this unstructured data. So you can outsource just about anything, which also means that you need to consider intellectual property as a result, regulatory requirements, again, governance requirements and accountability for that data. So who truly is accountable for the data? Is it you because you've hired the outsourcing company? Is it the outsourcing company because they're actually the ones doing the work on the data? So drive some different implications and considerations. And Shannon, I am not monitoring the questions. So if there are questions in chat that come up, can you just flag me? Because I'm just going to keep going. That's okay. Yeah, no problem, Kelly. Okay, great. Sorry, I wasn't sure if I lost you there. Okay, Internet of Things. So the Internet of Things is, of course, a key component of a digital business, right? And it is also a tremendous source of data. So there's a lot of data that can be created at those endpoints at the Internet of Things that then can be used to improve operations, that it can be analyzed to improve product development, management of different products and that sort of thing. And essentially Internet of Things is where the world of physical devices is connected via your network to some sort of centralized processing capability. And so this is where things like your Fitbit or your Nest is connected to industrial machines that can then be used for analytics. So one of the things about the Internet of Things is, of course, then starts to or can potentially invade this concept of personal analytics and Internet of Things within your home. So personally, as an example, we recently installed Nest. And so now every time I walk up and down the hallway, Google knows that I've passed by. And so no doubt they're doing some sort of analysis based on the temperature of my house and the frequency of movement and things like that. Anyway, but it is a way of linking endpoint devices into more centralized machines. They're becoming more pervasive, which means that there's more data, more complexity, a requirement and the actuality around more automation. And it is driving then areas of investment, like an industrial Internet of Things versus just the Internet of Things. Internet of Things Edge Analytics, we'll talk about Edge Analytics a little bit later, but this is a point of where analytics are actually executed on distributed devices away from these corporate central hubs or data centers or away from the cloud servers closer to where the actual data is being generated or potentially where the sensor is. So mobile app edge analytics, again, happening, analytics that are happening at the device component. Event stream processing is the ability to compute based on event streams. And event streams are a sequence of event objects that are arranged in some sort of order, typically by time. But event stream processing based on volume of data coming in from the endpoints of IoT, as well as data coming in from things like websites, social media platforms, et cetera. There's a lot of growth in this area because of the availability of the data and therefore the demand to use it. A couple of implications of this. Well, it's going to be really important to monitor regulatory legislative actions around what it means to use the data that is generated at that end point. So federal government regulations, state government regulations, other sorts of country regulations to ensure that as you consume this data that you understand what it means, at what point do you need to comply, et cetera. One of the interesting use cases is with the client that we worked with, one of the large automotive manufacturers where they're looking at intelligent customer interactions. And these are interactions with the car, right? So the maintenance requirements and standards, et cetera. And when you link those with the understanding of who the customer is via say a master data management solution, it can really change the way that you engage with your customers. So the Internet of Things provides the information. It's validated against say a customer master that can drive sort of accurate context and create some meaningful analytics and decision making. But again, the consideration of as that data comes in, you know, if it's from a car or other sort of endpoint device, what is the privacy concerns, what are the security concerns, and how do we incorporate that into our other sorts of governance decision making processes. Okay, next trend. Artificial intelligence is everywhere. So I've read the word AI is everywhere or the phrase AI is everywhere in multiple locations. So I don't know the exact source of that phrase. But I do know the source of the phrase AI is the new electricity. So in January of this year at the Stanford Graduate School of Business, they have something called a future forum. And this fellow Andrew, he's a machine learning guru, he gave a talk entitled AI is the new electricity. And his point was that similar to electricity in the industrial revolution, there will be very few industries that will not be transformed by artificial intelligence over the next few years. So artificial intelligence is, you know, no pun intended, powering a lot of change within our culture and within every industry. So if anyone remembers the phrase that big data is the new oil, well, now we have artificial intelligence is the new electricity. So lots of statistics around kind of the growth of these areas and that sort of thing. But essentially, artificial intelligence is becoming a key component of any digital business right now. Part of the success of artificial intelligence is based on the data that that platform has. So the accuracy, the precision of the data is what can set one AI application apart from others. So the success is based on the data. So this is why many large AI platforms are more than happy to let you use their services for free as long as they can use your data to improve their algorithms. So this is this concept of learning loops. It's also been called a virtuous cycle of artificial intelligence commercialization. And the idea is that you can launch an AI product, get the users, you can release the product for free, the users generate the data and the data can then be used to improve the product. So in this way, there's a transference of the data and the success of artificial intelligence is based on the data. The availability of that data, whether it is through users using a platform, or whether it is through increased computing power, has fueled the growth. So again, it's kind of this, I guess, virtual cycle, if you will, where the success is based on the data, the availability of the data and the computing power has fueled the growth and it is a sort of self fulfilling engine. So different types of artificial intelligence that you will have heard about are things like machine learning, deep learning, which is a type of machine learning. So the natural language processing generation, conversational artificial intelligence platforms, and then computer vision. So computer vision is a process that involves capturing, processing, and analyzing real world, excuse me, real world images and video to allow machines to extract meaningful contextual information from that environment. So this could be considered things like image recognition, pattern recognition, facial recognition. And the idea is through all of this computer vision that you can then create an event or act on it. So Google has launched something called Google lens. And this is a technology that basically turns your camera from a passive tool that's capturing images into one that's allowing you to interact with that image that's within your camera's viewfinder. So for example, Google lens can interact with Google translate to read a photo of the sign so that you can understand what no trespassing means in Swahili and you can act on it. Google lens can also interact with Google assistant to process data from a photo and add something to your calendar, for example. So you can see how this interpretation of images, the interpretation of language, can actually create very useful tools both personally as well as within an organization. And of course, you can see how there's an overlap into other categories. So for example, visual recognition is a natural extension of the Internet of Things where you have an external sensor that can extend the reach and change the way that they interact with the environment by way of visually recognizing whatever that graphical image is. So if we think about this from the perspective of autonomous vehicles, then as more computer vision models are created, it helps those cars to drive with more precise navigation. That requirement in turn creates a need for higher resolutions of images and video, which creates a need for better sensors, which in turn enables more use cases for these sorts of computer vision models. And so it becomes this kind of constantly improving environment that seems to be pervasive within artificial intelligence. So implications of that, well, be aware of how your data is used. So if these AI providers allow you to use their applications at no charge in order to use your data, be aware of that. So whether that's as a company, whether that's as an individual, that's something to be sensitive to. So there's this concept of corroboration of correlation. And I think John Ladley himself is going to claim creation of this term. But the idea is that there needs to be a model of the world around you that's understood because you can find a correlation in just about anything. That correlation itself may not create the correct insight. And so there needs to be some sort of human corroboration in order to make an intelligence business decision. So artificial intelligence can create these sorts of correlations, but they can't provide the human judgment that turns that correlation into action. So that's another implication of having all of this artificial intelligence. And then another thing that I thought was quite interesting when I was doing my research is IT organizations are now leveraging AI to better manage their own operations and to better manage the growth of data within their operations, so that they're getting sort of intelligent responses from the machines in order to improve the infrastructure and the speed of processing and things like that. So I thought that that was really interesting. Okay, bots. Bots are hot. So bots are a specific type of artificial intelligence. So these are little microservices or apps that can operate on other bots, operate on other apps or other services, and they operate in response to event triggers, user requests, et cetera, et cetera. Bots can be based on a specific set of predefined rules or they can incorporate some of this artificial intelligence and leverage algorithms in order to behave differently. So it's not just a standard question and answer, but there's a level of interaction. For example, with say a conversational user interface, which is essentially a chatbot. So we've all experienced chatbots when we have been in process of paying our credit card and then we stop. There's a text box that basically comes up and says, do you really want to not pay your credit card? Or when we are shopping or something, there's a potentially chat box that says, may I help you find something else? Or maybe we're on a website in which we're trying to solve a problem. But anyway, so we've all experienced chatbots. Bots can emulate a person like they do in a chatbot user or an app. And as a result, bots have started to be embedded into applications to facilitate workflow so that they're actually helping people use those applications, make decisions and recommend decisions within an application's workflow. So bots are not new, artificial intelligence is not new. But what is new is the ability to cross reference some of these capabilities, the ability to leverage new data sources and new devices that have really been extending this industry. So creating a bot framework has become a key initiative in many organizations and there's been a lot of progress around how companies can leverage bot frameworks to improve their operations or to improve their businesses. So I needed to take a sip of water. One of our partners, Data Bloom, is working with a company that has issues with their customer support and their call handling time is exceeding what they have expected it to be. Well, one of the things that they're looking at doing is analyzing structured and unstructured data of the calls and evaluating how, through text analysis, how they can improve call times. They have created a bot framework to enable customers to ask questions directly into the bot, so a chatbot. And then those responses can be obtained interactively or can be complemented with somebody actually talking to that person or a representative talking to that person using an actual callback system or what have you. So they're using text analytics, unstructured data analytics to improve the experience of these customers as well. They're using it to potentially reduce call volumes and call times and therefore potentially really holding times. So rather than putting in more agents into the call center, Data Bloom showed them how they can look at their data in a novel way and leverage a bot framework or bots to help improve and enable more self-service. So even within our company, actually, we started to use bots to improve productivity. So, for example, our company communicates quite a bit via Slack. And our head of marketing has started to create Slackbots as a Q&A mechanism for some of the basic commonly asked questions. So, for example, you can say, you know, what is Zarina's Zoom account? And it'll give you the account code to her Zoom account or the number of her Zoom account. And as we grow and we onboard new people and new employees, we can enable Slackbots to answer questions, to provide user manuals to our frequently used tools, to locate people, to locate passwords or files, and generally provide self-service to applications. So it doesn't have to be super complex, but the enablement of capabilities through bots is really impacting a lot of organizations. Okay, so implications. If this wasn't obvious, bots not only use the data, but they also create it. So similar to how we've been talking about understanding that your data can be used, your data can be leveraged, every time you chat something or every time you input text into a chat bot, that text can be captured and used in an analytics way. So in the same way that I gave the example of Data Bloom using that text analysis, as that client launches a self-service or a chat bot, the text that goes into that chat bot is data that can be used for analysis. So just be aware of that. And the privacy and security needs policies need to be considered in light of the fact that that data can be used for analysis. And I'm not sure all customers necessarily recognize that that data is used for analysis when they enter it. Okay, so I mentioned edge computing a bit before in the section around the Internet of Things, but I wanted to call it out specifically because of the architectural implications, as well as because of the way that it really has been enabling personal analytics and other sorts of end device analytics. So edge computing is essentially a computing architecture in which that data processing or information processing, as well as the data collection and the delivery are actually at that end device or are placed closer to the end device, so that the point of sourcing and the point of delivery of the analytics is on that local device. So this has really been driven by the need for data to be analyzed in real time when it's too costly or sometimes not available to be moved back and forth between remote applications, devices, etc. So there's a big trend, of course, in terms of making edge applications more intelligent, more robust, and more useful. So it's starting to put those decisions, put the analytics in the hands of the inquirer, if you will. And of course, edge computing works with other capabilities that I've also reviewed previously, like machine learning to optimize what's happening at the edge, if you will. So for example, Apple has launched their core machine learning framework or their ML framework that enables developers to create smarter apps by embedding on-device machine learning capabilities into the devices. So advantages are things like faster response time, reduced network bottlenecks, and online availability or offline availability of apps. So in doing my research, I read this great concise article by a guy named Matt Hardy on one of the development frameworks called MobileNet. And this quote I thought was awesome just based on his tone and all of that. So his definition is that MobileNet, which is a framework for edge computing, are a new family of convolutional neural networks that are set to blow your mind. And today, we're going to talk about how to train a custom data set. And then he goes on to talk about how to use TensorFlow to develop an application. But it's a very novel and fast way to enable edge analytics, intelligent apps, et cetera. So the implication. Implication is architectures need to adapt and stretch to include these edge locations. And that means that if we're enabling edge analytics, we need to consider network availability and requirements for moving that data across the network and what needs to happen in order to do that. Things like data thinning, file compression, et cetera, if we are going to enable that edge computing and it's not a standalone device. That it is something that it is on the edge, but not necessarily standalone. All right. Thank you. To some of the different data analytics architecture as a result of these trends. So just to recap the different things that we talked about. So our new directions and our new trends are, you know, analytics are everywhere, big data is the new normal, the internet of things is pervasive, AI is everywhere, edge computing and bots are hot. So what does this mean to us from an architectural perspective? Well, I think the first thing that we want to consider is that we need to now have a truly unified data and analytics strategy. We can't have a master data strategy, a big data strategy, an analytics strategy, et cetera. We need to have a unified strategy because these worlds are coming together and the worlds are blending. We also need to understand that the worlds are becoming more real time and so analytics are everywhere means that latency requirements are changing and that there's big differences in what we need now from an internet of things perspective. There needs to be real time ingestion of large data sets and fast integration between outsourcing cloud and on-prem solutions. So the requirement for real time is really, you know, beyond real time analytics. So it is real time analytics, but it's also consumption of the data and how quickly that consumption can be completed even with the requirement to do some processing like artificial intelligence. Edge computing is bringing computing back out closer to the sources. So back out to the devices or closer to the sources. So again, these are things like mobile nets, squeeze nets. And the idea is that these mobile deep learning tasks are created on the device and then performed in the cloud in such a rapid way that there's absolutely no latency. So processing is happening closer to in some cases within your device. The integration of capabilities across these multiple areas. So the link between, for example, edge computing in the internet of things, that is something that we need to consider from an architectural perspective. And then lastly, the concept of data obfuscation because some organizations may not need to be overly concerned with privacy. So I'm saying that with a little bit of hesitation because it might be possible to accomplish your analytics with observed anonymized or masked data rather than gathering individual private or sensitive data. So what are the requirements for gathering data and what are the requirements for processing the data in order to ensure that you are addressing privacy concerns? Now other things to consider that I didn't necessarily raise here are things like now we can look at perceptions and tone can be collected. And so how do we take advantage of that or consider that within our architectures and to consider leveraging these new types of data? So I'm going to show a couple of things that potentially y'all have remembered seeing before. I think John presented these a while ago. So this is a First San Francisco reference architecture from an abstract perspective. And the idea here is that there is a framework and a structure to manage data and the interfaces between the people and the processes. And we leverage this concept of an I-beam. And that I-beam is used to support the blend between vintage area data management and data processing and contemporary types of data life cycles and information supply chains. So it's a way of thinking about what has traditionally been considered like the data warehouse world with the data lake world. So again going back to kind of a Gartner reference Gartner talks about the vintage area as the mode one and a contemporary area as mode two. But the idea here is that we now need to consider these as a unified approach and as a consistent and comprehensive architecture within our organization. So if we take this just to a little bit of a deeper level, let's look at this sort of explicit architecture and a couple of things I want to point out as we make this more specific. Excuse me, I need to be drinking my water so I don't choke on my words here. This is complex. So orchestration amongst the vintage area and the contemporary area is very important. So a consideration is how does data operations run all of this and there needs to be a very explicit coordination of efforts across business and IT because the contemporary area is very business driven and the vintage area is very IT managed. And so the contemporary area is not IT managed and the vintage area is not business driven but the I guess leanings of each area need to be considered explicitly and it creates the demand for a greater partnership across business and IT. It also needs to be married with good governance because just like in the vintage area, the concepts of ingestion, integration, etc. are still very important and it's important so that you don't end up with things like confirmation bias. So let's use an example of that. Let's say that you're trying to do some analysis on data and you acquire data through some external sources. Maybe it's some of those external sources that we talked about previously such as open data or crowdsourced data. You pull that data into your architecture and you run the analysis incorporating that with your in-house develop data in your data lake and you come up with a result. Well if it's not appropriately governed, you might be duplicating the data that you have in-house with your acquired data and then you end up with a level of confirmation bias where you confirm a hypothesis that you created with your internal data mainly because it is the same data with that external data. This comes up we've also seen in terms of account management and assets under management where there's a potential that you're duplicate counting across multiple data sources and some of those data sources are provided by your distribution partners as well as the data that is created internally so that you're double counting your customers because your distribution partner counted that customer and you are also counting that same customer. So again governance is very, very important. So another component to this is that on the right hand side we talked about Edge, bots, unstructured data, internet of things and all of those sort of future-looking applications can be outsourced. So you can really outsource anything that you want whether it is the applications or the data or the bots to manage the data or what have you. So as you incorporate that into your architecture it should also be incorporated into your governance requirements and needs to create those standards of data sourcing so that you are managing it appropriately and that you're taking advantage of that data at the same time. So the point in all of this is that it does need to be explicitly planned, it should be comprehensive and it isn't just a technology stack with a bunch of abstract arrows. What we want to do is make sure that we are incorporating the rapidly changing capabilities within the contemporary area and understanding that data through cataloging etc and marrying it with the vintage area and leveraging that iBeam to ensure that we are creating support for our business and our business processes and that we're explicitly planning this as a result. So I've just got a couple more slides unless we want to move into some questions if not we can just talk quickly about capabilities. So you know in doing some analysis around the new tools for leveraging more data types section of our webinar here I did mention some tools as we've gone through the presentation but when we talked about this as a section I decided to take a different approach rather than cite specific vendors or tools I'm going to talk a little bit more about capabilities and the main reason for this is that I realized that it's virtually impossible to know where to start with listing the vendors and where to stop with listing the vendor tools. So for example Venture Scanner which is a database that tracks trends and technology has identified nearly 1,900 startups in the AI space alone as of April 2017. So I know that I tend to create busy slides but that was going to be completely over the top so I decided to take more of an approach of capabilities versus the actual tool names themselves. So with approval I've leveraged this graphic from the Gartner group. You know I really like this graphic from a few perspectives so although it isn't really categorized in the exact same way that I categorized the components of this presentation it does identify those big buckets that one needs to consider and action and then search for the technologies that fall into those categories. So that was a big reason that I thought that this was a good way to present a summarization of those capabilities. Now another reason I like this diagram is the way that it articulates the overlapping nature of many of these capabilities so you can see how the different lines of the different capabilities overlap. In fact I would call out some additional overlapping capabilities. For example there is an overlap in new data sources and new algorithms in the sense that there are now algorithm or model providers not just data providers. These can also be viewed via a crowdsourced model sorry they can be viewed and purchased via a crowdsourced model or used via a crowdsourced model. New processing needs to include the capability to process at the edge so although real-time analytics is called out there's also consideration of processing at the edge which also comes into play with Internet of Things or IoT since that edge may be an IoT endpoint. So there's important things that I think could be added to this graphic in order to further delineate some of these capabilities but I also like the concept of the new industrialization and the new thinking because really if we don't start to industrialize these concepts and bring them into our core business models then they will all be very interesting experience and not really impact our businesses and this drives the requirement for new thinking. How do we make decisions around data just because we have the data should we use it? How do we reflect our company culture in our decisions around data and the way we use analytics or has our analytics capability started to change the way we think about culture? The things to consider in your new thinking and this links to the next slide which is takeaways and just to make sure that you're still paying attention I'm going to start at the bottom so when we think about the capabilities like the new thinking there's this concept of digital ethics and digital ethics is the consideration that we should have a use integrity when we think about the way that we use the data and that we shouldn't necessarily just rely on regulations to guide how the organization uses the data. Is it the right way to use the data? Does it fit our company culture? Does it reflect the way that we want to engage with our customers, our business partners, et cetera? The greater concept of data freedom where that data freedom has led to open data where we can access and acquire data openly through data sources on the web or whether we create data freedom internally means that the organization needs to be and the data organization needs to be insight enablers not just data provisioners and data providers. Again as there's greater demand for things like the real-time analytics, the consumption of embedded analytics and that sort of thing, the usability of predictive and prescriptive analytics, it's not just data provisioning anymore it is creating and enabling insight with that data that's the role of the data management organization. Privacy will become increasingly important as computing gets closer and closer to the individual. This includes location data so edge computing some of the internet of things is specific to location and the value with the analytics has to do with marrying the location as well as the actions and the data that's created at that source so that needs to be considered and then recognizing that the skills gap still does exist in big data and in artificial intelligence and that we as organizations will need to plan accordingly. So I know I started the trends and directions talking about you know we're better at analytics and we're still really expensive well it is a reality and I didn't mean to belittle the fact that it's a highly valued skill set it is in high demand and it is expensive for a reason therefore plan for it from a funding perspective because there's a benefit to having some of that super high power data science capability and super high power artificial intelligence capability. But at the same time take advantage of some of these new technologies and service providers that can help extend that use of analytics and can extend that use of artificial intelligence so that you're really optimizing the function of those data scientists and of those gurus that can really help to ensure that the algorithms and the models are appropriate for your industry and your business. And then finally analytics is and will be everywhere so as you're planning out your architectures it needs to be taken into consideration whether it is something that you're doing currently it should be something that you consider within your future state architecture plans. And I think that's it so I know that I've talked a lot we've got a couple of minutes left. Shannon any questions or do are we going to do a follow-up email? We do have questions coming in Kelly thanks for this great presentation just a reminder to answer the most commonly asked questions I'll be sending a follow-up email by end of day Monday with links to the slides links to the recording and anything else to request it. So Kelly you mentioned quote unquote WAVE or quote unquote WAVE app and referred IQ with it could could you elaborate on it a bit more? WAVE so WAVE is a geospatial prescriptive analytics capability that helps me navigate through San Francisco traffic. So I'm sure some of you have used WAVE if you haven't and you live in a traffic congested area it is a real-time direction enabling and changing application that helps to get you from point A to point B in the fastest way possible. And the reason that I did that is if anybody has ever shopped in IKEA it is impossible to navigate that store. I love IKEA. I shopped there but boy if I had WAVE for IKEA that'd be awesome. I love that idea. I hope they're listening. I use WAVE all the time. All right well that is all the time we have for right at the top of the hour. Kelly thanks for this great presentation and thanks to our attendees for being so engaged in everything we do and thanks for attending today. Again I'll send the follow-up email by end of day Monday to all registrants and so thank you. I hope everyone has a great day. Absolutely. Talk to you next month. Take care. Cheers.