 Hello and welcome. My name is Shannon Kemp and I'm the Executive Editor of DataVercity. We would like to thank you for joining this DataVercity discussion panel, Smarter Data. If you like today's discussion, you can meet all of the experts speaking today at our upcoming Smart Data Conference at Expo August 18th and through the 20th in San Jose, California. Just go to SmartDataWeek.com to check it out. A couple of points to get us started. Due to a large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen. Or if you like to tweet, we encourage you to share highlights or questions by a Twitter using hashtag DataVercity. If you would like to chat with us or with each other, we do encourage you to do so. Just click on the chat icon in the upper right-hand corner of your screen for that feature. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and any additional information requested throughout the webinar. Now, let me introduce to you our moderator for today, DataVercity's own CEO, Tony Shaw. Tony is, of course, responsible for the business strategy of the company and its subsidiaries, all of which conduct educational conferences, training, and publishing activities focused on the area of enterprise data management. And with that, I will turn it over to Tony to introduce today's panelists and to get us started. Hello and welcome. Thank you very much, Shannon. And welcome, everybody, today, including our panelists, as well as those in the audience. Today, we're going to talk about smart data. As you know, I don't think any irony or sarcasm was intended when Shannon was playing the tune Smoke Gets in Your Eyes prior to the call today. But certainly, you know, I think there's a little confusion perhaps around what this topic really means and what smarter data can do for you. So I'm very pleased to be able to introduce our panelists today who are going to try and provide some clarity through that haze. The first panelist is Dave McComb. Dave is a longtime colleague. I'm pleased to say he and I worked for many years together on the Semantic Technology Conference. Dave was a co-founder of that particular very well regarded event with me. Dave's longtime consultant, he was also the author of a text called Semantics and Business Systems from Morgan Kauffman. And it remains, I think, a really excellent introduction to the subject of semantic technology in a business environment. It's a well-written text. Sean Martin is the Chief Technology Officer and a founder of Cambridge Semantics. Cambridge Semantics is one of the leading vendors in the smart data field. Prior to Cambridge, Sean spent 15 years with IBM Corporation where he was a founder and visionary for the IBM Advanced Internet Technology Group. So Sean's been in the area of internet technology for the past couple of decades. And Dave Dugal, I'm also pleased to introduce, is the founder of a very innovative startup called Enterprise Web, which offers an award-winning application platform for dynamic, data-driven, and elastically scalable business processes. Among its other awards and honors, Enterprise Web 1.2 coveted SIAA COTI awards this year for best semantic application and best GRC applications. Gentlemen, welcome all. And why don't we start out by talking about our own perception or our own definitions of what is smart data? So Dave, please kick us off. Sure. I guess, oh, which Dave did you want to go? Oh, pardon me. I've got two Dave. Don't know. Dave, come on. Yes. Sorry, Dave. Yeah, I like to think of smart data put in context with just where are smarts in general in a system. And in a traditional system, the smarts, and by that I mean what things mean, the constraints, the rules, the usage patterns, all that kind of stuff, are all pretty much in the application code. You know, they were in the modelers' heads. They were in the requirement analyst's heads. They moved into the developers. Some of it's stuck in the documentation, but most of it's really just in the code. So we've got all this smarts in the code. And people have been knocking, you know, existing applications in the legacy systems because they're not flexible. And it's not that the existing databases aren't flexible. It's this arrangement of putting the smarts into the code, I think, that make them inflexible. So then NoSQL came along and big data. And what we have done is taken the smarts out of the application code and put it in the heads of the data scientists, which may or may not be a step forward. It seems like that might actually be a step backward. What we really want to do, might take on smart data, is what we want to do is take the smarts, the meaning, the constraints, the rules, the usage patterns out of the application code, out of the heads of the data scientists, and put it in the model where it belongs, where it's available to everyone. Okay. Sean Martin, why don't you tackle this question? Thanks, Tony. So I'm obviously not going to disagree with any of that. I think Dave has the sense of it right. But for me, there's a very formal definition for smart data. And that's really both at the model level. We use a standard called OWL, the Web Ontology Language. And at the instance level, we use a standard called RDF, which represents sort of individual facts. And then there's another standard called Sparkle, which we use to query the combination of those. What's very powerful about these relatively simple standards is that they're standards and that everybody can share them. So as we start to enable software stacks like we've done Cambridge Semantics, we have something called SDP, Smart Data Platform. We embrace the meaning and models that drive every single aspect of the system. And what it means is that somebody can come along and produce either a model or instance data that our system can, without having any prior knowledge of that information, slurp it in and do something sensible with it, because we've got standards. So for me, the semantic web standards from the W3C are the core of smart data. We're now starting to see domain groups like CDISC in the clinical trial data space and FIBO in the financial industry business ontology space. That's the EDM Council and the EMG. They're starting to embrace OWL as a way of expressing their specific domains. And what's so nice about that is the software that respects the underlying standards like OWL can simply import those models and use them directly, and suddenly your software is much smarter. From a sort of philosophical point of view, I think what we're starting to do, and this kind of echoes Dave, but we're starting to arrange data for the benefit of how humans want to interact with it as opposed to the efficiency of processing. So we've suddenly got enough horsepower in terms of cheap RAM, very cheap random access in the form of SSDs, very fast interconnects, very fast multi-core CPUs. There's a lot of extra power there that we didn't have before. And so we can add this level of indirection, this level of semantics or smart data. And for the first time, it's much easier to write systems or build systems that are arranged for the benefit of the end user. And smart data is the core of that. Being able to do things declaratively is very important and much, much more flexible and easier to maintain and cheaper and faster. So that's the core of smart data for me. Okay. Sean, just before we move on to your line, we had a couple of times there. So if that situation continues, we may need to have you call in again. Okay. Just give me that. The block off of your statement is signed. So we'll just keep going. It's happening. All right. Dave DeGal, what's your version of what smart data is, please? Batting third. I'm disposed to actually say that, of course, the other two made some very good points, Dave and Sean, that I'll agree with. Actually, I'll tend towards Dave McCombs' more generalized description. I mean, I'm not sure I would actually call data smart. I mean, if you're in the whole D-I-K-W, you know, data information, knowledge, wisdom stack, I think, you know, data is information without context. So I'm not sure data is smart. I think data with its relationships to concepts and activity, which I think matches up our lines with the constraints and usage patterns that Dave McCombs described, that's what's smart, right? So then exposing those relationships to, you know, constraints and usage patterns so that they can be understood by humans and systems, putting information, data in its context, that's what it becomes powerful. And I think, actually, when that's done in a fully automated fashion, I think that's really critical, because I think there are – I'm not sure if I'd completely agree that there's only one way to describe smart data. And I think there is – you know, we would describe ourselves as a standard-based platform. We use all modern technology standards, and Symantec Web is, and using RDF now on Sparkwell is one set of standards that you can apply for making information. And we could read and write RDF triples. But I think also when we're looking at fully automated systems, and I think that's the end goal, right, is that these relationships could be fluid, that we don't have to have rigid hierarchical master data management models anymore, which I think we'd all agree with. But we also don't want to actually create – I don't necessarily see static triples either. We really want a system that can actually – a system of models that can be reacting to each other, that we can have machine learning and adaptation. We could do this in very fast real-time systems, so that we're bridging the gap, not just talking about data, but actually bridging the gap between data and applications, which I think is the other side of this coin. Right? If we just talk about data in isolation, we're really only talking about one-half. You know, we're really talking about smart – if we're really talking about smarts, then we're really talking about systems and applications and their relationships to data. Okay. Well, that's our question. I think the questions are ready. And I do want to invite everybody to send the questions as we go through. We're going to continue with the first floor panel questions first to get back to William's questions first. But we're usually able to get through everybody's questions by the end of the program. All right. So having provided some of you here what smart data is, where is it being applied to their actual diplomas? Is it just theoretical? How useful? And so I'll avoid the – well, let's just jump in and let Shawn tackle this question first. Thank you, Tony. Hopefully you can hear me clearly now. We can. Yeah. So obviously we have a smart data platform. So pretty much every single engagement we have with a customer is a smart data engagement. At the moment, we're seeing a lot of adoption in financial services and pharmaceutical areas as well as others. But those two have a lot of focus and it's also partly, I think, driven by some of the standard focus of the domain groups driving that. But if I think about a particular recent situation, let's see, we've had a couple of implementations recently where customers wanted to – pharma customers wanted to pull together end-user views of across clinical trials. So embracing both structured data coming out of statistical systems as well as document-oriented data like maybe clinical trial protocol documents. And one of the things about smart data is that it provides a way to join both structured and unstructured data. In the end, you get data that is represented formally. It's described using models and so on. But tying the two together, data coming from very gray areas like using text analytics along with formal data which is coming usually from structured sources, being able to tie those two together, you need a very flexible model that can grow as new facts are discovered as the model itself expands. Allowing end users to very easily ingest data or rather maybe have IT help them ingest the data and then have the end users be able to use the same models that made the ingestion very easy to help them generate queries automatically so that they're exploring the data and discovering information without actually having to understand the underlying mechanisms for doing the queries. And being able to do that across a much broader set of data than has been traditionally possible. Current sort of traditional analytics tools tend to lend themselves to fairly narrow analysis and the users end up asking questions that aren't quite there in the center of the data that's presented to them. So there's a cycle where they're going to go back. Whereas smart data allows you to build much more complex, more rich, broader pictures, and in an ad hoc fashion ask questions across that broad picture provided the data's there. So very recent example, clinical trials, being able to load all your clinical trials up across a drug and have a look at that in one go. It's a very useful current example. Okay. I understand there's some problems with my law as well. So I'm going to try to defer as much of a conversation to the panelists as possible. Dave Dugal, people have answered the question then. Where are you seeing smart data applied and deployed? Great. Thanks, Tony. So, you know, again, we would look from the application side, so the application of real-time data for smarter applications. And so I'll give outline two, I think, strong use cases that hopefully will be relevant, or at least add some clarification to the audience. So one is in Telcom. So in Telcom, we just want to award for most innovative solution in an emerging area called network function virtualization, which is the real-time construction of Telcom networks and the delivery of network services, firewalls, load balancers, you know, voice over IP services like we're using today. The real-time distribution and balancing and life cycle management of those kind of services based on overvolatile networks, so you have to, you know, be able to analyze the state of the network, work across potentially multiple what they call administrative domains. So in other words, a delivery of a Telcom service might actually be managed by multiple partners to deliver that. The functions themselves might come from multiple partners. So you might have people delivering a voice over IP service, people delivering some sort of radio access network to deliver to your cell phone. So you have all these different functions, and they're actually all subject to different standards. So there's actually multiple standards involved, multiple technologies involved. So you have these multi-domain distributed computing problems, right? And in this case, what our software is doing is it's actually taking a declarative model of intent. What does this network service supposed to do? You know, Tony is a platinum customer for voice over IP. When he makes a request, he has to get these five nines, and he's supposed to get these SLAs met, instantiate Tony's service over the network, and maintain it even though the network is volatile, which means scaling that up and down dynamically. So, you know, so that's in a certain intelligence just to manage that in real-time. But the other aspect of intelligence here, or smart data that comes into play, in that in the domain I just depicted, is there's a lot of change, right, between the partner functions might be being updated, the partners networks might be being updated as well at the same time, the policies related to a network service might be being updated. So you actually have to build this declarative description of a network service that's understandable to a business user, that they can describe it at a very high level declaratively through policies. They can say, I just want these behaviors over these networks in this fashion, and then let the software manage all of the volatility in the background, right, the volatility in the network and the volatility of the integrations between all of the partners. So that's one kind of real-time challenge we're working on. Another one is in life sciences where we also work in supporting some of the world's largest research hospitals and managing regulations over all of their administrative processes, where we actually calculate rules over research or activity in real-time so the researchers can focus on their actually research studies, the subject of their research, as opposed to trying to manage all of those and trying to keep in their mind all of those rules. The system actually, every time the research takes an action or a rule changes, it sort of recomputes the state of that application and if there are any new requirements, it auto-tabulates that and notifies the appropriate people or triggers the right processes. So two very different scenarios. One is a highly regulated long-running human process and the other one is real-time systems infrastructure cloud processes but fundamentally driven by the same principles of smart data. Okay. So, David Cohn, I left you to the interview because you're to some extent the odd man out here. You're the consultant. You're almost the client representative here whereas there are other two guests who have products that solve problems in this space. So let's see how your response to this question is maybe a little different. Where are you seeing smart data fly? It ends up being kind of the same. All right. Michael, I love you. Yeah. So we're working with a large electric device manufacturer who has acquired a whole bunch of other companies and every company they acquire has their own catalog system, their own product management system, et cetera. They're all different. They're all arbitrary. They're all complex. And they ended up writing literally hundreds of these things they call configurators to help electricians put together complex combinations of electrical parts for construction projects, things like that. Just the complexities went on and on. And we worked with them to one of these systems as a starting point and created a model like Sean suggested in OWL that abstracted away the structural complexity but retained the model of what these things really meant. And that model ended up being about 2% as complex as the model we started with. It started with a model of 700 tables, 7,000 attributes, and ended up with 46 classes and 36 properties. So the whole thing, in some ways, this is the forest-gump style of smart through simple. Just make things simple as you can. And then we were working with another consulting firm who was doing the rule writing. They were able to rewrite these configuration rules. Instead of writing them to each individual system and to the arbitrariness of the data structures, they could write rules that were about laws of physics and electricity and don't put something with a higher ampers unit downstream from something with a lower one. It'll blow up that and then physical constraints and configurations, but not the arbitrariness that was baked into all their existing systems. So then, also like Sean said, we took their existing data, converted it to the standard RDF, put it in triples, put it in a triple store. We could query it with Sparkle. They were then writing the rules against that Sparkle endpoint. And then we got a couple other extra bonuses out of it. We got to the end of it. They decided they'd been having a lot of trouble complying with a standard called ETIM, it's an electrical device standard, because they'd been trying to do it at a very detailed level. We were able to just write a way to do configuration at the abstract property level there, which took most of the effort out of it. And then, finally, they had a company they acquired. They wanted to convert the data into this shared format. Did that in just a few weeks. Later, all the same queries still worked. Everything worked the same, even though they came radically different systems. And we then found out that they actually acquired that company 10 years ago. And for 10 years, they've been talking about trying to convert this data. And we said, you know, if you actually know what it means and know what the rules are, it's pretty straightforward. So I guess it all sounds great. Let me try to get the dummies version of this just for a minute, though, because on the one hand, the powerful of the system is it sort of just sounds like you're programming, you know, hundreds of thousands, if not more, rules into a system. And then is the intelligence in the rules, or in what way is the intelligence in or the smartness in the data? David Cohn, this is really a follow-up question to your statement. So what we're doing is mostly trying to avoid writing hundreds of thousands of rules. In some ways, that is what an application system is. It's a whole lot of very trivial rules about moving data back and forth between screens and databases and stuff. So pretty much, you know, walk away from all that. And when you have a lot of the smarts, as far as we're concerned, is in the economy and the elegance that you can get to when you are no longer burdened by the structure. And then so when you get to a fairly elegant structure, there are rules, but there are, you know, between dozens of hundreds and not thousands and hundreds of thousands. So in some ways, I really do think the smartness is in the simplicity. Okay. Sean or Dave, do you have a different answer to that question? Yeah, well actually, I would say it is, you know, what Dave is saying is actually, it might be non-intuitive, but it's absolutely true. I think in IT we have a lot of accidental complexity, right? We build all these structures and ad hoc bases. We create all these arbitrary rules and we create legions of them and then we wonder why interoperability is a challenge. And I think ultimately what you're looking for, and I think the struggle of our age is given that the world is increasingly dynamic, distributed and diverse, if we just accept those things to be true, then we need architectures to be, you know, to new architectures to serve those requirements so they can adapt more flexibly. And I think the key is having a very, what you would call from the platform perspective, very streamlined metamodel, right? The metamodel of the platform itself has to be really streamlined to just say, hey, you know what, it's just like in genetics, right? GCAT, right? The DNA is all encoded in four things, right? GCAT and actually in chemistry we're also made of carbon, hydrogen and oxygen and nitrogen, right, that's Sean. So, you know, the world, all organic things are made out of very simple things and then they're just combined to create very complex things. I think in IT we've created very complex, rigid, accidental architecture structures that make, that force us to have very simple applications. And I think when you have a very simple architecture, when you rethink it and you have a very intentional architecture, the architecture becomes simple and the interactions become very rich or complex because they're allowed to dynamically resolve themselves for a certain context. So I think that's sort of paradoxical, but I think when you build things and the things themselves are very complex, then your interactions are constrained and therefore very simple. When you have a very intentional architecture and you abstract away a lot of things from the business user and just say, hey, just work in this universe where you can compose policies and compose nodes to create business value, the system gives the appearance of being much simpler and the interactions can be much richer. So this notion of reducing complexity and enlarging simplicity is coming through pretty strongly at this point. Sean, any addition you want to make to these previous comments? I totally agree with both of the Dave's. And I'd add that what we're starting to see now is this extreme flexibility. One example that's I think pretty familiar to most people who do any amount of data integration is really, if you think about the sort of the inherited model, the relational model we've been using for the last 30 odd years, in that model we've conflated schema, storage schema with meaning schema. And this has caused all sorts of difficulties in terms of inflexibility. It really puts us in a spot where when we build a data warehouse or do an integration, we have to predetermine exactly what questions we're going to answer and then we build a data structure which reflects both the storage and the meaning, the logical meaning of the information. And it's all tied up in the same place. And then we have a big project where we put the data into that. And as soon as you want to ask another question, you've got this huge problem where if you can't find a way to kind of massage what you've got into something that can answer that query or that question, then you're going to have to go in and change the structures and it's very expensive to do that. When you are able to extract your models away and simplify, and I liked the way that they put this earlier, you're able to evolve your models much more quickly in the face of changing business data requirements. And more than that, if you've got models that can express relationships much more easily than say you can in sort of interlinked tables, which is what you have in a relational model ironically. If you've got a model in language like R where you can express complex relationships, you can get to much, much richer representations of reality. And that in turn will drive business value because you can ask much, much better questions. When we start layering in unstructured data with all of its infinite amounts of potential for additional data and entities and types of entities, just things spiral out of control. And so if you're trying to grasp this world now where we, you know, I've read the other day that something like 80% of businesses' data is text. If you're starting to bring that into the fold and nearly all of our customers are, and that's really a new thing in the last couple of years maybe, and join that with all the sort of things that IT would traditionally be doing, the structured data and tie those things together to get much richer views that can answer wider and deeper questions, then you're really starting to get to value. And I think that's the key here. We're starting to use the smarter data to get to value quicker and to be able to do more with the same amount of resources. Okay, so the first question we're going to ask, I'm going to go to the audience question that you needed to do. It is historically, and this is probably controversial for some folks, but reasonable observation is that model-driven architectures have already failed. The question says they've already failed. I guess that's where there might be some controversy, but in many cases spectacularly failed if you believe someone like Scott Ambler who's a strong, agile evangelist. So if model-driven architectures have failed in the past, what makes us believe that putting models such as we are into smart data is likely to succeed? And I'll let anybody answer that if you wish. May I? Go ahead. Okay, go ahead, Sean, if you want to. I was going to say that it's definitely not proven that model-driven approach is failing. It's actually totally the opposite. We're seeing extraordinary gains through using models. We're not talking, there's a question of semantics here because we're not actually talking about the same kinds of models. A model-driven approach in code is very different from a model-driven approach where you're using abstract models that are easy to operationalize. It's something like AL. The other thing that may have changed drastically is our ability to scale this. It may have seemed like we were unable to address the wide variety of solutions people wanted to address four or five years ago. Things have moved on. The whole big data thing now has changed the level of scale at which we can apply these abstract semantic models or smart data models. And that is transformative. So we are using models at the center of everything we do and it is actually extraordinary liberating. And if you don't believe me, please request a demo just to see the kind of power we can unleash with models because it really is breathtaking. All right. Sean, as the other vendor on the phone, you might expect that I would likewise agree. But also from the demand side, we participated in many movements from OpenStack. I was just at the Open Daylight Conference. Pretty much every major open source movement and they're all looking to be model driven. And I think the reality is the complexity of the business environment and the requirements of the business requirements. So not just some philosophical abstract architectural requirements. This is not just about me, Sean, and Dave really liking these ideas. I would say these are necessary for the continued survival of the organization because we have to automate our way into the future. If we look at it, it's 2015 and we've automated a lot over the last 100 years. But one thing we haven't automated as of yet is IT. So we have to turn automation onto IT itself so that we can scale our cloud environment so we can manage complex rules and means so we can manage interoperability in our new value webs that we have today. So that will all be done by models. They will be done by much more flexible structures that can adapt to changes as they occur while persisting history and being fully auditable. So I think it's true what the person has said as far as prior attempts have. I don't know if they've failed it, but I think they've been limited. But then again, you can look at the high rate of failure in IT in general. 75% of IT projects are indicated to be failures. When you go over a project over a million dollars, it's like 90%. So what's the cost of static siloed IT? It's in the trillions. So I think you have to look at it relative to the alternative or the status quo and then also look at the new requirements and also the new technologies that have been enabled obviously with the web, big data, et cetera. We're just, we are in a new age. I think the person who wrote that in is correct. Certainly there have been challenges in the past, and that's led people to be more tactical and code up siloed solutions. I think now people are looking for more generalized solutions for greater visibility of their and more centralized policy management, and that's going to drive them towards model-based development. Okay. I'm sure you have an answer also, but I'm going to ask you to hold this, please. Okay. This is an important topic to get from here. So I will give you first the correct though at the next question, which is, and we started to talk about this a little bit and how the smart data work. So let's say you're a customer who's intrigued by the ideas that they're hearing today and everything. Okay. So how do we get started to do something useful with smart data? What's your advice? Well, kind of the stock advice, everybody wants to do a proof of concept and then a pilot, et cetera, and I probably can't dissuade them, but most of this stuff is already proven. I guess people just desperately have to have internal demonstrations to prove it for themselves, and so I don't know that I can radically change that, but everybody has a huge backlog of gnarly problems they want faced. A lot of it is pick one that has some characteristics that will showcase the smart data approach early on, but we're really trying to counsel clients not to pick this as being yet another application, because if you already have 1,000 applications, you can come up with a clever way to make 1,001. You haven't really made a lot of progress. So we want to be using this as a way to get beyond the application-centric mentality. That actually raises a point that I was trying to get at with this particular question today, which is what's the mindset required to get started here? What is different about approaching the smart data in terms of how the business needs to think about it if it's not just another application? Right, and I suppose where most of our clients are starting with it, and Sean was hinting at this as well, it's if your intractable problem is the fact that you have data in many different systems and many of them aren't even traditionally structured systems. They're social media and semi-structured data and completely unstructured data, just textual data. And if that's what is currently frustrating you, then sooner or later you're going to say, you know, the only way to bring that all together is to have a model that is not structure-specific, structure-independent and can represent the stuff that's in all those different systems and finally bring it all together and get something done. So, you know, I think the mindset is a combination of frustration plus hope, I suppose. Okay. All right. I was really happy with where you were going there until the very end. Let's let Sean and Dave... What do you recommend your clients get started when you think about smart data? You go ahead, Dave. So, you know, I think it won't be completely inconsistent with Dave's. I think we generally grow as you go. I think the challenge is people feel like this is new, it's transformation, it's scary, their careers are at stake. And I think you have to back people off of the ledge. I think, you know, sometimes it does help to do a proof of concept just to show people that, hey, look, I can actually very rapidly build something that actually is fully functional, that demonstrates the capability exactly that you might have struggled with in the past. So usually, I think two things. One, the whole point of a platform is to provide a unified architecture, common tools, libraries and services that enable people to start building things on top of a platform and then let you scale them out over time, right? You can scale your operation. So you don't have to, you can always start with a scope of X, I guess is what I'm saying. We don't have to burn down the legacy. I think that was generally a bad approach. I think what you're looking to do is tackle a problem that's real and concrete to the organization, identify some initiative or some pain point, and then, hey, then call upon a technology like Sean has to offer or we have to offer or somebody else has to offer and say, how would you solve this? How would you make my life better and not just maybe solve this problem at the same time? How could I apply the learnings from this and the technology here to a broader set of problems and expand it over time? And I think that's the benefit of the new technologies out today is that you can essentially think big, start small, and then scale based on success. You don't have to just make one big leap into it, right? It is some new capabilities, but I think the more that you make it the sort of big do or die thing, I think the less likely it's even ever to go off the ground. I think the key is to say, hey, there's got to be problems where your organization struggles with being faster, being smarter, being more adaptive. Let's tackle maybe being more connected and networked. Let's tackle one of those problems, use data in a new kind of way, and then maybe that will be its own proof point to build support inside the organization. Okay, Sean, I'm going to ask you to answer this question by having me ask it a slightly different way. So you've acknowledged that your products are based on internationally accepted standards, including ADF and ALF. So is that the starting point for a customer is to learn about those standards and then convert all their data to ADF, and then they've got something ready for Cambridge Semantics to work with, or how does that all work? So in the early day, I mean, we've been maturing our stack for quite a while, but in the early days, we did have to have a fairly technical approach, so you were really trying to persuade the people who could understand the semantic approach. But as we've matured and as we've abstracted ourselves away above the standards, so you're no longer kind of day-to-day bumping into URIs and the low-level nitty-gritty of sparkle queries and so on, because our software is taking advantage of all the smartness and actually leveraging it. Users are further and further away, and our preferred approach to getting a company started now is just to show them something that they just couldn't do and that showed to the business. So when you put a business person in front of a system and say, okay, all your data's in there now, ask whatever question you like along any axes, and here's the extent of the model, and it's comprehensive. It covers the entire domain of how you think about your job and your business, and just ask whatever you like. They've never seen that before, and they will drag IT in their wake. So there's been a shift in emphasis. We're now far better equipped simply because the software has matured to the extent it's caught up with the traditional world. Don't forget we had to reimplement the entire stack, the entire chain, the databases, the middleware, the tooling, the user interfaces. All of that had to be brought up to work with those international standards. The results now are pretty obvious. So when we load up, say, in a competitive intelligence for pharma type demo, and we can bring in data from, I don't know, 50, 60 sources and commingle that, and now let's put a user in front so they can answer a question that their IT people couldn't get them an answer to in a year, and they can do it in five minutes. It really is a way to get started because they say, you know, I don't care how you're doing this, just we want some, and it's accelerating the take up of this technology. So sure, in the long run, IT will end up, we call it linking and contextualization. It's where you basically take your data sets, you put them in a data lake, we call it a smart data lake. The process through which you link and contextualize that data, you resolve the entities and so on, but you're essentially, instead of leaving the data in a warehouse where it's really only good for answering the questions you design that warehouse to answer, instead you're using these standards to leave your data in a form where it's discoverable, because everything's tagged, there's models behind, shared models behind all the tags in the data. So you can discover that data and you can, an end user can just select data sets and commingle them, and because you've used the standards, there's no further integration required. So I think in the future, IT's job will become to effectively deliver data in a form where it's been linked and contextualized and become smart data, so that it becomes basically random access by anyone who's entitled to see that data and mix it with other similar data sets. And the standards pretty much enable all of that. So we're in a very exciting spot here at the moment, I think, because the reality of the semantic vision, which has been playing out for what, 10, 15 years now, is finally being delivered in actual applications. And there's a lot of excitement, certainly amongst our customers, and I think hopefully we'll see this at the show in a couple of weeks where people are now starting to see the promise finally of smart data being applied. Okay. Well, you gave me a nice segue there to the next question when you mentioned data lakes. So historically, if I can call that given how recent this history is, you know, the notion of a data lake is basically a Hadoop-based repository of data, and it sort of accepts everything that flows into it. What's the relationship then between smart data and big data? How does smart data, I guess, make big data better? How does big data leverage smart data? However you choose to answer that question, Sean. So there's a number of ways. One of the issues with the data lake approach, and I always think about the data lake as kind of the extreme reaction to the inflexibility and control of the warehouse. So if you look at warehouse on one end of a spectrum and data lake at the other end, there's a big danger that you could find yourself in a data swamp where you've lost control of your data. Sure, there's a lot of flexibility because the data's there and it's available, but it's just mountains of files and a file system effectively. So where semantics can help, and there's quite a few people talking about this, is providing a metadata backbone to describe all that information. So that's part A. I think you can use semantics models to describe the data. IT doesn't just toss it over the wall into a file system, but instead is responsible for tagging that data, and semantics models are a very good way of doing that. The other thing that we'd recommend is that at least one of the ways you store that data is in these open standards. It future-proofs it. If the notion is we're going to collect everything because we never know when we're going to need it, then you really want to future-proof it. And by using these open standards, you've identified the data and you've given it meaning so that whoever comes along later on can easily reuse it. For us at Cambridge Semantics, we allow something called a smart data lake, which is really that whole metadata that I've been describing. And then on top of that, ad hoc analytics on any combination of datasets you can find. So as a user, you can just come along, select datasets out of a catalog. These datasets have been pre-prepared, so they've been linked and contextualized, they've been tagged, they're aligned with a model that has meaning to the end user, the business user. And then they can spin those up into memory and that's where the big data approach comes. Not only have you got big data, a large amount of it's stored in, say, HDFS, many thousands or tens of thousands or hundreds of thousands of datasets in a catalog that describes them using that metadata, but now you can select a few of them, the ones that you need and spin them up into memory and you could be looking at a few billion triples worth of data and be able to do ad hoc analytics. And all of that is really relatively a new capability. You know, just the last year or two has that become viable. We're using in-memory graph analytics to be able to do interactive dashboarding and so on. So that's for us where big data meets smart data, smart data legs. Okay, good opportunity for me to remind the audience that if you have any questions, please submit them now and make sure to answer them before the hour is up. Dave Dugow, how do you see the relationship to big data? Well, this is a good question actually. We're working with one of the world's largest consulting firms on this problem right now. And I think you're really talking about two different sides of the spectrum, right? I think that, you know, these big pools of data that's collected in sort of an arbitrary fashion where the possibility that you might want to use or it might be relevant is and that it might be very large sets and it might be streaming large sets provide a certain kind of value. It's sort of a latent right value, right? You're just collecting a lot of data because you can, right? Because throughout this call we've been talking about, you know, we've gone from a time of scarcity to a time of abundance, right? You know, network storage and processing is all much cheaper today and much more widely available. You can collect something like a data lake where smart data adds values. It's like the small data. To me, I look at it as the set of... it's a model of a set of relationships that provides a set of facts that I could then use to trim a big data query. And I could do that in real time, right? Because, you know, big data is a big unmodeled pool of various kinds of information, of various kinds of type. And what you want to do is, instead of looking through the whole set, right, that's in graph processing, that's essentially a brute force when you trim a graph, right? It's a brute force activity. But actually, if we're really talking about intelligence, what we really want is applications, right? What we're talking about is transactions, not just data and writing queries and doing analytics against pre-prepared data, right? And remember that pre-prepared data is a schema unto itself. But I think what we're really talking about is actually having a higher level of abstraction, creating abstractions for applications that say, hey, I have an application. It has these concepts it's related to. And every time there's an interaction, it sort of looks at the sea of relationships. It uses the sort of in-band metadata that describes that transaction to sort of bootstrap the navigation of semantic traversal of objects inside the model and it could then interpret a set of big data, right? And very quickly get to the set of relevant information, maybe in a big data set. So I would look at, as smart data, as abstract models that facilitate or bootstrap the trimming of a big data set, right? Because you don't really ever need the whole set, right? You want the set, I mean, you know, each one of us walks around with a sea of information, and if we couldn't filter it well, our heads would explode, right? Our ability to act in a specific context and react to that intelligently is what sort of defines us as human beings. And so the same way in systems, we want applications that can look against abstract models, that can trim large sets of information to the relevant sets of information, and sort of work hand-in-hand with big data. So I think it is, it's almost a small data, big data. Smart data, you know, it's all smart data collectively, and it's almost like these small models against large sets and helping you trim those large sets to get answers that are relevant. Okay. Well, before we take an audience question, Dave McComb, I'll give you a chance to answer this question. And Dave, I think you're currently muted, so you need to take yourself off mute. Maybe not. Maybe that's, I can't hear Dave right now, so why don't we take this question? The question is, if one opens data to a wider set of queries with new flexibility, can hardly avoid introducing new errors and even new kinds of errors? Perhaps much more subtle than previously. So what are the specific challenges in validating semantic data systems? Dave or Sean? Dave, do you go or Sean? Okay. Tackle in. That's a great question. It's really almost the question, right, is when you go into a more abstract environment where you're actually explicitly allowing for more flexible relationships because you want that variety and you want that flexibility, then what, you know, can your system go out of control, as it were? And I think, well, when compared to essentially static siloed applications, where everything's done a priori, where everything is rigid, and you know that there'll be no, well, I mean, of course, even rigid applications break, too, and you can have side effects of static applications as well, but you are starting from a perspective that you want to lock everything down. I think the way you do it in a declarative application, or the way we do it in a declarative application, is that it's flexible as appropriate, right? We use the phrase as flexible as possible as procedural as necessary, right? There are some human interactions that say collaboration, right? Well, a collaboration is a kind of interaction where you'd like it to be more flexible and you might have less rules on it, whereas a transaction for a specific, buying a ticket for something or anything, or logging anything into a store, is maybe more structured, and you need things to be in a concrete and explicit way. And the way you back that into abstract models is through the constraints themselves. So you can actually put constraints on behaviors. You can say, well, yes, you're related to my object, but you have to use my object in a specific way to actually record something against me. You have to give me this data in this format, and then I'll record it. If you don't, you're not playing nicely with me, and therefore the transaction won't go through. So the way you layer it back in is through policy and constraints. And again, we're working in highly regulated domains of life sciences to telecom. And in those domains, what they're finding is they want the flexibility, and you're right, they want those enterprise class or carrier grade controls because their business compliance does matter. Their IT governance does matter, and you're right, that is tricky, and it's something that has to be addressed in the platform that you're looking at. We do it through policies and constraints as a way. But the advantage of using it through constraining things through policies is that it's still declarative and that if you have the right permissions, you can modify the policy over time to reflect new requirements, whereas if you do it in embedded in code, then you get those sort of accidental complexity of the systems themselves to come hard to change. So you want that ability to lock things down while still keeping them mutable because inevitably they will change. Okay, looks like we had Dave McComback. Are you there, Dave? Yeah, sorry about that. Okay. The line dropped and I went mute. No problem. So the question here is, what are the specific challenges in validating semantic data systems? Yeah, there's a lot in that question. When the semantic web standards came along, there was the assumption that when you go to web scale for your information systems, you give them a moment, your view of the world is incomplete and they refer to this as the open world problem, et cetera. But trying to square that with the enterprise where in the enterprise, they believe that their set of data is complete and how do you almost sort of mix and match a set of data that is inherently incomplete, loosely structured, loosely formatted stuff that you can't curate because you're harvesting it from the outside world with sets of data that you've spent a huge amount of money making sure it's exactly right. And I would agree with Dave to gal there. I think the solution, what we're working with our clients now is just how to recognize some of your repository, some of your data sets are going to be highly curated. You are going to have constraints on the way in. You can really count on what's in there and you're also going to be harvesting data from the wild web, if you will, or any other source which is not going to be curated. It's going to be incomplete. And the key is you can have a schema over the top of all of that to say when you want to, you can combine the less curated data with the more curated data and get a more complete picture. Okay. So, Gents, we've got about three minutes left here. I'm going to ask each Jamaican to make a prediction in three seconds for the future smart data. Sure. Can you do that in seconds? I'm not hearing sure. So, Dave, do you gal? In five to 10 years, you won't recognize Enterprise IT. It will be things that we do today that are all manual, writing apps for being parallel, writing things for data for being immutable, writing things for asynchrony, manually integrating APIs. Our children will look at us and say, Dad, you did what? You had to actually look at that, read it, and manually integrate it and in the future that will all be semantic, dynamic, and fully governable. Yeah. We'd be able to go back in a year and imagine where we would be today. I think it's stunning. Sean, do we have you back? Yeah, apologies. I was just going to say, what's smart data? We're going to see it explode over the next two or three years. Five years out, who knows? Ten years, it's impossible. But two, three years, smart data will be everywhere, for sure. Okay. Dave McCombe? What's your prediction? Well, I hate to be the naysayer on this one, but I suspect that five years from now, 95% of the companies will be offering more or less as they currently are. We've got a lot of 10-year plans for people and watch. The inertia is unbelievable. However, like I said earlier, there is still hope. There's the 5%. 5% of the companies are going to embrace this and are going to have some radical transformation. I do think most people overestimate change in the short term and underestimate it in the long term. I do think in the long term what the enterprise information systems are going to look like is a lot more like the App Store. We won't have these big monolithic applications. We'll have little tiny pieces of functionality, but unlike the App Store, which works because you are the systems integrator, the App Store will be little tiny pieces of applications that are already pre-integrated into the firms. As soon as you pick it up and use it, you're off and running. The long-term, very rosy short term for most folks, it's going to look a lot like it looks today. I think that's the challenge is picking the time for everything. The direction is often, it's often, you know, it's 25 or 20 years. That's the hard part. So we're going to need to wrap things up here. If you'd like to meet all three of our panelists today, and in fact, Shannon and myself as well, please join us at the Smart Data Conference in San Jose in a couple of weeks, and we'll be giving away a couple of tickets to that conference after the session today. I'd like to thank everybody who joined us, and in particular our panelists. Good job, gentlemen, and I'll hand you back now to our host, Shannon Kim. Shannon? Thanks, Tony, and thank you everyone, especially to our attendees for being engaged in everything that we do. I love the questions that have come in. One of the most popular questions, of course, is about the slides and the recording. Just a reminder, I'll be sending out links to both by the end of day Thursday. So if you don't have that in your inbox by the time you walk in on Friday, just let me know, and I will make sure and get you a copy. Again, thanks to our panelists and to Tony for such a great discussion. This has been fantastic, and really appreciate you guys putting this together. And I will see you in San Jose in a couple of weeks. Everybody, thanks. Everybody.