 the puzzle and the value is when I put the data all together and look across it. Yes, I want to understand the website popularity, but I also want the internal view, the internal data set, the public company survey, the public end user survey, all of the analyst reports, all the email back and forth between our two companies, et cetera, et cetera, et cetera. And the more of those, and the more you can boil that into a score, wow. All right. Yeah. From an architecture standpoint. Follow up on that. Yeah. Question. How do you share with me your, your angle on all the developments around machine learning, whose conference just yesterday at San Francisco Graph Lab, and it's very complicated around all the different graph architectures and the data having a graph format, but machine learning seems to be at the heart of that. So, you know, you mentioned ontologies earlier, machine learning has been around for a while as well. Let's follow up. How is that, how is the team learning changing these types of plans? It's another great question. You know, I, I think machine learning brings a little bit of the machine intelligence to it. So, earlier I talked about how, you know, Relay kind of curated the set of data. One of the things that brilliant people with a lot of domain experience do when they look at data, right, is they, they, they realize things, they understand things and they may want to tag the data, right. They may want to, they may, they may identify separations or divisions or outcomes from the data that they think are strong. The problem very often then is to take that and generalize it and, you know, technologists who are listening on the phone, how, how long have you spent trying to tune an algorithm or tweak an algorithm to deal with the fact that you have far too little data to really reflect the problem. And frankly, in structured data, that's much less of a problem. When you start looking at unstructured data analysis, you need a lot of data, a lot of it to really get the algorithm to do something like identify the concept for any document on the internet, right. That takes a lot of data and it's an advantage, you know, frankly, on the web that there is all that data. Machine learning is that tool, is the tool you use to do that. So, when Brigham can say, hey, you know, connecting.abc, this is a good pattern, it can now use a machine learning engine to identify what are the kind of features and patterns that are present in that and then say make decisions based on that, right. And find all of the other examples. All the other ABCs, not this one, but the ones that are like it, right, the same kind of effect. That's what machine learning is great for. I can tell you that, you know, at Ativia, we've been a long time believers in machine learning. We've brought all the different approaches together, whether it's language modeling to do things like key phrase extraction, which gives you the really good concepts or terms inside a document, whether it's our, you know, machine learning classifier, whether it's our sentiment analyzer, these tools allow you to pick an outcome and then find all the others. And that's what I think, that's why it's important because it lets you find an example in data and then find all the others like. Yeah, and we use it today. You know, I mentioned the RBI score before the index. I mean, that is fed by machine learning. Now, I will make the point that in my world or at the edge, I'm at there's kind of hypothesis driven and non-hypothesis driven approaches. And, you know, the non-hypothesis driven is kind of the purest machine learning and we have tools out of the box in Ativia we can use to do that. Just draw connections on these data sets and use them to feed, you know, certain answers. And, you know, we do explore those parts of it or asking kind of an open question. A lot of time for high end users or people who really need business intelligence, you need a very targeted question. You're going in with some concept or framework of the question you're going to ask. And we create a lot of these kind of secondary variables and secondary outcomes to measure that we then machine learn specific things against. So the difference would be between asking the question, what should I invest in versus should I invest in this? Well, let me give you an example, like what if it says we run machine learning and it determines that the cure for cancer is vitamin A, right? That's not really an answer that's meaningful to our users. What would be meaningful is if I can constrain certain parts of it to say, consider, you know, ones that have not been FDA approved and not generic. Consider like, you know, mechanisms that are related to that. So consider patent aspects that look like this. So by training that set and kind of targeting it, then you open up the power machine learning and you can actually drive to a meaningful answer as opposed to kind of data against the wall, see what sticks. And this is where you get into things like causal inference, modeling, markup, change, everything else. I think there'll be a lot of sophistication, particularly in life science emerging in that and big data health care, quite frankly, in the next five, 10 years. So, you know, that's where it's going. You have to be able to have that skill set, both on the broad data set and also in a targeted way to get to real meaning, I think. So I wonder if we could switch gears a little bit to like some of the cultural considerations. You know, you talked about, we talked before the call a little bit about, you know, some of the people in your industry not having very much quantifiable data making decisions based on they're taking a long time to develop their decision making process. Your tool comes in, they're doing it, allows them to do it much quicker, but it's different and it's a new way of doing business. And how do you approach getting people to adopt a new way of doing business? And in particular, how is it changing their jobs? If now, all of a sudden, they don't have to take weeks or months to come to these decisions, they can do it in a day or less. How is that changing the industry? It's been really interesting, I think. And we come from the soaks who understand it. Scientists in particular are cynical about data and they're cynical about trends and kind of algorithmically driven things. I think the way we've gotten around that in our first product, BE Live, is that we give them quant, but we're one click away from the document that is linked to that quantitative measure. And that's why having Tivio underneath is great, because it's all search driven directly. You can go there through the ontologies and actually return the reference. So you get to join that qualitative experience of I read scientific papers, I know what's going on with this quantitative thing that kind of measures my intuition. And the experience with that culturally, there's a great example is we were with a client and we were given a number of assets they were evaluating. We had to come up with our evaluation of it. And they had already done theirs. So there's the moment where I slide my data-driven answer across the table and they were looking to narrow it to four top ones. Now we had two of the same that they had, so confirmed something for them with data, which I think is always good, and then two that were different. And one of the ones we demoted of theirs, we picked up something in the intellectual property, i.e. information they wouldn't have joined to their analysis yet. The lawyers would have done it a month later. We picked up something that would have saved them a month of BD time because they know that's not going to work. So example of a great downgrade, then somebody as they're staring at him kind of slams his hand on the table and goes, I knew this one was good. And one of the ones we promoted became, they rehashed the argument they had had before. So it was kind of like, quant is not the answer, but it is another evidence base where if you introduce it culturally as, again, they could go talk to KOLs, Knowledge Leaders, they could talk to the experts, but then add this as a secondary component and culturally that tends to work. Also being able to be transparent and get to document level is key to them understanding what that quant means. So it's not a black box, it allows them to drill down and really essentially test out the answers they're getting or the advice they're getting from relay. And that's how they're going to ultimately grow the confidence in machine learning and these kinds of technical systems. I think Brigham made a great point. You put up a quant number, but then you let them see the documents. Think about that as a transitional state, right? Previously you would have just surfaced the documents for the talented analysts to consume and then draw the conclusion out. So putting some data, some structure around it, it just makes for a much smoother transition. It's a lot different than going from no data to having some data. It's a much more subtle increase. And we've seen it a lot actually, it's very challenging. We have some Intel clients and they've been big on machine learning for a long time. They've been looking across silos a long time that's been a goal for them. But the challenge often is to, when you suddenly start putting structure when they're used to having just surfaced the data for me as an analyst, but as over time as they see, hey, this number is actually working or this recommendation is bringing me something that's letting me understand more about how I make this decision. Their own mental process changes. They realize, I actually only look at the first four or five documents to assure myself. So now using the score, I can go down and see maybe a breakdown of the quartiles of documents and go in and look at some of the lower level documents and that changes the way I use the data. And now I start to trust the number more. And then in time, we believe that you'll start to say, well, if the number's above 80, I don't even need to be involved. The analyst doesn't need to be there, the decision can be made automatically. Do you have the ability with your technology to drill down and discover the root source of incorrect information? So can you kind of mine for bad information in these sort of diverse databases of unstructured data? Oh yeah. Could I add a little bit to that? How about the deliberate bad data of trying to manipulate the market, for example, ahead of time or using it to get a drug approved? Well, how do you see that playing out and how do you accommodate that in your analysis? So, are you asking how long I think the market to respond to this data that we've now introduced? I think he was asking how to get out the criminals, but yeah. Well, I'll tell you one that I think is a focus and gets brought up by all of our clients, which is, here's a phrase I hear a lot. We don't believe the publication literature. We think editorial processes flawed or we think there's bias at point XYZ. We can actually measure those things. We develop patterns to detect when there's a scientific controversy. So I'll give you a more recent example, C-TEP inhibitors. Okay, this is a, was supposed to be the follow-on to Lipitor. So Lipitor, a huge drug for the industry. C-TEP was supposed to be the next drug and get all that revenue. Well, the science isn't looking like it worked out and there was a drug that failed and it was interesting. There was the backlash of the failure and there was the backlash to the backlash where the scientists said, no, no, no, it was just that drug. Everything else is still good. And you can actually see those waves in the data and I think we look at a lot, you mentioned velocity said, I mean, we're looking at multiple derivatives of stuff and trying to detect when you see those pops and signal and those mean something. It's almost like if you see A, then B, then C, you're more confident in F, right? You know, it's actually transmitting that back to folks. So we spent a lot of time, not just on, you know, hey, positive affirmative stuff, but also, you know, what are the negative things that there's a project we're discussing with the FDA? Could we, you know, fetter out adverse event profiles based on early data? So can we say this what the mouse model data looked like? You know, what's the correlation of that to the potential, you know, a problem? And not just the data, but relate the whole ecosystem and see if we can detect those things. So, so exciting stuff there. And what about, you know, kind of helping wait through all the false positives that might be out there or data that's not or just insights that aren't really insights. Maybe it's just noise. I mean, how do you help your clients understand what something is really relevant and when it's not? Yeah, I mean, we right now spend a lot of time talking about what the market does. So detecting signals about when we have seen those controversy where they look like and then look for common signals. In the future, you know, a dream of mine, you talk about scientific data and I got to convince Sid to take in graph axes and, you know, detect statistical models, you know, within figures. But long-term, you could imagine coming up with your own model to measure that stuff and, you know, seeking out certain aspects of it. Again, kind of hypothesis versus non-hypothesis, you know, targeting it that way. And I think that'll come. I mean, in the data transparency thing, and like we keep pointing up to the right, there's also gonna be data complexity. And as complexity is introduced, you're gonna get more refined signals or more important signals than you're getting just off kind of the basal noise. And so ability to deal with kind of complex data and it will also be really important. I think there will be a substantial change in investment in sort of, it's like the analogy of fraud detection, right? There will be more data quality on unstructured data. People will become very interested in, you know, understanding what does a misspelling mean? Is it deliberate, right? Or what if two entities are placed near each other a lot? Which will be a typical approach to kind of throw some of these systems off on their biases. Well, one thing is, you know, getting good data sources and clean data sources, that's still important. There's a reason people pay for data, right? They will pay relay because relay's data is really high quality and you have, you know, all kinds of checks and things that go on. When you go and harvest data, actually, especially when you harvest a lot of data, that can be one of the hardest things to do quality assurance on. There's so much of it. How would you even go and find issues in the data? We have a lot of different techniques, you know, to bring to bear some of them work better in different industries. You know, we have a client, a big manufacturer who is essentially using indexing of documents to find documents that need review. And they have a whole series of special terms that they look for. And if these appear in the documents, then they review them. And what they're finding is, these documents should never have been put in the wild to begin with, wild is relative, right? But some internet. But that, you know, being able to say, this is an example of a document that I don't want to be publicly available to all my employees or whatever or partners, you know, then being able to train the system to go find others like that. That's one example, finding, you know, essentially variant but repeated paragraphs for plagiarism detection. There's been a lot of work around things like that. Students are hating that. Absolutely. So we've got about 15 minutes left. And I want to make sure that anyone that's online has an opportunity to ask questions. So let me pause here, talking about big data, unstructured and structured data analysis. I wonder if, I could ask a question. This is Dave Vellante. And Brigham, you mentioned the business user experience early on in your remarks. And I'm wondering how you help your end customer, the analyst, visualize all this data. No, and that is a tremendously important part of this. You know, the users don't like to get a big text or CSV file to pour over. That's not what they're looking for. In our UI, we took kind of two approaches. You know, we've used GWT and a number of apps there that are kind of really rapid for visualizing data off our backend and always showing things temporally, I think is crucial. You know, right now everybody gets kind of today's answer but showing trends, a big part of it. And the second part that's really big and we are working right now with both Tableau and Tipco Spotfire in our front end in OEM. And we've been using them for the last six months, ahead of our product launch in, you know, delivering, actually deliverables and consulting and having our analysts actually work from that end. There's a couple of really important parts to this. So number one, you're able to cut the data in any way you want, beyond what you can do in Excel or anything like that. So it tends to blow away these users just on that basis. But then there's this aspect of it that, again, combines quant with qual. So you can have a graph that's showing a temporal thing on a given gene, for instance. And then right next to it, you can show the documents that are related to that. So it's kind of this experience of like, oh, it's going up at a faster rate. Let me zoom in at that range and then see what happened there. You know, what were the key things that might drive it that makes sense to me from an understanding basis? The other part of this is that, and this is where one of the crucial things with Ativio, and we're actually announcing this week that we're a featured customer now of typical spotfire and we're doing a joint release with Ativio, where there's a crucial part where it's one thing to build a graph that's plugged to a database relational or otherwise. And those have been around for a while now. It's another thing to engage the user in exploratory sense, to use their intuition and search to actually craft the quant. So I have a dashboard right now where I'm looking at thought leaders, KOLs. And I can start with my ontologies and use ontologic search and say, all right, I want the best guy in Anemia and on these two quantitative basis. And then I can add the ability to search the documents away to that person and say, okay, constrain it by, you know, a certain type of mouse model that is common to research. So I can add a text string that's then limiting my people results by the documents that co-occur under that structure. And so instead of me a priori telling the user, this is what's important. It gives them the ability to explore and then return documents and related information. So it's giving kind of the steering wheel to the user to drive quant, which is great. A last really important part of this and we started to touch on it is the ability to do statistical modeling because at the end of the day with quant, a lot of users, particularly on a big decision like this asset or that asset for M&A, you need to know how sure are we and when were we sure. And to do that, you really need to build in statistical inference and do it. And for us right now, we built a layer with an open source product called R, but there's other packages we can integrate. We have a kind of front to back stack where we have a Tivio's index on the bottom R in between and then spot fire visualization on the front. And that's actually in real time doing statistical analysis of this quant, which is great because sometimes you get back, like maybe we're not so sure about ophthalmology. It's not that clear based on the data, but we're really confident about another area. And I think the last leap for big data for people to turn over the keys to, like you said, the score, it's going to start with seeing documents and getting used to it, but ultimately it's going to be like, look, statistically, this is what's likely to happen and I'm this confident about it and here's when I knew. And I think- Don't listen to this guy. Right, exactly. This guy's wrong. I think it's an amazing example of what unification does. Relay unifies information, they provide intelligence across it. Even think about the user interface, the experience of the business user. Brickham has used a lot of different terms, ontologies, search, dashboards, reporting, visualization, graphs. That's what UIA is all about. It's about creating intelligence off of all these different sources, all these different methods. It's not about one method, search is not enough, BI is not enough, it's the collective and I believe that this is very much the future of big data and data analysis is putting all the pieces together with that layer of brilliance and insight on top. So when you sit down with a CIO, what do you tell them? I say that they should be looking immediately to use UIA on their strategic projects. Quite honestly, that's the best way to do it. When you have a project that requires you to integrate information and provide search, and BI, analytics, dashboards, all of that experience on top of it, and maybe even workflow. And those are the apps that I think people are building now that matter, that capture the interaction with the user, gets to a deeper level, it provides a greater quant and qual insight for the internal decision makers. They have to start doing that right now because gaining the expertise with it, getting the infrastructure in place and learning what it takes and how easy it can be to merge a bunch of silos together and build an application on top of it, and how quickly and how low, how reduced the risk can be. We didn't talk about that much, but one thing about Relay is they got to market faster because they didn't try and build the stack. They took a stack and said, we're gonna build the app and that's where we're gonna put our effort and our uniqueness, and that's where their brilliance shines. They might be brilliant open source developers or they might be brilliant engineers and they have all those folks, but that's not what that's about. It's the collection. Don't let me, you know. No, actually, it's a very interesting point because one of the constraints that's sort of been thrown out is the lack of data scientists in order to do big data analytics. Are you saying that a less sophisticated data analyst can drive more value for the company by using a particular set of tools that make the experience easier or more understandable? I don't know if I would say that they're a lesser analyst. I think they're all brilliant analysts or the audience at the end of the day, decision makers. The point is Relay's expertise is in knitting together, this is my take, knitting together and proving the intelligence on top of all the sources and all the technology. It's not implementing the individual technologies and that's where I think, yes, to a CIO I would say, rather than going out and constructing your own UIA stack from MegaVendor one here and some open source here and you're gonna knit it together and then you're gonna have to learn how to track these projects and keep up with the patch rates and all the things that go into building software. Maybe that's not your expertise, but you've got the analysts and the business and if somebody else could help you put the information together and surface it, right? I keep using this line, surfacing the information for a talented analyst at the end of the day. If you can do that, right? Then the focus is to interpretation. His software is a service model for the David again. Can I push on that a little bit? I should ask a question of Brigham. Yeah. You're suggesting that the problem is that you just provide better tools for the analyst. What about the analysts themselves and the education and the training that's required at both two levels? What advice would you give to the government in terms of what's required to really take advantage of this at one level? I mean, for example, the amount of knowledge of Bayesian statistics is not exactly great in the country as a whole. And secondly, for a CIO, what advice would you give them to develop the skills required for the next decade? Yeah. Well, from my perspective, well, I can tell you I told all my younger cousins to switch majors to bioinformatics about five years ago. And I'm very, my background on Wall Street is I covered among other things, genetic tools companies. There's this building technology around the human genome sequencing, and there's gonna be a massive amount of data that requires the only way to get answers out of it is ontologies and then really sophisticated one. And that could be hugely valuable to the healthcare system. So from a government perspective, what should you do? First of all, I believe everybody should get some level of statistical and biologic education. Maybe I'm biased to the biology, but certainly statistics beyond what's done today. I think you don't have to have an analyst that is a sophisticated Bayesian modeler if you have some of those tools out of the box. And I'm looking and paying attention to both what could potentially happen on SID's end but also on TIPCO's end. And somebody is going to build this in where it's gonna be more intuitive. And just from the visualization side those companies have taken leap one, which is to be able to basically build cross tabs really easily and put them in a beautiful form. So I think there's gonna be more out of the box stuff available without having to be an advanced programmer or a stats guy. But from a government perspective, you need to have an understanding of what that means and how to use it. I'm lucky enough to have been tortured by a stats professor in pharmacology. So that was where my understanding of this originally came from. So I guess he should be teaching more folks. On the second side from the CIO perspective, what would I tell a CIO? When I look at, and I've talked with these groups because there's some interest obviously in them incorporating their own data into our analysis. This is where it was nice connection between TIVIO and Relay. And they look at it like, well, we've got our internal silo and we wanna match it to what you've got. We have the scale and ability to do that. We could create a private instance. I mean, there's all kinds of, back to the TIVIO choices, all kinds of reasons why that works for us. I think CIOs are trying to stay ahead of the data problems, but they're not that in touch with the actual people on the end of the phone making the decision. And those are our people. I mean, those are the ones we're trying out. And just to prove my point, that you may have a CIO is investing in a major internal data architecture movement and maybe even around BI specifically. He will have, let's say he has 50 users, 30 of them have SaaS subscriptions is something they don't really like. And maybe, and actually usually multiple. Not only that, they're using Google and they're using other search engines for specific data sources more often than not. So they may ask something of internal knowledge discovery to get like a report back, but they're not using it at their desk. That's just not happening. And there's a couple of reasons for that. Number one, they don't have access to the data internally in any kind of meaningful way. This is the structure side. It's like combining that internal data in a meaningful and accessible way is problem one. I've seen, and I'm sure you have some really bizarre looking information architectures. And we know how these things evolved. They evolved over time. There were mergers, there were different groups, whatever. The second side of it is, they don't know the tools to kind of really interact with. And this is more the Spotify or Tableau side of the world. Farmers unique because Spotify in particular grew up in the bioinformatics world and got a big part of its start dealing with the scientists at a real low assay level. That's not the BI users. They haven't ever, and they might know what it is and they might have seen an interaction like it's scientific data, but they haven't gotten there. I think lastly is then kind of connecting the dots and making meaningful analysis of it. And right now, the analysts don't understand the information architecture well enough and the information architecture guys here don't know the questions. So that's why I feel for relaying why we built this SaaS offering. Because I think we can leap in there and help them out right now. So it really boils down to a communication issue between business and IT, which we hear across IT segments. Particularly important when you're talking about data analytics and solving business problems with data. So how does the CIO go about kind of initiating, let's take a CIO who understands the power of big data, what it can do for their organization? How do they go about initiating that kind of first project? Is it a technical problem? Do they need to identify the business case first? I mean, how do you really practically get started? And unfortunately, this is our last question, so make it a good answer. Okay, I'll do my best. So I think number one, they've got to have a long view on, just like we did about structuring data. So they've got to move to something which is going to be scalable and long-term, still be in the game. I think this is one of my concerns with Doop. And I think, since they're very well, I get the comment, it's like, yeah, it handles volume, but it doesn't handle analysis very well. And when we've architected considering a Doop, it's like, yeah, we may put certain data sets there if they're big chunky ones, but we'll deal with the summary reports in our workflow for analysis. And the Doops are very powerful, but if it's just because of another silo, that doesn't solve the problem. That's exactly right. So I think they want to look to that long-term. And then, I think to help their users, they need to begin to consolidate and give them tools that actually deliver them answers. Now, you could give them all Spotify or subscriptions tomorrow, but I don't think that that would necessarily get them there. I think this is why Relay, I think, is really exciting. It's a SaaS offering, but it has the potential to plug into their internal data. So it's almost like they can get the outside world organized by us. We give them the structure to answer the questions and the look and feel, and then strap in the know-how at pharmaX to that decision through an integration with a Nativio Index or another data set that we can plug into. So I think you want to don't try and boil the ocean as far as trying to come up with the answer because you don't know what's at those analysts' desk and they don't know the problems you're facing. So you need to, in the meantime, focus on, how do we kind of connect those dots? And I think there's some offerings here to do that. Brigham, thank you so much, Sid. Thank you. I hope that you'll both join us back here again on Appearance site in the future or at the CUBE at one of the many events of the SiliconANGLE TV. I'm John MacArthur, Appearance site moderator here with Jeff Kelly, Wikibonist Big Data Analyst, joined by Brigham Hyde, I'm a professor at Tufts University and managing director of relay technology management to Sid Probst and CTO of Ativio. Thank you very much for joining us today. Very helpful. We'll have six peer insight research notes up on the Wikibon site in the next couple of days. Feel free to jump in, edit, contribute, enhance or provide another perspective on any of the analysis that we do and anything that we write. Thanks again, John MacArthur, Appearance site moderator. Look forward to seeing you all again soon. Thank you.