 And we're back, this is George Gilbert from Wikibon and I'm here with Amon Neymot at Demand Base, the pioneers in the next gen AI generation of CRM. So Amon, let's continue where we left off. So we're talking about natural language processing and I think most people are familiar with it more on the sort of B2C technology where the big internet providers have sort of accumulated a lot of voice data and have learned how to process it and translate it into or convert it into text. So tell us how B2B NLP is different to use a lot of acronyms, in other words how you're using it to build up a map of relationships between businesses. Right, yeah, we call it the demand graph. So it's an interesting question because firstly it turns out that while very different, B2B is also language is quite boring. It doesn't evolve as fast as consumer concepts and so it makes the problem much more approachable from a language understanding point of view. So natural language processing or natural language understanding is all about how machines can understand and take store and take action on language. So while we were sort of working on this many years ago, I think now four or five years ago and that's my background as well, it turned out the problem was simpler because human language is very rich and natural language processing converting voice to text is trivial compared to understanding the meaning of things and words which is much more difficult or even the sense of the word. Apparently in English each word has six meanings, we call them word senses. So the problem was certainly simpler because B2B language doesn't tend to evolve as fast as human regular language because terms stick in an industry. The challenge with B2B and why it was different was that each industry has or sub industry has a very specific language and jargon and acronym. So to really understand that industry, you need to come from that industry. So if you go back to the CRM example of what happens in the 10, 20 years ago, you would have a sales person who would come from that industry if you wanted to sell into it and that still happened in some traditional companies. So the idea was to be able to replicate the knowledge that they would have as if they came from that industry. So it's the language, the vocabularies and then ultimately have a way of storing and taking action on it. Very analogous to what Google had done with Knowledge Graph. Okay, tell us. All right, so two questions, I guess. First is, it sounds almost like a translation problem in the sense that you have some base language primitives like partner, supplier, competitor, customer. But that the language in each industry is different. And so you have to map those down to those sort of primitives. So tell us the process. You don't have on staff people who translate from every industry. I mean, that was the whole translate or writing logical rules or expression for language, which was conventional or Goddow fashion AI is called. You mean that this was the rules based on knowledge engineering. That's right. And that clearly did not succeed because it was impossible to do it. The old quip, which was one researcher said, every time I fired a rules engineer, my accuracy score would go up. That's right, and the problem is because language is evolving and the context is so different, right? So even pharmaceutical companies in the US or in the Bay Area would use different language than pharma in Europe or in Switzerland. And so it was just impossible to be able to quantify the variations and To do it manually. To do it manually, it's impossible. It's only not possible for a small startup. And we did try having it be generated. And in the early days we used to have crowdsourced source workers validate the machine. But it turned out that they couldn't do it either because they didn't understand pharmaceutical language either, right? So in the end, the only way to do that was to have some sort of a model and some seed data to be able to validate it or hire experts and have small samples of data to validate. So going back to the graph, it turns out that when we've seen sophisticated AI work towards complex problems. So, for example, predicting your next connection on LinkedIn or your next friend or what ad should you see on Facebook, they have used network based data, social graph data or in the case of Google, it's the knowledge graph of how things are connected. And somehow machine learning and AI systems based on network data tend to be more powerful and more sort of intuitive than other types of models. So, okay, so when you say model, help us with an example of you're representing a business and sort of who it's connected to. And it's sort of place in the world. So the demand graph is basically, it's demand based, who are our customers, who are their partners, who are their suppliers, who are their competitors and utilizing that network of companies in a manner that we have network of friends on LinkedIn or Facebook. And it turns out that businesses are extremely social in nature. In fact, we found that the connections between companies have more signal and are more predictive of acquisition or being predicting the next customer than even the Facebook social graph. So it's much easier to utilize the business graph, the B2B business graph to predict the next customer than to say, predict your next friend on Facebook. Okay, so that's a perfect analogy. So tell us about the raw material you churned through on the web. And then how you learn what that terminology might be. You've bootstrapped a little bit. Now you have all this data and you have to make sense out of new terms. And then you build this graph of who this business is related to. And the hardest part is to be able to handle rumors and to be able to handle jokes like, isn't it time for Microsoft to just buy Salesforce? Question mark, smiley face. So it's a challenging problem. But we were lucky that business language and business press is definitely more boring than people talking about movies or. Or Reddit. Or Reddit, right? So the way that we work is we process the entire business internet or the entire internet. And initially, we used to crawl it ourselves, but soon realized that Common Crawl, which is an open source foundation that has crawled the internet and put at least a large chunk of it, and that really enabled us to stop the crawling. And we read the entire internet and look at, ultimately we're interested in businesses cuz that's the world we are in business, B2B marketing and B2B sales. We look at wherever there's a company mentioned or a business person or a business title mentioned and then ignore everything else. Cuz if it doesn't have a company or a business person, we don't care, right? So, or a business product, so we read the entire internet and then try to then if you infer that, hey, this is Amazon is mentioned in it. Then we figure out, is it the Amazon company or is it Amazon the river? So that's problem number one. So we call it entity linking problem. And then we try to understand and piece together the various expressions of relationships between companies expressed in text. Could be a press release, it could be a competitive analysis. It could be an announcement of a new product. It could be a supply chain relationship. It could be a rumor and then we also, it also turns out the internet's very noisy. So we look at corroboration across multiple disparate sources. Interesting to decide, it's sort of how. Is it true? The signal, is it real? Right, cuz there's a lot of fake news out there. So we look at corroboration in the sources to be able to infer if we can have confidence on this. I can imagine this could be applied to. A lot of other problems. Political issues. So, okay, so you've got all these sources. Give us some specific examples of feeds, of sources. And then, and help us understand, cuz I don't think we've heard a lot about the notion of bootstrapping. And then, it sounds like you're generalizing, which is not something that most of us are familiar with. Who have a surface level familiarity with machine learning? I think there was a lot of research, not to credit Google too much, but there was a bootstrapping methods we used by Sergey. I think it was the first papers. And then he gave up cuz they found Google and they moved on. And since then, in 2003, 2004, there was a lot of research around this topic. And it's in the genre of unsupervised sort of machine learning models. And in the real world, because there's less labeled data, we tend to find that to be an extremely effective method to learn language. And obviously now with deep learning, it's also being utilized more unsupervised methods. But the idea is really to, and this was around five years ago when we started building this graph and I obviously don't know how the Google Knowledge Graph is built, but I can assume it's a similar technique. We don't tend to talk about how commercial products work that much. But the idea is basically to generalize models or learn from a small seed of, so let's say I put in seed like Nike and Adidas and say they compete, right? And then if you look at the entire Internet and look at all the expressions of how Nike and Adidas are expressed together in language, it could be, I think Nike shoes are better than Adidas. So it's not just that you find like an opinion that they're better than, but you find all the expressions that explain that they're different and they're competition. That's right. But we also find cases where somebody's saying I bought Nike and Adidas, or Nike and Adidas shoes are sold here. So we have to be able to be smart enough to discern when it's something else and not competition. Okay, so you've told us how this graph gets built out. So the suppliers, the partners, the customers, the competitors. Now you've got this foundation. And people and products as well. Okay, people, products. You've got this really rich foundation. Right. Now you build an application on top of it. Right. Tell us about CRM with that foundation. Yeah, I mean we have the demand graph in which we tie in also things around basic data that you could find like form of graphics and intent that we've also built. But it also turns out that the knowledge graph itself, an initial intuition was that we'll just expose this to end users and they'll be able to figure it out. But it was just too complicated, right? It really needed another level of machinery and AI on top to take advantage of the graph and to be able to build prescriptive actions. An action could be, hey, I'm looking for, to solve a business problem. A problem could be I'm looking for, I'm a startup. I'm looking for IoT startup. I'm looking for manufacturing companies who'll buy my product. Or it could be I am a venture capital firm. I want to understand what other venture capital firms are investing in. Or hey, I'm Tesla and I'm looking for a new supplier for the new Tesla screen or things of that nature. So then we apply and build specific models, more machine learning or layers of machine learning to then solve specific business problems. Okay, so. Like the reinforcement learning to understand next based action. And are these models associated with one of your customers? No, they're general purpose, they're packaged applications. Okay, tell us more about, so what was the base level technology that you started with, in terms of being able to manage a customer conversation? A marketing conversation? Right. And then how did that get richer over time? Yeah, I mean, we take our proprietary data sets that we have accumulated over the years and manufactured over the years. And then co-mingled with customer data and actually, which is we keep private. Because they own the data and the technology is generic. But you're right, the model being generated by the machine is specific to every customer, right? Obviously, the next best action model for a pharmaceutical company is based on doctors visiting and what is this person an oncologist or what they're researching online. And that model is very different than a model for demand base, for example, or Salesforce. Is it that the algorithm is different or it's trained on different data? It's trained on different data. It's the same code. I mean, we only have 20, 30 data scientists, so we're obviously not going to build custom code for it. So the idea is it's the same model, but the same meta-model, but it's trained on different data. So a public data, but also customers, private data. And how much does the customer, let's say your customer's Tesla. Right. How much of it is them running some of their data through this bootstrapping process versus how much of it is your model is set up and it just automatically, once you've bootstrapped it, it automatically starts learning from the interactions with the Tesla. Yeah. With Tesla itself from all the different partners and customers. Right, I think there's, we have found most startups are just like learning over small data sets, which are customer centric, right? What we have found is real magic happens when you take private data and combine it with large amounts of public data. So at demand base, we have massive amounts of public and proprietary data. And then we plug in and we have to tell you that, hey, our client is Tesla. So it understands the localized graph and knows the Tesla ecosystem and that based on public data sets and our proprietary data. Then we also bring in your private slides whenever possible. Present. Slice of data. So we'll plug in, we have code that can plug into your website and then start understanding interactions that your customers are having. And then based on that, we're able to train our models as much as possible. We try to automate the data capture process. So in essence, using a sensor or using a pixel on your website and then we take that private stream of data and include it in our graph and merge it in and that's where we find our data by itself is not as powerful as our data mixed with your private data. So I guess one way to think about it would be there's a skeletal graph. And that may be sounding a little too minimalistic. There's a graph. But when you work with, let's say we take Tesla as the example. Sure. You tell them what data you need from them and that trains the meta models. And then it fleshes out the graph of the Tesla ecosystem. Right. And then- Whatever data we couldn't get or infer from the outside. From the outside. Right, and we have a lot of proprietary data where we see online traffic, business traffic, what people are reading, who's interested in what for hundreds of millions of people, right? And we have developed that technology. So we know a lot without actually getting people's private slice. But whenever possible, we want the maximum impact. So it's actually simple. And let's divorce the words graphs for a second. It's really about let's say that I know you, right? And there's some information you can tell me about you. But imagine if I Google your name and I read every document about you, every video you've produced, every blog you've written, then I have the best of both knowledge, right? Your private data from maybe your social graph on Facebook. And then your public data. And then if I knew, if I partnered with Forbes and they told me whenever you logged in and read something on Forbes, then they'll give me that data. So now I really have a deep understanding of what you're interested in, who you are, what's your language, what you're interested in. Okay, cuz that is sort of simplified. Similar at a much larger scale. All right, let's take a pause at this at this point. And then we'll come back with part three. Excellent.