 Hello and welcome. My name is Shannon Kemp and I'm the executive editor of Data Diversity. We'd like to thank you for joining today's Data Diversity webinar, Integrating Structure and Analytics with Unstructured Data, sponsored today by IBM. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we'll be collecting them via the Q&A section in the bottom right-hand corner of your screen. Or if you'd like to tweet, we encourage you to share, or utilize their questions via Twitter using hashtag Data Diversity. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Joining us today is John Park. John is the lead production product manager at IBM. He has over 10 years of experience in relational databases and analytical solutions. He delivers high-performing value-driven products ranging from software to appliances and the cloud. With that, I will give the floor to John to start the presentation. Hello, and welcome. Thanks very much, Shannon, and thank you very much to all the attendees. And welcome. So again, as Shannon said, my name is John Park, and I am a product manager at IBM. I've been doing that for the past six or so years. I started my life doing actually engineering and development work in the realm of databases. And my career has really spanned a lot of different offerings, but all within the space of data and analytics, from reporting to predictive to prescriptive, from transactional data to warehousing, and now dabbling into the areas of graph and other cloud data services. Currently, I'm the lead product manager for IBM's data warehousing as a service, something called Dash TV. We just released it. I'm going to talk a little bit about it today because it pertains to my topic here. However, today I'd like to talk to you really about the state of database. I want to go through some ideas and thoughts on the market. What are driving market shifts and solutions to really solve these market needs? And as Shannon indicated, after the call, we're going to have a Q&A session, and I believe there's methods to how you can post your questions. And I'd like to turn that into really a good discussion around your thoughts and what you think of what I'm talking about here and some solutions or questions that may come up. I just want to warn you that I've had probably the most amazing cup of coffee today, and I've had two cups of it, so I may talk really fast, and I may go off on some wonky tangents, but I ask for your patience with me as I sort of go off on these various tangents if I do. I'm just showing here a slide which is essentially a standard disclaimer slide that we typically pull together for discussions around this. I'm going to just quickly showcase that and move on forward to the discussion. So I've really broken this up into three areas. One around what the state of databases today, what's going on in the world, more of a philosophical realm of discussion. Then I talk about who's really driving it, what the market is looking like and what the market is requesting, and then I'm going to get into solutions. And some of the things that we're seeing being brought to market to solve the problems that I'm going to be discussing in the first two sections. But before I get into that trifecta of an agenda, I want to talk to you a little bit about Moore's Law. I think we're all pretty much aware of and accept Moore's Law, and if you don't know Moore's Law, essentially Moore's Law states that every year the number of electronic circuits on a square inch of electronic board will double. It also proposes that the price to cost of that technology would fall suit in so far as a reduction in efficiency. Needless to say, I think Gordon Moore's Law really nailed it. And I believe, and probably so do you, that all facets of technology really benefit from Moore's Law. And it's not necessarily relegated to just electronics or hardware, but these scales of efficiency apply to technology across the board, be it software, be it development operations, be it hardware, technology, that software leverages. I believe that Moore's Law really describes and showcases how and why technology has accelerated so rapidly over the last 10 years. The realm of databases, you know, let's level-set, okay? Are databases on the decline? Are databases becoming or data structures or anything really to do with data decreasing in popularity or demand? And we all know the answer to that. Absolutely not. Case and point number one, Larry Ellison is still rich. Richest person in the world. And his technology is continuing to drive a larger market. Case number two, big data is still big. I'm not sure what the latest statistic is these days, but data is big and it's only getting bigger, okay? Case and point number three. Now I'm going to sort of use analogy here, okay? To me, data really is dirt, and what you do with that data is really the key asset. So just like in Canada, we have the Alberta oil sands or dirt. Until you really steam clean that dirt or sand, filter and extract it, then squeeze and transform it into something, will you obtain oil? That oil is really gold. Well, maybe not as expensive as gold, but you get the picture. A handful of years ago, if I were to do this type of presentation, I'd be saying something along the lines of Google is creating this much data. IBM is transacting with this much data. Facebook and Twitter are creating hundreds of terabytes a day in terms of data. Now today, folks like Google, folks like IBM and Facebook are not just creating data, but they're creating insights and tools to really refine that data. So for instance, Google Analytics allows common folk to gain insight into trends based on search entries. IBM and Twitter have released a service to understand your world Twitter sentiments. Facebook trends put Facebook patterns of trends in front of application developers to leverage. So it's not just about data anymore, but it's about refined data, making sense of that data, providing insight and value around that data. So today, yes, there is a lot of data, but the market is requesting that technology make data easier to consume and more available through innovation and commoditization. So innovation driving total cost motor is changing the database management system landscape. So you're going to hear me use these two words a lot, commoditization and innovation. Commoditization is really making it easier and simpler for people to consume, so getting to something where it becomes a facet of our life, and innovation is really changing all the time, making that easier aspect come to life. So I'm going to use another analogy. So do you remember the Blackberry? We all know who Blackberry is, but do you remember years ago Blackberry essentially ruled the world? It was a status symbol and it represented your importance in the food chain of an enterprise. Essentially, if you were given a Blackberry, then it meant that you needed to be connected because there was some essentially level of expertise that you had which needed you to be connected. So then as Moore's law kicked in, or innovation and competition wrapped up, what did we see? Well, Blackberry is not necessarily for the elite anymore. An idea of bring your own device became relevant within the enterprise, and the mobile phone essentially became a commodity, innovation, commoditization. The same thing can be said about database technology. Managing an RWS or managing a database system requires a high level of skill. In fact, it still does. There is an emphasis on certification and knowing how to turn the right knobs into the database. That's really what the core part of that vocation. Then appliance came along, or abstraction of technology came along, or innovation that decreased the amount of tuning in that administration. So made it simpler to use. Made that technology a step more accessible. Usability became easier. Costs, although high, led to usability increasing. Now with new players, we're seeing the technology decrease back complexity even further, making it even easier to access. So players like Amazon, like Parse, even the incumbent players like Microsoft, Oracle, and IBM are coming to market with data stores which make it very easy to use, commoditization through innovation. As well as the commoditization of this comes at decreasing costs. Now you can jump into the web, find a database, and be able to leverage a free service right out of the box. Redshift allows you to get one month free access. If you jump into Parse, you can do a trial. Microsoft does a very similar thing where they offer a free level of service so that you can go and try your wares, POC something, and essentially get the jobs done and work with your data. So highly complex systems are being replaced with this idea of loading with technology. And the number of players or the competitive marketplace in this database base is driving a new level of commoditization and democratization of the database adoption. Standards as well are causing the relative shift in the way that we view the technology that we use. So, for instance, we are seeing that the XML standard is being replaced by JSON. Where the idea of having a well-formed XML schema was parallel to the XML format, that's being now replaced by something where a variable schema definition can be used. Where we strive for absolute consistency of the data, now we're leaning towards a world where eventual consistency is a standard. As well, we're seeing that we do not in fact have to do not need to read a book. We don't necessarily need to read our data in one format or a role store. Now we have options because those options are more efficient. They're more innovative. They do the job a lot better. So, for instance, in the realm of analytics, we would be looking at something in a column store. There's ideas around how to use hierarchies to support document stores as well as graph databases. So, the way that we store data and the way that we view it has now been augmented to be able to create a solution for the problems that we see today. Net of it is technology is always changing, change is always a constant in technology, and this applies to the database management world where technology is driving new levels of simplicity, ease of use, and simply commoditization of data and databases. Technology is driving the most change. What database innovations are driving this commoditization and new level of adoption? Well, I believe there are six key technologies that are at play here. The first one being database as a service, so essentially cloud-based databases. In memory, so leveraging hardware, leveraging the fact that more is lost coming into effect here that it's going to make RAM cheaper and more accessible and faster. Document stores, Hadoop, graph technologies, and time series types of data. For the rest of my discussion, I would like to focus in on the three items involved here. I'm going to go through what those are, who are some of the key players in each of these technologies, some of the use cases, and delve into a bit of what technology they're complementing, augmenting, or replacing. What is it? So you may hear the term DBAS, D-W-A-A-S. I'm going to sort of mash them into everything at once and just call it a database as a service. It could be based upon a transactional workload. It could be an analytical workload. It could be to support things like JSON or XML documents. There are a number of different types of services around the database. But typically, they hold the same type of qualifiers. It's cloud hosted and provided provisioned. So if it's running on Amazon's cloud, or it could be in Microsoft's Azure cloud, or it could be in IBM's own cloud, there is some facet of scalability associated with it. So leveraging the components of cloud to be able to expand and contract the needs of that database, pertaining to the workload and the market demand. There's an idea around the tenancy model. So you can choose from a multi-tenant shared environment, which is essentially, you know, maybe there's a big old cluster in somebody's infrastructure that's running your database. So let's say it's a redshift instance. And within that instance, tenants or users are given a block of storage and a block of essentially compute associated with that larger instance. You have this idea of then neighbors within that single instance. So it's dedicated, essentially, your own box that you can use and you can run. As well with the database of the service, there's an aspect of included services. So in so far as administration, the monitoring of security aspects, those are also included as the service that you're purchasing. So again, this comes back to that whole abstraction or decreasing complexity, making it easier for you to consume the commoditization point. The idea of as a service is, okay, some of those things are going to be abstracted from you so that you can focus on working with your data as opposed to administering that database. In terms of monitoring, maybe there's a level of abstraction that occurs within that database that allows you to see only the key aspects of what's going on within your system. So maybe it's the number of connections, queries that have gone rogue, or something along those lines, as well as some key facets around security. Like is your data encrypted? Is your data going to be encrypted as it moves from one place to the other? Are you going to be able to hit your compliance audit? And are there features within your product which allow you to audit your own data? Maybe it's identifying sensitive data or masking sensitive data. This is something that can be included within that service. So who's playing in this space? So right now, we're seeing that the major competitors or the major database vendors are all in this area. Teradata, Microsoft, SQL, not SQL, sorry. Teradata, Microsoft, SAP, they all have a cloud service as well. But one of the things that I've seen a shift in is the era of .coms, with the adoption of open source and the ability to leverage open source technologies, we're seeing new competitors come in where they can essentially deploy a database based upon that open source software and create a database as a service through being born through the web. Players like 1010data.com or gooddata, they're all .coms who have come into place. Clouded.com, for instance, are all databases that are coming to the era of the web and are providing the service not from hosted or sorry, on-premises software but truly hosted and born in a hosted fashion. Now, database as a service, I sort of see two markets that database as a service can be leveraged from. One of them is really around those who have current businesses running databases on-premises and looking to extend into the cloud, treating these, you know, DBAS or DAAS as a service as really complementary data stores. The second market are really around these born-on-the-web applications, be it mobile applications or web-based applications that are looking to leverage a low-cost solution to be able to store and transact their data. Okay? So with the first case, where we have like current enterprises, maybe you have an instance of DV2 or Oracle on-premises, and you want to move away from a capital expenditure model into an operational expenditure model. So you want to move away from having to dish out, you know, X number of thousands or hundreds of thousands or millions of dollars with respect to licenses or SNS, and you want to move to a model where you're only going to be paying, you know, monthly fees or maybe a fixed-term fee, one- and three-year options. You know, Amazon has a great deal around that where you can, you know, get into a one- or three-year scenario with Amazon and really see some cost savings. And so how am I going to use it for you to go to your management and go to, you know, people holding your first drinks to say, hey, I want to move away from a fixed-cost model and ask, you know, instead of me coming to you and asking for $2 million, I'm just going to ask for, let's say, $10,000 a month to be able to run a database in the cloud. So, off-X versus cap-X. As well, there's this whole idea around simplicity. If you have the opportunity to play with one of these products that are born in the cloud, like Redshift, like DashDB, you'll find that it's very simple to deploy. Why? Because all of the complexities are taken care of. So the infrastructure underneath it, what server are you running? What's the storage that you're going to attach to? Those types of questions go away. And you don't need to spend time now tuning your database software to whatever hardware has been provisioned, maybe it's Intel, maybe it's HP. You know, that complexity goes away in this model of the database of the service. As well, there's a huge facet around speed of adoption of the technology. Because you don't have to worry about this infrastructure layer and you don't have to worry about the installation of that database software, you can really just jump into one of these services and essentially let it rip. Kick the service off. You know, maybe you're running a... you want to do a transactional environment leveraging JSON data and you want to kick up, say, a CouchDB instance. You can essentially jump into one of these services, indicate how many nodes that you believe you need. Maybe you need 32 cores of server processing power over four nodes because you want to be able to support a world by geography. I'm just making these numbers up. But you can request four nodes of, you know, CouchDB and you can simply deploy them, have them up and running within hours, maybe a couple days, and be able to just start playing with your data. Again, commoditization, innovation, driving change in the database world. Second key technology, in-memory. In-memory technology has really allowed databases to directly benefit from Moore's Law. In so far as RAM has become available in very large sizes, and it's relatively decreasing in costs. So in-memory has allowed us to be able to use it in a fashion as opposed to a swap space, so a data store instead of a swap space. So in-memory technologies are now being leveraged to be able to store either all of the data or a partial set of data. We're also seeing in-memory technologies being used to support not only highly transactional type of processing, but warehousing types of technology as well. And I think we are going to see in the very near future this idea of, you may have heard of OLTAP, or Online Transitional and Transactional Analytical Processing, or H-TAP, Highway Transactional Analytical Processing, each tab, where from a single system, you would be able to leverage both a transactional workload and an analytical workload on top of that data to be able to essentially get real-time analysis of that data. So the benefit of having in-memory technologies is really the idea of what I.O. is the best I.O. The best I.O. is absolutely no I.O., and that's what in-memory provides you. Some of the key players in this market are, of course, FAP, with their HANA offering. Oracle has an idea of an in-memory database as well. IBM has their blue acceleration technology, and we have seen and know of other database vendors such as SolidDB, VoltDB, and PartStream, which have been using in-memory to supply them with essentially in-memory database technologies. I'm going to move on forward to slide the next slide here. Document Stores. So this is really referring to JSON or XML type of data types. Excuse me here while I just take that prank. Typically, they are hierarchical in nature, and they're used mostly in web and mobile applications. What we're seeing is actually a movement where the XML type of data structure is being replaced with JSON. Why? Because of this idea of eventual consistency, where it's just good enough is becoming a standard for application developers. So we're not having to keep within the confines of older ideas and standards around a database technology. In the realm of Document Store, I think all the key vendors associated with databases have some iteration of this type of technology, including, sorry, I'm going to get distracted here, but including IBM, their DB2, DB2 product has a facet of MongoDB associated with it, Teradata, of course, there's CloudEnds, and you're going to see more and more of these database vendors adopting JSON and being able to provide APIs or libraries to build the store, these JSON documents, within their database because it has become the standard for mobile and web applications. We're seeing them a lot within the gaming industry and Web 2.0 type of apps, and really because they're able to leverage the scalability of the technology and the databases associated with the storage of these. So I'm seeing in the comments here, they're interesting that you left out most to Enterprise and RetroDocs or MarkLogic. I didn't do that on purpose. I think MarkLogic and what they've done with respect to their movement from XML over into JSON and creating their own proprietary JSON store, they are definitely a huge player within the market. I know Gardner has put them into their magic quadrant from an operational data store perspective, and they have a very key product in market today around JSON, and they've been very good at transitioning from the XML technology over into JSON and really hitting the enterprises in that space. So I think that's a really, really key piece. What are the effects of these technologies on the world of database? Well, starting to drive answers to questions. Questions that you may be faced with in your day-to-day work around how do I control the sprawl of desktop database manager systems, or how do I collate and aggregate large data sets, or how do I take control of license and capital costs that are really affecting my work and really causing my brain to get scrambled around how I organize and manage my data infrastructure. And these technologies, these core technologies are causing us to shift the way that we architect our solutions in order to be able to answer these types of questions. So maybe it's taking your desktop MySQL servers and saying, okay, well, enough is enough. We have way too much sprawl. We have way too much shadow IT going on in this enterprise. If the business wants a database that's very easy to access and easy to provision and easy to get up and running, maybe it's best that we allow them to deploy a database in the cloud. And we can create a tenancy model so that they can access the data and share the data as need be. Or maybe it's, hey, I have such a huge worldwide enterprise that I need to start consolidating data from Europe and Japan and from North America into one significant place. How am I going to do that? Maybe cloud is a solution to that, right? Maybe I can be able to say, okay, well, I need to go to one single place. Let's turn this cloud thing into an aggregator of all that information that's pertinent to the business. You know, when we talk about licensing and capital costs, I sort of alluded to before around the capex and off-ex model. There are benefits with on-premises and perpetual licensing, but maybe the technology funding is decreasing within your enterprise and you all get to a model but be able to leverage newer technologies. The off-ex model becomes a very favorable piece of business model. I spoke about, you know, the XML to JSON movement, really around market logic as well and our other vendors in the world that are seeing this movement from XML to JSON and being able to support the JSON format. But one thing that I haven't really touched upon is really this idea of open source databases and how they're infiltrating the enterprise or the business today. So what I'm seeing in the market is open source is being adopted. We know that CouchDB, ModelDB, MySQL, they're all very important. You know, Hadoop, HDFS, very important to the business and they want to be able to leverage that technology. Businesses aren't necessarily saying, hey, I want to contribute back to the open source community, but I'm willing to spend money on a company to be able to service and support this technology because I believe it's a pretty mature technology that I can use. And so we're seeing these open source technologies really come to the forefront and being consumed by businesses who may have traditionally just said, okay, well, I'm going to stick with an Oracle DB and create a whole Oracle infrastructure or I'm going to stick with, you know, I'm going to move to an entire Cloudera type of environment. Or sorry, the example is I was on Oracle and I'm moving to Cloudera. Now I've spoken about some of the technologies that we're seeing really driving the change in databases. I want to sort of delve into who's making these requests and some of you may be exposed to who's making those requests. You know, maybe it's a CIO that came in and mandates something to be done differently or wants to control costs. But I wanted to just put in front of you this idea of a group known as Generation D. So a piece of work done by IVM was really around the development of this idea of Generation D where we surveyed over a thousand enterprises across five different industries to understand how their approach to these key technologies such as cloud, data and analytics was driving their business. 70% of the respondents came back to us and said, hey, we know we have a lot of data. Now their biggest challenge was really what the hell do I do with this? I'm creating all this data, I need to get that insight and you need to make it easy for me to get to. And so what we found was essentially there are four categories of data users, this Generation D folks. So Generation D being the enterprises that are using data to drive business decisions. And there are four levels of these types of users where Generation D is the most data rich and analytically driven. So if we can think of analytics as having three types, there's descriptive analytics like reporting, there's predictive analytics such as forecasting stuff, and then there's prescriptive analytics which is essentially, okay, well, I'm going to tell you what's going on and provide you with some type of outcome or provide you guidance towards something, whatever you may be looking at. So if we were to take these data users and segment them into four different categories, you would end up in your traditional user base which is essentially using a limited number of, say, structured data and for really leaning on descriptive type of analytics. The second group being the analytically ambitious which really put a higher priority on data and analytics. Maybe they're delving into the idea of predictive analytics. The analytically enabled or data rich which have adopted this idea of, okay, I need data. I need data to be able to help make a decision. I get and believe in the descriptive and predictive components, and I want to be able to drive my business even further. So give me more forms of data to be able to essentially confirm my beliefs and provide me with the directions that I need to go in. And then there's Generation D which is that data rich and truly analytically driven type of organizations where they're applying structured and unstructured data to drive decisions, to drive that prescriptive or predictive type of analytics. This group really excelled at insights with respect to the market or whatever the business is. And they're more likely to push for the modernized technology which is simply to consume more data. So they're the ones, this Generation D is the one who is asking for the new innovations, the new commoditization, the new modernization of what you do with data. Essentially putting the power into the hands of the many. They're the ones who are going to really say, I think the organization needs to take this culture and take this data analytics culture to the next level and I think it should be put into the power in the hands of everyone. So during this study, what we also found was that the Generation D enterprises typically have a better customer retention, a larger percentage of the revenue through digital channels, larger employee retention, and a larger market share specific to their industry. And some of the sort of statistics here, I'm not going to go through them all, is essentially saying, hey, you know what, by being able to access all the various different types of data from the structured to the unstructured, you're going to be able to identify new opportunities and you're going to be able to leverage those new opportunities to create channels to drive better businesses, better business models, better business planning. As I indicated before, Generation D also will champion for the latest sources of data. They're the ones who are going to drive that modernization story and request even more sophistication and commoditization of the data. So they're going to be thinking, okay, well, I know there's this streaming data coming from Twitter and I know people are hashtagging me and I want to be able to access that streaming data and I want to be able to analyze it in real time. IT or someone, give me a service, give me the technology that's going to allow me to take that dirt of Twitter data and turn it into the golden nuggets that I can make a decision or create an opportunity out of. And Generation D are the ones who are asking for these new types of data sources and new types of technologies to be able to leverage this new type of data. Sort of to the point to the last chart, Generation D enterprises are the ones really pushing for the latest technologies. What we found was they're more, I guess, what's the word I'm looking for, willing to move to a new technology like Cloud, although Cloud isn't necessarily that new. But there are always preconceived notions around Cloud with respect to security that can be seen as barriers to entries to some other types of data of consumers. Generation D enterprises typically view Cloud as an opportunity to be able to drive faster insight, be able to store more data, and to be able to leverage more technology and to be able to get to whatever they're looking for. As well, within the environment of Cloud and within the environment of new application development, there's a higher demand for things like API calls as opposed to your traditional development of scripts or using, I guess, off-the-shelf type of technologies, like BI tools, where these folks are either using APIs or UDFs or other scripting languages like R to be able to gain the insight, and they want to be able to get to that data, not through maybe an SQL call, but through an API call to be able to access that data very quickly. Generation D's call. So what we've seen in our study with these 1,000-plus enterprises were use cases, what I'm showing you here, just some use cases around how they're leveraging newer technologies. Maybe it's leveraging Cloud to be able to deploy database and give access to that data to all of their employees within the enterprise. Maybe it's being able to use streaming data, maybe it's being able to use mobile applications and JSON to be able to access different data points across a larger geography. Or maybe it's just being able to create APIs that an enterprise can use to be able to access data. Essentially what we're seeing is that Generation D is really causing a shift in the way that we provide data and insight to the end consumer. It's not necessarily just in the hands of a few, the few elite, it's actually a democratization where we're seeing data being able to be dished out to the entire enterprise to all the users. And this is going to become even more proliferant and grow as we move forward into the next number of years of technology. I'm going to move into the third portion of the discussion around solutions. And just one of the solutions that I want to talk about is something that is provided by IBM. And this goes to the whole idea of unstructured to structured data analysis. And I'm just going to quickly go through a very fast demo of two technologies that are leveraging the key technologies I spoke about earlier that allow you to gain insight and be able to answer the call of Generation D. So IBM has a database of the service and a data warehouse of the service. The database of the service that we have in our portfolio is called CloudOnt. CloudOnt is based upon the open source CouchTV and is really a highly transactional, highly available document store for the JSON format. We also have something called a Data Warehouse of the Service which is my product called Dash TV where you're able to store data in memory in a column format to be able to drive some type of analytics be it descriptive, predictive, or prescriptive. So let's say I'm a TaxiCab owner in New York City and I have a mobile application similar to let's call it Uber. And I have all this data coming from my mobile applications that I'm storing in a JSON database. Maybe it's CloudOnt, maybe it's MongoDB, maybe it's some other JSON format, maybe it's MarketWatch. In my instance of the database specifically in CloudOnt I have this data sitting and residing here and I want to do some analytics against this data. So natively, JSON is a pretty messy format. It's schema can change, you can't really run SQL against that and it's very difficult to analyze. So if I wanted to get any real insight out of this JSON data I'm going to have to turn it into something that makes sense. Typically something that I could run SQL against or at the very least be able to dump into let's say an Excel file or a CSV and run in Microsoft Excel or maybe dump it into a MySQL server and be able to run SQL against it from there. Or maybe it's dumping into a CSV and pushing it into the various ways that you can probably do this. But what we've done is we essentially said, okay well we know there's JSON data within a CloudOnt database and let's say me being the TaxiCab owner I'm like I want to learn about my data. I want to take some stabs at this data. So within CloudOnt what we have is this idea of warehousing where we can deploy a dash DB instance or essentially a relational database to be able to leverage SQL. So put this dirty data, this highly variable form data and put it into a format that we can actively query and look at. So through a few clicks I can come in here and say okay I want to isolate my New York City TaxiCab data and I want to be able to turn it into something that I can query against and I'm going to use this new warehouse feature in CloudOnt. So I go into CloudOnt and I just essentially say okay well the database that I want to use is you know I auto form it into New York City Taxi and I choose that database and I say okay well I want to create my warehouse associated to this data. And so through a few clicks what's happening underneath the covers again CloudOnt is a database as a service so it's in the cloud, specific to JSON, dash DB or the warehouse that we're creating is a warehouse in the cloud. So we're leveraging cloud technologies, we're using JSON and we're using in-memory technologies to be able to take that New York CityCab data and essentially look at the schema as it sits within the hierarchy of the JSON format and shred it up and then push it over into the relational database or dash DB. And so some magic happens underneath the covers where the schema is determined and essentially we create that service, that data warehouse service or dash service. And that's available to you through the cloud and UI. And if I click that essentially that dash whatever it is that takes me over to my next window. So if I were a business analyst and I saw this, you know, it's very simple to do. Again, I want to be able to access all my data. I don't want to wait on IT. I don't want to learn, you know, CC++. I don't want to learn about connection strings. I just want to get to my data and analyze it. This allows you to do it. That GenD guy, he can go and now dive into his data because he's taken his JSON data and he's moved it into this relational DB. And from here I can start, you know, doing cool things with it. I can go to my tables, I can go in, and, you know, I can actually start running things. So if I wanted to create new tables or run a SQL query or connect, you know, a tool, maybe it's MicroStrategy. Maybe it's a proprietary tool. Maybe you have some in-house, you know, tool that you want to be able to run against. Maybe it's an aginity metadata layer tool. You can connect it to dash DB very simply through, you know, JDBC or JDBC-type connects. But let's say you didn't have that, right? You didn't have a tool. You could, and you just, you know, maybe you wanted to have a one-stop shop. You didn't want to be able to connect cognizance to this thing. You can come in and you could essentially deploy in our environment within Dash. And I'm just going to skip through some slides because they're not really in the nicest order. But within Dash you're also able to go in and, you know, work with your tables and do database-type of things. But again, what you're not seeing here are like configuration or administration-type things, right? It's very much end-user-focused in so far as you know you want your data in there and that's all you really care about. What you want to be able to do now is analyze that data. You don't want to have to massage it. You don't want to have to tune it. You don't want to have to make sure it's going to perform, like, optimally. So as a service, you know, Dash allows you to do that. And this would be similar to some other database as a service, including, you know, what's offered by SAP, what's offered by on the Azure space, where it's very end-user-focused. Maybe the UI is API-driven. Dash, we're very much UI-driven, as well as leveraging whatever tool that you may use. So we get, you know, very basic administration-type things like monitoring. So let's say I wanted to go in and just play with my data and just look at it. So as you can see here, you know, in my instance of Dash, I have my New York City taxi cab data. So that was essentially all the JSON data and that's dumped into Dash DB. And I can go in here and say, okay, well, it's chosen the best-fit schema based upon, you know, whatever the schema was over in my CouchDB instance, now I can start doing some cool things against it. So maybe those cool things are SQL. I can go in and start doing some SQL queries against it. Maybe I just want to do a CalSTAR. Maybe it's a very simple join. Maybe I just want to get a result set that I can dump into an Excel and, you know, just run a chart against it. You know, who knows? Or maybe it's essentially, I want to do something cool. Maybe I'm a data scientist and I know what R is. You know, R is becoming essentially the standard, the facto standard language for predictive type of analytics. Maybe I want to do something with that. I can jump in here and essentially kick off an instance of RStudio and R, the R using leveraging the open source R language, to be able to write scripts or query scripts like sandbox scripts that are leveraging the R language. And I can jump in here, connect to the database very simply. This is all provided again within the service. And, you know, through RStudio, be able to gain insights around that New York City tacticab data. So maybe I want to understand, you know, when is the best time for my cab to go from, you know, midtown to JFK? I can sort of analyze this and say, okay, well, how do I strategically locate or position my cabs to ensure that I have the most penetration within that geography to be able to get, you know, the executives, you know, on Wall Street over to the airport as quickly as possible. And I can do those kind of things. Maybe I want to collate that to weather data. And I can do that through R by essentially saying, hey, here's a data set. I can load it into Dash through a very simple API call here, or sorry, call here, and get that data to be able to cross-reference and create some sort of insight based upon the two variables of my, you know, on-premises data, or my New York City JSON mobile-created data and the weather data that's publicly available to me. And I can start, you know, essentially doing interesting things within R to be able to get that insight. There are a lot of new technologies coming into the database world. And we're seeing vendors adopting these new technologies and making them very easy to access, very conceivable to leverage. I think there is going to be a accelerated shift towards just ease of use, ease of procurement, ease of payment that are being driven by the demands of a new type of user. That user being Generation D. And Generation D could be, you know, the business analyst, the data scientist, or it could be the mobile application developer, the web application developer, who are really pushing up new applications very quickly and bringing them to market to be leveraged very quickly by either the enterprise or by the general public. And so in order for us to actually do something and be prepared for them, we have to be cognizant of these technologies. And hopefully I've sort of shown you a view into what we believe the latest and greatest technologies are coming down the line to be able to help support these new consumers and users. And, you know, I've opened some questions here. I'm going to be at Enterprise Data World in a week, so drop by booth 306 to talk. And thank you very much. Over to you, Shannon. I think we have time for Q&A. Most definitely. So, you know, one of the most popular questions that we get from everybody while you're submitting your questions in the Q&A section in the bottom right-hand corner is, you know, people have already asked, of course, for a copy of the slides and a copy of the recordings. And just a reminder, I will send out a follow-up email within two business days, so for this webinar by end of day Thursday. With links to the slides, links to the recording and anything else that is requested throughout the presentation. So there was one comment here that just throughout the presentation I said I would say build more, grow more, sleep less. That was in relation to one of your slides there. First question for you. Does cloud provide ODBC, JDBC interface, not APIs, so on-premise tools script can connect to cloud? I was talking on mute there. My bad. So the question is around ODBC and JDBC connects from on-premise tools into clouded. So the ODBC and JDBC connects into, they are supported by Dash, and that's probably one of the best ways to connect into the database. With respect to clouded, there are secure methods to be able to connect into clouded. I'm actually not 100% sure if they are ODBC or JDBC specific, but there are multiple API calls and security leaders associated with clouded that allow you to leverage that database. I love all the questions coming in. Do we upload data into data as a service or database as a service using web services? How do we upload data? So how do you upload data? So there are multiple ways that you can upload data. So in the realm of clouded, that data is actually created within the cloud environment because it's used in a transactional arena. So the data is being created on the fly and that data store is leveraged there. Now in the realm of a data warehouse as a service or let's say a database as a service that is going to offload data from an on-premise environment into the cloud, there are multiple ways that you can do it. Some of the ways are really okay, maybe I open up an SSL connection and be able to push data from my on-premises up into the cloud or into the database service. That can be very slow because you're dealing with a single small pipe to be able to do that. Another way is to be able to push your data into the cloud service provider. So maybe you want to leverage the Amazon storage space and you want to put your data there and then migrate it over into your database service or software. So we've seen clients do that and use that route as an initial load and then do a trickle feed or continuous load thereafter. Or you can use ETL tools to be able to make that happen. Maybe it's MicroStrategy or Data Studio. There are also other companies that are coming out with tools that will allow you to really accelerate the loading of data into a data repository. So for instance, companies like Aspera, which are built to really create accelerated pipes for ground-to-cloud type of data movement, they typically offer a better service when it comes to movement of data than what you would find off the shelf or just using an HTTP or SSL type of connection. None of it is there are many ways to do things. Absolutely. Many roads to one location. So, you know, John, data governance is such an important topic to anyone who works with data and it proves true here. And the question is, you know, how does this change data governance as it exists today? So that's a really good question. I think it causes us to focus a lot more on the governance of data, okay? And what I mean by that is, if you have an on-premise system, okay, or if you have, let's say, you know, a Cloudant local machine, or maybe you have a data warehouse appliance that's on-premises, you know, the governance of that data, I think there's less risk around that because it's within your own walls. You can control it. Maybe you hook it up to, you know, some type of vorometric server to be able to run the governance around it. However, when you move into a newer technology like Cloud, you really have to be cognizant of what that service provider is going to give you. And what I mean by that is, you know, you're giving up control, some level of control. You're giving up some level of being able to say, this thing sits within my walls, so I can't control it. Mr. Service Provider, what is your SLA? What are the things that you're going to give me within this service that are going to ensure that governance levels or auto levels or compliance levels are on par with what I need? And so, really, the vendor now takes on that onus. The vendor becomes the person that really has to understand their end-user and integrates the technology that's going to give the end-user the confidence that the Cloud is going to continue to essentially keep their governance standards up to par with their security officer. So, things like encryption, right? Having data at rest encrypted, you know, that may be a priority for you. Maybe you have to hit PCI type of compliance. You should be asking your vendor, whoever it is that's going to provide you the service, do you have that? Do you have encryption at rest? How do you encrypt data in transit? What is your, you know, what are you doing about sensitive data? Is that automatically masked? Are there things around, you know, key management? What's your key management strategy? Are you doing anything to help support audits, right? These type of questions need to be asked upon those service providers to ensure that you're going to be getting that level of governance. Because, again, you know, the security of being able to do it on your own sort of goes away as you adopt Cloud. So, the next few questions, John, to the same thing, for both Cloud Int and for DashDB, is there a free trial or training for either or both? Yeah, so one of the things that I sort of mentioned in my slides here around this whole idea of commoditization is this free aspect, right? And being able to tire kick, essentially, the products or services that you want. So, both with Cloud Int and with Dash, there's a level of free, okay? But I think that's becoming the status quo across the industry, where you're going to have some level of free where you can go and essentially, you know, pick and choose between what service provider that you want to go with. Is it, you know, Cloud Int for JSON? Is it going to be something in the Azure space? Or maybe it's DynamoDB or maybe it's DashDB? There is always a level of free and Dash and Cloud Int definitely support that idea. Maybe if I can't may bug you, John, to send me the links to those and I'll get those out in the follow-up email to everybody along with the slides and the recording. And Tony Drain... Yep, Cloud Int, so cloudint.com, Cloud Int does an amazing job with respect to providing examples and education around getting up and running very quickly and through the API libraries. DashDB as well. As we built out DashDB, we put a very large focus on the usability of it. And if you go to dashdb.com, that will direct you to essentially, you know, how to instantiate a free instance and get playing with Dash right away. Perfect. Thank you. And the next question is, is DAA, the database as a service, could be the best option for real-time data such as Twitter feeds or DAA, database as a service, is better suited for historical data? So it depends on how you, I guess, define database as a service. So if I were to think of database as a service as, you know, something like JSON or, you know, Mongo or approachDB type of instance, you know, maybe that isn't the greatest thing for, like, accumulating Twitter data, right? Where the data warehouses as a service would be because, you know, Twitter data, which is streaming and always coming down the line, you probably want to be able to store that somewhere so that you can essentially analyze it to, you know, determine some trending against that data. So typically, database as a service is held within the realm of transactional type environments. Okay, so think of ping and pong, you know, always happening, highly available, highly transactional data movement. Data warehouses as a service is really around the idea of analytics and reporting. So if it's something in the case of Twitter, you have Twitter data coming down the line, you want to analyze that, you know, historical data, you'd probably want data warehouses as a service as opposed to databases as a service. Perfect, and that is right on time. John, we're at the top of the hour here. So thank you so much for this great presentation and information. And again, just a reminder to all of our attendees that I will be sending out links to the slides, links to the recording by end of Thursday, so if you don't have that in your inbox by the time you walk in on Friday, let me know. And thanks to all of our attendees for attending and taking the time to attend and to, for being so interactive in everything we do. We just love all the questions that come in during the Q&A. So I hope everyone has a great day. And again, John, thank you so much for the presentation. Thank you. Bye.