 Let's see, I'm just gonna start off by saying that choosing technologies is the point of writing the book. Our entire chapter 12 is dedicated to the processes of how do you match these complex business requirements with this no incredibly complicated solution space. So it's a very important process. We've tried to borrow a lot of the techniques from Carnegie Mellon to pull it in and do that architectural trade-off analysis. There's never gonna be a perfect match and trying to understand those trade-offs and getting your customers to rank their priorities is a really hard process. So I encourage people to think about it very carefully and I think the thing that I've learned now is that it's getting more complicated. We have new vendors with new approaches. We have new vendors that are entirely entirely rewritten their stack all around solid state drives and they're saying that that's the only way they're gonna get the performance by by rewriting things for solid state drives. So I think things have gotten more complicated. I think it's a very challenging question about choosing technology and I'm gonna just turn it over to some other people and have them introduce themselves and tell us what you learn and what you like and what do you think about the process of selecting the right database. Karen, you want to go next? Next. First. So I always enjoy coming to this conference because as a data architect I'm really doing not only SQL, so SQL and no SQL stuff, which sounds very circular. So you know my sort of mantra that I looked for in all the presentations was that I believe that every good design decision, every good architecture decision comes down to cost, benefit and risk and we can't just talk about what's the right solution for everybody and I think that discussion is kind of weighing the bit. Thank goodness. That's also part of a motorcycle is that there's a right or wrong instead of this is just part of our toolkit and that I did see in a lot more presentations this concept of it's not only SQL and that there are certain use cases where these things work and certain use cases where other things work better and that we all have to fit within our own sort of cultures and expertise and everything and that's the type of discussion I want to see going forward. I definitely agree with with all that. I would love to disagree just to make it exciting but what can I say. One thing that I found interesting I don't know how many people here saw the lightning talk but we now can do teleportation and I think that's that's amazing. I'm not quite sure what has to do with no SQL but definitely think that that's a great new thing coming along. Also really like that we had a couple of people speak about anti fragility and it's really important when you're designing I mean the reason that no SQL has become so popular is that we're running against all kind of limits and it's really good as developers to think about prepare for the fact that you're going to have failure prepare for the fact that things are going to go down and just be be ahead of the curve and and plan for it and I think that that's a really important aspect of where we're going. Great so in my role I'm running a company that is doing development consulting and training we get exposed to companies who are trying to identify technologies and identify the best way to adopt them. So in my experience I have found that one of the key three questions that you need to ask are what are the things that matter to you when it comes to data. Is it performance? Is it different data formats? Is it the large amounts of data? Then when you identify these questions you can prioritize them. It is very difficult to prioritize them so that's why you should prioritize but don't agonize you don't want to make up a three-month project just making a decision about priorities and the key thing is when you are trying to answer these questions I encourage you to try out things you should never trust the benchmarks of vendors you will find some really surprising results when when you look at one vendor which for example promotes excellent performance and there is another one which has much lower performance when you create a benchmark that is going through your use cases with your data formats with your volumes you may actually find very surprising results so somebody who appears like underdog in publicly available benchmark may show up as a winner in your specific benchmark so always try it out yourself try to try it out not only on one developer's laptop but try to organize cluster that would handle data which are very close to to what you have in production and if needed you can move to the to some of the clouds set up the infrastructure there and and run some more realistic benchmark I think this is one of the key things that you need to do the fourth thing is really related to changing the mindset when you look into traditional data organizations you will see that the mindset is traditional relational and it was like that for the last 15 years now if you present just exciting technical advantages of some of the nos equal technologies they may not buy into it you know and and usually with a significant change in technology there is always significant change in power within the organization so traditional data office may be actually quite resilient to abandon this power and relinquish it to somebody some newcomers so this is why education is critically important training data departments to learn to perceive things in the different way to see things in a non-relational way is actually one of the key challenges and it is a process you need to educate everybody from the individual architects all the way to the city of the company what are the challenges and benefits and they should not expect miracles there are no silver bullets but you will end up with a plenty of aluminum bullets with all different nos equal systems and I think that's exactly what it's about it's all those different nos equal systems so it's about the idea that one size does not fit all which is the key the key take-home really I would say from the whole nos equal movement and I guess if there's a trough of disillusionment to go back to the question then it's probably around the name nos equal as much as anything I think it's a really unhelpful name and I'm probably allowed to say that because the guy who came up with it was actually on the Akunu team for a couple of years and every time you reminded him there's like hey look you know everybody's calling a nos equal and that was a phrase that you came up with used to hold his head in his hands and say yeah I didn't mean it that way and so it's not really it's not really about not having sequel it's it's it's an aspirational title that perhaps is really more about scalability availability and performance and data models as well to your point that aren't necessarily the right shape for fitting into a traditional relational pattern and it's aspirational because those things are fundamentally hard to achieve you know we are dealing with data sets which have increased in volume and in variety and in velocity but fundamentally it's about value to go that Karen's point you know you've got to you've got to think about the economic cost point that the day processing the data to any particular data set justifies you know taking on work is risk but really it's about the economic it's about the economic cost point and there doesn't need to be one overall solution that works with the data that that you have in your enterprise I think I think those days have gone and I think the nos equal movement as a whole is really embrace that and the education the education around that is really spreading I just wanted to respond to Vladimir's kind of comment about the benchmarking one of the lessons I've learned is that benchmarks usually don't test the type of load that you're going to have in your collection set and especially when you see companies that are only testing read and write performance and the biggest thing to remember about that is the caching is that most real world applications are running the same queries over and over again and so those queries are going to start to run much much faster the more you put things in cash and so one of the lessons I've learned is that you shouldn't actually be using anybody's benchmarks but your own and you should spend a huge amount of time understanding the load that you think you're going to have although that's really hard and building using there's tools like J meter that are very very good at helping you set up and simulate loads but then realizing that you're going to have to simulate the repetition to see if those caches are really going to be set up and working and where these systems use that so never trust a vendors benchmark trust their knowledge I mean one of the things I try to see is if if somebody says they're really good with solid state drives make sure you can go to their website and find out how they really utilize solid state drives but ignore their benchmarks and that's one of the lessons I think as Nathan Mars said we have to look at this as any other engineering and you got to measure everything and you know you need a baseline and you need to know how things are changing and measure whatever you're doing yeah and maybe on the point of on the point of benchmark as well remember your system's going to change as well if you're building a project and you've benchmarked it really carefully and you've sized it really carefully and then it's really successful the number of users you're going to need to deal with is going to change maybe the hardware you use over time is going to change the hardware that Amazon installs or whoever other cloud provider using is going to change so it's not just about looking at all the parameters but it's about measuring what happens if you change a particular configuration of things you know what if people start storing data which is potentially larger than you'd anticipated if you started using servers which have solid state or even MLC rather than SLC solid state drives so you know you sort of not just looking for a single number but sensitivity to a bunch of parameters in terms of in terms of results there and you know no performance measurement is worth anything of course if it's not for your workload as you say but also if you don't know the dollar figure next to it okay I guess right is anybody have any questions that you'd like to have for the group here I have one question I'm going to start out with the warm the group up is we've talked about choosing technologies and I know that right now there's a lot of people that are in colleges and universities and they're taking classes on databases and I suspect only a small percentage of classes today are actually even talking about no sequel architectures so that is the students that are graduating today are still learning the dominant relational models as the primary and maybe they're talking about analytical and and that's about it so does anybody have any thoughts about what we can do to get a broader set of decision-making skills into the curriculum at the college and university level because so I actually a credit college and university programs all over the world and I have to tell you they're not learning relational databases either in a lot of programs if they are they're getting one assignment that uses a relational database and quite often that's Microsoft access at the undergraduate level or something open-source that their literal assignment was the equivalent of a hello world problem in databases so just to let you know because I get to look at student work not just the description of the courses with student work so all is not lost for no single reason. Having said that we're keeping pace but the way that you influence academic programs of study especially the undergraduate level is by participating in college advisory boards or teaching in them as well so taking that several zeros off your paycheck and teaching post-secondary education and the other thing is is to help instructors for the vast majority of people it's not computer science but people are studying its information systems for private post-secondary training organizations is to help them develop materials and labs and all of that stuff and providing support to them because the instructors aren't paid well they don't have enough time for them to go learn all of the new technologies and also develop all the courses so that's often what I hear from academics is that's how you get people to understand those basics. So like open source textbooks and that kind of thing. Open source textbooks and labs it's like it's not just resource material it's helping them get those technologies into the classroom in a way where people have no background in data theory. It would be cool to get them to use GitHub and then put some open source. They do all that yeah. So having spent some some years in academia I must say I'm a little bit skeptical about being successful with wide no-seekful education. I think what we are going to see is something very similar what we have seen with object-oriented technologies when they start emerging. We will see a couple of top universities with their research programs they are going to train small number of students majority of university were not picking on object-oriented technologies until they become mainstream until we had a C++ as being very dominant language and then Java starting emerging. Is that going to happen with no-seekful? We see that there is a variety of different technologies variety of different players it may be difficult for an instructor at college to pick on these technologies make sound decisions create some curriculum knowing also that things may change and next year the things that they are teaching may not be so current or even valid. So I think the most of the education is actually not going to happen in colleges for foreseeable future and I think the most of the education is going to happen through young developers getting into companies who are trying to experiment with some no-seekful technologies they will be reading tutorials you will have all kinds of informal learning group learning attending some webinars and short information segments that they would get from from a website. So I don't think that we will see organized training. In companies, yes, when the company decides to go with some technology then we create a curriculum and then there is a particular role of this technology adapted to the field in which the company is planning to apply this. I'd like to ask you a question on this which is are we even mature enough to that point where we should even be thinking about this or we maybe do we maybe need to wait a little bit longer for things to mature and and and to you know because we're still like feeling it out it's kind of bleeding edge. Well I I I think a lot of the core database patterns I think we're we're going to see key value stores in a database selection pattern for the next 20 years I don't think that's really changed I think graph databases are certainly not going to go away because they sell things uniquely I don't think analytical things are going to go away either and I think it's not about going away it's just about you know if we've really gotten the science down. That's a good that's a good point I think a lot of the techniques that we see concepts of sharding and replication aren't going to go away and I think those should be taught. I think distributed computing is very new but I think the techniques in distributing computing are relatively stable so I'd say there's probably a 60-40 that 60% of the things in our book are going to be pretty stable over the next 20 years and maybe 40 of it will change but I still think it's worthwhile to try to introduce it. One of the questions I'd like to ask is does anybody have suggestions on how we can get some of the ideas that have come across this conference out to the more general public and maybe it's maybe college and university is too narrow maybe it's even within local meet-ups or things like that suggestions it's the hand Eric you have a comment? Internships I would agree with that and you know I was a graduate student we were teaching no sequel essentially and distributed systems topics to undergraduates and it was pretty well covered actually and this was mind you this was a Cambridge where actually they insisted on teaching ML as an introductory programming language even though the invention of C++ and no C++ on the curriculum at all even though the invention of C++ was a graduate of the department so you know you can't reason with these things but I mean there are certainly many interesting hands-on I take the point the hands-on side is very it's very important very many interesting hands-on training sessions that you see done in workshops at conferences and meet-ups that we've been involved in as a community's been involved in as a company and an interesting area I think around distributed systems because a lot of it does as you as you mentioned come from the academic literature where there's a strong overlap between what researchers were looking at 20 years ago what is being used now there are been some interesting projects to sort of try and spread the knowledge around that so you're just thinking about the no sequel summers project which was set up by Tim on Glard and ran in I think about 20 countries around the world which was sort of informal do-it-yourself reading groups looking at academic papers and there were there were groups that we were involved in in the Bay Area in London and in fact we did actually offer internships to the at Eric's point I think out of just people that we met who were clearly very it capable interested in learning those technologies out of those reading groups that we did and yeah I think there is a lot of education going on in this space the problem is problems that we're trying to solve here are hard computer science problems how about up up here we have a comment no I won't take much time let you talk to two things which came to mind and during the discussion one of the other summit I attended for Cassandra I think sponsored by data stacks they had a scholarship program in colleges and I saw competition where they showed the graduate students or a Kansas City area so I'm a data coach and pioneer in the in the Midwest and a lot of companies there and it's been a wonderful week a lot of good learning but I have some questions kind of going back to Karen's lightning talk that hopefully people will you know I've heard about scalability performance database patterns but I haven't really heard a lot about anybody really wanting to understand the data they're pulling in or what about data accuracy data quality so I was wondering what kind of steps or tools are you guys using and what's what are you doing in that area or is that out on the horizon are you just worried about let's just serve it really fast so I and by the way I plan on taking this information back to Kansas City Dama because I'm usually the ones pushing things back so if there's anybody in here that wants to go to Dama chapters and talk usually they they're very open to speakers so I think for my perspective one of the things about specifically document stores is you can load up a lot of data very very quickly that you can query so one of the first things you can do about data quality is you can do statistical analysis of large data sets and specifically you can look at counts of elements and then exceptions and and and if you see a few records that have a few elements that don't occur in a lot of others you know those may be outliers and that's just a beginning of how you might approach data quality I think one of the things that I've learned is that data quality can happen after data is loaded and it doesn't have to happen before and so we use much more statistical you can use frequency counts of letters in last names as a good example to see what those data quality things are and if you start to see L's and zeros in last names you know that data entry fields didn't check for characters so I think it's statistics is a really big powerful tool and we would call that data profiling profiling so if you're researching about that that's what we would and we have tools to do that definitely in that sequel world but also pointing against spreadsheets and XML files and all that stuff right but how's that gonna work in with I mean we're talking a lot about streaming and real-time processing not gonna be that's a different case so not all use cases are streaming and proof you know so a document especially a document thing isn't as much focused on fast streaming sensor data yeah if I understand that we can also do it after the load like that's the whole thing I would like to add to that I have right now a client for whom we are building data quality system and they are getting data that is that often has a ridiculous values and some some of the text in these values have typos and so what we are doing currently we are planning to store this data into a columnar database and then we have a set of agents that are running various rules so some of these agents are using statistical information that during matching with the databases of known entities and we are trying to correct the typos for those subset that cannot be machine resolved we are streaming this data to data stewards data steward is a role where you have a human who is trying to figure out what is wrong with this data and for this particular client for example the the name of the company 3m is spelled in over 600 different ways in in the last 10 years or 600 different ways which is fascinating so what is interesting from the non-SQL approach I have mentioned columnar database so as we ingest the data we put the data in these columns and then whenever some of the agents apply the rules we are going to create the enhanced record with a timestamp and we also have the audit log so if you look at the at a particular record you will have the the history of this data who changed it was it the automatic agent was it a person and that then you have a full traceability of your audit log now when you're doing the queries you will be getting the data with the latest timestamp which is the best data that you currently know and if you need to reason quickly that may be the best that you can get if you have data store who is going to be a person who looks into that next day you may have the improved value so your your data quality is actually going to increase over time so could we move on to the next one or else yeah we won't get through enough questions one here and then we'll take your question my question is about the original topic of this discussion about choosing the technology so so far in the conference I've you know learned that okay I can make the first classification of do I want a graph do I want a key value store and do I want columnar or big table so say for example okay I understood based on my business problems that I need to have a key value store now within the key value store now I have before coming to the conference I had one choice after the conference I have five choices and out of them for a fairly new which I never heard of before coming to a conference now last three or four months I've invested in one technology learning it and building a proof of concept which I can show to my you know senior you know managers who can take a decision on it now I would be questioned being a developer that okay have you thought about the new technology which has come you know how many prototypes I keep on doing before I actually reach a decision or you know probably spent next few years keep doing prototypes employees or a consultant used to be a consultant now an employee so probably I made a bad decision already you don't want to do too many then anybody want to take that the paradox of choice yes when you go to the grocery store remember when I mentioned a priority ties but don't agonize you see so at one point you need to decide sometimes just flipping a coin is the right decision to make and when you find a system that seems to be right and go for it do a prototype and you may refine later your knowledge of possibly with some prototype which which does not need to be done under such time pressure another point of key value stores have relatively simple API security can be more complicated but the nice thing about that is if you build a data access layer and abstract it you should be able to switch out some key value stores with others without impacting the data access layer interface unfortunately the more complicated you get as far as your data structures the harder it is to divorce yourself from a vendor specific API I'm glad you said the word security because I haven't heard enough about that other than then access control there's not a lot of talk about it and and do we think that it's automatically secure that there no they're not going to be any issues discovered who feels that you know we have nothing to worry about here you didn't go to the cumulo session on putting security directly in the column store keys then oh yeah so I'm I miss that yeah a lot I think that's a big trend I certainly saw I think Mongo had a release that had higher level security Mark logic has had it for a while we have an entire chapter in the book on security just because we think it's an absolutely critical thing especially as you scale from a pilot into the enterprise grade thing and the biggest thing is that many organizations won't go beyond a single pilot without a good security solution and I think one of the keynote speakers from Mongo pointed out the fact that this is a critical thing for businesses it might be a checklist for some people but for other people in the health care and people who need audit trails it's a make-or-break break thing I think we have at least three solutions that I know of that have fine-grained security in different models today in the no SQL space and and I think we're going to start seeing more and more as these systems mature the reason that I brought it up is I had kind of realized looking at the the schema lists databases that you can do what something that I just called a schema injection because it will you know it'll accept anything any field that you pass it and I wrote a blog post on it a Russian podcast picked it up but other than that there's been no response I don't know like SQL injection but it's going to inject a gigabyte document right well I see this two ways right where you can you can do a denial of service by just injecting a ton of different fields or the other thing is you anticipate that the model is going to change so you stick something in there where you where you have something that will break a security layer that might be added on later on so I think that those are two angles of attack and that's something that the application pretty much has to take care of but we should talk about it you know absolutely and I think that has any time you have public interfaces that's a good point alright okay so last year I was in this place and one of the staff that was asked is what is going to happen for no sequel in 2013 right so there's talks about hey we are going to make it more secure we are going to you know have transactional acid and stuff like that so rolling forward to today we are still not there so what what do you think that we have done between 2012 and 2013 that makes it you know that makes no sequel more powerful now well I I see a huge leap in maturity at many of the products that we saw on the show floor I think the fact that we have security we have more databases with acid transactions we have more products that are highly tuned for flash and solid state drives I think we're starting to see a lot of these vendors mature and a lot more choices so I I think the market is actually maturing as as it should I see one thing that is really interesting and that is sequel is back we see that in several ways back or bad like it you know this this is not getting really popular so you can see a variety of systems that are coming back with sequel you could see for example cloud a rainpala you see the hatched real yep you see a Dremel from Google a big sequel from IBM allowed you to run the full sequel over your HDFS and age base loads on the no sequel on the on the columnar storage we have Cassandra which has a really nice sequel three interface it resembles more and more to to full sequel so I think this is one of the things that will make an actual adoption of this technology is easier yeah I agree I think that's a the addition of additional interfaces for existing data structures has really made the adoption strategies much easier for big companies yeah okay this gentleman has the mic so can we have so first is the comment on the how to make a popularization of no sequel I'll learn about that by reading a comparison of digital campaign for presidential action and I believe Obama cave that was a name that's why I first heard Hadoop statistical our language never heard no sequel only as a derivative from that so make a statement no sequel helps to be elected you will have attention of all presidents and all political figures in the industry and the second question is kind of a there's a joke half joke half serious is that happiness come from lowering expectations so what expectation we should lower about no sequel good good question and so I think the expectation we tried to lower was sequel and I think that was a wrong expectation to lower I think actually acid well I think the expectation was that we will lowered about the the advent of the no sequel age if you like was that we would remove rich queries from the frameworks and that's a bad idea people want to be able to make sense of data they don't just want to store data or collect it they want to be able to analyze it and understand its significance for their businesses I think acid semantics on the other hand was a great thing to get rid of in some cases not all but in some cases you know bearing in mind is very hard to talk in generalizations about no sequels of space because the whole point of different databases for different purposes is that sometimes you care about long-running complex transactions and that's where acid is great but for many for example we work in real time analytics space and there where you're collecting a high-velocity stream of events and you're doing analytics and then pulling out insight on that data you do not need transactional semantics because your sensor data or your financial market data or your clickstream data isn't going to change you're not looking to do updates on it you're collecting it it's immutable almost and you're looking to summarize it and extract trends and forecasts and and counted and analyze it in a number of different ways so in those cases the constraints of traditional relational databases not the fact that you sequel not many things about them except that they insisted on complex transactional semantics or gave you those transactional guarantees that was the that was the challenge and I think that was the expectation that we lower in order to gain happiness in the real-time analytics arena released and I think that's really important so if I'm going to represent team sequel which I don't team data that's me is that this this undercurrent first very blatant now only partially blatant that we have to trash all things sequel and all things traditional in order to raise you know what we're doing is really backfiring when you know you try to do this outreach because there's a lot of myths being sad about sequel and it's you know it starts with the thing I hear at these events all the time is that you know once you hit 50 gigs on a sequel database it all falls apart but that's clearly not true and there are not just some use case some case studies on that there's a bazillion case studies on that so then you lose credibility because you've asserted a fact that people in enterprise know for a fact of the truth is that if we in the no sequel world would just hype on best fit for solving a problem everyone loves to do that they want the best fit and in fact we went through the whole data warehouse coming about it was a way of bashing the relational model because we didn't normalize we didn't do follow the traditional relational model we did something different we aggregate we separate we load stuff in we don't care about the data quality we fix data quality problems but we take a lot of crap too in the data house world because the fact that the data is incorrect is also a good thing so if we continue the sort of bashing of those things it's going to be really hard to sell the fact that we've got what's really a better solution for a specific problem and I think that's where we need to go for all of those things so you're just advocating true facts true facts would be nice the truth of it all right or our best fit now conference yeah we'll do this there's some hands have been up at the back of the room I want to get to the back of the room before I take some of these folks here and I know that you've got one as well so recently I've started hearing about something called new sequel where they're talking more about the acid property and now we are also talking about no sequel also is trying to that acid property right so where do you see this industry going on right whether it will go more or no sequel or into a new sequel because they're talking new sequel is going to be more from the relational database management standpoint of it anybody want to grab them so I think new sequel databases offer well I mean interestingly they they they off they offer a lot in terms of allowing you to retain many of the great properties of the sort of toolchain available to sequel databases but while offering you some of the properties you traditionally associate with a no sequel database in particular scalability but there is no free lunch I mean the reason that no sequel databases emerged or some of the trade-offs that they made in the early days if you look at systems like dynamo and big table we're all about making specific trade-offs to allow you to scale out on commodity machines now certain no sequel databases certain new sequel databases almost you know they work round they bring you back sequel but at a price some of them don't bring you full sequel they limit this the scope over which you can do joins or full query semantics so sequel is not just sequel there's no such one thing really certain ones require you to run on proprietary expensive hardware platforms with high high high performance low latency interconnects in order to be able to get round the challenge of distributed systems by by using hardware which is sort of rather similar to what Oracle Exadata does although that doesn't get called new sequel and I think you know you just got to look carefully you've got a sort of you've got to sort of push aside the labels new sequel and no sequel and look at the specific trade-offs that each individual system makes and evaluate that against your need for example you know Cassandra has great support for building clusters across multiple data centers in a single in a single topology that's really useful if you're you're dealing with large data sets and you're dealing with a problem or you can be reading or writing from multiple data centers if you're not it isn't of significant value to you if you're dealing with very small data sets in and you're dealing with documents or with XML objects then different solutions will be a better fit for you so your poison carefully sorry yeah I would like to add to what team is saying the common thesis of new sequel databases is that the traditional or relational technology is based on constraints that we had in the mid 80s from from where these technologies originate so the thesis now is that we have a different world of the power of CPU is different we have more memory and we can design the the engines of our relational stores in a different way now there are also some other recognition that for example storing data in rows is maybe not the ideal way of storage maybe storing that in columns is better or maybe depending on your workload some workloads will benefit from rows some some other from columns so I see the acceptance and creation of new relational database engines based on these newer principles is what drives the new sequel and with the change in implementation technology they are easier to embrace some of the features that we see today in a no sequel so I think what will happen now conventional relational databases will include some elements of new sequel we see for example DB2 in its later version includes also columnar storage so you can choose between rows and columns so this is some element of new sequel in a traditional database so you see one shift that is going towards no sequel and you can see also the no sequel databases are adopting more and more of the sequel even though using a completely different implementation aspect so I think they will get closer but they will not touch each other it's all about tuning the knobs you know yeah all right back row here so we all know that Google is the one that came up with the map reduce paradigm and stuff but I also heard recently that Google ditched map reduce long time back just when we are all adopting map reduce and stuff so is that really true and if that is true what is Google headed towards like what are we looking on any idea so I think that should only matter if you're Google or another Google can someone just Google the answer to that right now I mean that should be available I think probably what you're referring to is Google's new global distributed database called Spanner which ironically in German means peeping Tom but I think the fundamental thing that I like to think about about map reduce is it's really about the parallel transformation of data in independent pipelines and that's a fundamental process that scales over large clusters whether you're processing images or transforming data and I don't think that's going to change I think that the concepts around scalable map and reduce functions which which you can do in almost any programming language world as long as you embrace immutability right no side effects are going to be universal and I think Google has is continuing to change their infrastructure to take advantage of better distribution but I don't think they're going to go away from any one fundamental change in transformation I think the way they do it in the software packages that they're using will change all the time. Hi yeah Rick Jones from NoSQL.org I have two questions one question and one joke my son came up with I'm a little biased but I'd like to see if I can make this generic question how can we as a community get people to learn NoSQL a lot better more than marketing and more than just conferences come like this how can we do that and I'll follow up with my joke my this is for my son my 14-year-old son so he said if Hollywood came here to this conference it would be the best movie in the world because there would be no sequel. That was a great question and I think that community is very much the answer local meetups and maybe as part of this conference we can even push doing that a little more I run the LA MongoDB meetup in LA along with some some other colleagues and I kind of wish that I had maybe done more just straight up NoSQL because I think sometimes some of the other vendors feel a little uncomfortable presenting but really you know we need a wide variety of speakers and also to have training training sessions because it's not enough to just do a one-hour talk and hear the different vendors it's really good to sit for maybe three days and and have some some trainers proper trainers do I know that you're doing some training as does Vlad so it'd be great to kind of do a tour of the cities. Okay so this is some observation leading into a question one theme we've heard here is changes constant today's NoSQL solutions will not be here tomorrow there will be new ones so one of the suggestions would be going back to some of the common characteristics going back to our computer science what are those like focusing back on that in terms of the education issue and then the question is what are those things that we really you know maybe are not being taught as well as they should be or not being appreciated by our students that we need to do a better job with so that people will be more successful with today's NoSQL technologies and tomorrow's. Yeah I think the one the ways that I try to approach it is to look for common patterns one of the things that I have on my bookshelf is the original Gang of Four book on object patterns and I think those patterns are somewhat universal they're a little bit more constant than other things and when we wrote our book we tried to really focus in on the patterns that we thought were going to be constant and I think architectural patterns the concept of key value stores and graph stores and document stores are going to be universal patterns and I think one of the things this year that I see a lot less of is confusion about terminology I think we are starting to agree on some of these terms no all right so talking about agreement I think I think actually Andy Gross of Basho did an interesting talk just immediately prior to this one which was looking at sort of what are the fundamental problems when you are building distributed systems what what happens when you try and make two machines talk to each other and I think he put up a quote which I'll horribly misquote now by Leslie Lamport which said that distributed systems are when your computer stops working when because of a dependency on some computer you didn't even know existed which is which is quite a which is quite a nice quip but but I mean seriously the you know this has been studied in in the sort of literature for 30 years it's just quite difficult to sort of expose it's getting better but I think all good like conferences like Rycon which is despite the name not actually about react just about distributed systems are fundamental building blocks which will allow you just like you learn about the nature of programming languages which will help you move from Java to C++ or to Scala or to whatever are the sort of tools and techniques that I would encourage everybody to sort of get their head round which is why sometimes you can't have it or you why CAP theorem exists and why you can't just have consistency and availability and partition tolerance altogether and those are useful general purpose tools which are universal truths unfortunately which which which will help when when building these things and I think that's important I think what's missing from this community is the fact that we have this current reputation of defining ourselves by what we're not which is a poor definition is that we're missing the manifesto the principles the splendid truths of why you know no sequel isn't just about the flavor of the platform it's about these whole important goals that we're trying to meet distributed high availability scale whatever those things are we need to be able to agree on what those goals are define it and explain it to a way that it doesn't matter what the software is and that's what we're missing exactly so I would like to add to that so what is essential for longevity of our ability to solve problems is that we have proper fundamentals now these fundamentals the way how they are communicated in schools are offered on a rather dry and theoretical way and people just hate them and they forget them as soon as the exam is over so what needs to be done is we need to work with fundamentals but then try them hands-on with some technology experience them how they work and at that level you will get some basic knowledge and hands-on experience but it is also not really not then what then mentioned you need to extract patterns and the patterns are going to be fragments of thought that you will be able to apply to various similar related technologies and you can think that these patterns are manifestations of these fundamental principles in a way with which we can apply them with technology that we have today independently from individual products that's why these patterns are so critical because when we have a collection of patterns in our mind we have a language that we can use and with this language we can describe communicate and solve problems we're gonna have to draw things to a close up just Adrian Kockroff's reading list was a pretty good one terms of patents or development pretty much at a time then okay all right with that I would like everybody to give the panel a great hand here