 So first, how many people have used Raven Report at all? Have you downloaded it or tried it? Check it out, I won't. Who heard of it before they saw it on the conference program? Alright, cool. So, because of that, I'll go through some general description of what Raven is. How it works, why it's interesting. And then go into some of the issues we've found in implementing at a larger scale and a high availability environment. That's really one of the things I'm going to focus on. How do you, and developers like those SQL databases are great. I was a developer. You can do a lot of things. You can change your mind. You can deploy really fast. But the operations team is equally important. So everything that we did and that I'm going to talk about here has a dual focus on developers and operations. And I think that's been a theme of some other things, some other talks that we've heard here, that developers may love it, but what happens once it gets into production? So we're going to talk a lot about that as well. So, Raven is a document database. It's similar in that way to Caltech and Mongo. It's not as similar to things like React or Cassandra. It's seamless. It's access to the Raven server. It's access to an HTTP API. One of the things that makes Raven a little different, and it's also been a topic in a lot of the discussions this week, it is fully assing. So when you're saving anything, it's always written to disk. And that storage is actually a key saint, which we've been around since Windows NT as part of Windows, but nobody really knows about it. It's actually the transactional disk to storage that's used by Active Directory and Exchange Server. So it's a very mature, very reliable disk to storage. Indexing. So you can sort of think of the Raven server as a package of services and tools and a really great development environment around eSent to storage and we've seen the search. So you've got to store your documents on disk, and we've seen there's going a lot of interesting things. You can easily extend the server. You can extend your client as well, but you can basically just drop the DLL into the server bin and it will get loaded up using that. How many dot net folks in here? About a couple. So met is a way of sort of dynamically loading the DLLs. Based on interfaces that you declare contracts. I love that. So you just drop the DLL and it will load up whatever new functionality you want to add to the server. I'll talk about some of that. The management UI is Silver Lake. It's okay. It's fine. We're sort of playing around. And we'll talk about some of the things we did at NBC to make it really suitable for an operations team. We actually built our own UI. And you can host Raven as a Windows service in IS where you can embed it. So if you want a desktop app, you can do that. You can do a license for that as well. The team is actually working on changing or adding a new storage in addition to eSense so that it can run on Linux boxes since eSense is a Windows thing. So they're very actively working on a new storage engine that will allow it to run on multiple platforms. So the client side of Raven, it ships with the .NET client. Raven is bringing in .NET, the server as well. But there are clients for JavaScript and Ruby out there, although they're not official from having made Linos, but it's pretty well supported by the community. And I think there are others underway or at least being requested by a lot of folks for Java as long as they're there. It's really just a wrapper around the HTTP API. It provides a lot of additional features, caching, and change notifications. So as a client, you can say, I want to know any time this document changes or I want to know any time this set the document changes or I want to know any time a document that matches the square each other. You'll get a notification on the client side. And that's done through... There's a technology called SignalR which is basically a WebSockets kind of thing built in .NET. That works really well. So it keeps the socket open there and it does fall back along the whole way. The link query is a big deal. So a lot of folks here are not .NET folks, so we'll show a little bit of link and why it's interesting. Another one that we use at this conference has been standard query languages. Will SQL evolve to where there are released products evolve to where they can match SQL into queries in the SQL databases or some other language. Link is a really interesting thing that addresses that issue. So licensing. It is open source. You can download it and play with it. You can contribute to it. The license is GPL, so it's copy left and careful with that. If you pay for it, you're off the hook for any of that. You can do whatever you want. There's also an exception for open source projects. You can get a free license if you have any project that you use on the side of the license. Commercial licenses are based on scale. There's also some stuff like the clustering support and compression and encryption of the data on disk. It only comes with the higher end. So briefly, I want to show the UI. So this is the Silver Lake UI that I mentioned. It's pretty simple. If you're spinning up a project, you're a developer, you're getting going. This is great. It's not great for an operations team. But you can go in and create new databases created to here. And when you go into one of them, you can look at documents. Here's the document. You can save it as an ID when I saved it. You don't have to do that. You can give a document any ID you want. I've got a little issue with my Silver Lake here. I'm not a huge fan of the Silver Lake UI. It won't let me type. It won't let me paste into that deal. So if you give it your own ID, it will save. There's another interesting thing where you we'll talk about ID generation and how clients can manage the ID generation later on. There are some interesting things which is why I showed that here. Collections are a way for Raven to route documents. Some people think of them as tables, but they're not tables. Everything in Raven is just documents. They're all the same. This is just a way if you have a .NET class and you save it through the Raven API, it will be tagged into metadata. So we'll go back to a couple of these documents and open one up. You can see the metadata here, which in this case only has stuff about replication, tracking this document in case it gets replicated. But if you save through the .NET client, it'll actually save additional information here about the .NET C of R type so that it can be deserialized back into the right type of object. The next thing we'll talk about that in some detail. You can start to see a little bit of link here, although this actually is more JavaScript than it is link. But you can find in index you can also find map reduce indexes. You can do a lot of calculation, a lot of statistics reporting. Raven is not great for sort of star schema style EI reporting. You probably wouldn't want to do that with Raven, but for things like I don't know how many comments are on the blog post about doing a counter problem over time. You can create a map reduce that will do that sort of thing, store it in the index and then you can just load it in the index as a trivial example. You can also do transforms which are server side projections. So typically you do a SQL query and say select star, connect, join line and then you start to pick out the fields you want to put in tables. You can do that sort of thing in Raven across documents. You don't want to think relationally about creating your document model. We'll talk a little bit about that, but you can, if you're loading up a document go grab some information from other documents and reduce the number of you're bringing back across the line. Patching is great. You can do a query or specify a single document and essentially perform operations on every document that matches the query. This is one of those developer friendly things where I want to keep evolving my schema. I don't want to start working around the schema I created before I knew everything I needed to know. You can say and you can do everything I'm showing here you can do in code as well through the UI. Everything is automatable or can be done as part of the deployment. So if you want to update your object model you can go ahead and update the data as well during that deployment. Some operations friendly tasks here for exporting and exporting backups. Turning indexing off there are some scenarios where you might want to do that and the product might not be as mature when it has acceleration. You can also do the logs. Just a quick tour around the UI. We'll talk about some of these things in more detail. Let me show a little bit of code as well here. Can everybody see that? This is .NET using the .NET client. Remember underneath this all it's doing is making HTTP calls to the server. A couple of reasons for our instance is running here. When you run it just as a standalone EXE brings up a console so you can see all the requests that are making. You can see some of the things we were just doing in the UI were making requests here. So how do you use it? When your app starts up you create a single instance of the document store and that's sort of your unit of caching. Everything you do within the context of the store is cached on the client. We'll talk about that. You initialize it. I'll come back to the conventions. If you wanted to just create a new object I created a speaker object. This is just a .NET type that I created here. A couple of types. Create a speaker object. Create a session object. One of the interesting things is that the speaker ID in the session object there's nothing special. I haven't done anything with my types here but Raven will recognize this as an ID of something else and it'll do some interesting things with it. I go ahead and save up to set you can by default you probably want to be using optimistic concurrency. It's not on like default and typically I'll do this in the unconstructed period that started up so that every session automatically gets it and then you store it and call save changes. This is a unit of work. Anything you do inside the session before you call save changes when you call save changes it'll happen as a transaction. Loading is similar. Just call load and tell it what type that you want to serialize into the ID. Loading with influence. This is to prevent the N plus 1 problem. I have a list of documents and then each of them has a property and I want to go get the other document that it points to. You can do that here. I load up the session and I'm saying when you load up the session also load the thing that has this ID in speaker ID, whatever it is. That way on this second call it looks like I'm going to make a second round for the server but it's already locally in this session and in the client section. This is an example of link. You can do all of your query for Lucene syntax if you want. If you know Lucene and that's what you like you can do that. If you're using the HTTP API that is what you're going to pass in your query string for the query. This Lucene syntax. In the much simpler way using link. If you aren't familiar with link at all in .NET it's language integrated query it's a SQL-ish language built in C sharp in the video of the .NET languages that is basically just translated into something else before the test. So these things these two chunks of code do exactly the same thing. It's just two different C sharp syntaxes we're doing it. One is method based and one is intended to look pretty similar to SQL. But again, it's not being translated into SQL like in all of Hemwood it's being translated into the URL that's being sent to the server with the Lucene syntax. Alright. So, why render? Why do we choose it for MEC and RNG? Just to give a little context of what we're trying to solve. This is what we call NBC News digital network included a whole bunch of websites in addition to NBC News and today it's CMBC, OSNBC all of the individual television shows like Dayline and Rachel Maddow Typically in these numbers they're as of a year ago I don't have access to the numbers anymore. A little over a billion a lot of video streaming a lot of unique users about 60 million a month and the traffic spikes are pretty big in the business so on average here we're looking during a U.S. work day almost all of our audience is U.S. because people around the world don't really know NBC very well but in a typical work day we're getting like a thousand eight news a second which translates into some number more for less per second and that's on the web server it's not on the radio but there are days whether it's a natural disaster typically bad news it could be something you know about ahead of time like election night or it could be something like a tsunami slash nuclear disaster or the biggest traffic day of all time and in talking with other people in the news business they found the same thing was the day that Michael Jackson had so we got a hundred X spike on traffic you know he the news broke late in the day we got a huge spike in the evening and the following morning about 8 or 9 a.m. these kids woke up another massive spike on the order of a billion plus pages in 24 hours so that's the kind of spike in this and we installed that in a very fast page load instant publish time so one of the measures that our editors and publishers use is when they hit the button then you'd be able to click to the browser and see their update if you cash too much just the usual tension between data freshness and caching to achieve scalability of concern we're also deploying 6 or 8 times a day things happen fast whether it's new advertising or new news events or new types of features so the ability to deploy continuously without any downtime is very very important and 0 downtime for the last year last week I looked up Amazon was down for a little bit Google has gone down Gmail and like Google apps has been doing so but I like to define high availability as what's the amount of time it's down before you have an uncomfortable conversation with the boss in our case it was about 5 seconds another way to put it is you can't have this so in order to do that you need support rolling deployments and rollbacks without any downtime what that means is you're going to have multiple versions of your code and multiple versions of your data structures that say if you're changing adding property or moving a property changing the shape of your objects multiple versions are going to be live at the same time in our case we had a lot of messaging so different message structures all of those things have to be forward and backward compatible so you can start rolling out navigating work and rollback the old code can deal with any new data so on failover there's really two types you want to be able to you want the system to do the right thing you want it to failover on its own but you also want to give the operations team the controls say look this data center is having a problem that the system can't detect I need to send the traffic here or this server or this particular application I skipped one about decoupling decoupling is always great it's not just physically having different servers doing different things it's also the temporal decoupling so if one part of the system is completely down you want the other pieces to continue to function for some period of time depending on the application for a few minutes or a few days but you want things to not break everywhere just because one thing is broken that's the temporal decoupling you don't require everything to be up at the same time some things can be down and a seamless scale like that all of those equal products so big on this you'll be able to continue to answer so in essence what we were looking for was a private data cloud where we can evolve the schema where the app code the developers don't have to worry about calling different places to get the data they just call the one connection for one session and it talks to the right thing it fails over it talks to the closest thing so if you've got brave instances in multiple data centers you've got a whole bunch of your applications in multiple data centers you don't want the applications to be going across the country with your 150 ml second round trip to get data you want them to stay in the data center if they're in but if all the rave instances in that data center go down you want to give them the option to go across the country slower, better than nothing usually so for all of these reasons we chose rave movie it was a pretty risky thing version 1.0 now it's in version 2.5 it came out a couple months ago there's a lot more features a lot more of us we did find a lot of bugs we were an early adopter pushing it beyond what most people were using it for and today well, a year ago we're using rave behind all of the apps across all devices and mobile OSs and also a growing number of sections of the site the web apps the rave stats here are from a year ago so this is when we were only doing a couple of sections a couple of news sections and that's the way we can sort of roll rave now I should say on the operations teams concerns it may sound weird but the very first thing we did was deploy rave into production and I would highly recommend that for all of you trying out of your technology we we basically put a rave server into our building and just deployed it with everything else it wasn't doing anything nothing was talking to it but we just integrated it right into the deployment process from day one and that allowed the operations teams to really get comfortable where is it, how do I monitor it and we can start to build up their knowledge as we're developing systems on top of it for the first time so it's a shared learning across development operations it's really important you don't want to have the developer work on something for three months or six months or nine months and then the operations team said what is this thing I don't want to deal with that so get it out there we're a little short on time we've got a whole bunch of detail that I've talked about, we want to stop them and ask them a bunch of questions yes, and Raven doesn't do that on its own but yet, the UUID is associated with documents so you should be able to do that right so one of the things that's in this detail section is how Raven uses eTags so Raven it's an HTTP interface and so it adopts a lot of the sort of native HTTP idioms so eTags and Raven are basically sequential UUIDs so the UUID you saw there when I created the document was an actual UUID a total random document as close as it can be and eTag is generated for every single update of every single document or index inside of Raven and then it's sequential so basically you don't care what the number is but you care if it's greater or lesser than the number and so when doing things like version control, optimistic concurrency you know, two clients save the same thing by default that will fail one of the things we did is actually gets more into the replication so if two clients save a document with the same ID of two different servers at the same time before replication happens you get a replication conflict and eTag is saved in the metadata and that's what allows them to resolve the conflict in our case we said the last one anyway just didn't care that worked for our data it may not work for your data if you've got financial data or something like that then we have a different strategy so you do have to have it for that what is the replication model actually I'll get into that alright so why don't I skip some of this stuff on the client although it really is talking about how this is another place where the eTag is being used so as a client a Raven client inside my app I'm keeping a cache of all the requests I've made and I'm using eTag to invalidate those so if I request document 123 and again in a different session I request document 123 Raven will make a request to the server the client will make a request to the server with being just modified since Heather containing eTag and if it's the same on the server the response will be just the 304 it won't send the whole body of the document back you can also control that you can avoid even that round trip just to find out if you already have the latest by turning on the called aggressive caching and you can configure that on any given session you can say how long you want the results to be cached one of the things we did for our operations team was make that configurable at one time through an admin interface so in the case of the Michael Jackson event they could crank up the caching at one time so that the Raven clients would be making fewer requests to the server so sharding is not something we used because we had a very read heavy scenario sharding is great for write heavy scenarios if that's what you have you can mix sharding and replication so each shard can be replicated any number of times there is no rebalancing so this is something that's different than a lot of other solutions Raven will not rebalance your shards if you add another shard you're going to need to go in and figure out how to make your data run it's a pain it does a lot of things but it doesn't do that so there's no concept of doing a module of the hash there is sort of so you can choose what the sharding strategy is so when an object gets saved in the Raven client it gets passed to a method that you've created you determine which shard you want to send it to so you can use a hash and you can use whatever you want however if you add a new shard your strategy your sharding method is going to have to account for the fact that none of the data already is ever going to exist on that shard unless you move it over yourself so indexing and querying is a big deal other than this sort of operational stuff and replication I'm going to talk about the mixing is a big thing every query and if you don't create an index yourself you can just like in a relational database you can create whatever index you want if you don't do that Raven doesn't have a concept of table scans it will auto generate an index based on the query you're running and then it will kick off that index creation process and immediately return to whatever results which in that case will be none no documents indexing and replication both happen in the background asynchronously so if you have a new query that's never been done and doesn't match any existing indexes chances are you're going to get no documents back when you do that query the first time but you will then have kicked off an index building process so as you continue to query more and more of the data that matches that operation is done well you need a new index to make sure that you've created a way to build that index across all of your Raven instances before real clients start making a process for that that's a pain but there are some tools because it's such a pain there are some tools in the next 30 to 30 that will enable that creation we actually created some of our own so that we could go and generate new indexes before we did a deployment of our code give them a couple hours to build 40 or 50 days of data we have take an hour maybe to build an index depending on the complexity it scans through every single document in the entire store deciding whether or not it's in the index and then they're in the revenue business so it's very fast but it's not possible the types of indexes are pretty varied so your basic field of index just like in a relational store full text index and it ships with the basic new scene analyzers for full text indexing but you can customize it if you have logic or entity extraction or other semantic extraction of your data you can plug that into the new scene analyzer and it will do that for you and store that in the rate of index spatial indexing for maps map reduce for forms that are possible even after the reduce step querying a flink we've seen since index we talked a little bit about indexes being sale you can't control it you can say make this query but wait until the index is no longer sale but you're really careful with that you're set for a very small time on it on your milliseconds maybe 50 milliseconds otherwise you could just be hanging out there that good thing you know until the tcd connection fails and finds out that thread will be sitting in the way and we talked about patching index script we didn't talk about which is very interesting you can actually create data in your documents or create new documents based on index so I gave you an example before using map reduce to count comments on blog posts another way of doing that you could actually write an index script on that index but at the end of the indexing process they would take that value and actually insert it back into the blogpost.com all happening on the server during the indexing process so that way you don't even have to make a query on the index to find out the result you're just loading your blog post document and it's got to count right in it that's as up to date as the index which is probably there's obviously a lot more interesting things than writing a blog on it but the idea is very cool we talked a little bit about this creating indexes before you deploy again queries are done against indexes that can be sailed so you need to make sure your application is dealing with that we had one case where too complicated to get into details but basically we needed consistency but it needed to be on a property that was not the ID and since you can't guarantee that in Raven what we did was actually generate a new document that was based on a deterministic value so that we didn't have to count on the property so the ID was based on the same value as the property and then we could have competing client instances looking at that document to determine who would do the processing so replication again it's done in the background it will be up to date depending on your network speed and how much data is being written within tens or maybe hundreds of milliseconds all the replication is one way so there's no concept of two-way replication but it's easy to do that you just set up two one ways so if you want a master master everybody can be written to set up and then you just have everybody replicated to everybody else we actually created the silver light UI is really focused on a single Raven instance so you'd have to go to that instance set up for replication open up another silver light UI for replication on that instance etc so we wrote a UI that allows to basically just with a bunch of dropdowns say well this thing can be replicated across all of these and I'm hoping that something similar gets into the actual ring distribution or else working with them to get what we need because that's an operations task that you're not going to want to use that silver light UI great for development so one thing that's new in 2.5 or just came out a month ago is setting a write value so if you really need to make sure that you've got that data in multiple places make sure it's in three Raven servers it has replicated three other instances before I return to the client until it's okay then you can set a time on a value on that so you're not hanging out for it but given a hundred milliseconds make sure it's written in three places and say give it a max on three nodes or a hundred milliseconds whichever well it's a time right so typically a write is going to come in to a single written instance it's going to get written to disk and that's going to return so that'll be very nice if however you need to make sure that if that server blows up before it replicates out so the W value is a way to say okay when a write comes in to me, no dead wait until it has successfully replicated to B and C before I tell the client before I give a response it says yes that will say it successfully so the hundred milliseconds is just an example of a timeout so if the connection between A and B is slow or between A and C is slow you want to send the client a timeout rather than tell them well I sorted that for data so it's just like that failed you need to try it again so there's a document just like in most databases it uses documents or tables in the database to manage the database so all of the configuration of Raven is now inside Raven documents the client will use the same replication document that the servers use to figure out their topology in order to do failover and you can set in that whether you allow reads to failover or read and writes to failover or whether you don't allow them on the top web and tags this is just in detail on how the reputation happens just to make the point that it's always from the source to the destination so what do you have? I have a number 42 and the next time I ask if the answer you give me is the latest document I have I'm not going to do it so we did set up this multi master replication across all of the Raven instances but in order to reduce the number of replication conflicts that happen you know with the same document as written two places and then there's a conflict when they try to replicate to each other we through our own code to pass to a single place to a single one now that could failover but at any given moment all of the clients are only writing any body could be written to but are could have sent everything to one place just to avoid those conflicts we still got that sometimes because of other things happen and we wrote a conflict resolver it's just one of the plug in points so if there's a conflict you can hook into that event in the server during the replication process and decide what you want to do the default behavior is a little weird it saves both versions and then creates a new document that the ID is saying hey there was a conflict and the client will read that and explode whatever tries to read it so our approach was the last thing was whichever time stamp on the two different servers not sure that it was the same one ID generation these how many people are familiar with cargo strategy or ID generation it's basically a way to let the client control the ID generation to give out IDs without having to go to the server but guarantee that none of the clients are giving out the same ID across the time so this one is a high number and then they get a range the simplest way to explain this here's 32 IDs you can give out so every 32 IDs there's a request to the server to get a new batch of IDs and it's smart enough to auto basically do exponential expanding of that range so there's a ton of requests coming in from the IDs from a particular server the range will expand so just reviewing some of the operational stuff you know in our case we controlled where the rights went so we can get replication conflicts we allowed officer control the aggressive fashion time in case of high load deploying new instances was done with replication so we spin up a new empty radar instance just set up replication to it and a couple hours later we have a new instance that we have clients talking to and we did have scheduled backups and you know Raven supports all the typical backup which we use in Windows and all the other vendors to do backups but we didn't ever need to restore because we already had all these multiple instances in cases where the only case where you need a backup really is if you really corrupted your application wrote a bunch of bad data then you want to restore your backup and replicate that fortunately that didn't happen to us you want to be able to copy indexes from one instance to another the index definitions so that each instance can rebuild those indexes having them available before your code needs to make queries against those indexes and there's a lot of stats endpoints inside the Raven server that you can use to see exactly what's going on and how each individual index can see exactly how far behind it is if it's behind you can see exactly where the replication status is how far behind it is on different servers if it is behind so these are the things that I covered just a lot of things to keep in mind I love Raven, it's an awesome tool but like any other tool you got to think about what you're doing you can't just expect it to solve all your problems for more information so I'm not going to post it so I don't copy the sound I'll either tweak them out like Twitter handles the bottom of the room I think we'll be linked to from somewhere on the data version side thank you