 Hi, good morning. Thanks for coming out on a Saturday. So I'm going to try to tell you a little bit about big data and flash and SSDs. So who is using SSDs today in their servers? About half. That's a lot. That's great. So some of this I'll be talking to the people who are not using them yet, but I will also have some information for those of you using them about our experiences at Aerospike for which devices are best and price performance trends in the coming business. So first of all, a little bit about our company. So Aerospike, it's a very small disk at the top of a rocket. It creates a shockwave, and it is tuned for the exact size and shape of the rocket, helps it pass through the sound barrier. So it's a very small piece of technology. Makes a big difference in actually getting through the sound barrier for a rocket. My co-founder, Srinivasaan, is IoT Chennai and has PhD in databases at the University of Wisconsin and was also with Yahoo Mobile. He was a VP of Engineering there. So we have a lot of experience. And part of our goal, as was said, is to take the technology in Silicon Valley, the technology internally used in Yahoo and Google and bring that out as a product for companies that are competing with them and building big data solutions. Today, most of those solutions are locked up inside those companies, and you can't use them. So our goal as people who know that technology and know those people is to bring a product to market, a database that's capable of those kinds of solutions. So the big data architecture that we think about today most is an analytics tier. And that's usually Hadoop or something like Hadoop. And we also see a lot of different solutions in analytics, because there's so many different queries you can do. And there's graph databases, document databases, all these different kinds. And they're usually in batch mode working with a data scientist to create all the insights that are going to drive the big data in your business. These can be used for reporting and showing different managers, for example, what to do with their business. But what we also see is that these insights go into an application tier and are used in real time. So not only is there an analytics for big data, but there's also an app database and an app server architecture for big data. The technologies that we see, Hadoop, of course, is the big one in analytics, but also we see a lot of Vertica in our customers. Within app servers, we're starting to see a lot of interest in Node.js and Nginx because of the high performance that the two of those systems offer, as well as the older Apache and Tomcat. Within the app database area, we see a lot of Redis, we see a lot of Memcache, and we see a lot of the No SQL databases simply because of the scaling required working with big data. The kind of deployments that we see within Aerospike is something like this, where you have models being created within, let's say, an advertising industry or social media analysis where you can do user segmentation, find out which users are similar, find patterns of location and similarities, and those are all these big batch jobs running in your Hadoop system. But if you're going to use those on a second by second basis, you need a very fast front server that is highly available, that is tied to your application server, which brings me to the idea of in-memory big data. So we know the problems that are amenable to in-memory solutions, where you're not just doing a big batch process or a big table scan in database terms. What if you need to access randomly smaller pieces of data, or you have analysis where you have chunks of data that need to be accessed much more randomly? Can we do these kinds of systems with big data? Previously, big data was hard to do with in-memory, because within-memory, it's simply expensive. RAM is at about $30 a gigabyte. However, now we have Flash and the capability to run with Flash, and Flash is very good at these kind of random access problems for in-memory computation. So what would you do with in-memory big data? Well, you might map reduce not over your whole data set, but maybe very small parts of it. If you are in the social media, maybe you would map reduce just over a portion of your data, a set of the graph. Maybe you would map reduce over a particular time range, and just look at users that have changed a profile or posted within a certain period of time. Perhaps you're doing semantic analysis, and you're looking at posts or tweets that have certain keywords. Those are all areas where you want to look at just small subsets of your data without looking at a big table scan. Now in the case where your model needs to look at all of your data, that's what these batch processing systems are made for and why you should use Hadoop. But if you are looking at a smaller subset of big data, that is when a random access in-memory computation might work. And imagine using something like streaming machine learning, where you have a model, and very small parts of those, that model is used every time that new data is coming through the system. That is an example where you're not recomputing your model every hour or two hours, but it can be continually adapting. Systems like Storm and Esper are the front runner of this kind of machine learning. And you're going to need a database with a lot of different data capacity, terabytes and terabytes of data, in order to do that. Now it's not the petabytes that are capable with a rotational disk HDFS system, but the terabytes that you might need online. So what is a specific example? Within the advertising community that Aerospike serves, we find that customers are now using not just user segments, where they have computed that these particular users are similar, and they're going to use that similarity for advertising optimization. But now they are tracking every IP address. They are tracking every geolocation of a user. They're tracking every search term, and they're keeping longer and longer histories of those search terms in order to do more accurate optimizations in advertising. An example why we should think about this is that Google, Facebook, and Apple are making huge investments in flash infrastructure in their data centers. We know from the public filings of the company FusionIO that Facebook and Apple alone bought $200 million worth of their hardware in one year. That's because they are switching their in-memory systems over to flash-based systems. They realize that for graph computation and search and advertising computations, those companies need the random access that flash provides at the much lower cost point. $200 million of flash is a lot more than $200 million of RAM. So let's talk a little bit about SSDs. The first thing everyone thinks about with flash and SSDs is all databases run faster. Why would you need an optimized system? And I see you shaking your head there, sir. What we find is most databases go maybe three or four times faster. If you just take mySQL and put it on a rated SSD system, it will go a little bit faster. However, the problem with old relational technologies is the kind of index updates that they do. They are optimized because the rotational disks are very bad at seeking. So all of the index updates and materialized views are built around keeping similar data with similar data when you're doing your queries. With a flash optimized database, read locality doesn't matter. OK, it still matters a little bit. But flash is still so much faster. The idea that you could have 100,000 IOPS for a given drive or 50,000 IOPS at a reasonable price, $1,000 or $2,000 a device, changes radically the kind of indexing techniques that you should use as a database provider. Because you can skip around. The second thing that is different about SSDs is that you have to write still in large blocks. And this is part of the underlying physics of these chip-based devices. A large block in an SSD is usually at least one megabyte. If you're going to do any write, you will end up rewriting one megabyte on the chip. And that has to do with the technology they used to get flash. They have to change all of the bits from 0 to 1, back to 1, in order to write anything within that block. Then they can change 1s to 0s on a transistor by transistor basis. So still, we've now learned that SSDs are different from any other kind of storage, certainly in my lifetime. Which is, with all other technology, reads and writes were the same speed and done in the same ways. With RAM, it's very fast to read and very fast to write and very random. With rotational disks, seeks are very slow, but reads and writes, for particular blocks, are about the same speed. However, SSDs are different. Writes are much slower, and reads are much faster. The other important factor with SSDs is they are more like RAM in that you can do more parallelism. And you must do a lot of parallelism in order to properly unlock the drive, the capabilities of the drive. Most of the drives that we test at Aerospike work best with about 64-way parallelism per drive. So you need to get 64 IOs to really get the drive working at its full capacity. Maybe 32 on some devices, maybe 128 on others. But it's not the kind of thing where you can simply memory map and then have each individual thread. You're very unlikely to get enough parallelism using that kind of technology. However, the good old database technologies of using async.io and read and write are absolutely possible to get this kind of parallelism. So everyone says to me, but Brian, SSDs are so expensive. However, if you compare them to rotational disk, they are absolutely more expensive. They're 10 times more expensive. And then this is the current price of, I'll go through in more detail, the prices of SSDs. But you can buy them at $1, right around $1 per gigabyte. So 1 terabyte for $1,000. DRAM is still at $30 per gigabyte. That's Dell's price list as of last month when I last looked it up. And you have to power it. So that $30 per gigabyte you're spending is about the same amount you will spend every year in power as well, just as a rule of thumb. SSDs, even if you looked at pricing SSDs in the last year, the price has dropped a factor of 10 just in the last year. And this will continue. So faster than rotation is more expensive than rotational drive, yes, but much, much, much cheaper than RAM. The comparison in performance is, of course, memory is still faster. There's not really an IO speed for memory, which means you can get about 5 million transactions out of an in-memory database per second, whereas an SSD per device will only do about $50,000. But rotational disks will only do $200 per drive. That is such a massive difference in the number of IOs between rotational drive. Yet rotational and flash, the price per IO is, so let's take this all down to a bottom line cost. For SSDs, and this was a real world example that I priced out using the Dell R720XD with the Intel 3700s, and 10 of those in a chassis. So you could get 4 terabytes of storage for about $16,000. RAM, in the same hardware, you only get 512 gigabytes. That's a very large RAM-based system. And it still cost you $20,000. So if you're buying your own systems, looking at getting 4 terabytes instead of half a terabyte is a massive difference in capacity. So let's, yeah. Sure. Well, as an example, 3PAR is still selling very large rotational arrays. That's how, with rotational systems, you can get a lot of IOs still, is by building up very large arrays. So this system is capable of only 5,000 transactions per second compared with this one, which is nearly infinite, and this one, which is capable of, let's see, probably at least 2 million IOPS. And for that system, you're going to pay $100,000 for one of their smallest 3PAR systems. So not really effective rotational drives. Absolutely. They see the power of this as well. This isn't just about Aerospike. This is something that's changing our entire industry. So an example here is, let's say you need 10 terabytes of highly available, very random storage. And you're trying to make a pitch to your managers that you want to do this. Well, if you're doing this with an in-memory system, getting 10 terabytes replicated twice will cost you nearly $4 million. And I don't know about your managers, but when I walk in and I say, well, the first check you have to write is for $4 million, they'll say, well, you're going to do a smaller project than that to start. Whereas 10 terabytes online, and the example here is with a drive I'll talk about later, the Micron P320H, you can get very, very high IOPS, nearly 500,000 per server at a cost of 10 terabytes, replicated of only $250,000. Now, that's still a lot from data center terms. And it's only 10 servers instead of 100. The amount of rack space needed for 100 servers, even compact servers I priced in this example, is fairly high. You're going to spend a lot of time getting those machines racked and getting them installed. Now, that's a 10 terabytes. Maybe you say, well, that's fine, Brian, but we don't really have 10 terabytes. I was at a customer a few weeks ago, actually in China, and they said, well, we're using Redis with RAM. And I was saying, well, maybe you should consider using Aerospike with Flash. And they only had a 200 gigabyte problem. Even in the first server buy, we were going to save them. Flash was going to save them $19,000, just in the first project, in the initial stage. And then as they went up to one terabyte, they were going to save that over and over again by using Flash instead of RAM. So how do you build a database that is optimized for SSDs? So the first idea, like I was saying, is that you need to write with large blocks. There's a new name for this called the Write Anywhere File System, which is sort of an old trick, really, which is a twist on a log-based file system. So in a log-based file system, when you're using Copy on Write semantics, what you do is when you're writing the next copy, you just put it anywhere. And you keep the pointers, which are your in-memory pointers to where the data is, you simply update those. This allows you to put the data anywhere and to write very, very quickly, maintain very high write speeds, still keep the other copies of the data, if necessary, and also read very quickly. And that's the idea that using keeping your indexes in RAM, because this is still in-memory computing, RAM is still very cheap compared to where it was. The bulk of your data is on your storage device, on your Flash, but your indexes should be in RAM, because then you can still do your indexing, and you can still do all of the updates and iterations at a very high speed, and only look at the storage at device once you have decided which you're going to look at. One thing that we found is that using the kinds of these in-memory structures in your indexes, you don't have to use RAID hardware. And RAID hardware has become a massive bottleneck of systems of these types in our experience. The examples I was showing with the Intel S3700 drives, there is no RAID hardware that we have found that will keep up with four of those drives. And that includes our engineering samples of the latest Dell cards with the very highest end LSI controllers on them. They would only handle four drives. That chassis can hold 20, and still the LSI controller will bottleneck at four. Now, this is a reason why the industry will be moving very, very rapidly away from SATA. And SATA is essentially dead in the industry today. All of the high-performance drives I will talk about are PCIe. And we will start seeing, we have already started seeing, a move from PCIe cards to PCIe front panel modules in the same 2.5-inch form factor that we have for rotational disks. So imagine it still looks like a 2.5-inch front panel. You can still hot-swap it, but it's actually got a PCIe connector on the back instead of a SATA connector. You can buy those today from Dell. So there's a lot of people selling SSDs and a lot of manufacturers, and there's a lot of difference between the different SSD vendors. The biggest difference between them, the reason there's so much difference, is that firmware for SSDs is more complicated than firmware for rotational drives. And whenever you get a bunch of hardware guys trying to build a bunch of software, it usually goes very badly for about four years. And then they kind of get smarter and smarter, and everything's fine again. But we're still in that bad stage where they're having to do essentially a garbage collector and a defragmenter inside those drives, because they have ware-leveling algorithms and chip failures that can occur. And that's really much more like software. So we are, in my view, into the third generation of SSDs and flash storage devices right now. And only now are we starting to see the best algorithms in those classes of software being available through different manufacturers. So the first thing you should do is measure the drives, because we live in a data-oriented world for computer science. We at Aerospike have created a tool called ACT, the Aerospike Certification Tool. It's available on GitHub. What we found was that all of the current benchmarks have a fundamental problem, which is they tend to do a lot of writes, and then they tend to do a lot of reads. Well, most databases and most data problems aren't like that. If you are only writing data or only reading data, that's easier. The hard problems are when you're reading and writing at the same time. Those are much more real-world examples. So I won't go into detail with this, because you can find the code yourself and download it. See what we're doing in that case. But we're applying these large block writes. We're doing example defragmentation of these blocks in the code. And then doing very small reads, just like you would if you were doing a kind of data problem where you had random 1.5k user profiles being read from these disks while constantly updating them. And usually what our customers like is they want to see a particular latency guarantee. So the way to run the test is to run at a particular performance level, measure the latency, and then increase that level of performance, measure the latency again. The other problem with SSDs is that the drives change over time. So you need an automated test that you can run for about two days in order to see the true performance characteristic of the drives. Now some drives don't change very much over time, some of the better ones now. But some of them we have found. We wrote this test because it was actually because of an earlier micron drive about two and a half years ago. We recommended it to a customer. That customer deployed it, and then they found at about the two-day mark, suddenly their servers were slowing down so poorly that it was taking their service offline. There was a pause in those particular drives of about 50 milliseconds every two minutes, just like clockwork. Something when you fill it up to a certain extent, it would have these pauses. And being offline for 500 milliseconds in a database can seem like an eternity. So that's why we developed this tool that allows you to very easily see the time progression. So this particular drive, it's not one of the cheaper drives. It's one of the more expensive drives. So this is a SLC drive. So it also has very high good write characteristics. But we measure the percentage at 1 millisecond, 8 milliseconds, 64 milliseconds of response. What we got out of this drive for 150,000 read IOPS was these characteristics. And they never changed. This was the first six hours. But it stayed exactly like this through 48 hours. Now this particular drive is at $8 a gigabyte, which is fairly high. On the other hand, this is a very impressive read and write figures at this latency. So they were achieving very nearly 99.9% sub 1 millisecond at these very high throughput rates. Now in comparison, let's look at the Intel S3700, which is also a very excellent drive. It's priced right around $3 per gigabyte. But it was only doing 6K IOPS and getting to 1.6% above 1 millisecond. So comparing that to these numbers, again, 150K versus 6K IOPS, you see a drive that if you need this kind of performance, $8 a gigabyte is a steal. Now the benefit with the S3700 is it's a SATA drive. So it's very easy to build up higher densities within these systems. And if you're willing to live with a little bit of latency, a little bit of extra latency, you can push them a bit higher. So this ACT tool that we've developed and given to open source allows you to measure whatever different kinds of drives you have and do things like experiment with over provisioning. The Fusion IO drive, the IO drive 2 that we had, this was an MLC drive from their line, was also priced around the $8 gigabyte level. But compared to this micron drive at the same price, was not performing anywhere near as well. So the Fusion IO was, and we're partners with Fusion IO. I like Fusion IO. I know the CEO, the old CEO, the one who left, and the CTO very well. But we have a hard time recommending drives at that performance. This is why just because they're funded by NEA and we're funded by NEA, I can't recommend them when I see numbers like this. Yes? Sorry, say again? Over provisioning. A classic trick with flash drives is since they need to defragment continually inside the drive and do garbage collection, there's always more capacity inside the drive than it says. Because they need to have a lot of partial drives because they're constantly bringing them together again. Yes? So one way a drive manufacturer will save price is by saying the drive is only a certain percentage more. So maybe it's got 1 terabyte inside it. They say it's a 700 gig drive. Whereas a drive performs better if they say it's only 500 gigabytes. So over provisioning makes the drive appear smaller and you use less of the drive in order to allow the defragmentation algorithms within the drive to not have to work so hard. So that's a common trick among drives. You can read about it on our website. You can use in Linux, HD Parm will allow you to automatically resize it so that the devices seem smaller. So the OCZ. OCZ is a great low price provider. I had one in my laptop for over two years. However, we talked to a customer who used the Vertex4 even against our recommendation. And they said that as soon as production traffic started to flow, they had two drives fail every day. And so when you hear people talk about SSDs possibly being unreliable, they may be using drives like this. So the drives that we are recommending are things like the Intel S3700. The Samsung drives are very reliable. These micron drives are great. The Fusion IO drives are reliable. They're just not as fast as they could be at this price. We do have some customers using Vertex4s, but at lower levels of performance. So here are some raw numbers about the previous generation of drives. We recommended and put into production with our customers a lot of these Intel 320s and Samsung SS 805s. The Intel X25s never actually deployed with our software. They were the very first drives that were production quality. And you can see some of the different numbers that we had on our test. So I want to talk to you for a few minutes about the Aerospike software. So this is a list of some of our customers. You probably reckon, hello, yes? Yes. That was the year before those microns came out. Yes? Hi. So I'm a software developer. We mostly just focus on the bits. And I usually buy EC2 instances, which now have SSD-back instances as well, and other providers like MediaTemple or DigitalOcean. DigitalOcean is specifically because they are extremely low priced. You can get a SSD-back drive for as little as $5 a month. Now, there I don't have a choice in selecting the SSD. I just go click on the web page and get on with the machine. I really don't know much about SSDs. How do I find out what SSDs are they indeed using? Is my software running on these compute devices are compromised in any way? So first of all, good question. Amazon, of course, uses a lot of SSDs under the covers. They have both the high-IO instances, which are SSD local instances, as well as using throughout their infrastructure within DynamoDB, SimpleDB, even, and EBS. So the benefit of using Amazon is you don't have to worry about it, and the pain is you can't do anything better than whatever they're doing. So you just live that particular thing. For the other hosted providers, what you'll probably find them using are devices such as these. For example, SoftLayer, which is very popular, a little higher price than the ones you mentioned. We're using a lot of the smart devices. So you'll find devices in this particular range with the lower cost providers. It's still better than rotational disk, and it still has that nice middle range. And you're probably going to be running a file system on top of it. File systems have a lot of problems with SSDs, because those tricks that I mentioned about keeping your metadata, essentially, your index data in RAM, file systems don't do that. So instead, what you've got is you've got a lot of rewriting, and the wear amplification that occurs with file systems really ends up hurting. Most drive manufacturers expect if you do a file system write, that will actually generate seven writes within their system. That's one of the biggest problems in wear amplification, whereas the algorithms that we've done create exactly a wear amplification of two. So for every write you do, you're actually doing two under the covers. So as a software person, you're not going to perhaps provide this and get this kind of benefit unless you talk to someone who has done some flash optimization. So we've eaten a little bit into the question and answer, so I'm going to go that a little bit. So these are some of our customers. We have an office here in Bangalore. So if you see Sanil back there in his aerospike, classy aerospike shirt, he's one of our local team lead. Companies such as Pubmatic is out of India as well as in Moby. So the reason is why do these guys all use a for-pay database instead of using one of the open source solutions or just using Amazon? At some point, often the scale that you're operating at, having your own data center, maintaining your own data center makes sense. And that often happens in big data. Now the elasticity of using the cloud is also a huge benefit with big data as well. So of course you're going to mix and match your solutions. Some projects that I start, I start in the cloud. I might move them to my own data center, et cetera, et cetera. But there's a reason why these guys are using a for-pay database. And we've been proven in production with a large number of advertising customers. One reason is we are simply faster, much, much, much faster than a lot of the other NoSQL databases. And you can find a lot of this information on our website. So there's a company called Thumbtack who is a boutique integrator of NoSQL solutions out of New York. And they are not only aerospike partners, they're also Cassandra, TenGen integrators, and also they work with a lot of those couch solutions as well. So they did some benchmarks and just some measurements on what happens when you bring down a node, a cluster node, right in the middle of operations. What actually happens to your latency? So we're very proud of the results that they found with our solution. And I'll tell you a few things about the architecture. So the architecture of our solution is a true clustered system with shared nothing. So whenever a new node joins a cluster, what happens is we run the Paxos distributed consensus algorithms a very minimal number of times, only when it joins the cluster. We have our own Paxos implementation for that. Once you create the cluster, then each individual transaction is still applied as a transaction must for high consistency but high speed data. So there is a master for any given row, but that master might change depending on the number of servers you have or change over time. That gives us a lot of capabilities in terms of simply being able to snap new clusters and nodes in and also tolerate downtime. We ship a client with this, clients for all major languages, and those clients do that work of sharding for you and also tracking the cluster. So when you add a new cluster node or one goes offline because of a hardware problem, immediately the transactions start getting rerouted without you having to do anything. We also have cross data center replication because if you really care about high availability, you're not just in one site and one data center. So we do that for you automatically. And we also have a really cool monitoring UI done by some folks who I saw here earlier today as well. So we're very proud of our capabilities with both in memory and flash and think that the kind of solutions you can do, and again as a software guy, you can start applying some of your in-memory techniques and in-memory algorithms that aren't just table scan and batch based to in-memory workloads in the terabyte range with flash optimized databases. And also we have a meetup at our office, but actually not at our office, near our office, on Tuesday. We're going to be talking about database uptime and more information about our product. And you can find information at our booth outside as well. So any questions? More questions? Yes? Yeah, so my question is basically related to that comparison between Fusion Ion and Micron. So my personal experience is that if you run the kind of mixed workload with the Fusion Ion, then actually the performances start degrading. And if you just run the read, then they are a lot better than the Micron thing. So then how much percentage of read and write is basically you are running on your test cases? The test framework, good question. The test framework allows you to choose different levels because it's going to be very different for different applications. We find an interesting workload is 50-50. Within a lot of cases for real-time user behavior, such as with advertising, as soon as you have to read in order to make a decision about a single user action, and once you see it, you need to update it. So often these workloads are 50-50 within advertising and real-time behavioral work. So we find that's true. We have some customers that are doing 80-10 on both sides, and the tool can be configured for that. And if I may continue, I have one more question. Related to that Wafal, WFS type of file system which you have, so you are saying that you are not going to overwrite on the same place, you basically go ahead and write on some other place and then update the point, right? So is that? Eventually, eventually you will. Yeah, so it's kind of, we need the transaction there, right? So you have to basically do the small writing onto the disk. So where do you code these things? It's on the SSD where you maintain that transaction log. Sorry? It's basically on the SSD where you maintain this transaction log, all these small writes? No, they're large writes. No, so I'm actually saying that you cannot keep the data on the RAM because when the system goes down, you are going to lose that thing, right? So you have to maintain a transaction sort of so that you can basically replay when the system is coming up. Right, so this is where we have done something unusual from database history. If you do it as you say, you can only go as fast as today's databases. So we have several techniques. One is you have a whole second server with all of that data in it. So if that one server goes down, it's in another server right now. You have a hot standby, first of all. Second of all, most of the times when a system goes down, only the process, sometimes most of the time you only need to restart the process. Well, in that case, you can use shared memory and allow those transaction information to actually survive the restart of the transaction. Now in the third case, where you actually have to do a full restart of the entire operating system, first of all, that better be very rare. If that's happening even once a week for these large systems, you're probably doing something wrong. You've got some faulty hardware or some bad software, right? So you can go back and reconstruct from disk all of that information if your hot standby fails and you have to restart the whole machine. Thank you. Hello? Yeah, I have several questions. The first one is about the durability of the SSD itself. As we know, the SSD cells keep degrading as you do more writes. What's the average life cycle you have seen for these SSDs? That's a great question. And I'm not sure why I didn't have my typical slide on that, because it's a very interesting question. Each one of these drives is rated for the number of write cycles that they can take. The newer drives like the Intel S3700s have fairly high write durability, and they've created an entirely new kind of flash chip that they call EMLC to raise the durability. But even in the older generation like the Intel 320, what we found was we did the math for our customers. And we found that as long as the write load was going to be in the two to three year range, everyone tended to be fairly happy, because we know how fast this market is moving. And in two or three years, ah, you're going to replace it with the next generation anyway. You do need, in that case, it's also advantageous to have a clustered system, because we can do a generation upgrade where you take down a machine, swap out its drives, bring it back up, it resynchronizes. We've had customers switch between one generation of flash and the next that way. But typically, we're finding even the low write rate ones will last for three or four years. But you have to do the math. Now, the second one is, in one of your slides, you mentioned that there are 700 GB drives. But this slide only talks a max of 150 GB. Yeah, these were some of the older drives, the previous generation. So these ones, I didn't call out the sizes. These ones, we were testing 400s, but 800s are also available. Fusion IOs are going up, I think they announced their 10 gig card, sorry, 10 terabyte card. And I don't have some of the more recent, so this micron drive is a 700 gig drive. But all of the new PCIe form factors are now starting to push a terabyte per card and end up through 10. So Violin has their new 10 terabyte card out as well. Yeah, I had a related question regarding the question he asked, how do you ensure that durability? Like, of course, I understand that the power outage is something you cannot leave with these data volumes, but how do you ensure that an issue in the data on one system doesn't replicate to the other? What kind of issue? Some data corruption at one node is not replicated to the second one. So when you're doing the right, you don't write to the disk, read it back out, and then write it to the other disk. So the kind of corruption you would have is if you write it to the, if it corrupts in RAM as you're doing the replication. And frankly, that's a pretty hard thing to fix. If that happens, I'm sorry. No, I meant in terms of your own software, when it's building those blocks, realigning and merging multiple of those writes into one single write into the large SSD. If there's a software bug that corrupts one block of memory, which gets anyway replicated to the other one, it can replicate the error itself. That's the question. We don't do replication at a block level. We do it at a transaction level. That's actually very important. So when you're doing the transaction, the most common system for us is you're actually replicating and committing to all the copies in DRAM, essentially getting case safety out of your transaction in DRAM and then doing right behind. So in that case, if you have the fault, it would only happen on one of the devices and you'd still have the other ones. We actually do have a system, a check for that kind of hardware failure in the system where we will actually do a checksum on each individual row-based transaction when we're reading it back. However, we've turned it on occasionally. And with these particular drives and these higher quality drives, we haven't seen problems. So our customers tend not to turn it on. Thanks. That's all we have time for today. One question. I just read through your pamphlet as well in the presentation. I just want to know how Aerospike actually earns the money here. We save people money and hardware costs. So they pay you or, I mean, it's only the consultancy part? No, no, we're a business and we sell software. It's kind of old fashioned. OK, OK. So that's actually sell software for a living. But it's only crazy. Larry Ellson did a good job of it. Maybe it will work for us too. So this is like a perpetual thing which should be in the systems from Aerospace, Aerospike. We're a database software company. So you would buy us and install us. We have consulting and services that go with it. We support our customers. So it's a piece of software you install. Thank you. And we also have a free version. So anyone who wants to try this, go to the website up to a certain size and a certain cluster size, it's free forever. That's all we have time for today. Please catch up with Brian offline if you have any issues that you want to discuss with him. And we have half an hour break time now. Thanks, Brian. Thank you.