 Hello everybody and thank you for joining us today for the Virtual Vertica BDC 2020. Today's break up session is entitled Migrating Your Vertica Cluster to the Cloud. I'm Jeff Healey and I lead Vertica Marketing. I'll be your host for this break up session. Joining me are Sunit Kizwani and Chris Daley, Vertica Product Technology Engineers and key members of our Customer Success Team. Before we begin, I encourage you to submit questions or comments during the virtual session. You don't have to wait, just type your question or comment in the question box below the slides and click submit. As always, there will be a Q&A session at the end of the presentation. We'll answer as many questions we're able to during that time. Any questions that we don't address will do our best to answer them offline. Alternatively, you can visit VerticaForms.Form.Vertica.com to post your questions there after the session. Our engineering team is planning to join the forums to keep the conversation going. Also, as a reminder that you can maximize your screen by clicking the double arrow button in the lower right corner of the slide. Yes, this virtual session is being recorded and will be available to view on-demand this week. We'll send you a notification as soon as it's ready. Now, let's get started. Over to you, Sunit. Thank you, Jeff. Hello, everyone. My name is Sunit Kizwani and I will be talking about planning to deploy on my Azure Vertica cluster to the cloud. We may be moving an on-prem cluster or setting up a new cluster in the cloud. There are several design and operational considerations that will come into play. Some of these are cost, which industry or which expertise you have, in which cloud platform. There may be a personal preference too. After that, there will be some operational considerations like DM at cluster sizing, what vertica mode you want to deploy on or enterprise depends on your use case, what are the DevOps skills available, what elasticity, separation you need, what is your backup and DR strategy, what do you want in terms of high availability. You will have to think about how much data you have and where it's going to live and in order to understand the cost and the benefit of this deployment, you will have to understand the access patterns and how you are moving data from into the cloud. Thanks to consider before you move a deployment, a vertica deployment to the cloud. One thing to keep in mind is virtual CPUs or CPUs in the cloud are not the same as the usual CPUs that you've been familiar with in your data center. A vCPU is half of a CPU because of hyper threading. There is definitely the noisy neighbor effect. There is depending on what other things are hosted in the cloud environment, you may occasionally see performance issues. There are IO limitations on the instances that you provision. What that really means is you can't always scale up. You might have to scale out. Basically, you have to have more instances rather than getting bigger or the right size instances. Finally, there is an important distinction here. Virtualization is not free. There can be a significant overhead to virtualization. It could be as much as 30%. When you size and scale your clusters, you must keep that in mind. The other important aspect is where you put your cluster is important. The choice of the region, how far it is from your various office locations. Where will the data live with respect to the cluster? Popular locations can fill up. If you want to scale out, additional capacity may or may not be available. These are things you have to keep in mind when picking or choosing your cloud platform and your deployment. At this point, I want to make a plug for Eon mode. Eon mode is the latest mode. It is a cloud mode for Vertica. It has been designed with cloud economics in mind. It uses shared storage, which is durable, available and very cheap, like SD storage or Google cloud storage. It has been designed for quick scaling, like scale out and highly elastic deployments. It has also been designed for high workload isolation where each application or user group can be isolated from the other ones so that they can be built and monitored separately without affecting each other. But there are some disadvantages or perhaps there is a cost for using Eon mode. Storage in S3 is neither cheap nor efficient. There is a high latency of IO when accessing data from S3. There is API and data access cost. It is associated with accessing your data in S3. Vertica in Eon mode has the pay-as-you-go model, which works for some people and does not work for others. Therefore, it is important to keep that in mind. Performance can be a little bit variable here because it depends on the local people, which is a cache. And it is not as predictable as Eon mode, so that's another trade-off. So let's spend about a minute and see how a Vertica cluster in Eon mode looks like. A Vertica cluster in Eon mode has S3 as the durability layer where all the data sits. There are subclusters, which are essentially just execution groups which is separated compute, which will service different types of data. They can service different workloads. So in this example, you may have two subclusters, one servicing ETL workload and the other one servicing. These clusters are isolated and they do not affect each other's performance. This allows you to scale them independently and isolate workloads. So this is the new Vertica Eon mode which has been specifically designed by us for use in the cloud. On this, you can use Eon mode or Eon mode in the cloud. It really depends on what your use case is, but both of these are possible and we highly recommend Eon mode wherever possible. Let's talk a little bit about what we mean by Vertica support in the cloud. Now, as you know, a cloud is a shared data center. Performance in the cloud can vary. It can vary between regions, availability zones, time of the day, choice of instance type, what concurrency you use, and of course, the noisy neighbor effect. You know, we in Vertica, we performance, load and stress test our product before every release. We have a bunch of use cases. We go through all of them, make sure that we haven't regressed any performance and make sure that it works at the standards and gives you the high performance that you've come to expect. However, your solution or your workload is unique to you and it is still your responsibility to make sure that it is tuned appropriately. To do this, one of the easiest things you can do is pick a tested operating system, allocate the virtual machine with enough resources, pick something that we recommend because we have tested it thoroughly. It goes a long way in giving you predictability. So after this, I would like to now go into the various platforms, cloud platforms that Vertica has worked on. And I'll start with AWS and my colleague Chris will speak about Azure and GCP and our path forward. So without further ado, let's start with the Amazon Web Services platform. So this is Vertica running on the Amazon Web Services platform. So as you probably are all aware, Amazon Web Services is the market leader in this space. I mean, they really are the biggest provider by far and have been here for a very long time. And Vertica has deep integration in the Amazon Web Services space. We provide a marketplace offering which has both as you go or bring your own license model. We have many knowledge-based articles, best practices, scripts and resources that help you configure and use a Vertica database in the cloud, etc. We have several customers in the cloud for many, many years now. We have management, console-based point and kick deployments for ease of use in the cloud. So Vertica has deep integration in the Amazon space and has been there for quite a bit now. So we have a lot of experience here. So let's talk about sizing on AWS. Well, sizing on any platform comes down to these four or five different things. It comes down to picking the right instance type, picking the right disk volume and type, tuning and optimizing your networking, and finally some operational concerns like security, mentality and backup. So let's go into each one of these on the AWS ecosystem. So the choice of instance type is one of the important choices that you will make. In the on-mode, you don't really need persistent disk. You should probably choose a firmware disk because it gives you extra speed and it's free with the instance type. We highly recommend the i3, 4x instance types which are very economical, have a big 4TB D4 cache per node. The i3 metal cache is similar to the i3, 4x, but has got significantly better performance for those subclusters that need this extra oomph. The i3, 2x is good for scale out of small ad hoc clusters. They have a smaller cache and lower performance, but it's cheap enough to use very indiscriminately. If you were in EE mode, well, we don't use S3 as the layer of durability your local volumes are where we persist the data. Hence, you do need an EBS volume in EE mode. In order to make sure that the instance or the deployment is manageable, you might have to use some sort of a software array over the EBS volumes. The most common instance type we see in EE mode is the R44x, the C4, or the M, for instance types. Of course, for temp space and depot, we always recommend instance volumes. They're just much faster. Let's talk about optimizing your network or tuning your network. For the best thing you can do about tuning your network, especially in EON mode, but in other modes, too, is to get a VPCS endpoint. This is essentially a routing rule that makes sure that all traffic between your cluster and S3 goes over an internal fabric. This makes it much faster. You don't pay for egress costs, especially if you're doing external tables or your communal storage, but you do need to create it. Many times people forget doing it, so you really do have to create it. Best of all, it's free. It doesn't cost you anything extra. You just have to create it during cluster creation time and there's a significant performance difference for using it. The next thing about tuning your network is sizing it correctly. Pick the closest geographical region to where you'll consume the data. Pick the right availability zone. We highly recommend using cluster placement groups. In fact, they are required for the stability of the cluster. A cluster placement group is essentially AWS's notion of RAC. Nodes in a cluster placement group are physically closer to each other than they would otherwise be. This allows 10 GBPS bidirectional TCPIP flow between the nodes, and this makes sure that you get a high amount of commits per second. As you probably are all aware, the cloud does not support broadcasts or UDP broadcasts. Hence, you must use point-to-point UDP for spread in the cloud or in AWS. Beyond that, UDP does not, point-to-point UDP does not scale very well beyond 20 nodes. As your cluster size increases, you must sweep over to large cluster mode. Finally, use instances with enhancement working or SRIOB support. Again, it's free. It comes with the choice of the instance type and the operating system. We highly recommend it. It makes a big difference in terms of how the workload will perform. Let's talk a little bit about security, configuration, and orchestration. As I said, we provide cloud formation scripts to make the ease of deployment. You can use the MC point and click. With regard to security, Vertica does support instance profiles out of the box in Amazon. We recommend you use it. This is highly desirable so that you're not passing access keys and secret keys around. If you use our marketplace image, we have picked the latest operating systems. We have patched them. Amazon actually validates everything on Marketplace. It scans them for security vulnerability, so you get that for free. We do some basic configuration, like we disable root SSH access, we disallow any password access, we turn on encryption, and we run a basic set of security checks to make sure that the image is secure. Of course, it could be made more secure, but we try to balance out security performance and convenience. Finally, let's talk about backups. Especially Neon Mode, I get the question, do we really need to back up our system since the data is on S3? The answer is yes, you do, because S3 is not going to protect you against an accidental drop table. S3 has a finite amount of reliability, durability, and availability, and you may want to be able to restore data differently. Also, backups are important if you're doing DR, or if you have additional cluster in a different region. The other cluster can be considered a backup. Finally, why not create a backup or a disaster recovery cluster? Storage is cheap in the cloud, so we highly recommend you use it. With this, I would like to hand it over to my colleague Christopher, who will talk about the other two platforms that we support, that is Google and Azure. Over to you, Chris. Thank you. Thanks, Meaton. Hi, everyone. While there's no argument that we here already have a long history of running within the Amazon Web Services space, there are other alternative cloud service providers where we do have a presence such as Google Cloud Platform or GCP. For those of you who are unfamiliar with GCP, it's considered the third largest cloud service provider in the market space, and it's priced very competitively to its peers. It has a lot of similarities to AWS and the products and services that it offers, but it tends to be the go-to place for newer businesses or startups. We officially started supporting GCP a little over a year ago with our first entry into their GCP marketplace. It's a solution that deployed a fully functional and ready-to-use enterprise mode cluster. We followed up on that with the release and the support of Google Storage buckets, and now I'm extremely pleased to announce that with the launch of Vertica 10, we're officially supporting EON mode architecture and GCP as well. But that's not all as we're adding additional offerings into the GCP marketplace. With the launch of version 10, we'll be introducing a second listing in the marketplace that allows for the deployment of EON mode cluster that's all being driven by our own management console. This will allow customers to quickly spin up EON-based clusters within GCP space. And if that wasn't enough, I'm also pleased to tell you that very soon after the launch, we're going to be offering Vertica by the hour in GCP as well. And while we've done a lot to automate the solutions coming out of the marketplace, we recognize the simple fact that for a lot of you, building your cluster manually is really the only option. So with that in mind, let's talk about the things you need to understand in GCP to get that done. So I'm wondering if you think the slide looks familiar. Well, nope, it's not an erroneous duplicate slide from Sameeth's AWS section. It's merely an acknowledgement of all the things you need to consider for running Vertica in the cloud. In Vertica, the choice of the operational mode will dictate some of the choices you'll need to make in the infrastructure, and you'll need to make sure that you don't have to change. Just like on-prem solutions, you'll need to understand the disk and networking capacities to get the most out of your cluster. And one of the most attractive things in GCP is the pricing, as it tends to run a little less than the others, but it does translate into less choices and options within the environment. If nothing else, I want you to take one thing away from this slide, and Sameeth said this earlier. GCP runs on top of hardware that has hyperthreading enabled and that a VCPU doesn't equate to a core, but rather a processing thread. This becomes particularly important if you're moving from an on-prem environment into the cloud, because a physical Vertica node with 32 cores is not the same thing as a VM with 32 VCPUs. In fact, with 32 VCPUs, you're only getting about 16 cores worth of performance. GCP does offer a handful of VM types, which they categorize by letter, but for us, most of these don't make great choices for Vertica nodes. The N-Series, however, does offer a good core-to-memory ratio, especially when you're looking at the high-mem variance. Also, keep in mind, performance in IO, such as network and disk are partially dependent on the VM size, so customers in GCP space should be focusing on 16 VCPU VMs and above for their Vertica nodes. Disk options in GCP can be broken down into two basic types, persistent disks and local disks, which are ephemeral. Persistent disks come in two forms, standard or SSD. For Vertica and EON mode, we recommend that customers use persistent SSD disks for the catalog, and either local SSD disks or persistent SSD disks for the depot and the temp space. A couple of things to think about here, though. Persistent disks are provisioned as a single device with a setable size. Local disks are provisioned as multiple disk devices with a fixed size, requiring you to use some kind of software rating to create a single storage device. So while local SSD disks provide much more throughput, you're using CPU resources to maintain that RAID set, so you're giving it a little bit of a trade-off. Persistent disks offer redundancy either within the zone they exist or within the region. And if you're selecting regional redundancy, the disks are replicated across multiple zones in the region. This does have an effect in the performance of the VM, so we don't recommend this. What we do is recommend is the zonal redundancy when you're using persistent disks as it gives you that redundancy level without actually affecting performance. Remember also, in the cloud space, all I.O. is network I.O. as disks are basically blocked storage devices. This means that disk actions can and will slow down network traffic. And finally, the storage bucket access in GCP is based on GCP interoperability mode, which means that it's basically compliant with the AWS S3 API. In interoperability mode, access to the bucket is granted by a key pair that GCP refers to as HMAC keys. HMAC keys can be generated for individual users or for service accounts. We will recommend that when you're creating HMAC keys, choose a service account to ensure that the keys are not tied to a single employee. When thinking about storage for enterprise mode, things change a little bit. We still recommend persistent SSD disks over standard ones. However, the use of local SSD disks for anything other than temp space is highly discouraged. I said it before, local SSD disks are ephemeral, meaning that the data is lost if the machine is turned off or goes down. It's not really a place you want to store your data. In GCP, multiple persistent disks placed into a software rate set does not create more throughput like you can find in other clouds. The I.O. saturation usually hits the VM limit long before it hits the disk limit. In fact, performance of persistent disks is determined not just by the size of the disk, but also by the size of the VM. A good rule of thumb in GCP is to maximize your I.O. throughput for persistent disks is that the size tends to max out at 2 terabytes for SSDs and 10 terabytes for standard disks. Network performance in GCP can be thought of in two distinct ways. There's no-to-no traffic and then there's egress traffic. No-to-no performance in GCP is really good within the zone with typical traffic between nodes falling in the 10 to 15 gigabits per second range. This might vary a little from zone to zone in region to region but usually it's only limited or only limited by the existing traffic where the VMs exist. So kind of a noisy neighbor effect. Egress traffic from a VM however is subject to throughput caps and these are based on the size of the VM. So the speed is set per the number of eCPUs in the VM at 2 gigabits per second per eCPU and tops out at 32 gigabits per second. So the larger the VM the more eCPUs you get the larger the cap. So some things to consider in the networking space for your vertical cluster pick a region that's physically close to you even if you're connecting to the GCP network on a corporate LAN as opposed to the internet. The further the packets have to travel the longer it's going to take. Also GCP like most clouds doesn't support UDP broadcast traffic on their virtual networking so you do have to use the point to point flag for spread when you're creating your cluster. And since the network cap on VMs is set at 32 gigabits per second per VM maximize your network egress throughput and don't use VMs that are smaller than eCPUs for your vertical nodes. And that gets us to the one question I get asked the most often. How do I get my data into and out of the cloud? Well GCP offers many different methods to support different speeds and different price points for data ingress and egress. There's the obvious one right across the internet either directly to the VMs or into the storage bucket or you can you know light up a VPN tunnel to encrypt all that traffic. But additionally GCP offers direct network interconnects from your corporate network. These get provided either by Google or by a partner and they vary in speeds. They also offer things called direct or carrier peering which is connecting the edges of the networks between your network and GCP and you can use CDN interconnects which creates I believe an on-demand connection from the GCP network to your network to the GCP network provided by a large host of CDN service providers. So GCP offers a lot of ways to move your data around in and out of the GCP cloud. It's really a matter of what price point works for you and what technology your corporation is looking to use. So we've talked about AWS we talked about GCP it really only leaves one more cloud. So last but by far not the least there's the Microsoft Azure environment holding on strong to the number two place in the major cloud providers. Azure offers a very robust cloud offering that's attractive to customers that already consume services from Microsoft. But what you need to keep in mind is that the underlying foundation of their cloud is based on the Microsoft Windows products and this makes their cloud offering a little bit different in the services and offerings that they have. The good news here though is that Microsoft has done a very good job of getting their virtualization drivers baked into most Linux operating systems making running Linux based VMs in Azure fairly seamless. So here's the slide again but now you're going to notice some slight differences. First off in Azure we only support enterprise mode. This is because the Azure storage product is very different from Google cloud storage and S3 on AWS. So while we're working on getting this supported and we're starting to focus on this we're just not there yet. This means that since we're only supporting enterprise mode in Azure getting the local performance right is one of the keys to success of running Vertica here. With the other major key being making sure that you're getting the appropriate networking speeds. Overall Azure is a really good platform for Vertica and its performance and pricing are very much on par with AWS. But keep in mind that the newer versions of the Linux operating system sent to us run much better here than the older versions. Okay so first things first again just like GCP in Azure VMs are running on top of hardware that has hyperthreading enabled. And because of the way of Hyper-V Azure's virtualization engine works you can actually see this right so if you look down into the CPU information of the VM you'll actually see how it groups the vCPUs by core and by thread. Azure offers a lot of VM types and is adding new ones all the time. But for us we see three VM types that make the most sense for Vertica. For customers that are looking to run production workloads in Azure the ESV3 and the LSV2 series are the two main recommendations. While they differ slightly in the CPU to memory ratio and the IO throughput the VSV3 series is probably the best recommendation for a generalized Vertica node with the LSV2 series being recommended for workloads with higher IO requirements. If you're just looking to deploy a sandbox environment the DSV3 series is a very suitable choice that really can reduce your overall cloud spend. VM storage in Azure is provided by a grouping of four different types of disks all offering different levels of performance. Introduced at the end of last year the UltraDisk option is the highest performing disk type for VMs in Azure. It was designed for database workloads where high throughput and low latency is very desirable. However the UltraDisk option is not available in all regions yet although that's been changing slowly since their launch. The premium SSD option which has been around for a while and is widely available can also offer really nice performance especially at higher capacities. And just like other cloud providers the IO throughput you get on VMs is dictated not only by the size of the disk but also by the size of the VM and its type. So a good rule of thumb here, VM types with an S will have a much better throughput rate than ones that don't meaning and the larger VMs will have higher IO throughput than the smaller ones. You can expand the VM disk throughput by using multiple disks in Azure and using a software rate. This overcomes the limitations of single disk performance but keep in mind you're now using CPU cycles to maintain that rate so it is a bit of a trade off. The other nice thing in Azure is that all their managed disks are encrypted by default on the server side so there's really nothing you need to do here to enable that. And of course I mentioned this earlier there is no native access to Azure Storage yet but it is something we're working on. We have seen folks using third-party applications like to access Azure Storage in the S3 bucket so it might be something you want to keep in mind and maybe even test out for yourself. Networking in Azure comes in two different flavors, standard and accelerated. In standard networking the entire network stack is abstracted and virtualized. So this works really well however there are performance limitations. Standard networking tends to top out around 4 gigabits per second. Accelerated networking in Azure is based on single root IO virtualization of the Melanox adapter. This is basically the VM talking directly to the physical network card in the host hardware. And it can produce network speeds up to 20 gigabits per second so much, much faster. Keep in mind though that not all VM types and operating systems actually support accelerated networking and just like disk throughput network throughput is based on VM type and size. Now let's talk about for networking in the Azure space. Again, stay close to home. Pick regions that are geographically close to your location. Yes, the backbones between the regions are very, very fast but the more hops your packets have to make the longer it takes. Azure offers two types of groupings of their VMs, availability sets and availability zones. Availability zones offer good redundancy across multiple zones but this actually increases the node to node latency so we recommend you avoid this. Availability sets on the other hand keep all your VMs grouped together within a single zone but make sure that no two VMs are running on the same host hardware for redundancy. And just like the other clouds, UDB broadcast is not supported so if you have point, you have to use the point to point flag when you're creating your database to ensure that the spread works properly. Spread timeout, this is a good one. So recently Microsoft has started monthly rolling updates of their environment. What this looks like is VMs running on top of hardware that's receiving an update can be paused. And this becomes problematic when the pausing of the VM exceeds 8 seconds as the unpaused members of the cluster now think that the paused VM is down. So consider adjusting the spread timeout for your clusters in Azure to 30 seconds and this will help avoid a little of that. If you're deploying a large cluster in Azure, more than 20 nodes use large cluster mode as point to point for spread doesn't really scale well with a lot of vertical nodes. And finally pick VM types and operating systems that support accelerated networking. The difference in the node to node speeds can be very dramatic. So how do we move data around in Azure? So Microsoft views data egress a little differently than other clouds as it classifies any data being transmitted by a VM as egress. However, it only builds for data egress that actually leaves the Azure environment. Egress speed limits in Azure are based entirely on the VM type and size and then they're limited by your connection to them. While not offering as many pathways to access their cloud as GCP Azure does offer a direct network to network connection called ExpressRoute offered by a group of large group of third party processors partners the ExpressRoute offers multiple tiers of performance that are based on a flat charge for inbound data and a metered charge for outbound data. And of course you can still access these via the internet and securely through a VPN gateway. So on behalf of Jeff's meet myself I'd like to thank you for listening to our presentation today and we're now ready for Q&A.