 Okay, so good morning everybody. I'm glad that there is so many people after yesterday's party So we are glad that we can be here today and share with you our our story or our success story My name is Jakub Pavlik and I work for TCP cloud and together with power sites from AVG We would like to share with you today our our user story how we implement it open stack But not just only open stack in AVG technologies At first a little bit about agenda At first we would like to introduce us what who is TCP cloud who is AVG After Pavel will introduce the goals where we started with the project and what was the what was the target? What are the AVG infrastructure layers next I am going to talk about our implementation phases from proof-of-concept through the pilot after production environment and After that, I would like to share with you the our open stack architecture model and continuous integration model driven and finally Pavel show you the The best thing how we decrease the time for the deployment of the stinging environment. So Pavel, please Introduce. Okay. Hello. Good morning. My name is Pavel Zaitz You probably you know AVG as a Security company well known is our antivirus product, but we do much more In AVG, I'm a team leader of one of the technology group We are responsible for everything what is outside of Internal network, it means data centers a five load balancers CDN providers and The biggest part of our Work is linux server maintenance We are maintaining the web platforms e-commerce platform and the backend systems for and user customers like license server Because we are responsible for production. We also support the development teams And we are preparing for him the infrastructure To be able Produce the our products Okay Okay, so who is TCP cloud? So we are quite new company But we have focus for the building private clouds based on open stack open contrair and open source technologies since 2011 we are very active in global community. For example, we are of the mine One of the main contributors to open contrair We have our own data center and our key message that is that we are trying to be the maximum openness and use the all vendors technologies and Don't be so vendor specific and Things like that. Okay Very started The technology department in the AVG has four groups As I said one of them is my group. We are responsible for linux is then windows Windows group hardware group and network group To add Additional resources to our infrastructure or do a some change in the infrastructure takes us Days because each group is responsible for his part and Do a common biggest common things Is or was very hard 100% of our virtual infrastructure was running on VMware and We were not so happy with the API for automation Because we had to do a lot of manual tasks. We had almost no time for for innovations Right now let me briefly go through the through the Process of the deployment of staging environment for the development when Development requested New environment for for testing we get the request when we realize that we need some additional resources like storage or do a network changes like create VLAN create Firewall rules we had to Push the task to other teams wait till They deliver his part of work. Then we were able to clone manually clone the server on In virtualization apply the puppet profile Do Few few things manually like get IP address from the IP plan get Create a DNS Record for for the servers and when we had all these things done We were able to hand over these servers or this infrastructure to the development to be able to continue his task all this process takes Several days because we had to wait each other the group Till each part of the the infrastructure teams deliver his work Okay, what was the goals? the The request from the business was quite simple speed up the Time to market delivery for for the infrastructure for my team. It means speed up the infrastructure Delivery from days to hours It means abstract from the physical physical layer Mainly in storages and network network part The second things what is necessary to do or what was necessary to do do a full 100% automation of all processes what what we did what we did manually therefore We choose a open stack like Automation platform, which is able to Give us all all the necessary things what What was requested right now, I would like to show you what are the infrastructure layers what we what what are the things what is necessary to do to have fully automated infrastructure as a Virtually virtualization hypervisor views KVM Which best fits with the open stack? For network virtualization Was recommended by guys from TCP cloud open control as a Most powerful SDN solution on the on the orchestration is done by open stack with his APIs and heat templates and on top of it is good to have a billing because It's good to know who is consuming our resources and how much But this is the basic and we have we it was necessary to do much more The the infrastructure has to be monitored one of the request was was autoscaling and You We have some server we need to authorize when we want to log in there. We need to authorize there Then we choose a epa from from Red Hat On the infrastructure is good to know what's happening there. Therefore is we implemented logging using fluent elastic search and Kibana as a server configuration management views puppet version 3 with here and For the application configuration we are using console patch management is done by band done by Spacewalk and application content delivery is Developed on git Artifactory and as orchestration is used bamboo Then we have all these things we can deploy virtual servers Including the the infrastructure described in heat and This service has its own Application like Java Tomcat database PHP whatever needs Okay Okay, so Implementation phases so we divided the project into four parts and the first part started in last quarter in 2014 and the first step was to set up last set up set up As Pavel mentioned on it was described on the slide is that we had to use the existing hardware And we don't want it to invest any more to another Story system and things like that. So we need needed to prove that we can use the existing Hitachi VSP storages AVG also requires the using live migration inside of the cloud and Test the low balancer as a service together with F5 low balancers and automatic DNS Registration I will talk about about details in the further slides When we set up this environment and prove these things we started with the pilot and the goal of the pilot was to take one of the AVG application decompose it into the parts prepare it for the configuration management and automatic heat deployment and Test it in the staging environment and measure things and measure the effectiveness of these things Now we are in quarter two in 2015. Now we have to Know now we have a production environment into two data centers And we developed the model driven deployment automation Approach which I will show you in the next slides the future goal is to add the next data center and scale up to 300 compute nodes Okay, so first question when we had to start was neutron SDN solution and as you know all clouds are about networking and it is more crucial and most crucial and key component inside of the cloud and as you can see also in the summit there are lots of sessions about Networking how to do that there exist many many possible solution how you can do it And it's very difficult to decide. What is the best and the key things what you need to get from the networking is a high availability scalability and migration multi-tenancy and Things like that and also there are several buzzwords like low balancing as a service firewalls a service and service change chaining in AVG We had to decide between Four solutions and the first idea was vanilla neutron. I mean open V switch and L3 agent the standard implementation What was done in? Open-stake ice house and in Juneau and as you know, there are a lot of problems with high availability how to scale it how to provide Advanced functions like service chaining and also the bandwidth is not so Enough in the solution. So we left this solution The second one was Cisco APIC the Cisco APICs is more slideware than reality and As you can see also here when you come to boot to Cisco and discuss them Please show me the Cisco APIC. Maybe they answer you there is one guy who knows something But you've never seen him. So this was a very difficult problem and when you want to Deploy Clive today, you need a solution which exists and not the solution which will be available after two years The VMware NSX this was a different story because as Pavel mentioned the VM the AVG everything All infrastructure was in on VMware and there are still a part of infrastructure, which is on VMware But there were two points. The first point is a licensing model and second point is a VMware itself Because you never know when they change the strategy and lock something and decide that the multi hypervisor Must die or something like this and as I mentioned we are open source company and we wanted to build the open solution based on OpenStack so we decided for the Juniper contrail and I can give you exactly five arguments why the open contrail was what fits best our Requirements the first one is a licensing It is completely fully open source solution Without without any limitation With possibility to buy the commercial support. What's mean? You can build the old environment Scale the environment and when you decide that you really need the support and you want to go to the production You buy the commercial support from the Juniper. The second thing is the high availability The high availability is natively supported in open contrail and it used the standard protocols Which there exists many many years in standard network boxes The next the next key criterion is a cloud gateway routing and this is very very important because open contrail is one of There may be two solutions Which are able to support you to provide a gateway routing on the network boxes and routers was designed for the routing the routing on the Linux machine doesn't make sense and You need to scale you need to provide the bandwidth and things like that So it's not possible to route the old traffic of whole cloud on one Linux machine server on or something like this And this is very tightly Connected with the performance because we know sorry We are now able to get nine point six Gigabits on 10 gigabits line outside of the cloud Inside of the cloud, but it is not about only on bandwidth, but also about pocket per seconds and the Performance of open contrail is is amazing and the V-Route is much better stuff than openly switch the next point was interconnection between SDN and fabric and There is not so many SDN solution who can provide you these things very easily and the requirements was to How can you connect the intranet network into the software defined network and With this solution you are able very easily Connect underlay word with your overlay word and we really need it for example for some bare metal servers or or for External physical firewalls and the last thing is the physical f5 integration All solutions introduce you that they have integration for f5, but they have integration just in the graphical user interface just for clicking and We want to automate it. We don't want to use the clicking buttons. So the idea is to use F5 integration through the heat resources and describe the whole infrastructure including physical f5s by heat It is now in the beta release of open contrail So what was our findings after POC when we finished the POC? we needed live migration and For live migration We decided that our production instances must be booted from the volume and for this purpose We used existing Hitashi VSP storage on the fiber channel. So all instances in production are booted from the volume and Discs are mapped as a row devices into instances The second things is automatic DNS registration with open contrail We don't need to develop some integration that after provisioning it registers something into Microsoft Active Directory on things like that because contrail natively support Automatic creation domain records For glance images, we decided to put it on the NFS storage existing EMC NFS storage Maybe in the future we would like to move it into Swift because Swift is in our Internal roadmap to implement it into AVG and the last thing was orchestration Where we find the approach which is suitable for us so use heat as a creation of virtual resources and So describe the infrastructure by templates of the heat provision it and after registered into some configuration management or orchestration management and deploy configuration into application So when I do the conclusion so for the Nova and for the KVM hypervisors, we Decise it for the for the Ubuntu It has several improvements for us for example the kernel and and things like that and It provides much better things than sent to us For the Cinder we use the Hitashi storage driver For the Neutron SDN solution open contrail and for the configuration management we use sold stack for the provisioning of underlay infrastructure like compute nodes controllers databases and Puppets Already exists in AVG before we started for their application deployment stuff For the monitoring and billing we use our TCP cloud solution because we are not just the deployers and integrators but we also use the our monitoring system, which is based on sensor monitoring framework and We develop we develop also the our bill matter application Which we covers in our cloud deployments So this picture covers the our open-stack architecture I can speak on each slide for almost more than 30 minutes. So I will try to describe it very briefly So on the top this this how it looks one data center So in each data center we have on top of that we have two journey paramex When we created two types of VRF routing instances where we terminated our SDN word. The first one is INET which provides public IP addresses So standard floating IPs as you know from the open stack and the rest of VRFs are different dimulatory zones Inside of the AVG intranet to able to provide direct access To instances inside of the cloud. So we separated we separated mysql Galera cluster From the open stack into separated free virtual machines We also we created the open stack controller stack. There are in high availability all APIs all open stack APIs, RabbitMQ and keep alive with HAProxy which proxying all these APIs and We run free virtual machines with high available solution for open-contrail Rest things There is a proxy because we are proxying our APIs. We don't put it the APIs on the HTTPS SSL but we using proxy for that and We separated silometer MongoDB for the metering and storing data into our graphite graphical database Inside of the billing system and the blue one is our master node which deploys all the configuration It is saltmaster node which deploys them At the bottom is NFS which is from EMC. It is for glance image repository and In compute nodes. We have just one network, which is called cloud underline networks and we are using the Mcapsulation MPLS Over GRE each server has 10 gigabit sports and bonding inside Okay, so what we created and what is more important than just the platform because open stack is just the platform But you need to also prepare the processes for that Open stack itself. It's not enough for your company. If you don't have any continuous integration and delivery system inside so If I start at the bottom you have a virtual system for this we are using a git lab And there is still all all templates from the heat all jobs definition for the For the Jenkins and also formulas and Here are two database for our configuration management system So orchestration resources is done by by open stack heat. So heat prepares all virtual resources including IP addresses disks security policies routing policies so everything we want to deploy to the heat and Heat is of course managed by some continuous integration tool, which is in our case Jenkins and So when when the deep resources are prepared and deployed There is the action for the configuration management where for the application purposes is used the puppet and Uncivil now and as you can see the process is that AVG has some development environment some QA staging environment and production environment and this is the way how they produce the application Inside of the company So and we are going into we are finishing now so this is this is how looks as the AVG dashboard It's pretty nice and I like that so It's a standard horizon dashboard we just brand it and prepared for the design and put the manual on the top and We also edit our internal monitoring and billing So we are using the horizon for everything and integrate everything inside But not hard integrated but similar like other open stack project projects through Through the API. So this is much better than you have some external External dashboard on things like this. So one pane of glass for everything and Now I give a word Pavel to explain you what is the best one As I said on the beginning Deployment of the infrastructure takes us a days It was On the first step we had to manually deploy for example F5 load F5 load balancer configuration firewall networks Storages then we were able to Deploy the servers application content to the server prepare deploy scripts for application For the deploying of new version of the applications it talks to us again in days and When we had these two steps done We could start do the basic test if the infrastructure is really working And it takes again in days finally We spent our creation of the for example staging environment takes us 10 days because there were a lot of delays Or waiting for each team Right now It is we choose choose a template It template what we can apply the is starting out automatic infrastructure deployment and we are just waiting and When the infrastructure is deployed it started the application deployment it takes it depends on how big is the database how big is the project and When all these things are done Auto we automatically start the test if all tests are green then the infrastructure is deployed this this process takes about half an hour or similar Half an hour and it is big improvement from 10 days to half an hour. It's it's nice therefore our development is very satisfied because with each Every two weeks They can they can have a new new infrastructure on click prepared for for for for for for for the for for the new For the new development Therefore we fulfill our goals Okay, I right now. I have to thank guys from the TCP cloud for his great job I have to thank thanks also my team because he did also the the great great job and Thank you for attending It's time to clear questions Yeah, the question is if What's our plan because the deployments is very agile and we are using the legacy style fiber channel storage Hitachi we speak so what is our plan with the storages? Probably we stay with this this solution because Right now we are using many type of storages emc del Whatever and everything these storages are virtualized behind the the Hitachi and The this model give us the possibility to To get the power of the of the of the biggest Hitachi Hitachi storage what we have as I could mention for for the Application content or for instead of NFS we plan to use Swift as a object objects distributed object object storage because we need to Expand the NFF let's say we need to expand NFS over over all data centers over the world and with the NFS It's not not possible. Therefore the swift is probably the solution. Yes, and developers Works with Amazon as free. So we need to offer we need to offer them alternatively for that. So the plan is to implement the swift Object search Yeah, one thing is to create the environment. It takes some time When you when you have it and when you are developer and you produce a new new version of your code it is possible to just update update application content therefore you with each commit you don't need to wait for Certain minutes for for the new infrastructure. You just commit it and run the job to to apply this in this change It is the the certain minutes or this time is In agile development is called to these two weeks Or iteration or We create a new environment with every every every sprint and During the sprint the environment remain the same Yes Yeah, it is SDN so there's no violence. It is we are using encapsulation MPLS over GRE So basically open-contrail Use the VRouter module kernel module in the hypervisors instead of standard open-v switch and each network is actually VRF routing instances inside of the hypervisor and Between all compute nodes and gateways you have GRE tunnel and your network is going Are attacked by MPLS labels inside of the GRE tunnels So it is overlay world completely So we have just one transport VLAN between compute nodes and inside of that there are these virtual networks Sorry, I didn't know This is yes NSX use OES Openv switch No, no, we don't want combination. We want to have open-contrail and whole infrastructure Yes Yeah, it's based on on virtual machines The part of virtual machines is on KVM Standard KVM and part of all is on existing SXC infrastructure we are so but everything is virtual machines The physical machines are also are only Sender controllers because we are using fiber channel So you need to be able to create the volumes for the booting So you have a fiber channel card and this cannot be the virtual, but the rest is completely virtualized no the This is the big advantage because it is vendor open solution and we have several different Implementation when we are using for examples Cisco routers. You don't need to have journey paramex boxes for for the open-contrail You need to have the devices which support IBGP MPLS GRE and VRF functions And it's not depends on that, but if you want to use the F5 integration I mean load balancing as a service function, which we would like to implement it You need to have MX because it is proof and tested with the automatic managing with MX routers Okay, so if it's there any more question, thank you for your attention