 for almost two and a half years. And I work on networking modules and plugins to make sure network automation works. And before I start by network automation, it's not actually Linux network automation, but network automation of actual networking devices like routers, switches, and everything. Before I start, is there anyone here who is network engineer or operator or have ever used any network device or virtual device? You have? OK. Great. Great. So let's start with why automation for networks or why is it necessary? So managing network hasn't really been changed in last decades. So if you look into the past of what network engineers do or what they're used to is doing things manually, they are not really much into programming. And since network is very critical, they follow legacy operational practices. And by the term legacy operational practices, what I mean is they SSH into the device manually make changes there, validate it manually, and then log out from the device. So this is what the practice, they have been following for decades. Vendor specific implementation. Now coming to the vendor specific part, see all these vendors, networking vendors, they have implemented their technologies in such a way, which is very specific to themselves. So say I'm a network operator and I work for a network organizations. And I have like Cisco, iOS router, Juniper, a switch, and a five load balancer. And when I'm configuring it, the way they have implemented their every technologies, it's very different. So I have to have very domain-specific skillset. I have to be Cisco router expert or a five expert or something. And also there is no standard way to abstract all the different vendors that would make network operators live easy, which is why that is why siloed organization has been risen and properties, platforms has developed. And no source of truth and no vcs at all is there. So let's give an example here. So say I went to, I came to work today and I configured my network device. I changed entire configuration of my device. And I left for the day and did everything manually, right? That's what we have been following. And say I fall sick tomorrow and one of my colleagues came in and he is doing something on the same device, but he is not aware what's been changed and what's the current state of devices. Of course he can check manually, but if it comes to like more than one device or even 20 device, like it's very less when you're maintaining an infrastructure. That's like very painful for him to check every configuration of what has been changed manually and then based on that push another configuration. So there is no source of truth, no version control. And another important point is configuration drift. What configuration drift is, it means that say it means you are configuring something on a device and at the same time someone else is doing same thing on the same device. So say I'm configuring my router ID and each router can have only one ID, right? So another person is doing the same thing. So either my task or his task is going to fail, which means I did not have any configuration state to validate whether I should proceed with setting up the router ID or not, which makes the chances of error very high. And of course since we are human, we are prone to error. And considering all these points, when it comes to maintaining more than 15,000 devices, this whole workflow and practice is not at all scalable or maintainable. Now that you have no other reasons, this is a survey Gardner Research published in Look Beyond Network Vendors for Network Innovation in last year, January. This survey actually published that almost 71% people still use is a staging manually to network devices and configuring and automation is on forth, which is like only 6%. So this is the current state of network management case. Now that you know why we require network automation coming to this part is this automation tool that I'm going to talk about is Ansible. I am pretty much sure that everyone here know about Ansible, but still a brief introduction. It was started in 2012 and acquired by Red Hat in 2015. Currently we have 3,700 contributors, five like downloads per month and 2100 plus modules. And Ansible is very simple. It uses YAML, which is like very English-like language, to maintain your entire infrastructure or workflow. And regardless of being simple, it's very powerful. You can do almost anything with it like deploy application, orchestrate your workflow, configure your devices and automate your daily jobs. And another important point is it's agentless. It does not trigger an agent. It just control to manage host communication. Currently we have support for cloud containers, database, files, managing, monitoring, networking, notification, system, test utilities. All the stars currently with Ansible, you can be auto-man. You can auto-man. Now coming to the point, why Ansible for network automation? We're at the point of end. Yeah, it is here. So since you know that Ansible is very easy and network engineers don't like getting into programming language, they always look for easy tool or something which won't make their life more difficult, which is why with Ansible they can get started very easily like probably in two days. And the other point I talked about earlier is vendor-specific implementation with Ansible. You can integrate multiple vendor configurations and Ansible also provide you an ability to abstract multiple vendors. So you don't have to care about how the vendor is implemented under the hood. You just care about what you want to push to the device. And you can manage with Ansible, which is like you can track each and every state of the configuration of your device. So before you are pushing any configuration, you can just fetch the current state and check whether you need to push it or not, something like that. And you can also make changes not only on one device, but across any number of network devices. And even after making changes, you can validate whether it's there or not. I mean, for network engineers, it's very necessary for them to know whether it's actually there or not after pushing the configuration because it's like very important informations that are configuring BGP, OSPF, SNMP and all that. So that's why they have to do all this manually because they want to validate now as well as for them to validate that. And scale with AWX, which is you might have heard of Tower. And AWX is the upstream open source version of it. So with Tower, you can manage and build in dynamic inventory. Tower is, you can save it. Tower is kind of EY of Ansible, but it's not just only that. You can do more than that, which is like you can have role-based access to users or teams. You can create a diff between your current configuration and what you want to push based on a schedule job. And you can also integrate restful APIs and any other third-party APIs to Tower. Let's talk about some common use cases that a network operator might do on day-to-day basis. So, since there is like very few network operator or device user here, I'll just talk a little bit about it or how it works. So when you actually boot up a network VM from an image, it comes with some default configuration, which is called startup configuration. And when you're actually pushing something to the device, it's called running configuration. That is the configuration that the device is running. So there are kind of, usually two kind of configuration, running config and startup config. So as a user network engineer operator, I want to know the current state of the devices. So I should be able to back up my device, which back up means fetch the current config of the device that can be either running configuration or the startup one, and should be able to restore the device configuration at any point of time. So say I have done something wrong, which might affect the network, I should be able to roll back to the previous state as a network operator. Then, upgrade network device, just like you can upgrade Linux operating system, we can upgrade networking physical devices. So I should be able to upgrade from one version to the other version. The next point is ensure configuration compliance. This is similar what I was saying, like validating whether the configuration is there or not. Dynamic documentation, it's something like I should be able to fetch the config of the state of my device at any point of time or on the runtime of the playbook. So this documentation is the source of truth of your entire network device. That is what I meant here. And so, other than this work cases, like pushing everything at a whole or fetching everything as a whole, I should be able to perform some discrete tasks like for Linux, you are just configuring HTTP server like that. So for a network, it's like, you just want to configure VLAN or just manage firewall ACL entries. So I should be able to do that. Now with Ansible, how Ansible is actually helping with all these use cases. Like I said, the backup and restore configuration, but since Ansible is infrastructure as YAML, you can get everything either in JSON or text format of your entire network device state. So, and the next part is configuration management. So you can either make incremental change and check the difference or compare the difference between your change and the prior state of your entire network device. So I should be able to do that or Ansible helps with that. And then I should be able to check and validate whether something is configured or not. That's what with Ansible you can do. And this is the other survey that network to code. It's a company networking. They did in 2016 November, where the survey was on the tool the people has already deployed or interested in where Ansible ranked second and already used in production is very much higher than any other tool. Just a difference between how network automation works. I'll actually demo it after the talk. For Linux or Windows, how it works is when you run Ansible on your control node, it copies your module code to the remote device and uses its Python interpreter to run it and return the output to the STD out. But for networking it's a little bit different because networking looks like a Linux environment but it is not actually because it does not have an interpreter or something or some device has, but for that you need to change the shell mode. So that's why we run the modules locally on control node and just fetch or pull the configuration to the device. Okay, so this is a little bit about the Ansible parts like what inventories and modules is. So inventories is like your entry point of your hosts. So this is where you can have all the hosts. The hosts can be Linux, Windows, containers, routers, switches, et cetera. Playbooks are as you all know, it's written in YAML and each playbook contains, can contain multiple play and each play can contain multiple tasks where task is what that invokes the modules and modules are mostly written in Python but can be written in any language and it is what actually does the actual work. And plugins are the one that augments Ansible course functionality. So action plugin is kind of front end of module. So if there is anything required to do on the control node, we write action plugin for the module and there are connection plugins also. So for networking, we have multiple connection plugins. So if you want to connect through HTTP, you can just use the HTTP API connection plugin, something like that. And there are other kinds of plugins also like filter plugin, you can filter your data that you are getting or want to push. Currently we have 65 plus networking platforms, thousand plus network modules and 50 Galaxy roles that we support. And we started with 2.1 version where we started with seven platforms, 28 modules. And in the month of May, we have, this year we have released two eight. Now we have 65 network platforms and 1098 modules. And these are the networking platforms that we have enabled so far. Like you'll find mostly use like Cisco, Juniper and all that. And since you're mostly, I guess, into Linux automation, so you don't really have to mention which connection plugin you are using for your playbook. Because it by default uses SSH. But for network though, we have a shell environment when you connect to a device, it's called CLI shell environment. So we have written this connection plugin for network CLI. It's connects via SSH, then implements SSH, CLI shell. And the other connection plugin we have is NetConf. It's XML based. So it sends and receives RPC over SSH, a NetConf subsystem. And we also have HTTP for network devices. So like Cisco Nexus, which supports that HTTP protocol called NX API. Same for Arista US, it's HTTP protocol is EAPI. So for that we have written HTTP API, where it talks over HTTPS or HTTP. Okay, now I am actually moving to the demo part. I'm going to connect to a Cisco Nexus switch. It's a 9K switch, just a little bit information on the show version results. So it's a Cisco Nexus switch and the image is 7.0.3173. And what I'm going to do right now, this is my label. On the top I'm connecting to the host. I have mentioned which connection plugin I want to use. I'm going to gather some facts from the device. And the first task I'm going to perform is backup of the device, which is get the running config of the device. So if I do this, show running config is this is how you actually can see what config the device is running. So if you do this, you will get the entire state or config of your device. So I'm going to take the backup of this using the task. First I'm gathering the facts. I did not gather the whole facts because it's going to take some time. So this is some like default facts we have which gathers the name of the device platform, which platform it is, what version it's running and what API it's implemented in. So for CLI, it's CLI Conf. And now I just backed up the device. Now it's created a folder in my directory which is called backup. Now if I go to the backup folder, I can see a file name with a timestamp. Now if I open this, I'll see the same backup configuration that my network device has. So this is what I was talking about. And I can also push this to the device at any point of time. Okay. Next one. The next task I'm going to run is I'm going to just validate this command show version because I was talking about I want to make sure whether it's some, I want to validate something. So what I'm doing here, I'm going to run this command show version and going to test whether this device is an XOS or not. Okay. So if you remember, I have run the show version command before. It returned the same thing that the playbook just had. So, which is this? It's the same thing. And this is the second task that we just ran. Okay. Now I'm going to the device again. The next task is what I'm going to run is check the state of my ethernet one slash three, then configure it, which is I want to disable the interface. So I'm going to shut it down. Then I'm going to verify it again in the next task. So before that, I just want to show you how it looks like when I execute the command show interface, ethernet one slash three. So currently the state of the ethernet is up and let's do it. So gathering the facts, now return the show response. Now it's disabling the interface. Now verifying the state. So you can see it's the same result that you can see on the same output that you can see on the left hand side. Let me actually switch to that side. So yeah, admin state is up. As you can see on the left hand side, this is it and on the right hand side, here also you can see the admin state is up. Now, since I have made the state down in the next task, which is this, so I make it shut down. Then I ran the output again, which returned admin state is down. And then I verified here the operational state of it. Now let's actually check the same thing on my device, whether it had shut it down or not. And it had shut it down, so admin state is down here also. So we have verified and configured something and just verified it. Oops, this is the same playbook that I had on my demo. These, so gather facts is something that we added for network devices previously. We did not have support for gathering facts. So in two ways, we have added this. So you can gather any kind of facts from the device that can be interface, VLAN, or the modern information or configuration. So there's the same output that we saw while gathering the facts. And the other part is just like Linux, in networking VMs also, we can switch to privilege mode. And there the mode is called Enable, but not sudo for Linux and sudo. So you can also set become to true and switch to your become user or become Enable mode using the Enable method. All right, and if you want to join our community, this is our Slack Channel Ansible Network.slack.com and the community usually, the developers hang around in pound devil channel. And if you have user-specific question, we have pound general channel over the Slack org. And on IRC, the channel we are on is Ansible-Network on FreeNode. And this is the Google project group we have for user. And this is Ansible-Devil Google project list. And the last one is the announcement list, which is also Google project list. So this is where we announced release. And this is the page where you can actually get started with network automation, which is Ansible.com slash networking. And if you don't know Ansible, you can get started with it with the second link. And on the third link, we actually have some network automation tutorials and playbook examples. If you want to use or at least check how it works or how it can get started is a good place. And we also have webinars, resources, ansible.com slash resources link. And this is my info. I think that's it. Questions? Any questions? Hello. So my question is about a rollback part on the network device. Really I have thousands of network years, but I'm really concerned. If I do automate like this, in case of one of my backbone goes down, that'll be a really big impact for me. So in case of what happened is like if I lost a connectivity from the Ansible console to the router, what are the precautions we can take or how we can avoid this condition? Or if I lost a control to the router or the remote switch, what we can do? Okay. So like I had the task backup, there is also our command count file that you can set the file name, like which file name you want to save it to. If you don't save it, by default use your inventory hostname and timestamp. So if you have say multiple hosts, you can actually use the inventory hostname magic variable.