 Okay, I think it's about time to get started and let me welcome you. Thanks for joining us today. I'm Cliff Lynch. I'm the director of the Coalition for Networked Information and you have joined us for one of the project briefing sessions that's part of the final week of the Spring 2020 CNI virtual meeting. We'll hear three speakers today and I'll introduce the topic in just a moment and when they're done we will take some questions. I would just note that there is a Q&A tool at the bottom of your screen and you can use that to enter questions at any point. We'll respond to them all at the end but certainly there's no need to wait till the end to put your questions in. I also would note there is a chat window over on the side and we will be putting out a couple of URLs during the presentation that may be helpful for some of you. Finally I'll just say that Diane Goldenberg Hart from CNI will be moderating the Q&A when we get to the end of the presentation. Now just a quick word about the presentation which I'm going to be very interested to listen to and I think you are too. One of the great sort of disagreements among CNI member institutions and we've explored some aspects of this in executive roundtables in recent years has been what's the right role of the cloud? Do you go all cloud? Do you do a balance between local and cloud? What's best situated where? What are the pros and cons? This has been a really lively thematic discussion. We're going to hear here I think from one of the institutions that has been more aggressive than many in taking a cloud-first strategy and I think that we all can learn from their experiences here so I'm really grateful for really grateful to Gene Phillips, Fivensio Calvo, and Lewis Brooks for sharing that experience and that thinking with us and I think this is going to be a very timely presentation as well because I think that in the aftermath of what's happened over the past few months and the pandemic, we are going to see some institutions reassessing how aggressively they want to be moving services into the cloud and shifting off some of the operational facilities risk to vendors. So with that it just remains for me to thank our presenters and thank you for joining us and say over to you Gene. All right thank you Cliff. So I am Gene Phillips and we are here to share our experience and moving to AWS three years in the cloud and so I want to introduce next slide. So I'm the associate dean for technology and digital scholarship at Florida State University Libraries. Lewis Brooks is the head of systems and many other things. Fivensio Calvo is the director of software development. Next. So in the beginning, in the beginning is when I came to FSU Libraries in 2013. Okay our computer room does not look like that but I think you get the idea. We had an environment where we have a computer room in the libraries and we had we started using virtual machines for our critical servers that the campus supports. This is a very expensive solution for us especially as we needed more and more storage. This solution also had some problems with reliability and a lack of a failover option. Our computer room itself was not optimal. We had AC problems. We had power problems. It really it was an office that was turned into a computer room and we had a growing number of services we wanted to offer. We had just recently added programmers to staff and we were able to add sys admins so we had some capabilities in staffing and now we wanted to offer more services. One of the biggest areas that we saw that we needed to expand into is digital libraries and preservation. Next. So that was our atmosphere that we were working in. This is our traditional setup of servers. Our two critical services were our easy proxy server for off-campus access to e-resources. No surprise there. Even more important today. And our web service and we see the web server as the front door to our virtual library services as many and in addition many other things. And next. Okay so we what we did we started looking at the cloud and I have to say that our first sub bullet point storage cost is probably the thing that inspired us the most. We decided that there were many advantages to the clouds but this two cents a gig versus four dollars a gig a month for the campus IT we just you couldn't beat that. The campus IT could not compete. They also didn't have very flexible pricing. They had very it was four dollars a gig no matter what. We looked at it and decided it would cut our costs of 50 percent for the campus hosting service. In addition we'd have more reliability. We wouldn't be working with hardware or networking problems. We would be able to have our servers and our data in multiple data centers with no extra charge. It would have doubled our cost to do it in in the campus IT virtual machine area. We'd have better monitoring of the services in the cloud and of course we wouldn't be dealing with power or climate issues. We also had never really been able to consider disaster recovery or automatic failover really for continuity operations because of the cost and it wasn't offered by our campus and so now we now the cloud had these abilities. We were able to spin up servers pretty darn quickly. Our servers are not housed in Tallahassee which every once in a while there is a local disaster such as a hurricane as we start the hurricane season watch right now and it was inexpensive to mirror servers. Those are the things that we saw as the advantages of moving to the cloud. Next. Oh there's my dogs. Not my dog. The one on the left is Lewis's and the one on the right is mine but that's just a little break and a place that tells me to hand this over to Lewis to talk about how we moved into the cloud. Thanks Gene. After a lot of discussion internally some testing going back and forth between our developers and our sysadmins and management we decided to dip our toe into the cloud. You can go ahead and move to the next slide. Very basic what I'd call a 0.1 architecture cloud based it's recreating a virtual uh recreating a physical server in a virtual environment where you have within one system you have your cpu your file store at your ram networking software everything is in a server. This is really easy to set up. We did a lot of testing with their free tier of service that AWS and many other cloud vendors provide and we we using this model we moved up all of our servers critical mission servers up into the the cloud and in fact our easy proxy server still uses this architecture even though we have transitioned all of our other servers to uh more of a cloud-centric setup. So next slide. One thing I do want to talk about real quick is cloud security because this is they use a what's called a shared security model in the cloud. AWS itself is compliant with a bunch of different programs FERPA and HIPAA being the two that we deal the most with for student and for medical data since we have a medical school on campus. They guarantee that the the layer that's below the applications which includes the the compute the storage database networking virtual hardware all of that stuff is is secured they're responsible for that whereas the customer that's you is responsible for the applications that you run your identity and access management any kind of server side encryption excuse me. So it's a partnership but there is built-in security into the cloud so you no longer have to worry about the lower layers of what you're building on you can just focus on your applications and your data. Next slide. So this is our 1.0 architecture. I call it 1.0 because as opposed to basic moving a physical server in the cloud we did start taking advantage of their relational database service which is a database service AWS offers that emulates a lot of different popular databases. MySQL, Oracle, MariaDB and allows you to to connect to those with your existing scripts that are set up for the for a specific databases but it obfuscates the entire database back in so you don't have to worry about it or do anything with it but we had you know our primary web server and then we had in another availability zone we had a secondary web server we also had a primary search server and then a secondary search server these were all backed up and connected with scripts mirrored nightly. We tried to do it with easy proxy if you've ever worked with easy proxy it's not the most flexible software I'll give it its dues and the fact that it's rock solid for what it does but it's pretty dated and we had a lot of trouble getting it to adapt to the virtual environment so like I said we're still running a very basic architecture with easy proxy next so once we get your servers up into the cloud the next thing that you tend to move into is moving into data in the cloud which is what we did we started up backing up servers those over the cloud made sense and then our physical servers so we'd have them all backed up in the location off campus then a few years ago there was the ransomware encryption scams that were going around so we made a decision sorry my dog's backing in the background we're having to publish them file backups we moved 14 terabytes in a week up to the cloud to protect us from from this encryption scam and we've had it up there ever since it's been slowly growing the reason why we have 14 terabytes is because as you know librarians never throw anything away so then we just moved to staff desktops we just started moving into preservation I say that with capital P because that's library and preservation not IT preservation which basically means it needs to be there forever and we're looking at research data now we also spun up a bunch of new services because it's easy to do new servers such as we became a DPLA hub we set up a Wikipedia wiki server which has been hugely successful with our internal communications and sharing knowledge within the library and our request tracker or RT ticketing system which was originally developed to be an IT ticket system but it has been expanded organization-wide it's now used by special special collections human resources facilities a bunch of different groups next so this is where things started to get complicated because it's so easy to set up servers um back in 2017 you know we thought we had everything set up we had our primaries and our backups but they were all in one excessive they were all in north virginia and north of virginia went off the grid for AWS it was on all the newspapers so then we decided well we should have a separate disaster recovery site and what we did was we mirrored everything we had in north virginia and Oregon which got very complicated because we had scripts not only backing up from the primaries to the mirrors but also from the production site to the disaster recovery site and you'll see that I say this is a hot site because all the servers were up and running all the time we had three years commitments for everything it's very bulletproof again we still didn't have the capability of actually failing over automatically from one system to the other but it I will say that we had two hurricanes come through in one year and while the campus's web servers all went down ours all stayed up so I was pretty happy with that next slide and then things really got crazy because we had not only our primary and our backup we also had servers set up in California and Ohio and Canada which were used for testing or for experimenting by our system men's and our developers we had a lot of new systems as I mentioned but we'd also gained a lot of knowledge this was well two years into production three years into working with AWS because we spent about a year spinning up getting people ready to work in the cloud and we had a lot of new knowledge and a lot of new experience we better understood how things were starting to work in the cloud and we decided that we really needed to simplify and work on a new plan and I'll turn it over to Benzie now who will talk about our 2.0 architecture okay thank you Lewis um yes so as Lewis uh you know said we had been working in the cloud now for a few years and um you know we realized that we had you know we had a few gains we were doing better as far as money was concerned we were providing better services but it was certainly getting unwieldy and we realized that there was much that we could do to improve right and so you know the three-year anniversary was coming up and and this was important because I don't know if you caught this in our previous slide but the way we were doing our budgeting it was the three-year cycle so we were buying our servers for three years and as the new budget cycle came up we kind of thought well you know we wanted to reassess what we were doing and think if could we do this better right so we wanted to optimize for the cloud and so what does that mean right well when you want to optimize for the cloud I think you have to like basically shift your thinking into a whole new way right and and that happens gradually I think I think Lewis had mentioned how we had already started to use RDS and all these different things but I think what happens initially is that you're still thinking in the old way of you know we have a server here and we want to move it over there and you still think you're still thinking in the old you know hardware way of thinking at least that was the case for us but you know with with the we with the three-year budget cycle coming up and and also something that happened was that we had a denial of service attack that brought down the the web server for a few hours we realized that there was still fried a little bit more fragile than we wanted it to be and there was things that we needed to think about that so what does it mean to to the cloud is not just servers I think the cloud is really about the platform right and it's about using AWS and what it all the tools that it gives you so what what we had to do is start thinking about the servers and start thinking about services and this is what happens when you start moving to the cloud that you start hearing all this new words all this new acronyms you start hearing about ec2 and s3 and kms and all of these things right and and it's it's it's a lot to take and and it's understandable that we initially would want to sort of lower complexity and sort of move everything the way we had it before but but as you learn more about it you start thinking well okay you know at first we had we were using ec2 to move our virtual servers we were using s3 and ebs to create disk space we were using we started using the relational database service for the for the databases but all of this is very much one to one it's all like this is what we're doing before and this is what we're doing now and it's all one to one and so what you start realizing is that there are tools that can help you improve and optimize your systems if you think in a different way right so instead of using something like key management service which is what you initially use to give people ssh access like your developers in your sys admins to your system you might start using something like systems manager which is a more advanced way of providing this type of access you might start using things like the global accelerator so that rather than assigning a specific IP addresses to your servers you can provide one IP address to a service in general and then and then redirect traffic from that IP address to any host any number of servers that can live anywhere right so you start sort of like decentralizing your your systems and make it more resilient and make it also more able to be flexible so you you might also start using things like application load balancers and auto scaling and that sort of thing and that also gives you also ways to do things like infrastructure as code right so you start thinking of new strategies that's what you really need to think about is what is a way to think about cloud-centric services that can help you provide services that use only what you need right and that's really the key here when you think about cloud formation about cloud-centric strategies is how can we stop buying servers that are only that are going to be bought only for peak times right so like the old way of doing things whenever we the new budget cycle will come up we would say okay because the website gets at most hits during any period of time would be a thousand hits you know per minute then we are going to buy this that's just an example but we would buy a server that can handle that type of traffic even though that would only happen for maybe 30 minutes a day in the peak time traffic right so for the rest of the time that server was sitting there spinning doing nothing what you really want to be able to do is do things like build smaller servers and then use something like the use something like sorry i'm going to bring it up to the next like to the to use auto scaling groups which lets you build smaller servers and then you can spin up new ones whenever traffic becomes necessary right so so there's there's all these different types of optimizations that you can start to do the biggest one going back to this this side was being able to build our infrastructure as code right so before we would do something where we would build servers as individual sort of like i want to call them beautiful flowers right like you would build you would start up a server you would configure it manually and now that thing lives in a way that cannot be rolled back or cannot be destroyed because it has all the history inside of it sort of living as a configuration the way we do things now is that we create cloud formation templates that allows us to reuse components so that as we bring up new applications we can if we search for an example would be an Apache web server if you need one Apache web server you don't need to reinvent the wheel every single time we can use our cloud formation templates to roll a new one and then as changes happen to that server we can we can keep that as part of the code so that if you ever need to rebuild it again we can destroy our entire infrastructure and rebuild it in in another availability so we wanted to or simply rebuild if we wanted to use another type of software like if we instead of using Apache we wanted to use IIS or another type of web server there's ways for us to now be able to be more flexible about the type of applications we use so that that this is what we're thinking now and this is where most of our services are going we are you we're making it much more easier to manage our services because rather than that each service being an individual thing that lives separately we make it so that we reuse the systems and make it much more flexible and efficient for us to manage so this is where our new what our new applications look like you can see that we're we're thinking about using things like lambda functions whenever you we have things that let's say you have a specific process that you want to run and it only runs for a few minutes let's say you know once a year for example our ATDs we receive those from ProQuest only three times a year and we need some some processes that need to be run at a specific time during those years rather than spin up a whole server there are ways for us to now run applications on a server so this way so that we don't have to pay out for servers and think about configuration at lower layers we just simply write code and run it in in this cloud services so so that that has made us a lot more a lot easier to manage cheaper better and with that I am going to hand it over to Gene for lessons learned and I know we're running a little late here I'm going to try to rush through these things our lessons learned at administrative level were how difficult it is to explain what cloud services is to people who are really thinking about technology as hardware stuff they can see and so first we had to start with library management and talk about things that we really didn't know that much about but talk them into it they were willing to take that risk get campus approval they campus had not had a relationship with a with a cloud vendor at that time they just signed a relationship with AWS last year and they have a relationship with Azure also how to go through purchasing when you can't get a fixed price quote from multiple vendors to do some price comparison and everything is sort of a la carte that's also very very challenging and then and I say misconceptions about the cloud but I really mean concerns about the cloud privacy and cost are the two things we hear the most about we had to rethink how people work together to sediments and programmers it's a very different way of working and as Favizio says when you say that the services is the code that's kind of interesting and we had to rethink that we had to make sure we had funds for training because as we said it's very it's essential to being able to kind of shift the way you look at things next I think I we've talked about the strength a lot I we do have I we do consider the last two the the services are now under our control and we can respond to emergencies faster than we have in the past and our emphasis is on the service and not the OS or the facilities but there are definitely weaknesses and I think this is some of the things that the cliff was talking about in our introduction we're a vendor lock-in you know we've been talking you know most of the things we say about AWS you can get from other cloud servers services now but every one of them is done differently and so we are locked into this vendor at this moment there is cost variability the slide that Lewis showed can show how things can get out of hand and luckily you know by the end of the month that you've done something that's going to up your bill very quickly we're highly customized to AWS and it's very complicated there's a suite of services that you can use and how we mix and match those and use those is probably different in each institution it's not right for all services we were running virtual desktop servers that are on prem right now because that's the best place for them so we can't use those for the cloud and there's a steep learning curve in doing this next and so we have but we have a lot of future dreams we're going to take over our digital asset manager it's consortial manager right it's consortial service right now and move that to the cloud because this is an area of growth for our libraries we are creating a data repository specific to one subject for learning disabilities data we're going to reimagine our whole born digital library workflow we're looking at open publishing virtual research workstations and managing the data the lifecycle of data and research we know we need to develop an exit strategy and that is the heart one of the hardest things and there's always more there's always more things we want to do next okay so we're the star there in florida if you can see the little red star and that's a hurricane coming at us we're not that concerned because our critical services are in the cloud sorry but we're not in that cloud that is a hurricane so are there any questions that we can address i'm sorry about the time not at all thanks thank you for making that distinction gene and wow that's really concerning um i'm glad you got your stuff in the cloud thank you for this wonderful presentation we really appreciate it and it really is very thought-provoking um i see that we already have a question in the q&a box so let me just go straight to that and that comes from mary be mary beth snap mary beth says very informative we're along the same journey i'm curious how many team members do you have so we have currently really three developers and two sys admins is that right guys and then these two managers that are and that's on working on a huge variety of services wow fascinating okay thank you did anybody else want to weigh in on that thank you um oh sorry uh yeah so we one thing that we've really had to do though with with the staff that we have is the type of people who work in this environment you can't it's really hard to hire them and honestly they probably couldn't afford them as a library anyways we've had to develop our own and what we've done is we've taken developers and giving giving them some sys admin skills and then we've taken our sys admins and giving them developer skills uh some are stronger in areas some are stronger in other areas but uh it was the way that we were able to get people to get this done uh in an effective way and it's been very successful for us i'll say yeah i think that's a very good point is that uh i think what we have you what you have to do uh when you deal with budgets like ours is that we hire potential rather than experience we we see people who we who we think we can train and that can learn and then we hire them and sort of bring them up uh so that they can work with us and this new technologies and hope they stay with us thank you thanks uh mary beth for that and she comments spot on on your assessment on staffing so thank you for the question thank you for those responses um from sarvani he asks uh for research lab virtualization what apps have you considered and thank you for the presentation so we we've looked at a number of things but uh uh louis do you want to talk about the whole campus has been looking at how to manage data and uh we've been keeping an eye on a project that's going on with emery and others um to create a um a whole management suite in aws i don't actually know what the what the apps are that they're thinking about louis you um so yeah we uh we've been taking a look at this for about two years uh it kind of started off with us being pulled into a big data research initiative to build some infrastructure for that on campus and then we started taking a look at how we could support being libraries where we're more interested in supporting all kinds of research data um and we um um they're putting data into the cloud uh i know what emery is doing with um their project is mainly a HIPAA compliant way to store uh and work with data in the cloud where they set up an s3 storage bucket and a virtual server that the person can log into remotely and run queries on for different softwares such as r we're taking a look at how we can do that as well to try to to see do it up on a much smaller scale we're just starting to spin that up though okay thank you um i see we we have a few questions in chat uh that um that louis has um responded to so let me just read those out to be sure everyone knows what those are um let's see how frequently do you back up the servers and data what about the network infrastructure and costs to support these transfers can you please talk more about that the reply is we back up nightly there's no cost to upload or move data within an availability zone there's a small charge to move data between availability zones such as Virginia to Ohio we feel it's worth it and it goes on to say the big cost with data movement is out of the cloud but there are ways to minimize that as well um and also let's see Martha Anderson asked are there any other platforms or programs you're considering now for your exit strategy strategy from aws the reply from louis is while we're not planning to leave aws we are looking at what it would take there are other options the other options would be microsoft azure or google iCloud um and then we have a question from cliff cliff asks are are you part of an institutional aws contract or are you on your own contract so we um we were the first uh official contracted at fsu with aws um the institution just started last fall with a contract that they're writing from a texas university texas contract we haven't joined that yet and we're not we just we've had some conversations with them they're just starting off on aws the very beginning of their journey they're trying to do disaster recovery there um and um they're really just getting started so we're not we're on our own right now interesting um i will say that we're looking at it um mainly uh for simplification uh we've had and for getting the bills paid on time we've had some issues with purchasing paying stuff on time uh and getting things taken care of since this is a new way of doing business uh one of the things that we do that are i personally like about the uh the campus um or the agreement at the campus level is being able to pass through costs so if we have a researcher with a grant who wants to set up a um something like ld base or they want some research storage and a website uh we can pass that cost down directly through to their billing without us having to handle the money or anything it's all taken care of at the campus level great thank you we have a question another question from Mary Beth snap have you experienced resistance or hesitation to having master digital objects i.e preservation capital p in the cloud versus locally we're really getting started on this process right now we did as as louis said we took our highest um level of preservation need um and we moved those over our electronic theses and dissertations and we moved those into glacier and uh we're now getting archived matica set up which is a tool to to manage your preservation um of your objects and and we're trying to understand what tools we're going to run we're setting that up in aws right now and we're trying to figure out all the the dependencies because um so and it's ours it's our people managing the digital library and the digitization that are interested in doing this so i don't think we have resistance because we haven't figured out a reason not to do it we will be doing checksums etc aws has tools to do those uh native tools and then um we're trying to figure out the best fit for all of us great thank you so much thanks for that question and thank you for those answers uh a lot of great questions here today um really appreciate everyone being here and um to our panelists as well we're still around to please uh stay with us if you would like to chat with jean favenzio or louis if you'd like to make a comment or ask question uh all you need to do is raise your hand and i will unmute you take care everyone bye