 Good afternoon, everybody. I guess we'll get started. My name is Aviva Weinraub. I'm the associate university librarian for digital strategies at Northwestern University and the co-product director for the Avalon project with John Dunn at Indiana. So we're just going to get started, OK? So the Symbira application is called Avalon Media Systems. It is an open source system for managing and providing access to large collections of digital audio and video co-developed by the libraries at Indiana University Bloomington and Northwestern University. Version 6.4 of the software was just released. And as far as we currently know, because, of course, I think all of us know that tracking open source software has some difficulties, there are 12 other academic institutions that are using Avalon Media Systems. And this project software has been supported by multiple grants from IMLS and the Andrew W. Mellon Foundation. The overall goal of the 2015 two-year Mellon grant that we just received was to figure out ways, sorry, the two-year Mellon grant was to figure out ways to sustain ongoing support and development for Avalon. And as part of our most recent IMLS grant, we were able to partner with Lyrasys to run a pilot project with nine partner institutions that spend a diverse range of use cases and institution types. So I'm going to talk a little bit about running Avalon in AWS at Northwestern. And then I'm going to pass it on to John to talk about the Lyrasys pilot. And then we'll pass it along to Carl to talk about what it was like to actually run it as a local institution. And then we'll open it up for questions and comments. So in support of the Mellon grant, one of the specific deliverables was to deploy Avalon in the cloud as a proof of concept for potential vendors to provide the software to customers as a hosted solution. Northwestern was primed to take on this part of the grant since Northwestern had signed a larger contract with AWS, a larger campus contract. So in late spring, early summer of 2017, we migrated and upgraded our instance of Avalon from NU's local infrastructure to AWS. Our migration ran smoothly with the care and attention of a single developer. And we are now fully in production on AWS. So as of March, 2018, we publicly have available 140 moving images and 851 audio items. But logged in, we have a number of things that require you to log in. We have 3,430 moving images and 4,029 audio items. So because of the grant timeline and the academic calendar, we had to move fairly quickly in getting this up and running. That meant that we relied on the lead developer at the libraries to do the bulk of the work since he had worked with the DevOps in Indiana in actually creating Avalon 6.0, which was our first version that could run in AWS. So getting all of this to deploy cleanly with the right settings in the correct dependency order represented the steepest part of the learning curve for us. And in complex systems, one cannot expect to just move things to AWS. It really requires the work of rethinking the grant from the ground up in many instances. So I don't actually expect that you'll be able to read this image. But this is a resource dependency graph for the underlying Sanvera stack on the left and Avalon on the right. So it should give you some idea if you can sort of vaguely see the lines that are there. This is a really complex system. It's a fairly complex piece of software because it involves streaming and transcoding, which makes it actually a really great candidate for AWS in terms of utilizing available services. So moving to AWS involved many lessons learned for our technical teams at the library. So one of the big mistakes that we actually made was that we did not involve our own DevOps from the start. And I don't think that I can emphasize this enough. This was a huge, huge mistake. AWS allows for a different approach or mindset to architecting systems and developing code, sort of the infrastructure as code, if you were, is something that your DevOps will need to embrace and to actually internalize. So where did this impact us and why is that important? Well, when it comes to maintenance and support, our DevOps will still need to be able to provide support for these systems. So not having our DevOps on board caused us some issues and I'll point some of those out. So that requires properly understanding how those systems work so that they can be properly monitored. They need to understand the system so that they can troubleshoot and fix issues. And DevOps had to get up to speed to be able to do this kind of work. So for example, we did not have the alerts and monitoring set up correctly. So we were switching from SolarWinds, which is at our data center to CloudWatch, which is an AWS, and it brought some significant changes in logic and the approach to how alerts should be set up in a system designed to self heal and expand to meet demands. This led to a component of the system not restarting properly overnight, but to AWS everything seemed fine. So an outage was not identified until the application was not working for 8 AM classes. So we had some serious issues with this. Nobody really knew who was responsible, how to fix it, what the problem was, and really this was an issue with our system just not alerting us properly. So it's really important that you make sure that you're sure that your staff has the training support that they need before you make this kind of migration. This is essential for maintenance and support because the lingo in AWS is very specific to AWS. It will take time to train your staff and we're using a combination of vendor classes that work toward various certifications and ramping up is participation in various campus groups. In fact, we're taking a real leadership role in that on campus. As far as digging into the code goes, during the pilot phase, a lot of tweaks were necessary to get the code for this complex system deployed in AWS to run properly. And our library DevOps assisted in reviewing and modifying the code with our lead developer to reflect all the necessary tweaks and changes so that it could really build Avalon and AWS from start to finish properly. This process became an early step in bringing our DevOps up to speed on Avalon and AWS and how best to support and maintain this environment. Now, I sort of mentioned this before, but it's really important for you to take the time to rearchitect your systems. So not all lessons learned from this were a problem for us and this was something that we actually did really well. So depending upon what you're wanting to move to AWS, you may need to spend a fair amount of time rethinking and rearchitecting your system for the cloud. This is beneficial because it will help optimize how the system will run in the cloud to take advantage of how the cloud works, for instance. For instance, our transcoding services from AWS completely changed how we managed this part of our work. Sorry, I lost my place. So transcoding for video scales up really well. So, right, okay. Okay, so one of the big things that you need to do is transcoding scales really well. So you need to prepare your content for the repository which is going to include spikes in your resource intensive work such as ingest of large archival materials which vary greatly due to size and the type of work or project request. So we can now ingest and transcode thousands of minutes of audio and video in a fraction of the time that we could do on our local infrastructure because that was not really scalable in real time because of hardware limitations. So the savings in time it takes to transcode means, for example, that we can easily say yes to ingesting large archival collections of, say, ensemble performances that are, because of AWS, the collection is available in a much shorter timeframe and actually we've seen tremendous speed changes in the way that we're able to ingest content for classroom delivery. So if somebody comes in on a Monday morning and says, I have a three hour movie, we can have that up and running for them within an hour. So this, so anyway. So in sum, for these two benefits, AWS means that we can really easily scale to meet seasonal demand and if they were on premise, this would be adding many physical servers and simply this is, we just wouldn't be able to handle that. The cost for that would just be absolutely astronomical for us. So the system also self heals. So if a component goes down, AWS simply rebuilds it. This is a huge benefit for our DevOps and our developer staff because before when a component went down, there was an outage. Now assuming that, of course, we have everything set up properly and our alerts let us know everything, the system should be up and running basically 24 seven. So generally AWS provides our developers and DevOps use of deploying and maintaining systems and it means that we only pay for what we're using, which is a shift in our libraries since we've really primarily been working on these sort of one-time costs moving into the sort of recurring cost model. So sorry. This is not actually moving forward. All right, so we're gonna talk about, this is a snapshot of costs from September to March of this year. So we've spent $13,682 in total for production. This gives us more insight now into what our repository costs us. I'm gonna give you guys real numbers to talk about and that may or may not be what you want, but I'm gonna give it to you anyway. So in the September to February six month period, we ingested 2,151 items. So an item is either an audio item or a video item. 1,785 were audio items that are considered an archival collection. 222 were new course reserve videos and the other 100 or so were music. We streamed to about 100, sorry, 1,100 courses a year based on last year's staff, stacks of online course reserves. Since we only pay for what we use, the budgeting for this is very different and the cost is now an operational cost as opposed to a capital expense. So it's really important to track these expenses carefully so that accurate estimates can be made for budget purposes. But this impacts our growth as well and adding additional services, moving over additional services and systems to AWS will increase the annual operating costs. And I'm not sure about you. I mean, every institution that I've ever worked at, we usually use year on funds to buy servers. We're not necessarily thinking of running these systems as an ongoing operating costs. It requires us to have a very significant change in the way that we talk about running our technology systems in-house. That's been an interesting conversation on campus as well. So, all right, so a brief snapshot of some of the services and what they cost. So S3 storage of derivatives for us is about $600. As we add more content, we're seeing approximately a 5% increase in monthly costs. EC2 is stable and consistent at around $270 a month. CloudFront went from a high of $200 for October to $10 in December. This service facilitates high availability of our streaming derivatives around the globe and costs fluctuates based on demand. So we're assuming that the number of courses streaming audio and video in October was much higher than December since classes were mostly over and the school year was on break by December. Elastic transcoder, which transcodes or converts our media into streaming derivatives, costs fluctuate based on the minutes of media that are actually transcoded. So for example, we spent $1,400 for creating derivatives for a large audio collection that we ingested from our being in school of music. And it was about 150,000 minutes of music. So I do want to mention here in the costs that we still have needs for our on-prem as storage for archival content for our repository. This is a key component of our preservation plans. So having an archival assets on-prem in a storage system that fulfills requirements like being geographically diverse in the event of a natural disaster and the ability to perform integrity checks is key to our mandate to keeping his assets for many future generations to come. I have a whole bunch of lists of what it actually costs for us to run these things on-prem, which I'm gonna go through pretty quickly for you, just to give you an idea. So transcoding hardware for us was about $12,755 a year, which you need to be replaced every five years. Streaming hardware, $13,593, again, replacing about every five years. A streaming license was about $800 a year. Isilon costs for five years, so we're currently using 287 terabytes, which was costing us about a million dollars for our portion. We've paid to our local NUIT, which is our central IT unit, about $600,000 in service costs and some of those nodes. We essentially paid for some of the nodes last year. We funded the library about $200,000 of that. Again, seeing this as a costable end is dependent upon whether or not Glacier, for example, is seen as a replacement for Isilon as we reach Isilon's limits and meet our geographic diversity requirements. We have a number of unknown costs that would end or that may be replaced by this amount. So what it's actually going to cost for us to run this production systems in a managed hosting environment from our central IT unit, the cost of the repository staging in the cloud, and the staff time setting up and managing servers that should be less than what it takes to get systems up and running in AWS. So since DevOps was still left to manage other servers and you locally, this comparison is sort of one of those apples to oranges things. So I'm going to hand everything off now to John, who's going to talk about the Avalon pilot that we ran with Lyrisis, using funds from our most recent IMLS grant. Thanks, Aviva. For those who don't know, I'm John Herbert. I'm the director of technology services at Lyrisis. We were approached by Aviva and John had done from Indiana about a year ago to participate in the pilot of the Avalon media services as a part of their latest round of funding from IMLS. We were particularly intrigued by the opportunity because I assume because you're here listening to us of the sort of larger problem that video repositories are trying to solve for academic libraries and others. As I understand it, at least as we launched into this round of funding, what we did and what Aviva did at Northwestern, this was the first deployments of Avalon in the cloud. Interestingly, our hosting services, we operate on SoftLayer, which is IBM's equivalent, quote unquote equivalent to Amazon, to AWS. It doesn't provide some of the features and functionality that AWS does, which was a little bit of a negative, but I think the positive for us, that helped us understand well and better is that Avalon isn't designed specifically for AWS and is capable of running on different cloud web services. So that's a positive, certainly. Similar to what Northwestern did, we ran a six month window of active testing from September of last year to February of this year. And you can see the nine institutions there, University of Oklahoma, Carl sitting to my left, he's going to talk more specifically about their participation in the pilot and speak more about a specific participant's experience. You can see the others there, largely academic libraries, although DC Public and the University Arts, but the DC Public is there and Houston Public. So unlike Northwestern, who had experienced users with Avalon, we were piloting and our clients, our participants in the pilot were, we assumed, new to the platform. So we had to sort of, we broke the pilot into three distinct phases. We spent the first phase introducing the product, giving them specific training on how to use it. The second phase was with scripted protocols. We sort of gave them, take this object, upload it with this metadata. And then the third phase was just sort of open where they could actually engage with their own content in their own way. Through the large majority of the pilot, we were running version 6.2, although towards the end, we brought 6.3 in. As you might expect with nine institutions, we were able to ingest most every audio, AV file type. Most though, and I believe Avalon's designed mostly for access, were access copies with some mezzanines, digital masters. Master copies were not engaged. Do folks understand the difference between master, okay, and mezzanine and access? The largest use of it amongst the nine was about 500 gigabytes. In fact, it was interesting for as enthusiastic as our participants were, one of sort of the limitations of the pilot was them actually getting access to large quantities of source data at their institution. So we would have liked sort of more scale, run at it, cause bandwidth is a serious issue as you start to transmit over the wire. But, well, anyway, unfortunately we couldn't get there. We did implement, our tech lead implemented a script in the ingest process to actually directly integrate with DuraCloud and actually take two copies. And this is where through DuraCloud we were connecting to AWS, putting one in S3 and one in Glacier if folks understand the difference. So, okay. So some of our findings. So our assessment of version six is that for a hosted service there are some gaps in the service and functionality that it provides. I think Carl might get into a little bit more detail. We did an interesting exercise. We asked all the participants, we sort of gave them, I think it was a thousand points and we had assembled a list of about 40 proposed improvements and we let them spend their points and assign their points to however many of these improvements they would like to. So what it since gave us was a ranking and a prioritization of the improvements. So just to run through some of them, there's a need for more customizations to be able to sort of add, edit metadata fields to customize them more for local collections and actually have collection level configuration. Batch functionality of course as with most digitization services is a big need. It's a large requirement because you want to deal with your digital assets and bulk. The manifest creation was I think challenging for new users of the platform. So getting the manifest created correctly and also to have a batch sort of validation tool would be helpful. I know some other programs use a similar sort of tool. Integrations as you might expect with most platforms today, repository platforms, integration and discoverability is a big issue. Integration with a IR platform in particular and some as with video and audio, there's obviously a speaking component so transcription service integrating with that would be nice and sort of better UI and error messaging I think would help. It's important to also I think say what wasn't tested. As I mentioned sort of large scale patron streaming wasn't orchestrated into the testing protocol. It was largely individual institutions and the staffs that are streaming it out. So what would it take if you were streaming some popular video with 200 people let's say watching it at once? So that sort of large scale patron streaming wasn't tested. Authorization, the off-end, off-see integration was not tested. It's an interesting issue I think because I don't know if Eva mentioned this but Avalon's of course built on top of Sambera and I think this is a component of Sambera that's not yet sort of finalized and complete. So the off-end and the off-see type of integration is something that Avalon assumes the deployer's gonna be able to do themselves. An edge case was integrating with the learning management system. I think you touched on this a little bit. You can sort of see where videos and audios used in the classroom and connecting it to it. An LMS would be good. Scale testing I mentioned and bandwidth not stress tested. So there's a bitly there. I don't know if the, what would be there? Sorry about that. It'll, I've done that more than once in other presentations but I'm assuming that's on the video. Anyway, the bitly there is actually the full report that we're submitting to I'm a lesson that'll be a part of the sort of formal reporting out of the pilot. All right, next steps. There's not a, not a lot of rocket science here. So we're recommending of course that the sort of gaps and functionality be brought into the development cycle and the road tech road map whenever is reasonable. We assume version seven is a good opportunity for at least some of that, the better UI and collection management support we understand are coming along. Some hosted service deployment needs, some of which just outlined would be good. User account management, the batch loading. And interestingly, Oberlin College has asked us to continue and actually I'm a lesson providing support for us to do this to continue to host them through the end of June. So we're going through a process now to literally migrate their data in from their old system called variations, I think. So that's interesting. Migrations are always tricky. So we're going through these steps with Oberlin to actually more simulate how a hosted client could come on with legacy data we would need to migrate and bring on. That's a work in progress. And then finally, almost every participant is keen to retest and once version seven is available, we might come back to this and pilot test it some more. With that, we'll pass it on to Carl. You ready? Okay, good. All right, so by the way, I caught the term DevOps quite a bit and if you don't know what that means, we're actually running a session tomorrow to introduce that concept. So if you don't know, stop by our session tomorrow on the schedule. Why did we participate in this test? Well, I think there were a lot of reasons. And the first was that we are currently running what is an in-house system on FileMaker Pro. It's really just for staff and to say that we maintain it with bubblegum bailing wire would be really generous description. So the librarians have been after me for quite a while. Let's get a system in them. We've been looking at Avalon as local install, but I wasn't thrilled with supporting another platform in-house. I was not particularly happy about that law, but we knew we needed a, if I can get this to work, I knew we needed a streaming media platform for audio files and for our video secondary, but we have a lot of audio in our fine arts library that needs to get out there. So we've really been moving in that direction. I'm a huge fan of host platforms and I really like us to get as much of that in the hosted environment as we can. I don't want having my librarians have to find the particular recording that the user is looking for that's wasted their time. I don't like having my tech team spending their time maintaining a back end. I want them adding value for the users on the front end. So I'm really pushing for that. I'm also a huge believer in network effect. So while this is not a multi-tenant system and needs to be in my opinion, but what I'm trying to do is get as much of our content loaded in an environment where it can be shared with others and that's important to me. Network effect is this value that we see in Amazon and Google and all over the place where the more users you get at using a service, the more users are attracted. Well, obviously as librarians, we can't keep our head in the sand about network effect. We have got to get into the field of heat with the network. I think Joan was just talking about that a little bit in her talk. So that was important. I think we were also looking for a better display of the metadata elements because I know from being on the vendor side of the fence, I've never met a music cataloger that I could keep happy. It's just, wow, they have a whole other dimension of metadata that they want. And so we need to do much better job there. We know that our discovery interface on our support system is too generic for them and we'll never make them happy, so we're looking for better support in that area. And then of course, the other thing we loved about the test was the cost was right, you know, it was really good. They did that really well, so we were pleased with that. So let me tell you a little about our specific needs as a result. If I could just add, for the participants, the cost was zero, so, you know, for us. Right price, anyway. So what we came up with, and you've seen some of this in the slides before, of course, is that we really do want to see metadata fields. They're far too defined at the moment. They still need more flexibility. And particularly when harvesting, when we were trying to crosswalk these records, we were finding we couldn't do the things we needed to do. As I mentioned, it's not multi-tenant software, which we feel strongly it should be. Then we also want to see better support for DPLA. We are a DPLA hub at OU Libraries in Oklahoma. And so we are feeding as much as we can into DPLA. And we didn't find that the system has currently put together had really good connections that would enable us to interface with DPLA. We didn't get to test, as I think John mentioned, the metrics. And of course, we want to get into that because we are under the same pressure, I'm sure many public institutions are, if not private, that we're increasingly having to show what's the actual usage, what can we go into the impact of those usage statistics. And we didn't get a chance to look at that. And of course, the last thing we're really looking for is the CAS integration, so that we can have this logged in very simply and easily. So that was some of what we were looking for. Here's the rest of what we were looking for. He mentioned the points voting system, which we like that kind of system. It forces users to prioritize, to some degree, what they're really looking for. And what we spent our points on when we voted were primarily the OAAI PMH interface. Again, that would help us with DPLA quite a bit. We wanted better batch export capabilities. And then we were looking at batch metadata modification capabilities. So that was also important to us as we went through this. And obviously I've already mentioned statistics. And then finally, we were looking at the workflows and how configurable were they to match the processes we were using in-house. Although I will say there, I think that's a place where lots more discussion is needed. Again, having served on the vendor side of the fence for many years in this field, I think we don't spend enough time trying to come up with best practices and trying to get libraries to evaluate what they're doing in light of what might be a better practice. And could we all come down to a little more agreement on processes? I've always said that if you ask 200 programmers how you should catalog something, you will get 200 answers. That's ridiculous. We've got to do better in this profession. We can't afford what we're doing. So items that we really, really liked, and that I really want to call these folks out for, we thought Lyris is taking on this kind of a role was really very, very admirable. That you put it in a test environment where you could collect input from a number of people together. You brought us together where we could have conversations and hear each other talking about how we were using it. I thought that was really quite good. So we were pleased with that. We want to say that we thank them for this. We also thank IMLS for providing the funding. We thought that was great and we very much appreciated that as well. We liked the structure of the test. We thought they did a particularly good job of putting it together. He just described some of that for you. But that was very useful and it kind of kept us all focused on evaluating the same types of things. Again, trying to bring us to some agreement because otherwise we'll all go off like stray cats looking at different things. And so that was well done, I thought. We did come out of this feeling like this is a viable system for us that we will be able to work with the system in the future. So we were very pleased with that result. And then again, I think I already mentioned this one. They provided a form providing the best practices. And then again, the enhancement process being linked to a way to prioritize needs. This is a good way to move forward and getting some focus on that. And then the person that we worked with rather continually through an autism process was Carissa inside of Larissa's. She's dynamite. We just love the world of her. She did a great job. So did we. We really like her. Yeah, she did a good job. That I had a lot of admiration for the way she handled the process. And so that was why and what happened at OU though, so. So just to clarify, we actually do have CAS authentication available through Avalon. We just may not have been able to get it set up through Larissa's first test. So we are open to questions now.