 My name is Andrew Woods, and it's a pleasure to be here. I am privileged to be a member of the DuraSpace team, one of the folks actually developing the DuraCloud product. Actually, before we jump into the session, I would like to go ahead and support the ongoing effort of trying to disambiguate the organization, DuraSpace, with the product DuraCloud. So something in the past, on the order of a year or so ago, the gaze of two organizations met across a crowded room, a d-space on the one side, and a fedora on the other. And one thing led to another. What ended up was sometime this past summer, July or so, the two organizations joined, realizing that they really had so much in common. And that joined organization became DuraSpace. And so DuraSpace is a not-for-profit organization that supports open technologies around the accessibility of durable content and working in collaboration with other organizations and institutions that share the same values, namely the academic and the scientific and cultural heritage institutions, as well as technology community. So with that, actually, I'm tempted to jump right into the presentation. But before doing so, I will take the moment to go ahead and recognize the team of people and the efforts that they've put into making DuraCloud a reality. So to begin with, there's a small group that we affectionately term the triad. So there's Sandy Payette, who is over here. Hi, Sandy. And Michelle Kimpton, Brad McLean, as well as we have Chris Wilper and Chris Smith, as well as Dan Davis and Danny Bernstein on the drums, or UI, rather. And particularly, I'd like to recognize Bill Brandon, who is not here today but certainly deserves so much of the credit for making DuraCloud what it is. We've been working on DuraCloud for something on the order of a year now. And what I'd like to do today is go ahead and give an overview of what we've been doing, talk a little bit about what DuraCloud is, discuss our pilot partner program that's been underway since this past fall. Some of the things have come out of that, as well as go ahead and take a slightly deeper dive into the architecture, take a look at a demo of a running instance of DuraCloud, and then make an attempt at peering into the future. At an organizational level, so DuraSpace, I'm talking about, we've put considerable energy into defining the mission that we believe in and that drives a lot of the decisions that we make as an organization. We're into and interested in promoting durable persistent access to digital data and creating these open technologies in collaboration with the communities that share those same values. All right, so now a little bit about DuraCloud itself. I'll start with saying that it is actually a service, or it will be a service, that you can go to, say, DuraCloud.org, you can input your criteria for what your storage needs are, maybe what your geographic concerns are, needs around where you want your content to reside, what sort of resource requirements you have. And then you can click on Go. An instance will be started on your behalf in the cloud that is hosted or supported by DuraSpace. So that's one thing. It's actually a service that's provided by the not-for-profit organization, DuraSpace. And a couple of things that this service does is, one, it sort of abstracts away the storage of underlying storage providers. So more specifically, on the commercial side, we work with Amazon in terms of storage, Amazon S3. We work with Rackspace cloud files, and we work with EMC Atmos on the commercial side. And as non-commercial providers, institutions come up. The framework certainly allows, and you'll see that when we jump into the architecture of plugging those right in. Additionally, in addition to being a service, it is actually a open-source cloud-based application. So you can run it as a service, or you can download it to your local institution and go ahead and play around with it, stand it up, and run it as your own internal service. And the pluggable service framework that it provides really leaves the door wide open for any number of service implementations. So if you leave this session with nothing else, I would hope that, at the very least, you pull away these key points, namely that DuraCloud offers mediated cloud storage. So it helps mitigate issues around having, on several different levels, issues around having to understand how to communicate and integrate with any given cloud provider, and it mitigates the possibility of that cloud provider making a business decision that's not in line with your business decision. So DuraCloud facilitates your content flowing across various providers, so it mediates that. And it also offers sort of ready to use out-of-the-box services that you can execute on your content that resides in the cloud. And obviously, from a user perspective, at the very least, the intention is for it to be easy. Just graphically trying to reinforce that same concept, the idea of, you can't really see it at the top, but mediated storage. So any sort of digital file, the format doesn't matter, audio, video, whatever, you can push into DuraCloud. And DuraCloud abstracts away the idiosyncrasies of having to interact with the various underlying providers, and it abstracts away the details of the various APIs that they offer. So the lower portion of the diagram basically indicates that you can push all the content to any one or more of the providers, and you could potentially stand up rules that dictate certain types of content go one place or another. And then also in terms of reinforcing the other key notion of cloud-based services, here what we're just talking about is you have the ability through DuraCloud to run processing on the content that is hosted in your DuraCloud account. All right, great. So now I'd like to change gears a little bit and talk about some of the experiences and sort of what the whole pilot partner program has been about thus far. So sometime back in the fall, we went ahead and started to establish relationships with these three organizations, the Biodiversity Heritage Library, New York Public Library, and WGBH in Boston, in order to create sort of symbiotic, mutually beneficial relationship around DuraCloud. So on the one side, we were offering the DuraCloud service to these three pilot partners so that in order to address real business needs that they individually had. And then on their side, they were offering concrete use cases that helped drive the processes and refine the processes and services that we were developing, giving concrete feedback around that. It has been absolutely wonderful. I'll go ahead and drive into a little bit of the use cases that they brought to the table. This is just sort of an aggregation of all three of the pilot partners use cases, or at least some of the top level ones. Namely, they were really all interested in, as you might expect in regards to DuraCloud. They're all interested in having an online off-site backup of their content. So DuraCloud certainly offered that. There was also, from the NYPL's perspective, an interest in converting what turned out to be sort of a subset of their corpus of TIF images. And actually for all three of the pilots up front, we defined a data set that was on the order of 10 terabytes for each provider. And so with NYPL, we started off with 10 terabytes of their TIF images. And they wanted to convert those to JPEG 2000s. So we developed services around that. And you'll see shortly there were some lessons learned there. Naturally, once those images were converted to JPEG 2000, then they wanted to be able to see them. So we stood up the J2K server, or sort of spelled Jitoka, JPEG 2000 image server. And with a view, we're on top of it. And we'll see that shortly. Additionally, the Biodiversity Heritage Library had the use case of an interest in internationally replicating their corpus. So what we're working on with them is we've already pushed, as it turns out, somewhere over 13 terabytes of the BHL content into DuraCloud. And then we're working with either in, we're not sure where it's going to land. If it's in the UK or if it's in Australia. But working through the details of then just pulling that content down from the cloud and any issues that may come out of that. Additionally, on the WGBH side, as you may expect, they're interested in streaming the video that they are responsible for. And so straight out of the box, sort of as low-hanging fruit, we have decided to just leverage what Amazon provides in their CloudFront service. They have a video streamer, so we can easily stream content that's hosted in S3 that way. But we're also in discussions with the open source video toolkit of Kaltura and have made some early steps in the integration of that. Additionally, just sort of at a high level, there is this notion of processing over the corpus of the content that is held in DuraCloud. And more specifically, BHL has a particular use case around extracting the scientific names out of their books that they have hosted. And the tool is called Taxon Finder, which you may or may not be aware of. But the idea is that we iterate over the content, their holdings, and are able to extract these scientific names using this Taxon Finder tool. And then the tool itself creates the links in their relationships. But from DuraCloud's perspective, enabling the mass processing over the entire content set that is held in one's account. And then also, just this notion, this came out of the New York Public Library, where they get a hard drive on the doorstep late at night or something. And you want to triage it, sort of do an initial look at what is this, what should we do with it, and so pushing it into maybe like a quarantine area in the cloud, having somebody analyze it, do some bulk tagging of what these files might be so that another department can pull it down or do whatever needs to be done with it. So out of those use cases, we had some good times. We had some bad times. But certainly, there was a handful of lessons learned that came out of it. And a lot of them, as you may expect, came out of the ingest process. So we're talking about three different organizations that, at least for this pilot round, had upwards of, or in some cases, exceeding 10 terabytes of data. And so we learned quite a bit from that. One thing that came up immediately was the fact that as we were loading content, so let me, I can speak specifically for the BHL case, their content was previously, or still is actually, hosted in Internet Archive. And so we were pulling the content from Internet Archive into their DuraCloud account. And in the process of doing that, realized during the validation phase of that ingest effort, realized that there are some errors. And some of the errors were around the fact that some files didn't make it in. So it was like, hmm. And then sometimes the initial MD5 associated with a file didn't match the MD5 that landed in DuraCloud. So in the case where files didn't show up at all, actually what was going on in BHL wasn't aware of the fact, apparently, from the beginning that files that exceed a certain cap limit on size don't get accepted into the underlying storage provider. So in the case of Rackspace and Amazon, that cap is 5 gigabytes. So all the files that exceeded that size, at this point you can't push it in. So that's where this chunking and stitching comes from. So we had to solve the issue of being able to take these large files. And as they're coming into the system, go ahead and break them into chunks, obviously we want to capture the MD5 of the chunk. And then also, as the stream is going through, capture the MD5 of the entire file and go ahead and package all that information in an XML document and push the chunks separately into DuraCloud. And then on the flip side, if we want to be able to stitch those chunks together, pull those constituent parts back into the parent file. So working through a lot of the details there was good fun. And so we have two use cases around ingesting content into DuraCloud. One was over the wire, like I was mentioning there, with the BHL. And we also, WGBH brought to the table the use case of not having the bandwidth to push the content into the cloud in any reasonable time frame. So they worked with us in developing processes around shipping that content via hard drives. And so obviously, some of our concerns were having a manifest of what the MD5s of each of the content items was locally on the system. And then on the hard drive, and then once it lands into the cloud with the MD5s there. So having the accountability and the verification from hop to hop that ultimately the file that ends up in the cloud is the same file that initially resided on the local system. And so one thing I'll mention, actually with the parallel upload, it also applies to the bulk image conversion. One thing that we found, and no big surprise, but we found in this effort in general is that one way to address time concerns, be it bulk image conversion or the time it takes to push content into the cloud is to go ahead and break up the job and parallelize it over multiple cloud servers. And I have a graphic here shortly that shows that we were able to, in the case of BHL, push this 13 terabytes of content into the cloud would have taken a month or more. But by standing up, I think what we ended up with was six servers and just having three different processes on each server pulling from internet archive and pushing up. And with that configuration, we're able to, you see here, like in a five day time frame, push the entire content up. And you see the green line marks. I imagine this is fairly unintelligible. But the green line marks, the 10 terabytes, and in each of the hash marks at the bottom are indications of a day. So this was just sort of an internal graph that we were using to track that progress. Additionally, a big thing, actually, talking about lessons learned is the bottom point about the asynchronous nature of cloud storage. So naturally, Duraspace, the organization, hosting, or supporting Fedora, the repository, as well as dSpace, it only makes sense to go ahead and plug Duracloud in underneath those repositories. And so currently, Fedora, you may be aware of, has a low level storage implementation called Acubra that the default implementation, it writes blobs to the local file system. Now, Chris Wilper, bless his heart, he went ahead and pulled together another implementation of Acubra using the same Acubra interfaces, but instead of writing to the local file system, writes to Duracloud. And it works like a charm. Well, except it takes a lot longer than you might expect because of this. When you write to a file system, you write a file, and then you get a return, and it's there, and you're happy you can move along. And when you're dealing with cloud content, you write a file, and it takes a certain amount of time, usually on the order of seconds or tens of seconds, for the other load balance servers within that underlying storage provider to register the fact that a new object has been created or has been updated. And so what Acubra does is, in order to maintain the contract of when Fedora writes an object, it's guaranteed to be there, it sort of spins until it gets that confirmation that, OK, the file's there, and then it returns and moves on. But the overhead involved is noticeable. So I'm quite sure that in many use cases, that's probably acceptable, depending on how you're using your repository. But it also drove us to come up with an alternative solution for integrating repositories in the cloud. And that solution revolves around a client-side utility that basically does a synchronization with your local file system and your DuraCloud account. So you can point this utility at a directory or any number of directories on your local system, tell it which space in DuraCloud you're interested in synchronizing with, and any sort of new files or changes to files or deleted files get synchronized with DuraCloud. So you can run your repository just like you normally would, and then in the background, it's being sort of out of the mainstream of requests and response. In the background, it's synchronizing that with your DuraCloud account. And so obviously, it works in the case of repositories, but it's really a general utility that anyone can use. And something that we're doing just internally within the DuraSpace organization is each of us standing up our own instance of this client-side utility and synchronizing whatever files we want to just sort of working through any kinks that might show up in the initial release of this synchronization utility. So a handful of things have come out of the pilot effort. Now I'd like to go ahead and talk a little about DuraCloud itself, try to pick up the pace. So talk a little about DuraCloud itself. Like I said, it's a service that you can go and sign up for. And once you input your criteria and actually sign up, you click on the button, and it launches an instance in the cloud for you based on whatever criteria you specified. And then from that point on, most of your interaction with the system takes place through that instance that was launched on your behalf. So it's actually a set of web applications that you can interact with. And we'll go ahead and pop the hood on that and look inside some of the moving parts of DuraCloud. So what I'd like to do, actually, is sort of a component view of the DuraCloud architecture. I'll sort of talk to this at a high level, just giving you an idea of what the various boxes represent. And then I'd like to flip over to a demo. We can sort of walk through a running instance of DuraCloud and just to sort of provide, you know, have some initial upfront context, look at the application, and then we can come back to this. And maybe some of the smaller boxes will make more sense. So one thing I'll say before we jump over to the demo is that what I would consider the two main components of the system are the two boxes in the middle, the pink one and the greenish one, DuraService and DuraStore. So as you might expect, DuraService is the component that manages the deployment, installation, configuration of all your services. And DuraStore is what is responsible for mediating the requests for interacting with content, mediating that down to the underlying providers there at the bottom. And one thing worth noting is the fact that both of those components have a RESTful API sitting on top of them, which basically opens up all kinds of doors for other types of integrations. At the top, we have our own implementation of a web-based interface that interacts with DuraStore and DuraService. But that interaction takes place through the REST APIs. It would be easy to go ahead and throw away the DuraAdmin, the UI that we've put together, and strictly interact with your DuraCloud account via command-lined scripts, using just WGit and Curl or whatever. Or you could build your own applications at your institution that interact with the REST APIs or you could throw a Drupal on top of this that under the covers plugs into these REST calls. As it stands right now, we've pulled together just to facilitate that interaction. We've pulled together a web-based UI called DuraAdmin. We'll take a look at that. But in looking at it, I would like to just note that this is UI that we've pulled together. But kind of like I said, it's not required. This is not your only way of interacting with a system. We did it in order to simplify things at the beginning. So let me just go ahead and poke around in here a little bit. This would be the home page where you first experience with your institution's DuraCloud instance. You can imagine any sort of customization that could take place here in terms of icons and color schemes and that sort of thing, or messages about what content's been most recently viewed or used in some way or another. I will note that there are a couple different tabs here at the top, one that relates to your content called Spaces and one that relates to your services. We'll jump into those momentarily. And then also on the right-hand side here, we happen to show the underlying storage providers that are connected to this actual instance of DuraCloud. So in this case, we have Amazon S3 as well as RackSpace. So jumping into your Spaces, so I guess it's worth saying that a space in the DuraCloud context is really just sort of an abstract notion of a container for digital objects. And right now we have one space called CNI Content. You can have up to 100 Spaces as it stands right now, just as a sort of hard cap. But in terms of content that you can put inside of each space, it's unlimited, depending on how much storage you want to provision. So you see that there's some metadata associated with the space. You can add content to it. You can remove the space. You can add a new space. There's some special characters that you don't want to use. But otherwise, you can name the space whatever you want to. For example, something, maybe. And then there's this notion of access, open or closed. And it's really as simple as that. If a space is open, then that means that all the content that resides within that space, the URLs, the restful URLs that can access that content are publicly available. If it's closed, then they're not. You have to actually provide credentials in order to view that content. So before we, well, I will actually mention here on the left-hand side, if you can see it, there is the notion that you can associate metadata with the space. So any sort of metadata that would be appropriate. So whatever you want to, just sort of name value pairs that you can associate with the space. Likewise, you can add tags to a space which would potentially facilitate or just whatever you use tags for. So we'll come back to the spaces in a moment. But I'd like to jump over to services. So actually, in terms of available services, we have right now three that are plugged into this instance, namely replication service, which enables the ability to replicate content that, say, comes into your Amazon, replicated over to Rackspace. Simple as that. Also, there is the JPEG 2000 service, which stands up a JPEG 2000 server, as well as a viewer for serving and viewing images that exist in your DuraCloud account. And then likewise, there is this image conversion service, which can convert from any number of image formats to any number of other image formats. And under the covers, the image service is just using image magic. So let me go ahead and kick off. Say, for example, the JPEG 2000 service. And there's a little bit of a workflow here. It's fairly simple at this point. You can specify which server you want to deploy it on. But since there's only a single server here, it makes a decision somewhat trivial. And then this basically is just saying, OK, let's make it happen. And you may notice that it takes a little bit of time. And the reason for that is the framework, the way it's set up, is actually the services are bundled as OSGI bundles. And for those that are into OSGI, these are separate JAR files that can be deployed into an OSGI container. And what's happening is the service, actually these JAR files reside within Durastore. So we have a service repository that hosts a variety of services. And when you want to deploy one, you click on Deploy, it pulls the services from the repository, streams them over to whichever instance you want to deploy, in this case the primary instance. And then it deploys them into an OSGI container resident on that instance. As it so happens, the JPEG 2000 service takes a little bit longer than that even, because once it installs everything on the OSGI container, it goes and it starts up a new application server, in this case Tomcat. And then it deploys the JPEG 2000 service into that new instance of Tomcat. And then once that's all happened, then it returns back and says, you're good to go. I will also just go ahead and kick off the replication service, and then we can go back to the content same decision process, quite easy. With the replication service, in this particular release, and I guess I should mention that we've had a couple different internal releases of Diracloud, this one is the 0.2. So it's quite young, embryotic really. So we're going to have the 0.3 towards the end of the month, and we have a series of other internal releases that are mapped out in their associated functionality. But here, the replication service, you can specify which store you want to be the source and which one you want to be the sync. In this case, we'll say, OK, everything that comes into Amazon, go ahead and push that to Rackspace, please. And this one takes a little bit less time, just because it doesn't have the application server that has to come up and that sort of thing. But you see a listing of the services that are installed. There are two here that I didn't mention, the Image Magic service and the Web App Utility service. These are really from a user perspective. They don't matter all that much. They're sort of infrastructure required by the other services, as you may expect, the JPEG 2000 and the Image Conversion service. But there are some properties that are associated with a service. And you can look at those properties by clicking on View here. But it's just a rest call, so you can get that information by doing it command line as well. And really, one of the reasons that drove us to choosing OSGI as our framework for hosting services was the ability to, at runtime, reconfigure a service without having to sort of stop and restart any applications. So OSGI provides sort of a very dynamic interaction. Go ahead and cancel out this. All right, so let's go back over here to Spaces and maybe just add some content. You can go ahead and pull any sort of file from your local system. I was hoping that we would find a pleasant image, but maybe we'll just sort of grab something. Hopefully it's non-offensive. You can specify the content ID. If you don't specify anything here, then it just takes the file name under the cloud images. Or excuse me, Sandy? Sandy? Oh, just this one. I'm not sure what, so I'll just leave the extension off here. If it has the extension, then obviously it tries to extract or determine the mime type based on that. I'll just actually let it do its own thing in terms of the content ID and figuring out the mime type there, which it did. And so a thumbnail or a fist nail or whatever actually is created from that image. And I imagine this one, yeah, it's actually quite small. So if you click on it, it launches the JPEG 2000 server and viewer, as it so happens. Since it's such a small image, you don't really have the benefits of JPEG 2000, which deals with tiling and different resolutions. But this particular example is less dazzling. But a couple things that you might be interested in here are the checksum. So we retain the checksum of the file, which is later used by a bit integrity check service. And then obviously the mime type and the file size. And just like at the space level, you can also associate metadata and tags with digital objects. All right, so let's go ahead and just take a look at the rack space side. And you see that now that there's the top left hand or top right hand corner, you see we're looking at content that's stored within rack space. And you notice that there's this space that was created because we turned on the replication service. I created the space over on the Amazon side. That space was replicated over on the rack space side. And likewise, the image that we pushed into Amazon also got replicated over here on rack space. So I will go ahead and move along. Additionally, time permitting, we could go ahead and do an image conversion and change the PNG to JPEG or whatever. But we'll leave that for now. And talk a little bit in more detail about the architecture. So on the storage side, we have a REST API that we like to think is a logical naming convention. I'll show you in a second here of space and then content ID for interacting, creating, updating, deleting objects, leveraging the HTTP verbs. So like put and push and delete and head and get. But so those calls are translated in the storage mediation layer, translated into an internal interface. We call it the storage provider interface. So those calls get turned into calls into this interface. And basically, in order to plug in additional storage providers, in this case, we have Amazon and EMC and rack space. In order to plug in additional storage providers, all that needs to be done is another adapter has to be created that is able to translate between the API of that storage provider and the storage provider interface. And then you can just plug it right in and work right along with the REST system. On the services side, we similarly have the REST API, which gets translated into calls into the service manager. And kind of like I mentioned, the service manager is the one that's responsible for the listing of available services and services that are deployed. It is able to interact, and actually obviously, the arrow should be pointing, the arrow going off to the storage should be pointing up at the REST API. But it interacts with DuraStore through the REST API in order to pull the services over to itself or to the instance that is going to be hosting the service. And then those services are deployed into the OSGI container on the appropriate server. And in a similar way that the storage providers all implement a common storage provider interface, all the services implement a common compute service interface. So that allows the installation, the sort of statusing, and the update of services in a generic way. I will mention in terms of services, right here, we have sort of three different flavors of services. We have just sort of pure Java. So that would apply to the replication service, where it's just Java code that deploys within OSGI, and it knows how to interact with DuraStore, and it can say, OK, content, here's a message that a piece of content has landed in whatever is configured to listen to, in this case, Amazon. It knows how to talk to DuraStore and say, OK, go ahead and make a copy in Rackspace. So that's just pure Java code. It runs in the OSGI container and does everything it needs to do itself contained in that way. Another example of a slightly different type of service is the Image Magic service. And basically, it's a sort of thin wrapper that has a representative here in the OSGI container in order to implement the basic functionality of deploying, undeploying, and reconfiguring. But ultimately, the service itself is running as a local system utility, like Image Magic. It gets installed on the system in a sort of command line fashion under the covers of this. This just sort of manages and facilitates that interaction. So we have the ability to handle sort of system level services, as well as sort of these pure Java services. And then additionally, kind of what I talked about a little bit for, the JPEG 2000 service is actually its own standalone external web application. So web apps that provide some sort of web-based service were also able to handle within this framework. So this, I'll just sort of gloss through here briefly. It's examples of sort of the syntax around our REST APIs. And this is just a handful of examples for getting content, or getting basically a listing of contents within your space, getting any particular content item, and deploying a service. So your primary instance has a static IP associated with it. So obviously, you can map any sort of domain name to that. And then for the example of getting the listing of contents in your space, there's the context of Durastore, and then the name of the space ID. And actually, in our example, instead of images there, maybe it would be April 2010. And just doing a get on that URL would provide in XML format a listing of the contents within that space. In terms of getting an actual content item, the second example, the same context with Durastore, the instead of images, it would be April 2010. And then instead of the Rome JPEG, it would be the one that we used, whatever the logo was. So this is here just to sort of demonstrate the simplicity of actually interacting with the underlying or the storage and services at Duracloud, the API that Duracloud provides. So that's your primary instance. There are times, and I've talked a little bit about times where it's interesting to actually break up a job and run it over in a sort of parallel processing scenario, run it over multiple instances. So you can also spin up multiple managed instances that the little purple box is there to indicate that it has its own OSGI container, which hosts these deployable services. So you can break up a job and distribute it over managed services. Likewise, at the bottom, there is this notion of a preconfigured service. So that would be actually a virtual machine image that you have, or that you've created at your local institution, and implementing a few basic APIs in terms of management, something akin to the compute service interface. We can deploy and undeploy and do some reconfiguration of required to launch a brand new instance of a cloud server. But actually what's happening inside of that instance is completely up to whatever you've baked into that machine image. So it's sometimes of interest of how security is handled. So I'll just talk a little bit to that point. Have a couple different layers in DuraCloud. At the very bottom layer, each of the underlying storage providers has a notion of security. And what we do in DuraCloud is go ahead and lock it down from the storage provider perspective completely or lock it down to the point that you have to be the owner of the content in order to interact with it to do any sort of updates or additions. You have to be the owner of it. And then on top of that, we have the DuraCloud application. And we have the application level security, which provides the ability to log in or pass in credentials, which then give you the authorization to interact with the underlying content. And on top of the application security, we have channel security, basically just the encryption of the appropriate calls so that your passwords aren't being pulled as it goes across the wire. And that's just having Apache sitting on top of Tomcat, that your DuraCloud and DuraStore and DuraService web applications run on. And then on top of that, your whole instance is firewalled off so that only the common HTTP port 80 is actually open at all. All right, so moving along. On the home stretch here, we have a sort of look at the horizon. And some of the things coming up are, like I mentioned before, a series of internal releases, sort of dot releases that will push through the summer and into the fall. There are lots of things, obviously included there. But just pulling out a couple of the highlights, sort of the integrations with the repositories through this sync utility. And obviously, we're expanding the scope of the provided support for underlying storage providers, as well as compute providers. Right now, storage, we have the three, Amazon, EMC, and Rackspace. Compute, we have Amazon. Rackspace just came out with the ability to have custom machine images that you can persist, which didn't exist before. They just came out with it like two weeks ago. So that opens up the door for being able to use them in a compute capacity in the way that makes sense to Duracloud. And we're in continual, actually at this point, weekly conversations with EMC working through the issues of having their compute service work in a way that works for us and works for them. So expanding that support, and then certainly beefing up and expanding the service selection. And here's just some of the examples of services that are on the horizon. When you're talking about terabytes of data or more, obviously you want to be able to find that. So we'll have some sort of indexing and search capability on that, as well as more robust bit integrity services so that you can get in. We have several tiers of that. You can provide, say, a manifest or a listing of your digital objects and the MD5s associated with it, push it into Duracloud and say, OK, it doesn't match up. And it doesn't match up across your storage providers at a higher level of trust or assurance is the ability to regenerate MD5s. And potentially, the user could provide a nonce or a salt and have the content re-read, regenerate the MD5 with the salt, and then see if that matches what the user or the owner of the system was expecting. So beefing up the bit integrity assurance and putting more functionality around replication so you can replicate based on different rules, MIME type or space or whatever, as opposed to simply everything that comes into Amazon, I want to push to Rackspace. And then, like I mentioned before, the video streaming and auditing services. And so we're actually open sourcing the baseline for OR10, so early this summer. And we're bringing in another set of pilot partners. In the fall, a subset of that group we'll be working with earlier and in a similar way that we've been working with the existing pilot partners, BHL and YPL and WGBH, but more in a development capacity, particularly since the code will be open sourced. We'll be working with each other and having more eyes on the source code. And so the subset that we pull over will be helping with actually expanding the baseline itself. And then the first public release, at least a beta version of it, will be coming at the first of the year. And with that, I'll say that if you'd like to find any more information, there are the websites at dirspace.org and dircloud.org. And thank you for your time.