 So, welcome this early in the morning to the presentation about the Memento Project. I started a Memento with Michael Nelson to your left around April of this year. And the project is partly funded by the Library of Congress. As the months went by, other people on our teams started to work along with us, both in creating proof-of-concept demonstrators to indicate that what we were proposing with Memento is indeed viable, is truly doable, but also at the level of refining the concepts that Michael and I had been developing. And so, Rob Sanderson, who some of you have heard yesterday talk about the Open Annotation Project, he was very instrumental in getting prototype work done and helping us think through all the issues involved in this. And then two other people on my team, and then at the end of Old Dominion, we also had Scott Ainsworth helping us out with technology and concepts. Memento is all about the past and the past of the web. And Michael wanted me to emphasize to you that looking at the past is not only important for reasons of historical analysis, but actually it can also be fun. And so, what we have here is a page from the Internet Archive, that's a CNN homepage around the time of the Dick Cheney shooting incident, somewhere February 2006. And so this is cnn.com, and it says, Cheney prays for hunt victim. And then we stay in the Internet Archive, but we go to the Fox News page, and here suddenly this press attacks Cheney, and you start wondering who the victim in this entire incident really was. So it's fun, and it's interesting to be able to look at the past. And with the Memento project, we want to actually make it easy. What we're going to do over the next 50 or so minutes is, first of all, I'm going to give you a recap of the essence of the architecture of the World Wide Web, since this is all about the web and how to introduce a time dimension into the web. And we have to know about its fundaments before we start tinkering with that. Then I'll give you a rather extensive problem statement, after which, of course, we'll introduce the solution to the problem, otherwise we wouldn't be here. Then we'll show you video recording, a screen count of, well, showing the results of a prototype that we built over the last two or so months. Then we will highlight some consequences of the framework that we are proposing. And then we'll end with basically indicating what is happening or what might happen with all of this regarding acceptance of the framework that we're proposing. So let's get started, a recap of the basics. So we're really at the fundament of the web here with the architecture of the World Wide Web and architecture of the World Wide Web introduces this thing called a resource, which is an item of interest. And resource is identified by means of a URI. And for the purpose of this presentation, we're going to pretend that only HTTP URIs are worked talking about. And so we then have the concept of you can dereference an HTTP URI, meaning you can click on it, for example. And the result of that is a representation of the resource that is delivered to your client. It's very simple, that's what makes the web tick. There's a little twist to this, because there's this notion that's called content negotiation. And what happens there is that a client, so like your browser, dereferences an HTTP URI by indicating a preference for, let's say, a certain format or a certain language. And that is then taken into account by the server that is serving that resource. And depending on the preferences expressed by the client, you may end up with different representations of the same resource. I emphasize this and actually we'll come back to this because this is actually at the basis of the solution that we're introducing with memento. So the problem statement, this is supposed to depict the web. And those little notes are resources and the little letters you see in them are the URIs, so the HTTP URIs for the purpose of this presentation. And those little arrows indicate that resources on the web are interlinked, okay? So, as I just indicated, resources on the web have representations. And that's those red and blue little documents that hang off there. As a matter of fact, it's a little bit more complicated than that. Because representations of a resource change over time. When you go today to cnn.com, you get a certain page. Tomorrow, that page is entirely gone and a new one has shown up in its place. So the whole point that they talk about in the web architecture is, you have resource and the representation. Yes, but those things change over time. The way the web architecture is defined, the way HTTP works, is that at any moment in time, in the current moment in time, you can only get to the current representation of a resource. So you type cnn.com in your browser and what you get back is the page as it exists now. The old representations, the one that was available yesterday, the day before last year and the year before that, they're gone forever. You can't really get to them. That's a little white lie because there are actually pockets on the web of old representations of archived versions of resources. And I'll give you examples of that. So here's two. At the left hand side, you see an archived version as it exists in the Internet Archive of the cnn.com home page at a certain point in time, September 11, 2001. We need to note here that this page is available at a URI that is different than the cnn.com page. Okay, as you can tell there, it has a URI somewhere in the Internet Archive. At the right hand side, an example of a different kind of archived version of resources. That's a page from Wikipedia. And Wikipedia is a content management system. And content management systems on the web care about old versions. They maintain them themselves. And so what you see here is an archived version, so a history version of the September 11 page in Wikipedia. That is again available at a URI that is different than the URI of the current version of the September 11 page in Wikipedia. So the point is, these pockets of archived versions do exist. And when they exist, they all appear at URIs that are different from the URIs of their originals. Okay, that's the essence. Now the good news is, these archived things exist. The last good news is that it's not all that easy to get to them. So how do you get to that old CNN page? Well, it starts with you have to know, and people in this audience, I'm sure, do know, that something like the Internet Archive exists. Believe me, the general public is way less aware of that. So let's presume we do know Internet Archive exists. You go to the wayback machine. You enter the cnn.com URI. That's what you see at the left hand side there. And you submit that query. It's really a search action that you're performing here now. And the result is the page at the right hand side, which basically gives us an extensive list of all the versions that the Internet Archive has saved for the cnn.com home page. It's now up to you to start scrolling and find the exact day that you were looking for, for example, September 11. Similar, yet a little bit of different story with Wikipedia. With Wikipedia, you actually, the way to get to an old, to an archived resource is you actually go to its current version. So you use the URI of the current version of the page. And there, you use the history tab, which you can't really see, but it's at the top right there. You click the history tab, and you see the thing at the right hand side, which is basically a history of all the versions that have ever existed for this Wikipedia page. Depending on the topic, depending on how controversial or how popular the topic is, there will be anywhere between 10 and tens of thousands of versions for a certain page. Up to you now as a user to start finding the date that you're looking for. It's actually just one part of the problem. How do you get to one of those archived pages? There's a second part to the problem. Once you're there and you start navigating, what happens? Well, we stay for a moment in Wikipedia. And at the left hand side, again, we see an archived version of a page. There is a link in there to the Pentagon. And the link is actually to a description about the Pentagon also in Wikipedia. Now what happens when you click that is you end up at the current page for the Pentagon. Now that might be what you want, but it also might not be what you want. If you are really thinking about going back in time and looking at the past, then you would rather have the Pentagon page as it existed at exactly that moment in time as that left hand page existed. You can't really do that. You have to start your whole history navigation back again in the right hand side. Different story for the Internet Archive. So here we are again with the cnn.com homepage and there's a link in there to a page in CNN that is about space, news about space. When you click that, you actually are being pointed back into the Internet Archive. The way the Internet Archive achieves that is by rewriting the URLs that are in that cnn.com homepage. It's obvious why they do that. They're a real archive and so if they would not rewrite those URIs, you would end up at the current representation of the space page. Which in the context of an archive is definitely not what you want. So they rewrite your URIs and you're really now time traveling. And you end up at the right hand side page, right? Which is indeed the space page at the moment in time that we started off there, September 11, 2001. But also here, there's problems. And the problem is you can see those empty slots. There's a limit to how much an archive can crawl. It has to stop somewhere. And so as we can tell, it stopped somewhere and so the stuff is not available to whomever is consulting the archive. Note that the stuff that is missing may very well still exist on the web. It may be in another archive or it may actually still be in the original site, okay? So it's not necessarily gone. It might still be there but we are not getting that complete experience in the way things are architected now. So this is the problem statement at large, really. And to summarize it, we can say that the current web and the past web are not tightly integrated. And there's two instances to the problem. One is that you can't really easily get to the past from the present. As I've shown you, that becomes a search kind of exercise or a navigation exercise. And you can't really navigate the past in a consistent way either. So those are two problems and those are the two that we're trying to address in memento. So this is what we try to do in memento and what we'll explain how we want to do it. We're going to say, well, if you're interested in an archived version of the Wikipedia page, just talk to that Wikipedia page. Use it URL, but qualify your request for that page with the daytime as you see there. Submit that request and as a result, you want to immediately receive the page, the history page from Wikipedia, the page as it existed at the moment in time that you were requesting for. You want to continue your navigation in that way. There's a link there to the Robles Exclusion Protocol. It's a link inside of Wikipedia. So it's a Wikipedia URI. Again, we're going to do the same thing. We're going to talk to that resource with that Wikipedia URI of the current version, but we're going to qualify our request with the daytime. And as a result, we're going to get back that page for Robles Exclusion Protocol as it existed at that moment in history. And just one more navigation here. There's a link to an external site here in this Wikipedia document, to Robles Exclusion, the website as a matter of fact. And again, we're going to qualify that with time. So we're going to go to robots.org, but we're going to qualify our HTTP request with the daytime. And in this case, what we get back is a page from the Internet Archive, and the most recent one that we can get there seems to be November 9, 2007. All of this can work rather smoothly, as I will show you. But you will of course understand that it's all subject to the availability of archived resources. If they're not there, Memento isn't going to do any wonders. However, if they're there, and if the Memento framework is implemented, you're going to seamlessly get to those old pages. All right, here's how we do this. This is where starting to get a little bit more technical. There's really two components to the solution. The one clearly, the first one, clearly being more important and more fundamental than the second one. So I'll spend more time on the first one than on the second one. So the first is, in order to get to an archived resource, we're going to talk to the URI of the original resource. And then we're going to use a notion called content negotiation, which I'll explain to get to archived versions. That's the first component of the solution. The second part is a discovery API. Basically, we are proposing an API for archives of web resources that allows us to ask for any given resource. So for URI, what are the archived versions that you have? And at what daytime were they archived? And a little more than that, but that's basically the essence. So, let's talk in some detail about the first component. Just to make it totally clear, what we're trying to do is, here's a slide of the current situation. At the left-hand side, we see the original resource, and it has its own URI, URI-R. You want to get the representation of its current state, you dereference URI-R. At the right-hand side, we have the archived resources. If you want to see them, you have to actually dereference another URI. So the good news is these archived resources exist. The bad news is that in order to get to them, you have to use a different URI than the one by which they had been known throughout their lifetime. This is actually really weird if you come to think about it, okay? So, what we're saying here is, well, no, let's change this equation and let's get to the archived resources via the original resource. As a matter of fact, I was reminded by David Rosenthal, who's also in the audience, that they pioneered this kind of approach, also in the context of the LOX project. Here, what we do, we talk to the original resource, but we qualify our request with the daytime. And then beyond that, some magic is going to appear, which I'll explain to you that will seamlessly bring our browser to an archived version. To understand this, how exactly we do this, I need to give you a brief lesson in the notion of transparent content negotiation for HTTP. It won't be hard, I promise. So what we have here is a very basic, your most basic HTTP interaction at the left-hand side of the client, at the right-hand side of the server. And then the little node there is a resource with URI, HTTP URI A. And turns out that this resource is an HTML document in English. So it's one of those good old-fashioned resources on the web. So, our client says, get A under HTTP 1.1. So HTTP get request, and the server comes back, yep, 200 okay, here it is. And as a matter of fact, it's text HTML and it's English, and the location is A. This happens all the time. This is your most basic thing. Now here's transparent content negotiation. What happens here is the following. Rather than only having resource A, which is the resource in HTML in English, there's another resource, B, which represents the same information in PDF in English. And a third resource, C, that represents it in PDF and French. And what happens with content negotiation is that there's this new resource, T, that is being introduced, that is kind of a gateway resource. It's a resource that you can talk to, to get to any of A, B, and C. So you do your talking, not to A, B, and C directly. You do your talking to T. And this is where the client starts to express preferences, and starts to negotiate. So, what you see here is the client says, get T, give me resource T. But it qualifies the request by saying, I have a preference for HTML, rather than PDF. And I have a preference for English, rather than French, and absolutely rather than German, okay? The server gets this request, looks at the client's preferences, and then says, well, you know what, I understood you. That's the variety header there. I understand that you're negotiating with me in the dimension of format, the media type, and language. And PCN, I made a choice for you, because I can deliver exactly what you wanted. You wanted English HTML, I have it. And it's actually available at location A. So that comes back to the client. And now the client does another request, just like the basic HTTP request. It says, give me A, and the server responds, you say. That's transparent content negotiation in brief. This wasn't too hard, right? So there's also scenarios that are a bit more complicated in content negotiation. For example, when a client asks for something, has a preference that the server cannot honor. Like in the situation we just depicted, let's presume it's asking for a German PDF, but it's not there. And it's not going to emerge out of thin air. So in that case, the server is going to say, I understood you. I understand what you're trying to get from me, but I do not have it. And so in that case, it says, four or six, not acceptable, but I can point you at alternatives. I have A, B, and C for you, okay? Right, we are now ready to finally talk in a bit more detail about momentum. First a little terminology. I'm going to use the term original resource to talk about, let's say, the cnn.com, the uricnn.com, as it exists today. And I'm going to use the term memento to refer to an archived version of it somewhere on the web, okay? So original resource memento, okay? So with the project, we are proposing to extend the notion of content negotiation to beyond what it has been used for so far. This RFC 2295 that introduces this notion of content negotiation that I've introduced to you in four dimensions. Mediatype, language, both of those I've shown to you. Compression and character set, that exists. It's in an RFC that is ten years old, and it's used all over the place. We are basically saying, we're going to add a fifth dimension to this. This is starting to become Star Trek here, right? Fifth dimension, and that dimension is actually time. And so we're introducing a new header that says x accept daytime. The x that's best practice, it's an experimental header. So that's we have the x there. But basically, it allows a client now to express a preference for a daytime in an HTTP request header. If you have listened carefully, you will understand that as we build up our solution, we'll need somewhere that kind of gateway resource. Remember that blue T that was a gateway to all these other ones? Technically actually called a transparently negotiable resource. We'll need to put that somewhere in the picture in order to be able to do this content negotiation in time. And how we do that, I will show to you by means of two distinct classes of servers. First one is actually servers that have internal archival capabilities. Those are the servers like Wikipedia, content management systems on the web. Actually, version control systems for software that know exactly where their archived versions are and at what day times those versions were available. They have internally all that knowledge. That means they have all the knowledge internally in those systems in their databases to be able to respond to a request for resource qualified with daytime. It's just there. They don't have to go look elsewhere. So here's the example. Left hand side, current page of Wikipedia for the 9-11 event. Right hand side, old versions of them, each of which at their own distinct URI. Using our own language at the right hand side, sorry, we have the original resource at the right hand side, the mementos. In this case, the approach is rather trivial. Because here you can use the original resource itself to negotiate in time. Because all the information to do so is available locally in this content management system. So you can basically directly talk to the URI R up there, qualify your request with the daytime header. Then just let the system poke in its versioning database and redirect you to the appropriate archived version. I need to introduce another piece of terminology here. We're going to use the term time gate, again, Star Trek, to refer to a resource that supports this content negotiation in time. It's basically a resource that understands this x-accept daytime header, okay? So for this case of service, these kind of content management systems, what we have is that the original resource up there and the time gate coincide, they're the same thing, okay? There's no need for a distinction between the two, okay? So we solve this one, as a matter of fact. Here's the depiction, won't go into detail. Just follow the one, two, three, four, what happens there. In one, the client says, give me the resource, the Wikipedia page, right? But don't give me the current version. With the x-accept daytime, I indicate I want an old version of somewhere October 2009. Just as was the case with those other transparent content negotiation examples I gave you for media type and language. We're now going to have a server that says, I understood you and I can 100% honor your request. The memento that you're looking for is at URI M2. The client says, good, I'm in business, I know what to do now. I'm getting the URI of that memento and I receive the old resource back. This is done. Again, there's obviously more complicated scenarios. For example, what happens if a client requests a resource that is outside of the daytime range for which an archive has archived resources, that's all dealt with. It's on our website. We don't have time to go into those details. That was one class of systems. Those were the content management type systems that have internal archival capability. Now we're going to other systems, basically the bulk of the web servers out there that rely on third parties to come archived them. For example, they rely on the internet archive to control them and basically suck up their stuff and push it into an archive, the bulk of the systems. These systems do not have the information to, by themselves, respond to a request for daytime content negotiation because they have no clue where the archived resources for their own resources are out there on the web, nor for which daytime they were archived. Because the archiving happens independently of themselves. Still, those systems can play a constructive role in the framework because although they may not know the exact your rise of the mementos, they may know in which environment they're available. For example, in the internet archive. So here's what we do in that case. Left-hand side, there's a regular kind of web server. Right-hand side, there's archived versions of the server's page. The line in between them indicates those are really different systems that hardly know about each other's existence. So how are we going to address this? This is kind of weird now, right? Well, let's focus on the right-hand side. First of all, we know the left is original resource, the right is mementos. To focus on the right-hand side, then you'll see that that right-hand side is a system with internal archival capabilities. It knows about its archival things because it's an archive. So here, we can play the game that we played with the first system. And we say, we're introducing a time gate here. We have all those mementos, each with their own URIs. We're going to put the gateway resource in front of that, the time gate. And we're going to negotiate with that one in time. And as a result of that, I will get the appropriate one that it was the operational one at that moment in time. We still have another problem. How are we getting from R to G? From the original resource to the time gate? Well, we're solving that by means of a basic HTTP redirect. And the redirect is based on merely detecting the x-accept daytime header. It's a binary thing. If the header is there, redirect to a time gate at some external archive. If it's not there, perform as usual. There's nothing more, nothing less going on here. Implementing this, for example, for an Apache server takes exactly three lines to add to the Apache config file. That's it. There's no magic going on here, really. So we kind of have a solution for these types of systems also. But there's still two kind of problems lurking here for this class of system. First of all, which archive are you going to redirect to? There's all these archives out there. Which one are you going to pick? Well, it's not a very technical. It's much more a policy or kind of knowledge kind of decision to be made here. If you have, like we had in our experiment, a transactional archive that keeps a very fine granularity record of what was ever served by your original server, where you're going to redirect to that one. If you're in Finland, then probably you're going to redirect to the Finnish archive, if you're in Denmark to the Danish archive, Canada to Canadian, and all the rest of the world just points at the internet archive. There's always something to point at. It can get more subtle than this, obviously. It could very well be that your videos are archived in system X, and your images in system Y, and your HTML in system Z. That's just a little test that has to happen, and you redirect to the appropriate environment. So I don't think this is a big deal. Second one is, OK, we are saying we're going to redirect to an archive. That's fine. But we still need the URI of a time gate in that archive. We can't just redirect to the internet archive URI. You need some specific spot in the internet archive. And that is something that we propose to solve by means of a syntax convention, to express the URI of a time gate as a function of the URI of the original. And you see the example there. This is not far-fetched, because already today systems that are based on the way back heritrix solution for web archiving use URIs like this. The big index page that I showed you earlier on that had all those links to past versions of the cnn.com page, that has a URI that looks very similar to that. It has, instead of time gate, a star. So this is not impossible to achieve with it. So anyhow, we kind of have this one under control also now. There's just this one extra step when compared to the previous class of systems, whereby we talk to the original resource. The original resource is not the time gate itself, because it doesn't know about its archived copies. So it's going to redirect the client to a time gate. And now we're in the same situation as with the previous class of systems. The client talks to the time gate with the qualification for daytime. And the time gate says, I understood you. And there is the resource that you're looking for. So we kind of covered both these cases now. And again, there's special cases that we don't have time to mention. Component 2 is basically an API that we propose web archives should implement. The API allows us to, we call the API with the URI of an original resource, so where the URI are. And the response is basically a list of URIs of mementos, so of archived resources, that the archive has for that particular URI are. Plus additional metadata, at which point in time was this thing, this memento archive. What's the media type of it? What's the language of it? Maybe what's the digest? You can add all kind of metadata. We don't care about those details here. The point is that if you have multiple web archives that support this URI, you can build an aggregator of that stuff. That aggregator will now have a much better understanding of all the mementos that are available across the web. Not just in one archive, but across archives. And hence, when a client will come in to that aggregator, the aggregator will be able to respond with a memento or redirect to a memento that is much closer in time to what the client requested as compared to when the client would have talked to one particular archive. Because now the aggregator has a perspective of all these mementos around the world. So I'm skipping that because what we really want to do here is show you a little demo also. So as I indicated, we've built this rather elaborate experiment, which is depicted here. It consists of a variety of systems that, in one way or another, implement the memento solution. What I mean by that is for some systems that were under our own control, we implemented it ourselves. We natively implemented the memento on these systems. For other systems that were not under our control, we basically had to do the implementation by proxy. Meaning that we pretended they were supporting memento. And so what you see at the left-hand side there, Lanelle Odieu, are two servers at our end that archived their own materials in transactional archives as they were serving materials to clients. They completely played the memento in the way it should be. What you see at the right-hand bottom there is that for existing web archives, like internet archive, archive.it, web citation, we've implemented time gates and that API by proxy. So we pretended they existed, although they are not really existing there currently. We also implemented an aggregator that I talked about anyhow. I can keep this short. We're going to move to that movie now. There we go. This was recorded about a month ago, right before a presentation we did about this at the Library of Congress. We are at this point just navigating in real time. So we're not time traveling. What we do here is we pull up one of those pages that we set up for the experiment, a picture of myself and a picture of Michael. And you see the timestamps up on the pictures there and also the timestamp of the HTML page. These pictures changed once a day. The HTML changed once a day and always timestamp. Here we start time traveling. And we're saying, so we're introducing a new header, x except daytime. And we're saying, we're looking for what? 11 October 2009. We're refreshing this page. And what happens here is that we're talking to the URI of this resource, but it redirects us to a transactional archive. That transactional archive has a time gate. And it is now looking in the archive, both the one at Los Alamos and the one at Old Dominion, to reconstruct the page as it existed on October 11, 2009. You see the timestamps again on the images and at the bottom of the page. So it's October 14, I think we were talking about, not October 11. So just to point out that this is distributed as the web is distributed, this image is served from the transactional archive at Old Dominion, whereas the image of myself is served from the Los Alamos transactional archive. So this is not a monolithic kind of archive. This is distributed as the web is. So here you go. And you see the timestamp there, October 14. I love that t-shirt, by the way. OK, we're going back. And I think we're going to do a little navigating. There's a link there to web archiving, the page in Wikipedia. So we're talking to Wikipedia, in this case via proxy, because Wikipedia has not natively implemented this. But what you see, the result is that we get a history page from Wikipedia directly. And it's the page that started to become active at October 1. We keep navigating Wikipedia. We go to the page for National Libraries. Same thing, qualified by October 14. We talk to the proxy. We get a history page. And same thing, we go to the page for the Library of Congress in Wikipedia, again qualified by October 14. And the result is, again, immediately a history page. I think this takes a little time. And this was the one that started to become active October 15. So as you can tell here, we're really time traveling here. Everything we see is qualified by time. We're going back to this home page here of the Memento Project. And we're going to change the time. So we are changing to another daytime. And now I can't read it. 6th of November, 2009. And we're going to do some other navigation. Conveniently, we put the URIs of cnn.com and the BBC news home page in here. So we click those, qualified, again, by the daytime that we put in. And by the aggregator that Rob built, we are now finding a resource closest to that moment in time in website. So we're being brought to the closest resource in time, which is, in this case, for November 1. We're going back. And we're now navigating towards the BBC news home page. Again, qualified by time. Again, we're talking to the aggregator who collects the information about Mementos across the web. And in this case, I think we're retrieving a page from, is it archive it? I think so. I think it's a page from archive it. Yep, there we are. And it is a page of November 7. Maybe not exactly what we asked for, but I warned you before, if it's not available, you won't get it, right? So you'll get the closest to what is available. OK, I'm going to stop this little demo here. Memento has quite interesting repercussions on the notion of persistence of the web also, because it provides some capabilities to deal with four or four situations that can be handled by redirecting to archives and all. I'm going to skip this. We can talk about it in the Q&A later on. That's because I want to talk a little bit about what is going on with all of this. So we published our paper on this on November 6, it was in the physics archive. And actually, a lot, an enormous amount of things have happened since then. First of all, there's been this huge hype on the web around all of this. It got started by a news article in New Scientist about it. And from there, it literally finally spread across the web. It's been in China news. It's been in Indian newspapers, in Thai news feeds, what have you not. It's been in a tweet from Tim O'Reilly, who has 1,300,000 followers, which was really cool for us. So it's gotten a lot of, let's say, hype attention. Fortunately, it was not only hype. It also has received a serious amount of rather fundamental discussion on the link data list, among others here, also in a blog entry by P. Johnson in the UK, where people starting to think fundamentally about, is this OK from the perspective of the architecture of the worldwide web? Can we do this? Is this OK from the perspective of rest principles? And I think the outcome of rather serious discussion on this list was, yeah, this seems like totally OK. This is legit to do, and it's actually quite cool. Other stuff that we've seen is people that are now already kind of taking this for granted. And they say, well, memento will happen, right? So what does that mean? What can we do with it all? Once it's going to be real. This is a really interesting blog post by Steven Owens in the UK also. And he's talking about how we do formal citations of web pages, like in articles, where you always have, do you arrive, and the daytime of the downloads when you visited the page. But here the guy says, what about we put that information on an open URL, we direct it to a linking server, and the linking server fetches for us the memento that is closest to the one that is being cited here. Really exciting. More exciting stuff. We have, over the past couple of weeks, worked with the MediaWiki people. MediaWiki is a Wiki platform that is used rather broadly. It's the one that Wikipedia operates on. And we're working with that community. We have the software available to make it memento compliant. And we're working with them to try and get it into the general release of MediaWiki. Meaning, one day, Wikipedia may natively support this. Just yesterday at the reception, what receptions are good for, we heard from Chris Carpenter that the internet archive is going to implement memento, which I think is absolutely phenomenal use for us. It'll start somewhere early next year. We'll visit the internet archive, and we'll take it from there. And then thanks to the great connections that Rob has in the UK with JISC, they decided to fund one of those developer competitions around memento. And so this will all be about, well, here's the memento concept. Here's the money. You build us a really interesting prototype demonstrator of what can happen when this actually happens for real. That's the end of February. Rob will be there. And we hope to be able to get Michael to attend there also. Since this is about coding, this is totally out of my league. I would also ask you or people at your institutions to contribute to making this happen. I hope you agree with me. This is a rather cool kind of concept. We have a web page that has preliminary information on how you can help us make this happen. Somewhere, May-January, we hope to have draft kind of specs, at least descriptions of how you can go about all this. We have a technical development group, a Google group, the URI is listed there. So you can contribute, or the technical people at your institution can contribute. And then last, but not at all least, is we did this work on a very limited amount of funding. There's still a lot to be done, both in the realm of standardization. Obviously, an RFC or a W3C spec or something needs to be written for this outreach. We need to talk to a lot of communities, to a lot of people in charge of web architecture, what have you not, about this to really make this happen. There's researchy kind of things that are still lurking underneath. You don't have really time to talk about that. And there's software development that we want to do. So if you have a pocket of money, just talk to us. The little bin is right here. OK, I hope you enjoyed this, and we can open it for questions now. Thank you. Hi, David Rosenthal from Stanford. First, I want to say I think this is great work. And I want to point out that the reason why it's great work is because it's adhering to two fundamental principles of the web. And I want to reinforce this from our experience with locks. And the first thing is resources on the web have names. They are URIs. Trying to change the name of something on the web gets you into all sorts of awful problems. Anyone who's had to implement URI rewriting will tell you this is really a bad idea. And this is one of the fundamental things about archiving on the web, is you need somehow to avoid having to rewrite the URIs because you cannot do a good job of it. That's the first one. The second one is that you need to talk to your archive about what you want. And the web has a mechanism for doing that. It's called content negotiation. And the fact that most people don't know it exists, even most people who build tools for the web don't know it exists, doesn't mean that inventing some other way of talking to the archive about what you want is a good idea. It isn't. You need to use the tools that the web give you. And so this isn't really a question. It's more a comment. But I want to say you should keep stressing these are the two reasons why this is a good idea. It's because we've tried the other things and they all lead into terrible ratness. Thanks for that comment. And actually I want to talk just briefly about your first point, the whole URL rewriting. I had a slide about that actually. And it's one of the things we would like to further explore, a part that we think is somehow researching. And this notion of indeed web archives are currently rewriting URLs. And there's something good about it in the sense that they point back into their archives and in that sense they keep the navigation self-contained. But as I've shown you with that CNN example and the empty spots, sometimes it's not a good thing. Especially not if there's no archived version for the linked or embedded resource in that specific archive. Yet it may be available in another archive or even at its original location. So what we would like to work on with the Internet Archive and the other archives is what could be strategies to combine in a constructive way those two approaches really. Rewriting when it's essential, but not rewriting when you can do better. At Fox Virginia Tech. So you've given us quite an impressive vision of how to deal with specific resources, but the context of your discussion was time travel. So knowing the group up there, I would expect that you also have the next step of this, which is a historian or scholar's workbench. I'm going to move to so-and-so time and I'm going to see the whole web that way and I'll be able to look at events and other kinds of things. So when is that coming and how will that work? So we're working on a Firefox plugin in which you can essentially set the date that you want to traverse the web and you can go and visit the web as it exists at that point. Now all the caveats exist of whether or not it's actually been archived. You can only use the material that's available to you. One of the things that we're working on researching is conveying notions of time is actually very difficult. I set my request for a particular time and I get things that are close to it and then within a particular page that has resources embedded, I get things that are further and further away potentially either forward in time or backward in time. So you actually develop this temporal bubble that changes as you traverse the page and you have to decide are there policies that I would like to set that always minimize that temporal bubble even if I continually slide outside of my realm. Do I want to not have any resources before or after a given date if you have a particular epoch that you're interested in and you don't want to see things before or after that. There's actually all kinds of interesting policies that one would have to deliver and also come up with the user interface mechanisms to actually convey to the user what's actually occurring. So that's future work, future resources to try and convey to the user what's happening as they go back in time. Just before you ask your question, for those that are interested, Michael has just yesterday evening recorded another demo that illustrates this Firefox plugin that we're working on. So for those interested, stay after the Q&A, okay, and Michael can run you through that one also. Nathan Lambert, Case Western, thinking about acceptance and how you would gain mass appeal for this type of concept. I mean there's really only one central solution in place where all traffic filters through and that's DNS. Have you thought about this whole rewrite URL negotiation at the DNS level where you were able to take that traffic, pass your headers, pass your time gate type of headers, and try to identify at that point where someone might actually try to be going without having to do local or implementations at the local level for each individual website. DNS would give us the granularity that we want. I mean we could swap post names around but I don't think we could figure out exactly what the rest of the URI is. So some of the archives that you see like Internet Archive is sort of predictable about what the structure is but other web archives like websiteation.org, they have completely different syntax. I don't think we could, I'm not sure we could do DNS tricks to get to those things but we haven't really studied it but I don't think it would do what we want. For the questions, Michael do you want to ask? Yeah, sure, yeah. So for those interested, Michael is going to give you another demo. So we were working on setting up this Firefox plugin and are we actually going here? There we go, so we did this last night and what we're doing here is showing tabbed browsing so this is the current page of our hello thing and then we decide that we want to go and visit the pass so we're mousing down to the bottom right hand side and there's a little m and we turn that on and a panel pops up and we have a date time navigator here and essentially what we've done is gone back to 11.13 and you can see that the images are loading and I'd point out that the ODU image always loads much, much faster than the LANL image. He always has to do this. Yeah, I have to, I can't fail to mention that. It's a government site, right? Well, yes, exactly. You're going through that one little modem at LANL that they have. So now what we're gonna do, we pop up the little calendar and we set it for November 12th so the idea is we have sort of a user friendly mechanism for specifying time. We've gone to 11.12, we're reloading the images. We're going to go and I think bounce back, there's Herbert Wehring, his favorite t-shirt again and this time we're going back to October 12th. We hit reload. So the interesting aspect is we're staying on this page and we're hitting reload and we get the right thing happening, right? So we're not having to sort of re-navigate out of an archive and so forth because it keeps track and it's hard to read. It keeps track of this thing called original URI and it continues to issue the request back to the original URI and Firefox does that for us. So our friends at the Library of Congress, we've opened up a new tab, Memento's not enabled here. We see the version of digitalpreservation.gov. We click the enabled button and we've dialed it back to 11.12, 2009 and now we're loading the new version and we're gonna pull up a version in website and this was actually archived October 31st and they're missing the style sheet so it's not quite exactly right. We dial it back to 2006 and we're gonna revisit the page. It's doing that mod rewrite rule, the three lines that Herbert talked about and we get a version as it shows up in archive and this one looks a little nicer. So the idea here is the content exists in multiple locations, right? And the browser in this case talks to the aggregator and we go and find the right one. So here's somebody who'd like to go back in time. We load up Tiger Woods and the current page right now cause Memento's not enabled and it gets a little funny. So there's the personal life and his all his problems right here and so what he would like to do was click enabled here and we're using a proxy and we're dialing it back to just before Thanksgiving before all this happened. We hit reload and what's happening here is Firefox is loading the page and now it's going to determine so we're on the current page but it's still loading all the images and so forth and it's gonna go through and figure out that what we're talking to Wikipedia doesn't natively support it yet and once it's finished loading that Firefox sends a signal saying done with the page load the plugin finds that we're done and it essentially redirects to talk to the aggregator at this point. So now if you look at the URI bar up at top we see that we have one of these old IDs. We still have the fragment, marital infidelities and so forth but we didn't jump to that because it didn't exist at that point. So we scroll down into sort of the summary and you notice that whole section eight doesn't exist anymore. So at that point, this is the mechanism for interacting with the web so we can have different tabs that some tabs of time browsing's enabled something that's not you can have different times associated with it. So the idea is that we can use different plugins, different mechanisms, you can write your own as long as it speaks with these hitters you can access the material. There's no one true interface or plugin or something like that for accessing it. It's wide open. Any more questions at this point? Well then we thank you for your attention. Thanks a lot.