 Thank you all for coming in from the great sunshine outside. Brewster, you put on a great party in many, many respects including this amazing weather and amazing spot. It was a wonderful thing. I'm sorry to those who are watching virtually that you didn't get to enjoy the California sunshine here in San Francisco. This next session is a long-awaited affair, which is the first discussion slash unveiling of the technical platform that some people have been working incredibly hard on. I like the title very much from Beta Sprint to Alpha Release, which we're about to see. I thanked before the heads of a bunch of different work streams, and I wanted to, before introducing my friend and colleague David Weinberger, to send a particular shout out and thanks to the technical aspects work stream. Martin Kelfadevic and Chris Freeland have been co-chairing that group. Martin and Chris, are you here? Give us a wave. You guys have been astonishingly good at this job and have been great partners both to the conveners in the work stream and the steering committee as we've been working with the technology development team, the interim tech dev team that you're going to hear from in a moment. Thank you to all who have been part of this. It's a challenge of course to do any distributed effort, but I think we're well on track for a 2013 release, but along the way we wanted to have something that we could push against and test some assumptions with and this team has been working round the clock to get to this point. I'm thrilled to introduce my colleague David Weinberger, who works at the Harvard Library Innovation Lab where he's co-director with Kim Doolin and his team, and they've been teaming up also with the Berkman Center crew and I'm sure you'll do introductions with this group. We have been blessed to have your incredible thought and hard work and this is an early stage chance to push around what they've done and there will be much more chance for input and discussion over the course of the next year. David, we look forward to hearing from you. Thanks for all the work. Thank you very much, John. It's an ineffable pleasure to be in this temple with you people. I get to call it a temple, I guess, not so much a church. Why don't we introduce ourselves? John's already introduced me, so I'm going to pass it to you, Dan. I'm Dan Collis-Piro with the Berkman Center. I'm a developer there and I am the point two developer on the platform. I'm Paul Deschner, I'm a developer at the Harvard Library Innovation Lab and also on the Dev Team. And I'm Matt Phillips, a software developer in the Library Innovation Lab and also on the Core Dev Team. And there are actually lots of other people, so this is the core and they're amazing. There are lots of other people who are helping at every level including the Secretary and Rebecca who can't be thanked often enough during this session. And Maura and the other work streams have been really, really helpful. So what we want to give you today is a taste of the platform. And the very first thing to say are slides up, by the way. Ah, okay, great. The very first thing to say is that what we're presenting today is the opposite of etched in stone. This is a proposal that's been written in software for reasons I hope will become clear. And so we want to give you an idea of our thinking behind this and show you some very quick demos. And I need to let you know, and you'll figure this out anyway, that the demos are all screen captures. They're all screen casts. But they were done absolutely live. They're unedited. We did them basically over the past day in my hotel room. So we didn't want to have to worry about life. Because I can't see these screens of course, but I'm told that the sun... Brewster, if you could stop the course of the sun in the sky, we would appreciate that. Working on it. Failing that, you may find a better picture through the live stream at this... Unfortunately, the only black text on white in the entire session gives you the address for that, which is you can go to either dp.la or you can go to livestream.com slash dpla. And you may get actually a better picture, especially with the wonderful Wi-Fi here in the Internet Archive. So a very quick timeline. We're going to try to do everything really quickly. There's a lot to go through. So we were constituted on January 3rd in... Oh, the middle, right? It's the third week of January. We did the initial build. Live data, live API. Of course, do all of this in public. With the idea that having people push against it is the best way of figuring out in an environment like this what works, what doesn't, and what might be useful. So we've been doing early builds. Then we built a bunch of scenarios. And Nate Hill and his group, we are talking together about the set of scenarios. But we made up a set of scenarios that we thought developers maybe would want, things they might want to do on top of what we're building. We did... We spent a lot of time putting together a scope document, sort of a technical overview and a proposal that's available. It's also on our Wiki. All the addresses are at the end. That lays out what we think is the scope, what we're proposing, what we're suggesting so that you can push back with the scope of the project through April 2013. April 5th, Martin and Chris are wonderful heads. They are amazing, awesome heads of the technical work stream. Put together a hackathon and you'll see some of the results from that. April 24th, three days ago, we did build .03, which has more fully developed schemas. It has a lot more data including the 12 million metadata records from the Harvard Library as well as 7 million from Library of Congress and more. And that brings us to here. Throughout this process, we've been gathering data, writing scripts, trying out different sorts of data right from the very beginning, making sure that we don't focus only on library metadata, but also the sorts of metadata that comes from online collections. We've been talking with lots and lots of people in the space, partners and potential partners, and we've been encouraging the development of applications that test out what we've been building. So, I'm going to start one big step back. Everybody here knows this, but just in case, as you know, computer software consists of there's often some data, some computation, some interfaces and a user interface that sits on top of it. This is also known here as the shiny thing. In a web application, running in your browser, same thing, except the web is in between the two and the UI talks to the back end. So, what's important for the moment is what our team is not building. We are not building a user interface. That's not part of our charter. I'm very happy to hear that the discussion of this, of the user interface and the need for one arising at this meeting, I think it's very important, but it's not what we're doing. So, we're gathering metadata and some content. We are planning, this is the plan anyway, to enhance that metadata in ways we'll talk about, provide a set of services that will be useful to developers and to make all of this available, not through a user interface, not directly, but through an API, an application programming interface by which programs talk to and use the services and data that a platform provides. And that's what we're building. We're building a platform. A platform consists of those pieces, at least in this definition. So, why would you build a platform? Well, you do it in the hope that developers will come along and they will build stuff that is gorgeous, wonderful, useful, a place where you can live and thrive and take advantage of all of the incredible metadata that's available in this ecosystem and some of which metadata will be available through the platform. So, in a little more detail, you have the platform there and then you have some data sources underneath it, nodes, for example, online collections, library collections, DPLA digitizations, perhaps some stuff from the web, perhaps some stuff from users. We'll talk about these in a minute. And on top of a set of applications that we hope developers will build and they will amaze us. We don't know what, this is the central conundrum and the central value of the platform is we don't know what people will build and the platform is successful to the extent to which developers surprise us because it's a platform for innovation. So, we hope that developers will make incredible, useful things, users will come to those applications, and we also hope and expect that other sites that already exist or maybe some that don't quite yet will use the data and metadata within the platform to enhance their own offerings. Whether it's library of the future, I'm going to give you a plug, I might as well give it the right URL, libraryyourfuturelibrary.com or if it's maybe another end of the scale of Wikipedia simply in terms of the number of users at this point. So, why a platform? Why do that? Well, in order to get more value from the objects of culture that are spread out, distributed across the environment, we want more people to find them, more people to understand them, to share them and to do things with them and also so that the cultural institutions, whether they're real world or online or in between or both, that they find benefit in what we're doing as well. The platform should serve both. So, even though I just said we cannot anticipate what developers are going to do, we've done some scenario. We have to sort of anticipate some of it. So, here are six extremely quick scenarios of things that maybe people will build. So, for example, somebody might build a library browser, an OPAC for children or somebody might build library analytic packages. That's the sort of thing you could do with the sort of platform that we believe we're building. A local historical society that has a collection of documents or images and wants to make them more available, more findable by the public might find this platform useful. Another site, for example, Wikipedia, might find some of the data that's in it useful and likewise absolutely go any other direction. Wikidata is a really interesting project, for example. There's lots of different sorts of social applications that could use the metadata of the sort that we're contemplating. Books and cultural items are social objects. Social groups form around them. That's the point of cultural objects. They form culture around them. And that requires social formation. So, maybe somebody will build some social applications on top of the DPLA platform. And within the DPLA platform, we hope there will be such a richness of metadata that people will do computationally intensive work with it. So, those are three simple sort of scenarios. There's sort of one more piece of this sketch, which is, in addition, so, of course, what we're building is open source software. And on the one hand, we would find it incredibly valuable if there were various types of connectors that made it dead easy for nodes and for other institutions and even users to contribute their metadata into the platform and to have a two-way programmatic conversation. And in addition, on the right-hand side, there's... We want to make it, we think, this is a proposal, that a local institution that wants to use this platform, of course it's open source, so they can just get all the software and use it however they want. But there are things that we can anticipate that would make it more useful to local institutions. We can perhaps build that or help somebody build that so that a local institution can get the same sorts of advantages out of its own metadata and building local applications on top of it. And, of course, that would optionally feed directly into the DPLA as well. I think that's technically known as a win-win. We don't know what to call this. We've thought about calling it a deplet or, in a less gendered way, a diplet or a dimple or a dumpling. And we tend to, I think, be calling it a dipling or a diplet just informally. It's possible it needs a better name. So three big sessions of what we want to talk about today. The first is the challenge of the task of gathering interesting metadata. We currently have a whole bunch of metadata that we've ingested, that we've brought in, including, as I mentioned, Library of Congress, 7 million mark records. Harvard has the 12 million biodiversity heritage. It's just about 50,000. The Bancroft Library from California. We have the University of Illinois, Urbana-Champaign collection, San Francisco Public Library images, old SF, that's 48,000, a little resource called Reporters' Resource that has 600 scientific reports that are of particular interest to journalists. We've also been experimenting, because why not, with bringing in some web content on the grounds, and you'll tell us that this is not where you want it to go, but on the grounds that if somebody comes to the DPLA and they're looking up a topic, there may well be some stuff on the web that they would benefit from if it were returned in the results. So things like TED Talks, Google Author Talks, NPR broadcasts about books. We've brought in that metadata as well. And then usage data in addition, which we think has the potential to spur lots of innovation. There are lots of issues around usage data as well, circulation data and books on reserve and things like that. So we've been gathering so far through custom methods. We've been writing scripts. If it's a Mark 21 record, the standard book format, it's pretty much automated. We'll just bring it in. And likewise for other sorts of data as well. But we absolutely recognize that custom ingestion it doesn't scale. We needed to do that now, but it doesn't scale. And so the idea is that ingestion will be done through an API, through an ingestion engine from nodes. The joy there is so they'll remember to make a node to joy. Joy to nodes? Node to joy. Which I think I'm going to skip. As well as perhaps from individual institutions, from other perhaps websites. And then there are the deep and difficult issues of maintaining and scheduling and syncing and correcting in two-way dialogue. These are very, very difficult issues that nodes offer some serious advantages in this regard. And I want to thank the content committee again for their help with this. I look forward to more help. So Dan is going to... So I'm going to show you a couple of quick demos and we're going to jump back into sort of boring exposition, which is my job, and then some more demos. So here's some demos of some things that have been built. And as I say, all of these are screen captured but very fresh and unedited. All right. Dan. Thank you. So I'm going to show an experiment that developed at the Dipla hackathon. And this was in collaboration with a couple other developers. And what I wanted to do is make a more visual way to browse and query the Dipla API, sort of allowing you to refine a set of queries and then sort of stack criteria on top of them up so that you can get a more refined result set. So I'm going to start this. This is live as well. So you can play with this now if you feel so inclined. So we're going to enter a keyword search across all 19 million records for a monkey trial. And you can see that this is going to find a number of relevant results. And then this is filtering based on the Library of Congress results as well. The covers are coming from openlibrary.org. So we're calling out to another open API. So now we're doing a search for creators, so authors for Collins. You can see that that returned 10,000 records currently. And we're going to refine that search by putting in the name Suzanne. And that, again, is going to be against the creator keyword search. So now these covers are being loaded by, you know, from openlibrary. We're going to, you know, filter again on Library of Congress records. And then we're going to go down, look at the... So there was 25 results. So that's a pretty tight result set for an author. And you can see the Hunger Games here. When we click this, we're loading external jackets from the library thing, openAPI, and then this is a link to the library thing page as well based on ISBN. If we click this cover, then we're going to go to the openlibrary page. And if this was something that we could read online, then we could borrow that, you know, and then we could read online or borrow it through WorldCat as well. So that's one example. Thank you. Thank you. So the next application is another experiment from our hackathon. Developed over the course of a few hours one day. Basically responding to the challenge of how you can visualize the facets that can be returned with any query to the API, how can you visualize them in a more intuitive way? Typically they come back as text rendered results. So this is one attempt to address that issue. So we do a keyword search on Darwin. We get the results back, and it's looking at the facets for subject, language, author, and year. In the top section, we have different rectangles of different magnitudes showing the relative importance of the various facets. The timeline of the years of publication of the books that were returned in the object set, and then we have a word cloud that expresses the same sort of idea. The second search is, of course, monkey, and we get a similar display here with the various facets that are returned, shown as rectangles of varying magnitude. The next application is actually one that wasn't developed at the hackathon. It's being developed at the Innovation Lab to address the issue of how to help collection managers in libraries be able to assess their collections based on statistical information. It's a data set that we hope we will be able to include in the DPLA at some point, not too far in the future. We happen to have that data internally at Harvard and are using it with this application, but it shows you the sort of thing that can be done with circulation statistics. It's using the data that we have in this particular case of checkouts for books in the time period of 2002 to 2011. It's also looking at the subjects for the different items based on their LC class numbers and the year that the circulation event happened. If we look here, we see a global presentation. The different colors represent the different subject areas. In the top, when you have language and literature, for example, with a little over a million records in our data set, then you do a specific keyword search and, in this case, on evolution, and you bring back the subject set of all the items that respond to that particular keyword broken down by subject classification and, once again, also by the circulation statistics. So a collection manager could look at this and be able to tell which areas of the collection are being used most for these particular for these particular kinds of interests. Thank you. Hello. Great. So one thing that this demo shows is that you can take a small slice of DPA content as opposed to maybe taking a big slice like Dan showed. So you might refer to this as the card in pocket in jQuery plugins. We have an old school card here. It's taking in a pocket, you know, like you would get in a library 10 years ago. So you pass this jQuery plugin, a unique identifier, an ISPN number, an OCLC ID, and you get back details on how the community has engaged with this item. So a number of copies or a number of checkouts. Right? So this is really easy. One jQuery plugin to install, maybe someone would install it in their OPAC or in a blog post. It's not kind of a whole thing that you need an army of engineers to install. So I like the idea of taking a real small slice. This is a project that came out of the beta sprint called Mint. So what it does is it creates a crosswalk between things that are in the API already. So when we bring things into the platform, we try to map incoming data to a set of standard terms. So maybe you can search over a title field and it can be, you know, pretty standard. But obviously when a data set is coming in from the wild, maybe someone calls it, you know, my title or book title or something. So Mint is a web interface that lets an end user create that crosswalk between my title and title and helps us ingest the data. So the next segment is how do you make metadata usable? And it addresses the problem that Matt was just talking about and that Mint is one part of a potential solution that's too strong, but a helper towards. So I think that each of these jelly beans is one of the items pointed to by the platform, by the DPLA meta collection, except that way out of scale. In fact, you should drop back by some orders of 10, hours of 10 this picture and you have more like the some sense of the scale of difficulty. And the issue is, as Matt says, that there are, there's, as is well known, there's not uniformity about how to label things, what either the label should be or what the contents should be. So things that are the same have different metadata attached to them. This is one of the key difficulties. And so the challenge is to get enough detail so that it's useful, but not so much that you scare away contributors by requiring them to do extensive mapping and something they often are not familiar with. And also to do this in a pretty rough and rapid way, because April 2013, that's basically tomorrow. So we're trying to do this as imperfectly, but as quickly and usefully as we can. So what we are proposing and what we're trying out so far and seems to be actually pretty good is to have two types of schemas. So on the one hand, so you take one of these jelly beans, you take a thin wrapper around it of pretty predictable standard data. Most things are going to have, not all of them, but many of them are going to have a title, a creator, a publisher, a license, whatever. Fairly small set, express that in completely standard ways. So we are proposing smushing together Dublin Core and schema.org to very well-known standards. But then there's all that other goodness on the inside of the jelly bean that you, it isn't in the standard format, but may have tremendous value. So what we're proposing is that we maintain all of that data, not try to put it into a standard, into a standard schema, for the reasons I mentioned, but keep it so that if somebody knows what data is there, he or she will be able to get at it programmatically. And there are absolutely pros and cons to this. The pros are, well, it makes it much easier to ingest this two-part proposal. Much easier to ingest, you get a big chunk of stuff that can be searched across all of the contents that it's searched for title, will turn up titles of lots of different sorts of collections, and the rest is still there. We're not throwing out any data. But, on the other hand, to find the stuff that's not going into the standard schema, the simplified schema, you have to know what you're looking for. And we will make the schema, particular schema available through the API, but you have to, that is the documentation about it, but you have to know what you're looking for. And it's certainly the case that doing this purely through linked data internally, because we will support linked data out and in, but doing it internally will give some tremendous advantages for making connections among all those different jelly beans along likenesses and similarities that a schema cannot predict. And on the third hand, April 2013, we also have questions about scaling up linked data, which, so here's a proposal to the DPLA that it would be, I think, wonderful if DPLA sponsored some research will give anybody who wants it, they have all of the data, of course, and the API specification and try to build it using linked data internally with a triple store. And if it scales, and if it works, that would be fantastic. That would be amazing. In the meantime, we will charge ahead with a more traditional approach to managing this data. So, we currently have three schemas, three types of information that we're tracking, the item information, the item information, contributor information, and event data, which is circulation and other sorts of events. We will also have a creator schema, but we're, with all of these, we're very interested in having a discussion about what you think should be in there. We want to do as little recreating as possible. And we hope, well, we think that there, but this is for the DPLA to decide that there's room for a user's schema as well, but tracking some level of information in the course with privacy and security built in, but information about the users of the DPLA is maybe a terrible idea. It may be a good idea. We want to suggest some reasons why it's at least something to think about. So, Matt? So, people like putting things on shelves, right, in the physical world. They like putting things on shelves that they want to hang on to in the long term and the short term. There's a reason why that can't translate to the web world, right? So, this is a project we've been working on in the lab and developed an integration with DPLA called Shelf Me. I've got a little bit of a video here. So, this is my personal Shelf Me page and I have three shelves, one called Mapping the World. And these are the things that I've put on my shelf, right? So, these are books. I'm going to add a new one here. I'm going to search for a new book. I like that one. And I click the bookmarklet. Say, add to my Shelf Me page. Shelf Me extracts some metadata from Amazon here. And I have to do a little bit of cleanup because this is a beta version. And now I've got that item on my shelf. So, I took something from Amazon, put it on a list of things that I'm calling Mapping the World. And you'll see that I have details of this thing in the course. And so, this is from Amazon, but it can come from anywhere. Right now, it can come from IMDB or Music Brains or Dan's Cover to App. So, we did a quick integration here. Let's search for map. Find a bunch of cool things in DPLA. Let's say I want one of these things. I want to put it on my shelf. This Italy builds book. That looks really great. Let's clean up because this is a beta version. And now I'm adding it to my shelf. And so, here it sits right next to my thing from Amazon. This could be sandwiched between something from Google Books, maybe something from IMDB, anything on the web. So, this is supported for multiple users, for multiple shelves at Shelf.Me. We'll be out pretty soon. I have another thing to show you here called StackView. And StackView is really the engine that powers Shelf.Me. And StackView is kind of this general purpose tool that we've packaged up and put on GitHub. And we hope people will use. We did an integration with DPLA. So, you can pull results in and build a stack from lots of sources. Here's one from Amazon. You can scroll through this stack. You see the books are represented in dimensions you might find in the real world. The length comes from the length. The number of pages. And the color comes from maybe some heat mapping. Here's an example from Google Books. I can scroll through. Here's one from DPLA. There are also examples from OpenLibrary and Worldcat on this page. Again, this is a production project that's on GitHub. We hope you'll go and grab it and build a DPLA-powered stack. Thanks. So, this is in some sense a small widget. It's taking a list of books coming from a variety of sources including DPLA and it's presenting it in an appealing visual way. One of the hopes is that since this is open source that people will build widgets as a way of very easily embedding content from the DPLA into non-DPLA sites, whether it's a blog post or a local library that wants to give its users the ability to navigate visually through the content of that library, of the DPLA's metal library. So, I wrote this. So, this is a toy. This is the opposite of ever being production ready. This is from the hackathon. So, the idea is DPLA does not have lots of works that we don't have covers for. So, let's say you do a search on evolution and you click on the result for dreaming in the dark and you say, well, I'd like to get a cover. So, it goes out to Flickr and it pulls back what it thinks are relevant covers. You pick one, you crop it, because you're a good citizen, you correct the capitalization, get rid of the punctuation mark that shouldn't be there, and then if you want to, you can add a new background to the banner. And then you do the important thing and the thing that absolutely doesn't work, which is you save the cover. That's DPLA, but it's not saving anything to the DPLA in part because we don't have a concept of a user. And this one of the questions for DPLA is there room for a concept of a user? I should point out these are the four images that came back from Flickr. The title is Dreaming in the Dark. The absolutely most terrifying one you could pick is the one on the right. That would be the wrong cover to pick for dreaming in the dark. I hope that's the nightmare version. So, this is a demo from one of the beta sprinters known as ExtraMuros, it's now called Ziga, that James Burns there contributed. And so here's the idea. First of all, you log in. Ziga is this phenomenal tool for pulling in images and content of any sort from any site that has an open API and then letting the user construct beautiful narratives from this, walkthroughs of it. In this case, we've done a search at DPLA. It's come back with some raw data. We said, add it. This again through a book market, which it now has done. It has taken the first 10. It's brought them in. You can see the metadata that's been pulled in quite accurately by Ziga from the DPLA's collection. And then we're going to start adding these images to a collection. In this case, the collection will include what you're looking at now, which are images from Boston Public Library. Ziga doesn't care where the images come from. You pull them all together. You make a collection of images and sound and video and then construct it like a deeply personal walkthrough, a movie. So this is a type of integration of another application that adds, I think, tremendous value to the content that people have contributed, institutions that have contributed to the DPLA. It's also the possibility of taking those collections that users have created and bringing that information back into the DPLA, because it's incredibly useful semantic information. Of course, the user would have to opt into this. But that would be, I think, just an amazing roundtrip. So there are lots of types of enhancements that need to be done to the metadata. All of these are impossible to do really or even to do well. These are well-known computational problems in the field of library science. So first is Uniform Title where you have a set of books that are basically the same book, but they have different titles to them for whatever reason. And so you want to get those together so that when a user does a search, she finds all the books and other works that she's looking for. And Paul actually has done a lot of work on this. Uniform Title is, I think, the sort of thing that we can do successfully. But then there's a whole other range of ways in which books are related. And it's an open-ended field. This is a deep and very important problem. And we need help of all sorts in dealing with this. Likewise, pulling together books that are recommended or providing information to somebody who develops a tool that's going to make recommendations. And then the inevitable deduping, the removal of duplications, is a game when all their metadata may be different. And doing this across collections, it's very difficult. There are people working on it. It's a deep, deep problem. And again, we look for help. We will do the best we can, but we need help. So how to do this? Well, you want to make it... The aim of this is to provide a platform that gives access to metadata and through the metadata to content that makes it very easy for developers to work with it. At the developer's point of view, the developer is going to see outputs. And that will include the core, simplified data that we talked about, access to all the data that's brought in, all the metadata, and even getting back to the original records, having information available as data download. You just want all the data. Go take it and do what you want with it. And through linked open data, both coming in but also going out. So here's the last demo, and this is coming from Dan Brickley. And he's narrating it, so I will simply stop. Zoe Bybutovic and I created this demonstration of one way the DPLA platform can play in the world of linked data. What we're doing is using topics and entities from linked data to create entry points into diverse DPLA collections, as well as into other aggregations like Europeana. For this first demo, we're using manually collated results, but we're happy with what we're seeing from the platform APIs, and a live prototype in a couple of weeks. Our top level navigation uses RDF data from Freebase, which itself aggregates linked data from Wikipedia and others. We've filtered those for use in DPLA and generated per topic, per artist, and so on pages. Here we see a simple A to Z listing of art movements, where each entry is in effect backed up by a miniature database of associated information. So let's jump into the abstract art entry. You'll see how each page forms a link with data from DPLA and elsewhere. The idea is to cross reference across collection types, books, images, museums, etc. You'll see here items from the DPLA meta collection, and later on some images from Europeana. As we scroll down through the entry for abstract art, you can see how new modules can be dropped in as the DPLA collections grow. These entry points are built from linked data, with which we can generate appropriate queries to send off to the DPLA and Europeana platform APIs. This lets us generate new pages today, but this kind of micro-site will benefit greatly as these platforms and collections move towards providing richer linked data natively. Our approach shows how DPLA can hook into the linked data ecosystem and how linked data lets us build browsable navigation systems on top of the DPLA platform. Dan is amazing. He's many time zones away, but he is just wonderful to work with. So we had a hackathon April 5th. The last little bit of video to show you is, in fact, a two-minute round-up. The videography is horrible. We are glad everyone could make it on relatively short notice. Thanks to the Berkman staff. I'd like to introduce Chris Freeland, the co-chair of the time group screen. Hi, Chris. Pulled together a tiny ruby script that basically took a query, in my case, monkeys, and grabbed four fields, used that file as input into ViewShare in an attempt to build views like having a pie chart showing groupings by the data source. For instance, I did the monkey search as inspired by Corey, and I find that there's an author here, David Lipsky. He came back and tells me that there are three different David Lipsky's that it knows about being able to generate lists from queries to the DPLA and placing them on a map. And Joe locates your position through your browser, and it finds the three nearest local public libraries. In version two. One of them happens to be in the middle of a river. So I was just working with the James and Nate a little bit to help them hook up wiki-media images into their map because they also wanted a little image now. This is a query for butterflies. It's the first 20 results, and it puts them all on a timeline. It's easier, but you can see butterflies, what people have thought and said about them very slightly. Well, I created a really simple Python library wrapper for the API. Define facets that you want, define sort parameters that you want. One of the things that we've wanted to do, but haven't had time to do yet, was to go out and get related IOC. OCLC has an XID service, they call it. There's a suggestion when the ISPN shows up they can call out, get the related items, and create a work in firmer speak. That's the work level. That was really great in just a few hours to get all that done. So again, I do want everybody. Yeah, that was that was an amazing day. I want to thank Martin and Chris again. So those were most of the ones that you didn't see send me live here today. That was all in one afternoon. That was to me astounding. I think it made us all feel really, really hopeful. An amazing set of people as well. So thank you for coming. Okay, so April 2013, here's what we think with a lot of help. A lot of help from all of you. We can get to working API and ingestion engine work. It's going to be it's not going to get everything, not even close, but it will be we hope, automated enough. Pretty good collection of worthwhile material that is metadata that's pointing out to many of the astounding collections that are available as well as whatever digitization the DPLA is sponsored. We hope something like Internet Connectors to be specified. And we hope lots of community developed apps and integration with sites large and small. We are not doing a front end. I think I've been clear about that, but I also want to point out the following. The metadata will be incredibly messy. It will be nowhere near that you look up something and you get everything back that's relevant about it. We're dealing with metadata that's coming in from as many rich sources as we want. Sources that were not designed with supporting the DPLA or being even interoperable in mind. The ingestion will be clunky. It will be, as Paul says, perhaps have a mild case of indigestion. It will, we hope work and support the set of nodes and collections that will make the platform worthwhile, but it's going to be really hard and difficult and uneven. And most of all we can promise that this will be wildly imperfect. It will be an alpha at best. That's what we're aiming for, because a year is not very long. The only way we can do this is with help from you as individuals, from your institutions, from the work streams, from the mailing lists, from other labs, from other application teams that is the, and by reusing massive amounts of code it's an open source world. So we need, this is a plea for help because this is a gigantic undertaking. But we think that we are headed that this approach in general with all of the room for argument and discussion and improvement, that this approach in general of building a platform that gives open access to open metadata through a set of services that will support what developers want to do has the possibility of generating the type of innovation that the web itself was designed for. These are some links. Thank you very very much and we look forward to talking with you and thank this team which is an amazing, awesome team. I believe we have two minutes for questions. We'll do our best. You actually have an API that we can write apps against. Is that what you guys are doing? Yeah, that's all this. And some of those links will lead us there, is that it? Yes. Great. Everything that you saw was live in my hotel room against the API, against real data in real time. Absolutely. Please, kick the tar out of the API and let us know. Cool. Just a request to put up the URLs again, please. Ah. I was just wondering what I see that you are using PHP and I was wondering why you chose PHP. I'm not going to answer. In my opinion, the language doesn't really matter that was a common language that one of the common languages across our team so it seemed like an easy choice or certainly not very to it. I've built stuff against the API using Ruby and Rails and just my app that I showed the covered app is purely client-side so it's just a choice. If you have further thoughts about that see us afterwards, get on the mailing list, we would love to hear from you. And somebody else is going to have to tell us to get off the stage. One question? Just how much content is available today through the platform? If by content you mean metadata? No, content. Ask this. So this is a metadata server? We are not aggregating content directly. We don't have a repository for content not at this point. We will for DPLA digitizations which don't yet exist but the primary, the first aim of this is to aggregate metadata that will point out to content. 19 million records though. Thanks.