 Good morning. My name is Tim Cole at the University of Illinois. I am at Banks Librarian and Professor in the University Library and I'm an affiliate professor in our iSchools Center for research and information research and science and scholarship and I better make sure I'm muted too. Okay. At the W3C, I'm past co-chair of the Web Meditation Working Group, co-editor of a technical note issued by that working group, and currently the co-editor of a perspective new recommendation being developed by the recently charted publishing working group. This is an update on the release of W3C Web Meditation Standards and an early look at the ongoing Web Publications Working Group work currently underway at the W3C. In regard to the latter, before we get into the main presentation, let me acknowledge that I've liberally ripped off the slides created by two of the co-chairs of the working group, as well as Dave Kramer, Ishet, and Yvonne Herman of the W3C. So I'll be focusing on developments in the last 12 to 16 months, but regular C&I attendees will be well aware that a log round work for the Web Meditation Specs was laid through most of the last decade by a lot of people. A major effort provided by many individuals. So to provide a little context for those of you who are not familiar with the recent developments in Web Meditation, this is my single history slide. Several projects led to the chartering of the Web Meditation Working Group started with the Mellon Foundation-funded Open Meditation Collaboration project that began in 2009 with Jane Hunter, Herbert von Asampel, Neil Freistat, Dan Cohen, Rob Sanderson, and myself as the PIs. In 2010, the Attation Ontology Initiative based at Harvard and Manchester got underway, Palo Tricorice and Tim Clark and others. In 2011, Dan Whaley founded Hypothesis, which works extensively with Annotation Now. By 2012, the Open Knowledge Foundation had released JavaScript-based annotation tools. And by 2013, we were seeing work from coming out of Europe and on other areas in Europe, which led to the notorious image annotation libraries and the pundit annotation tool. 2013 also saw the first I-Anti-conference and the creation of the W3C Open Annotation Community Group and separately the creation within W3C of the Digital Publishing Interest Group to begin exploring Web publishing issues. So we built up a lot of momentum by 2014. The W3C had a need about this to say critical mass with regard to web annotation had been reached and charted the web annotation working group. The working group was able to leverage all the activity and the interest, led to the release in February of this year of the recommendations and specs, three recommendations and two notes from the working group and also led to the development and furtherance of the publishing interest activities and so the charting in June of the publishing working group. Very quickly I'll show you the specs and then talk a little bit about them. So let me enumerate the web annotation specs published in February. First, the web annotation data model, which gives developers both annotation clients and servers a language of frank web to exchange web annotations. The web annotation vocabulary, which is the formal ontology underpins the data model, the data model is RDF and linked open data compliant. A web annotation protocol, a framework at this point a little loose for connecting annotation clients, web annotation servers. I'll say one or two words about this. The working group also issued a note describing particularly the submodel for specifying segments of resources, known as specific resources in the model. When you're talking about segments of a resource that you want to use as a target or a body of an annotation. And finally, a summary note describing possible approaches for embedding web annotations in HTML. This presentation, which will be up on the CNI website soon. I've included for those of you who like reading W3C specs and God help you by the way. I include the URLs you can use to find these specs, have at it. For others who want just a sense of what these documents are about, let me give you the 10 minute tour and you can ask me more questions at lunch or later. I'm also going to talk a little about the implementations we're seeing emerge. So core, there's a very simple idea. And those of you who've seen Herbert or Rob talk about this work over the years, we'll have seen this diagram. Let me highlight three kind of critical facets in this very simple model. First, every annotation is itself an object on the web, which can be in turn, annotated, shared, and start separately from the content being annotated. And that's important. Second, the body, the contents, the comment, if you will, if it's a simple comment of the annotation is not necessarily part of the annotation object itself, but is addressable to its own little box there. It's an addressable object in its own right. That can be referenced and in some cases, if you're for example, tagging, it may be a tag in an anthology somewhere else entirely. Third, annotations have a structure. And so they can't really be captured as a simple URI. Instead, the data model specifies serializing annotations in JSON-LD, the link data version of JSON-LD of JavaScript object notation. A brief anecdote about this last point. The web annotation working group charter was presented to the W3C advisory committee in April 2014. The AC is composed of a rep from every W3C member. Doug Shepers and I presented the charter at the AC meeting in Boston that year. During the Q&A, someone from the audience got up and volunteered that they knew of the perfect scheme for defining annotation targets and annotations. It's similar to how fragment identifiers work. You just put double hash marks instead of a single hash mark, and you do this and that, this and that, and everyone will be happy and will just be done like this with no working group. But we needed it. At that point, someone from the back of the room piped in with, and we do that, we'll break the web. Someone turned out to be Tim Burris Lee. The annotation reviewer went favorably, the AC decided to charter the working group, and we went forward from there. But now you know why. We don't try to make URIs out of annotations. We actually use JSON. Okay, all very nice in the abstract, but completely what kind of things can you do with the live annotation data model? I'm not going to spend a lot of time here, but I want to show you a little bit of what annotations look like in this data model. Simple use cases are relatively straightforward, and in fact, almost human readable. To be clear, annotations are created soon by software. But some simple annotations are easily comprehended at a glance. For example, there's the idea of a bodiless annotation showing the upper left corner of this slide. We were surprised when we began the open annotation collaboration, that even among scholars, this is a very common and important use case, just empty bookmarking, just like flashing a highlighter or circling something in text. Okay, so here you have a four line annotation, as simple as you get the first and third key, context and type, and fixed values, and they're the same in all annotations. The ID key gives identity to the annotation. The last key then is the important one. That's the target. Here's a simple URL of the web page being being annotated. For slightly more complicated annotations like sticky note kind of things, there are two options. The lower right shows what's called a body value annotation. It's a shorthand that may be used when the comment is plain text and extremely simple. You don't need to talk about the language of text. You don't want to reference the comment later on separately. Lower left is a longhand version of the same annotation. This pattern has advantages if you want to annotate with markdown or HTML, like you include bold or italics in your comment text, identify the language of your content, or be able to reference this annotation, this body separately. And of course, critically, the model allows them on both the annotation itself and on the bodies that you create for annotations, the ability to provide provenance, who created, when they created that kind of thing. More complex annotations than require more complex JSON, the way it should work. One more powerful feature of the annotation model is the concept of sake resource descriptions, essentially look here for finding identifying resource segments in the annotation target of body. I won't take time to detail this works. But if you have questions, see me or lunch or drop me a note. This just illustrates that the model does anticipate different ways you might want to target text, or graphics, or whatever. You can talk about byte streams, you can talk about text streams, you can talk about code selectors, and so on. And even find advice advice in the model on how to extend this with a particular community. Similarly, the model provides a means to deal with multiple bodies and targets, which comes up in more complex scenarios, or as illustrated in this example, to tag the motivation for creating the annotation, and or tag the role of each body in a multi body annotation. This part of the model, the, the motivation purpose is almost certainly underspecified. There's a paper that Alan or Nate or Jacob Jett, Dave Dover and I did at the 2016 Balancide proceedings about this. But the reality is that there are only a few tools actually sophisticated enough to allow experimentation with how motivation and purpose might be used. So for the working group, we felt like more experience was needed before we get more sophisticated in this fairly simplistic model. Go a lot more depth about the model, but I didn't have time today, so I want to change and mention the protocol briefly. At a high level, this document is about how annotation clients and servers interact. So the protocol spec talks about what a client application, an annotation tool, can do with an annotation what's created, how it might send it to the server. And now that tool could also go to that server and discover if there are other annotations involving a resource of interest. The protocol is based on the W3C linked data platform recommendation. It was written to follow the five basic principles laid out here. Stay consistent with the architecture of the web. Follow rest best practices. Expect annotation clients to talk to servers using HTTP broke all the web and vice versa. Take maximum advantages of existing specifications. And while keeping it simple as a priority for us, we realized these first four principles really take precedence for the most part. To remain to this comments yesterday about annotations and security of annotations and privacy, the protocol does not define how to do that. Instead, the protocol like most current W3C recommendations including the linked data platform specification relies on the built-in mechanisms of HTTP to manage permissions to annotate and so on. Basically, if somebody tries to do something using your annotation tool and sending an annotation to the server, the server has a right to say, you have an authentic agent, I'm not going to let you do that. And use HTTP to authenticate. So I want to turn to the testing that we did as part of this process before we published the recommendations as final recommendations. And a little bit about implementations that were seen emerge that are using those specs. At the W3C, testing is not about proving the technical viability or correctness, I mean sorry, it is about proving the technical viability of a recommendation. It's not about proving the correctness of the recommendation in a more higher level sense. It's basically about ensuring that required features that you talk about in the recommendation can and have been implemented. So we don't put a recommendation out there that says you're required to do something and it turns out it's very hard or impossible to do. So this is a screenshot, just a fragment of our test report. The link is there if you want to go see more. Ten developers submitted JSON created by their implementations and we tested all the files submitted to see exactly what features they had successfully incorrectly implemented among our required feature set. We wrote over 160 tests, each comprised of one or more JSON schema checks, over 300 JSON schema fragments. We then made sure that we had at least two implementations while required specs or we stripped that requirement out. And there were only a couple that came up that nobody had chosen to implement. The first column basically shows the feature. So I wanted to make sure that you could do an embedded textual body and have the required properties that we specified. And there were three schemas that you had to pass in order to do that. Then the right hand column are the abbreviations for the implementations. For that particular required property, four implementations had that particular kind of body, text body. For using external resources, the targets and bodies, eight of the ten had done that and so on. I say you're welcome to go look at the actual spec. It's got these nice, at the bottom it's got these nice little green and red and yellow boxes. Very festive to look at this time of year. The other thing we've been pleased to see is the number and diversity of implementations. These are specialized tools that I'm going to talk about, the web annotation spec, at least at parts of it. We've had discussions as well with browser manufacturers but the stage annotation implementations require embedded JavaScript libraries or plugins in order to use. Still seeing quite a bit of use. And we may see some progress on that or we may do some polyfills for showing how they can be used more natively in browsers. This screenshot is from Hypothesis under their general heading on their website of publishing. It illustrates some of the work that they're doing with publishers to do annotations and add annotation features to publishing the publishing. In particular these are some of the partners that have joined with them in doing work on annotations in their their applications and as of last week you can actually add Engenta to this list. Hypothesis for those who don't know, it's found by Dan Whaley, is a non-profit and does everything open source. So the they have a nice GitHub repository. You can actually contribute under certain terms to their code base if you want to and you can borrow from it and use it in your own own tools and applications. Another application is an Italian project started several years ago called Pundit. It got started in part with the help from Europeana. It's very similar to Hypothesis and actually one thing they've done that's very interesting is the interoperability experiments. Tag it or annotate it in both Hypothesis and Pundit and use the other client to see what it looks like and that that seems to work very well, very good proof of concept to show the interoperability which after all is one of the main purposes of these specs. They have a GitHub repository which primarily in AngularJS and developer site so again open source software may be of interest. Here's a relatively new initiative within Europeana, an API for annotating any object in Europeana. It's at the very early alpha stage right now but watch this space over the next six, twelve months. They do expect to get to mature beta in 2018 so we'll see that work and that's very interesting to see an API that allows you to create the annotations about inter-operable objects and store them with them with the with Europeana and again both a different repository and nice Google discussion forum for your API. This is something you may be familiar with in the other context, the international agent interoperability framework initiative. Rob Sanderson, one of the chief instigators and architects of IIIF was also co-editor of the web annotation specs as well as co-chair of the web annotation working group. The reason is because in IIIF in the presentation API as it describes annotations are used basically to associate images with the canvas which is the underlying model of IIIF called shared canvas originally when it was first worked on. So this is a very interesting use of annotation, a little bit novel and as no one over here is already at work and by the way Rob is just finishing his paternal paternity leave. He should actually be here co-presenting this with me but he had some familiar obligations. He says hi but it's given lots of time for him to go and work on the next spec of the presentation API that will be out next year and it will bring it fully into line with the latest the final recommendations that came out of the W3C web annotation work. Finally just a real brief plug for one of my own projects with an acknowledgement to the generous support of the generous support of the scholarly communications program at the Andrew Mellon Foundation. We've been transforming our now-venerable cold-pressed archive into IIIF as a step toward making better use of linked open data. This is an archive built from digitizing Professor Philip Kolb's notes that he and his students took over 25 or 30 years in the last century. Basically 20,000 or so note cards literally note cards. They've been digitized put into TI initially now into RDF and they mention a lot of people more than 7,000 names associated with Marcel Proust. His writings, his letters, his life in times. In fact Kolb was the editor of like more than 20 volumes of the correspondence of Marcel Proust. We've gone in and made it into RDF and now we're starting to mine this information and leave for linkages to VIF to Wikipedia and so on and finding that and we're going to find which families the various names are associated with. Many of the prominent families of the time like the Rothschilds. This is just a way of visualizing some of that. Every note I hear is a family as a member that is mentioned on the same card with one of the Rothschilds were mentioned up 43 of his cards. Obviously the next thing you want to do is begin to annotate this graph. Again different than annotating pages of text or even images. This is kind of fun. This just shows a couple of screenshots from a very crude interface we've been working on for this screenshot in the upper right where I'm about to add a comment to the Rothschilds node and then the lower left as we get links as annotations we can build those into the right context key so you can link out to like Wikipedia entries that have been added by annotation and finally in terms of implementations before I turn to the publishing work at W3C let me just mention that Apache Input Annotator Incubator has been established to see if we can get web annotation libraries into the Apache software software universe. This will allow us to do more work on annotation services in particular and potentially make it relatively simple to configure servers to capture for example annotations people want to make about content or some part of the content on your website. The incubator project does not proceed as fast as we had hoped but is proceeding to get a lot of input from hypothesis folks from Benjamin Young at Wiley from other people and I'm hoping this ends up being yet another avenue to see the influence of these recommendations on interoperable annotations. So I'm going to spend just a few minutes that I have left turning my attention to the web publishing activities within W3C which have been around for a while but have gotten a big boost this year due to a couple reasons okay in particular the web annotation work group showed some interest and survivability for annotation which is important to publishing and then in January of this year the International Digital Publication Forum people created EPUB merged with W3C and EPUB 4 will now come out of W3C or changes to EPUB 3.1 will come out of W3C. So with that initiative that momentum W3C in June charted a new working group the publishing working group. In slightly less words than this this official goal on their charter basically the goal of the publishing working group is bringing traditional publishing more into the web world and the open web platform and bring the open web platform a little closer to traditional publishing. We'll see how that works out. So what's the problem that space that they're really trying to be tackled here? Why can't we just put web publications up to HTML pages and we're done? A lot of people in the W3C community think that's the right answer just put up there add some JavaScript and you've got a perfectly fine platform for publishing traditional journals books everything else. It's not entirely clear it's true it doesn't deal with the fact that there are other things besides browsers that use the web like read platforms and basically the web is very resource-centric where resources are conceived in a single simple objects basically a file like an HTML source they're ancillary files like style sheets and and JavaScript that basically help to help you present that render that file but they're not containing the content for the most part. Content generally is in the resource that you get whether it's drawn from a database or an HTML resource. Web publications are a little bit different in the sense that structure is more complex structure is more important but if you have an intellectual object say a book with chapters or an article with sections and proofs and a better video each of those separate files now you have to worry about whole part relationships first next last reading order conditional sequencing for certain kinds of of modern books offline reading of books the open web platform sophisticated as it is is not fully adapted to treat such objects well currently EPUB basically wants to treat all internal resources next HTML file and zip everything else and call it a day that also is limited so what we're trying to hear is finally to talk about resources that contain other resources that are publication in nature in a sensible fashion that can be rendered by the various systems. The working group has been charged to create at least four deliverables we have a fifth deliverable in process as well and it's an outgrowth of the web annotation work but the four required deliverables are listed here. EPUB 4 will come out before this working group is finished. Another important deliverable is the area, DPUB area of deliverable which makes it's time to make EPUBs and everything else much more accessible okay version one of that is now being voted on based on work done with the interest group area is the interest or the group within W3C that does excessively work very optimistic very pleased by what's happening there version two will come out over the next two years in trying to achieve its deliverables the web publishing work group is looking at several items okay number of technical issues some of which are listed here for locating identifying segments within a web publication we're going to borrow heavily from web annotation notice the mention of archiving I was talking to Nick earlier we need more help we need more librarians in general in the publishing working group we definitely need archives and things like or experts on things like archiving if you know any please problem to talk to me at the very least we'd like to get their input on which they can provide even without being part of W3C because the work's done transparently transparently deal there are a lot of publishers involved in this with the joining of IDPF a lot joined W3C and while they all agreed this matter was wonderful they do not want to lose the idea that the web is also still and it always will be they hope a web of documents like publications so the members of the working group think about things in a document oriented fashion this is interesting to see how this meshes in our meetings and our calls with people who don't think about document-centric web anymore at the same time there's no false sense that everything is or should be an e-pub the web publishing group is not just creating e-pub for they're creating more general specification for web publications more generally and for package web publications web publications that are created and often little on offline but they can be expanded and brought to the web because they were created in the right way we're also looking at prior work the open web platform service workers web application manifest html nav tags link references we're looking at what e-pub did with manifest in spine kind of go fragment identifiers we're looking at personalization what browsers are doing as well as accessibility personalization efforts looking at annotations offline again and arrays of objects and so on so that's kind of where the web publishing just getting started in january we released our first public working drafts of at least two hopefully three specs please take a look at those those are transparent open anybody can look at those anybody can go to github and comment on them so look for the publishing working group find our github site add to our issues so we do better job with the specs and think about things like what do libraries have to have to make e-pubs work well and then think about joining community groups the nice thing about w3c community groups is you don't have to have to actually have to be a member of w3c a joining community group a new community group that just got started people work in museums or have segments of libraries that do museum-like work is the art and culture community group rob sanerson and a colleague just opened this um a month ago at t-pack w3c t-pack and so it's got 40 members already they're just getting organized if these are interesting to you you might want to join a community group it's a gateway drug i joined the community group for open adaptation and a year later i managed to get my institution to join w3c which for nonprofits is not terribly expensive and then i was able to participate in the working groups and it was a good thing to be honest for those of you at educational institutions the challenging part of getting to join w3c is less the budget cost than it is getting your office of technology management to sign off on the uh no patents kind of policy basically you don't put stuff in w3c that you're going to try to patent and you have to convince them that what we're doing for w3c is not patentable anyway so and i did not leave much time but there's a couple minutes of people have questions