 Good afternoon everybody. I'm Diane Harris. I'm Dean of the College of Humanities and I'm delighted to be welcoming Dan Whaley to our campus this afternoon and I'm just going to give Dan a sort of an informal introduction. I've known Dan for a little while. We first met when I was director of the Humanities Center at the University of Illinois in Urbana-Champaign where Dan has from Winston has his undergraduate degree in English. But I first met Dan actually at a small meeting that was convened by the Mellon Foundation to talk about digital humanities and the future of several digital humanities projects and learned a little bit more about hypothesis after and became very enthusiastic about it. But before I say a little bit more about that I'll just say again Dan has an English degree from the University of Illinois and he then went on to do some incredibly important innovations in the digital world like inventing the first travel reservation platform and also I believe the first e-commerce platform on the web. These are kind of major innovations that have impacted all of our lives every day since that happened and it's also exciting to me to think about how someone with an English degree then takes their knowledge of the kinds of things that we teach students to do in an English department close reading learning and understanding of how people think about how humans interact with the worlds around them the kinds of things that matter to us in the humanities and then can take that knowledge and parlay it into building the kinds of tools that are incredibly important in our everyday lives and that probably I'm going to venture to say would not be as useful to us if they were not designed by someone who had the kind of depth and breadth of understanding that Dan has I think because he also comes from a background in the humanities so it's just a delight to have him here with us today I really wanted Dan to come because although I have not yet had the opportunity I've been doing administration mostly since I met Dan and learned about hypothesis but I'm very very convinced that this is a tool that has the potential to shape the way we work in higher ed more than almost any other innovation in the digital humanities or in the digital world because I think it offers the potential for new pedagogical practices really innovative pedagogical practices which some of our faculty are already pioneering but also new modes of collaboration among scholars at a time when being able to work together to produce critical commentary that's also authoritative commentary on any content that's out there on the web is more important than ever before so with that I won't take up any more time I'm going to let Dan come up here and tell us about this wonderful tool so welcome Dan. Thanks Dan. Thanks everybody. This is what I'm going to talk about so thanks for having me it's great to appear it's my first time at the U of U or U squared or whatever you guys affectionately refer to it as here and super impressed by all the stuff that's going on I got to see the new D.H. Matter or the digital matter group thanks Rebecca and everybody and the university experience has come a long way since since I have been in one so it's really exciting to see what's going on so I'm going to tell you a little bit about a very a very interesting new project some of you may have heard about it an interesting new paradigm and why I decided to kind of divert from for-profit world into this area and a little bit about what's happening and what you guys might expect to see over the next couple years so so I work with a nonprofit called Hypothesis we have about 15 people I'll tell you a little bit more about it later and we have a vision within the scholarly world which is essentially to bring an open collaborative layer over all knowledge and all the things and bits and pieces that make up knowledge in terms of articles and books and images and art and media and so forth that this effectively are all on the web now the the amount of undigitized information that's still not on the web is getting smaller and smaller you guys are an active part of Hathi trust and continuing to scan your your archives and you know we're we're getting to the point where it's basically all on there and certainly you know the the most important parts in terms of what they mean for our daily lives so we think and the reason why Hypothesis is a nonprofit is that it's super important that this technology be open based on standards and really built for the long term in terms of what it's what it can mean for for all of us and the plan in terms of how to do that in a nutshell is really to partner with the world's great publishers libraries these platforms like Orchid Internet Archive Hathi Trust and so forth to to develop design and roll out this this platform so that you guys can all use it so you know the problem statement is kind of like here's an article and it's and it's been published actually I think this is on archives or technically hasn't been published but you know it's it's basically gone to die somebody may incorporate cite it at some point later and then we'll continue to kind of have a life you know in the citation history of a paper but there's no way for you to see anything really about what other people think about this unless you can maybe google unless it's been cited perhaps then you can see what other actual formal papers are saying about it or how they're referring to it you could google the title and see whether maybe google might get lucky and give you like a blog post about it or something like that but you can't see on the article you know like like a heads up display or something what other people think you can't see if there's been a correction you can't see if there's been a retraction you can't see if you know somebody tried to reproduce an experiment at a different temperature and and you know the the author of the paper responded and said oh no you know try 85 degrees and study 83 degrees so there's a lot of information that you're lacking because of the way publishing works and has worked for a very long time so there's been a vision for how this might be different that goes way way way back kind of the some of the earliest people that cite early influencers point of Vannevar Bush who was the first head of the office enabled research and time man of the man of the year and 1943 and really a big thinker about information imagined the web and and how we might get here and he he imagined that he kind of conceived of the concept of Wikipedia this collaborative you know thing even before the infrastructure to create that thing might be possible but he really first talked about this idea that all of us together might come collaboratively and create trails through knowledge sharing our perspectives on the things that are out there in addition to just sharing the things that are out there in Tim Berners-Lee and his rough proposal for the web in 1989 one of the foundational nine principles of the web that he wanted to create is was that it should be able to be annotated surprisingly not only the the the nodes the pages but also the links between those two things surprisingly are the two fundamental design principles today the way that annotation has been imagined to work in 1993 Mark Andreessen former U of I alum who many of you may know created the first graphical web browser called mosaic and then later he moved to silicon valley and started the company called Netscape which is now essentially the spiritual origin of Mozilla which brings us Firefox so he he put out in 1993 a note on a bulletin board saying hey I'm going to start a group annotation server and build it into mosaic and so they did that they turned it on for about a month and so that the only browser that everybody was using back then was able to annotate every page on the web and you were able to see the annotations in groups of people that you wanted to join with and then they turned it off and he's written a blog post recently about how much he regrets that even though they didn't have the funding or the resources or whatever to be able to run this server that would have to scale to web scale with every page new page it was added he still regrets not having this 25 year history of layers of annotations and thinking on top of this this web so what we've gotten instead is this thing called the comment widget which is you know down there at the bottom of pages it's really terrible you know it's first of all it's it's kind of below the fold as an afterthought at the bottom it's implemented by the publisher it's their agent their vehicle not your vehicle it's all proprietary tech there are no open source implementations of commenting systems that anybody uses it's you know a haven for trolls you know it's poorly moderated and poorly really conceived as a way to generate high quality signal versus low quality so but the better big problem is it's only on pages where it's implemented so most of the web is quiet you go there you see the page but you don't see any activity activity or thinking around it if you want to see that you go somewhere else you go to facebook words or you go to reddit or you know you check twitter to see if anybody's you know pasted the url in there and what they might have to say about it so we have an interesting technology that's been with us for a very very very long time called the annotation you know Talmud and you know illuminated manuscripts from the middle ages you know have these things and the reason why people started doing them is because they were useful that placed the thinking of a person they didn't really mind scribbling in books back then too much and so it was a way to layer thinking from other people as a kind of a living guide and a living legend as people went there are other solutions out there in the more of the stem areas people might be familiar with services like research gate and mental a and academic edu where you can take a document and you can upload it and you can have a collaborative conversation there that's really kind of forking the content like you've got a you're going to still do it over there you can't really discover it where where it is so the the idea the the promise is to build the the capability to have number one annotation and number two the most important thing is layers of annotation on top of the world around us layers for any purpose that you might want for personal use for classrooms for small groups for public channels for specialty communities that annotate for a specific purpose machine annotation is something that we're seeing a tremendous amount of interest in so lots of different groups want to create you know machine reading systems that will take and process do any extraction create annotated service layers that may be a benefit for many folks so these these annotations like I said lots of layers lots of purposes and the other thing that's really important about the way that that people have come together to conceive of this is that these layers of annotation can come from different servers so this is not a monolithic new you know twitter where all tweets one through twitter or google docs or something like that this is kind of like the web where anybody can run a server anybody can build the client software that would be able to see these annotations or render them and that way you can have a user interface control a paradigm that lets you see different layers and those different layers could come in becoming from different places this is all based on a new web standard as of february this year the w3c which is the standard's body for the web so they determine what how each tml and css and other things like that get extended has has ended a four-year process of bringing this this technology called web annotations through to formal approval and this means the most important thing is that people number one can have a shared understanding of how to build these technologies in a way that will be interoperable number two and number one and number two that browsers can start to build this stuff in natively so that your browser will come built in with the capability to to create and anchor annotations and that you'll be have the freedom as a browser user to plug in whatever servers you want to be asking for for relevant annotations on the pages that you go to not to get too geeky but annotation in w3 speak involves four things that we commonly know of as discrete concepts they're actually simply permutations of the annotation model so an annotation is the fully specified thing it's it's got a note it points to a page and then it points inside the page to to a bit of text a comment doesn't point inside the page it's just a page level thing a highlight points inside to a sentence but it doesn't have any it doesn't say anything it just highlights that so and then bookmarks don't point in and don't carry any text but they mark a book or a page or a document in a way that we can save an archive and access later we can start to tag all the bookmarks that we make in the same tag structure that we're using to make annotations or highlights or comments and being able to combine these things in a fluid single data model very very very powerful so the class of applications that were that are kind of in the crosshairs of this new paradigm are things like note-taking things like tagging dego delicious pinboard things like discussion reddit facebook twitter and bookmarking stuff that tends to come natively inside your browser the goal of the w3c paradigm supports is format agnostic all it needs is a selection to be specified for the particular data type so text e-pubs which are text images video data you name it so we'll show you some examples my goal i'm going to try to speed through this and get to the demo and kind of show you actually how it works so we're a non-profit like i said funded through support of melin or medias loan helmsley shuttleworth night foundation thankful we have a team of about 15 all we do is annotation it's about half software developers and half program staff that help focus in verticals and adoption and the goals for us is to be shared new for neutral infrastructure both from a services point of view to the point where people want us to run or host their annotations but also from from a code point of view to produce the absolute best to breed open source implementation reference implementation of this new technology and to partner broadly and protect and preserve this this paradigm to the best that that our organization can do that in a way that aligns long-term with with users we do run a service ourselves at hypothesis you can go to our website you can get a user account you can start annotating the web right now we have about 100 thousand users we're on track for about two million annotations by the end of the year about a quarter of the annotations are public other people can see them 50 percent of them are in private groups so for classroom use or you know a bunch of authors are making notes sometimes people make groups just for themselves to help organize things and then 24 the the rest are personal private notes just just personal use so what what's you know we try to think of what is going to drive the next billion annotations which we think will start to happen very rapidly so things I'll talk about in a little bit number one is true federation of the client our client right now is still single server by the end of this year we'll have extended the client to be able to listen to any server any compliant annotation server including multiple servers simultaneously so you'll be able to just to log into two or three servers like a company might want run one behind the behind their firewall were you creating company confidential notes and collaborations at the same time you're listening to see on that piece of case law whether there were any private public observations and then you and the client will also be able to automatically discover new servers as it as you go around the web so the page might say there's a server over here that has authoritative annotations for this content and client will automatically go connect to that server and pull them in so it's really interesting for publishers and other kinds of use cases publishers platforms integration is the key really from our perspective to to use making sure we support every format I mean you should not be able to essentially encounter something on the web that you can't reach into and annotate that google doc you should be able to annotate that even though it's got a commenting system being able to use yours means that you can annotate against your account with the same group structure that you used on that web page over there so we you know the goal is to break apart these silos so that you've got that same fluid powerful capability everywhere spotify you're storing you know or or shazam you what are you doing when you're tagging all those songs you're annotating them who's got that data shazam how come you can't plug your annotation data data store into shazam and then tell Spotify to listen to that as your structure of tracks we you know the promise out here is to start to use these open paradigms to start to break apart these silos they're providing useful services but these are very brittle systems ecosystems that aren't a good long-term architecture for humanity to do all the things we want to do so a couple years ago we've pulled together a consortium of first scholarly and educational publishers that have agreed to start integrating this so JSTOR Elsevier Wiley Hathi Trust Archive Bloss you know most of the large publishers are part of this now and we're beginning to roll out the integrations into their platforms yesterday we just announced MIT Press we'll be announcing NYU Press and a few others and some of the other larger ones are very near to announcements as well and we also run a large workshop every year called I annotate with about 150 people that come together to do hack and share stories and have a good time so annotation is useful for a lot of things here's some some user stories I'm going to dive down into some of these but there's a lot of different kinds of ways that you can you can use annotation let me kind of power through a couple of these so the classic use case kind of the equivalent of comments on the web is post publication discussion so here's a somebody who's annotated an article at archive same article I showed you before and he's you know his article his research was cited in this article and so he's gone in and said hey it's really great that you know thanks for citing my work but you know folks might also want to look at this other one and you know goes on to make a few other notes in the preprint very interesting use case for preprints because you know they're not peer reviewed but potentially a collaborative annotation on top of them can be a way for for the article to improve perhaps on the way to publication here's an example of a researcher whose paper who is the one of the co-authors of the paper who's gone and later and annotated their own work to talk about some additional projects that they're working on so this is good you know one of the most authoritative people about a paper is of course the author so how come they can't go back and provide I'll have a system that's completely within their control outside the publisher which lets them provide additional context as as corrections and so forth as time goes on publisher layers are something we're seeing absolute tremendous amount of interest in so you know the journals are kind of like well this is interesting but I don't really want to stick this thing on my journal and just invite you know kind of conversations and discussion that I can't moderate so this but on the users on the other hand want to be able to go to that article and basically annotate and not necessarily be moderated by the publisher and bring all their communities and groups and everything with them so how do you decide solve the standoff well the the architecture kind of lends itself to the solution which is that the publisher can specify on in the the page template for the article an authoritative server for that article and the the interface will show that server and this is an article by elive elive is about to roll out this technology they're open access life sciences journal and so here the elive discussion layer is on top if you're not logged into hypothesis you won't see that there are any other annotation layers so the casual person who doesn't even know about this will only see the kind of publisher sponsored a moderated discussion so elive will have full moderation rights over that layer that they control but if the person wants to sign in create an account and go create groups or have a general discussion they can interestingly publishers have been okay with this compromise where they're you know until the person knows about it their layers are the only ones that that are shown peer review i'll show you some more slides on this later we have a big announcement this week we've this technology starting to be integrated into submission workflow system so e-journal press which is one of the submission workflows that journals use you can instead of writing a long form review in microsoft word document where you say on page three section two you know i had this note you can just go annotate it in those annotations flow into the journal press system but they're all formatted with the open standard and the editors now can selectively go in and take interesting conversations and publish them as a layer that will flow along and be visible on the actual eventual public published article so this is this is an area that we're pretty interested in entity annotation is one i'm going to talk about a little bit more about this later illuminated footnotes so bringing interesting information in the citation the footnotes of the bibliography forward so that you can kind of see what's being talked about see whether it might be interesting and then click straight through to that document wherever it is without having to go to google journal clubs we've got a bunch of journal clubs now so instead of just having to get together in person to talk about that that those three papers on thursday night you can get together but then you know they might send the three papers out you know at the beginning of the week and you can start to read them and start to make notes in that private layer in ways that you know is confident you know kind of protected and private to the journal clubs especially if you're an early career researcher you're you know able you know you can ask that dumb question that you know maybe would have been obvious if you were you know 10 years further into your career and do exactly what journal clubs are about which is learning more about science and then the classroom use so we're seeing a lot 60 percent of our use right now classrooms who you know the teacher will issue you know assign a bunch of material and then the students can come and help each other basically ask questions about that material and the teacher can see oh what you know what are people struggling with how should I spend my time when we all come together you know during class we're starting to build shims for all the lms systems we have an alpha shim for canvas so you can pull drag and drop the material straight on your your what you've already put into canvas and wrap it with the hypothesis interface and assign a group for the class just so you know if you have different sections or whatever they can all be in their own space but then if you want to teach something that's not that you didn't put in the lms students can just grab the extension and go to whatever new york times article was that same group that you've created inside the canvas environment will be available and so it's a great way to kind of start to blend the boundaries between these systems that have been very siloed for instance and if you guys use x libers here for as a interface to library resources but you know x libers kind of has its own user interface canvas has its own user interface they don't there's shims between them but they don't still don't really work like a single system here you'd be able to see no matter where you enter the library resource through pro quest through x libers or whatever to the document you'd be able to see the same annotations in the same layers that you were interested in another interesting use case we're seeing is our community so here the kind of the the idea is what if we were able to take the combined expert expertise of the world's most knowledgeable people and use it deploy it really as a public service over knowledge news and so forth so we have a class of folks called professional fact checkers snopes photo fact etc very small understaffed and then we've got another group of really enormous group of folks which is really everybody that knows any you know special experts in their field um much more credible um you know as soon as you know folks see snopes some people go oh great that's what snopes had to say other people go oh snopes right i don't know those guys but if it's you know the people that really know the the world's experts on the zika virus they've got a lot more agency credibility than because they're specialists um and that's what they do all day so we there's a group um interesting group called climate feedback this is um a group out of berkeley they run their own annotation layer um and they annotate climate news and rate it so this is you know some article in the telegraph they've come along and said and and you know the the article in the telegraph might have quoted a few recent studies climate feedback because they're you know the world's top climate scientists they'll just go find the guy that wrote the article um that was cited bring them into the annotating group and have him you know or her you know point out you know whether this was accurate or not or you know what the nuance was that was lost in translation so we're super excited about this paradigm of um of communities we're about to launch um our community's initiative this fall to bring more of these folks on and we asked ourselves you know what might the next hundred communities be i mean you can imagine specialty communities on constitutional law that would annotate you know leak drafts of stuff coming out of subcommittee in washington or folks that have you know a lot of experience in cybersecurity or hurricanes or whatever it is um the um you know if you're on a page where one of these communities has agency um you might all of a sudden see there might be a little alert that says hey this high profile group that a lot of people are following with a high credibility score is annotating do you want to see that specific layer giving communities layers gives them agency allows them to control their voice whereas typically in comment widgets they they're the first to leave because the lowest common denominator basically controls the conversation so let me show you kind of how it works so i was going to be all cool and come up with a digital humanities example here but let me just um use um a springer article on this is an open access article on earthquakes and so this capability is not implemented at springer um this is just a an article that's in my browser um i could turn on the chrome extension which is one of four ways you can bring the annotation technology to a page and then i can come along and um annotate so let me go to i can go to the public channel for instance and i can say uh hopefully i would say something quite interesting other than just that's interesting um and make a note um and so there it is the annotation is stuck to that um to that piece of text you know i can go on you know make another one down here and so now there's two annotations so and you know i can use the cards to kind of browse the page to take me um to where um those annotations are i can grab a link to any annotation anywhere on any document and then immediately share it with somebody um like that put it out in a tweet embedded as a link in a text um and then you know it takes them you know straight to that article even if they don't have the annotation technology they're not aware of it they haven't done that on the extension or anything um so that's um that's how that works um and then you know if um for instance let me go to download the pdf so here's the same article um in um pdf form i turn on the extension and um the annotations um that i made are automatically in the right place on the pdf so the technology knows it can read all the metadata on the page it knows what the doi is what the all the urls that are in well formed scholarly articles are in the metadata behind the page and it automatically says oh this pdf is um a is this article and pulls them over and i can go backwards the other way here as i can annotate here and say hey um and then this is here there's a new annotation on the page i can pull it down annotation on that part of the html that came from the pdf side um so and it also it works on the fingerprints of pdf so one of the really cool things is i'm harder to demo here but i can take this pdf um and save it um to um to my desktop um however i do that uh and email it to um um to you and you can pull it from your email um into your browser turn on hypothesis and you'll see all the annotations on it because it doesn't need even need a url it can go just off the the binary hash that's native to to all pds um so that means that you know that whole corpus of scholarly articles that you know you've been saving for whatever reason on your hard drive for carrying around as you move your computer or storing them in a drop box or something like that you um can see the annotations on those without having to go back to the original version um and so forth um another thing that we're really um interested in is data so um here is here's a csv file um somewhere on the web that um this one happens to be a list of um publicly traded companies and little bits and pieces of data on them so if i drag this into um the hypothesis annotator like this it um turns it into um a takes the the csv file you know runs it in an html table and then i can just go annotate um the data so um actually i forgot one thing i was gonna show you so um so let me instead of annotating in the public channel let me create a group so and create a group very easily so let's quick and now this group called earthquakes that i just made is available to me and i can create an annotation here and i can tag it um you know um quick one and you know then maybe i can um make another annotation tag this one like this italy um i could you know make an annotation on the html version again go back to my uh earthquake group like so looks like i've made some other annotations in this group before um on this document um and so these groups these this group becomes away from me to start to accumulate um annotations on um um on documents wherever i go um and it works across formats so that csv file i can now also annotate um here like this um i could link to this um grab the link to this annotation maybe and um maybe point link one annotation to another check out this data got a hypertext length of points to the annotation i just made so now i've taken this piece of a um of an article and i've linked it straight to um directly into a csv file um and when i link there it actually scrolls me over to that to that cell puts me into that group and shows me that that annotation so we can start to link um annotate the links between notes as lee was imagining so um one of the things we've been working very hard on is um getting the next um format in place so um uh earlier in the year we had wanted to do epubs for a long time um epubs epubs are pretty tricky they're they're html but they're all zipped up and they got spines and weird um ways to they're all reflowable so there's no one single page it's like the reader dictates what you know the pages are that you see special you know creates all kind of special um consideration so there's two primary open source um epub frameworks one's called redium j s that was created by the group called idpf which is owns now part of the w3c which which kind of manages the w the the pub spec then there's another wildly popular framework called epub j s so we partnered with both of those groups and nyu press and then integrated with those two frameworks so as of this last week i can take an epub so we'll take this really interesting one called 100 proofs that the earth is not a globe here's a little epub from project guttenberg and i can drag and drop it onto our little test site so this is a way to get um so uh so here it is now it's it's in epub j s um and i can flip through it like a book and i could embed this book on a page in a frame packaged with the the reader if i wanted to if i was uh um running um a a press or a library where all of our titles were available some of them might be behind drm and watermarked and so forth or some of them might be open access and um then i can annotate them now we get a little bug anyway um let me annotate the book this was working this morning but broken yesterday so it's i was hoping um anyway you can see that my group um the context is still here from um from across the web that i've been annotating with a bunch of people i can make an annotation um in this um in this book on that page um you know as pulled from um the that particular press um and stick it there and organize it with tags and so forth um and um and so this and you know use that have that context with me wherever um i'm going uh annotations annotations don't have to be just text um you can also stick cool things in them like video annotation and so there i've got a youtube video that i can play right from the uh the annotation card um annotations can also um have uh like math in them so lotek in there so there's an equation um which is fully selectable and annotatable itself we've um we're expanding the media formats we've just um expanded the framework so that we can you can go to the internet archives tv news archive which some of you may be familiar with so they've been um recording 60 channels of video for the last 15 12 or 15 years now you can get any snippet of any um anything that's been on cnn for the last 10 years um and you can dive down into you know 30 second bit where somebody says something about something else and drop that right in an annotation card for instance as a way to kind of uh annotate um with media so um so let me let me kind of go back to our browsing interface so this was my group um here um the earthquake group i've annotated a few more things here was the data that um table that i was annotating um here were the tags that i was using so if i want to narrow um the documents this is a faceted search so i can add the the data tag up there and just look at the one annotation that was tagged data um i can see all the members um that were in this group so it's just me to begin with so if i you know i click on my member name then i can scope um again the search query by by a particular user and a tag and a group i can just look at all of my annotations so i can go to my user profile these are you know the 1200 annotations i've made over the last few years um you know in all kinds of documents you know and i can go and just go there from any document any any annotation i've made i can always click and go straight back to the um to the the document scroll right to to the place people can come along and reply to the annotations as a fully threaded model somebody wants to come along and say something um to to that i will get a notification that somebody's replied to my annotation anywhere on the web um so um there's a lot of interesting things that you can start to build with us so here's a project called cybot cybot as a project of the neuroscience information framework at ucsd they've created an annotating robot that goes and pulls a new kind skins the scholarly literature for a new kind of identifier the identifier is called an rr id so let's go look at one of these so quick on that annotation card and here is a article in pub med central um on uh something high fat diets and this rr id um the um biologists have created um over the last couple years because there were these articles are mentioning all these things like antibodies and reagents that were very specific down to the manufacturer like if you wanted to reproduce the experiment you'd have to actually know but that information wasn't in the article um so what they did was come up with an identification scheme like a doi those are now embedded in articles and now the cybot goes and annotates every rr id in every article that comes out so that instead of having to cut and paste that and put it into google um you can just click on it and see which um you know which manufacturer created this um you can see um um all the other articles that it was mentioned in and you can click on it and you can see all the other articles that were tagged with that particular um rr id so you can kind of use start to use that as an easy way to pivot through the literature um and this use of tags is a really really powerful um thing that we're we're that a lot of people are starting to to leverage um we're about to to release uh um maybe a little bit later next year um a structured um tag capability so you'll be able to load a reference external control of vocabularies or dictionaries um that communities maintain and and so forth um let's see um another interesting project um the Syracuse qualitative data repository is starting to um form partnerships with publishers um to where whenever a Syracuse um a piece of evidence in an article that's published that points to um a something that was stored with the Syracuse repository they'll annotate the um uh they'll annotate the the area the the place where that um um that uh citation was mentioned in the article and show you the thumbnail of the of the of the um element that's stored in QDR you can read a little bit about it this is actually in Spanish so you can see the Spanish and the translation of it um and then you can go there you can go to the the actual document as stored um in uh at the repository and and you know read it but you you'll click on it knowing what you're going to and and knowing most of the relevant information about it um ahead of time so that's that's pretty interesting we're seeing a lot of potential for this um publishers are very interested in in kind of enhancing their articles um with these different kinds of of more structured machine driven um annotations um the e-journal press integration I talked about um I'll just show you kind of the simple few slides here so um basically what they wanted to be able to do was allow re reviewers editors authors to mark up manuscripts implement all the blinding requirements for the review model that's in place override the tag editor with their structured tags to classify the different kinds of um interventions to be able to filter by all that stuff and have it stored in their database so here now as of this week um um AGU which is the which is kind of the code development partner with us and e-journal press is rolling us out to all 19 of their journals so that um the reviewers can annotate if they want to do an annotated um review can can click on that a little bit backwards but you know they'll come to the the manuscript um they'll see um the annotations um if they're the editor so this is kind of the editor's view the editor can see that it's reviewer one and who the person is what was said um the tags whether it was a major comment or a minor one etc um whether or not the um the annotation is something that that should not be shown to authors um or something that can be shown to authors um they can filter um the the annotations through the um the scope control um by any of the these different parameters um for the reviewers it's easy to create the annotation kind of the same way that you do already um you know just add whatever you need all the capabilities of the editor there they can use youtube videos or math or or whatever um they can uh they can file that away um decision letters that have all that information are automatically generated um um and created into a pdf so they can be sent to to authors um and so forth so that's um that is the review stuff i'll cut it just end it there i've kind of already explained that stuff and take your questions there's a scholarly article on climate change and if someone goes in and just throws a bunch of uh junk or bad science science at it is there a way for the author to remove that so that's a challenge right i see great things with this but i can also see bad things like like that happening yeah so there's really you know that the goal of having lots of layers with lots of potential communities moderating them is specifically to address that problem which is you know the biggest problem um and the one channel that is unique amongst all of them is the one we call public which is the channel that's that's visible by default right now we've been existing in a kind of a beautiful um paradise where this is still all relatively early adopters we have a set of community guidelines we enforce them occasionally people break them um and we you know we address that and either turn off the annotation to where it's just visible to them we don't delete it um we just make it private um or we you know if it's a problem person then we'll you know kind of turn it there or um turn all make all other annotations private so they can still continue to use the service um to make personal notes or whatever but um nobody else can can see them um or they and they can participate in private groups they can't participate in the public channel so um the question that we have is you know how long are we going to be able to you know clearly in order to scale this further we'll have to pull the community in to do community based moderation um we have to build a set of tools to enable that um for us whether or not we can continue to have the public channel operate and and provide its functionality is an open question you know it's experimental hypothesis um we know that wikipedia's got some fantastic approaches that have been relatively successful to do this we know it's possible um we have um we have a very strong um uh advantage in that because we're a non-profit we don't have to generate commercial revenue by pumping numbers and generating ads so we're not naturally disposed to allow anybody to say anything because um of of that outcome so um you know for us from our perspective annotation is is a privilege not a right in the public channel um and you know the great thing about the architecture is if you want to just create a really you know terrible place for anybody to say anything you can go start your own annotation server and run that and people can subscribe to it and maybe they will um there are certainly places like that on the web so that that's the simple answer it's this idea of being able to um create these communities and bring your community to wherever you browse on the web right that's really what I think that's what's most robust about this platform rather than just going to a social media site and having uh your community just post stuff from the web so you bring your community to the web rather than your community bringing the web to you right uh but relate to um I'm sorry what was your name again Peter's question was uh a concern of mine with regards to the balkanization of the web right so what is to stop a bunch of community members who use hypothesis who are interested in denying climate science or anti-vaccination and just subscribe to that channel and bringing that uh kind of discourse to where they browse on the web right I mean is that something that you guys have thought about here's the here's the philosophy behind the whole thing it's open people you know people will be able to create their own layers um you know will we have some guidelines that we've created in terms of what we consider discourse that we are willing to host on our platform and then a separate set of rules for things that we'll be willing to host in our public channel um and you know I don't I don't would not see a violation of our guidelines somebody you know creating a group with you know just terrible information in it and so the the the thing the way that that we believe that this works is um that um the question number one is you're on the page you'll be able to discover that there are multiple groups with different perspectives um there may be on that same page a group of climate scientists um that you know what they're talking about that have also weighed in so you'll be able to see both perspectives um as opposed to just seeing one um you know which might be that skeptics page that you landed on because of your friend posted in Facebook and said all things from one perspective um and didn't include that other perspective so from our perspective um letting people see that if those different channels providing a lot of metrics about the reason why one channel might be dramatically more um credible than another and like for instance maybe just the number of people that are following that channel number two might be the whether that channel is posted a a code of conduct that stands the test of time um you know the credibility the the participation of the individual members and where what the bona fides are but also um you know does the um channel allow replies from non-members moderated perhaps to flow in as responses to their annotations and um you know do they allow perspectives from people that are you know outside of their zone of of influence or whatever um so there's a lot of you know this is all kind of green field for us um we need to experiment see what works partner with people that have solved some of these problems in terms of applying a lot of automation um to the problems like there's a lot of Bayesian um analysis you apply on annotations um to to you know see whether the toxicity score for um for a channel you know how does it rate in terms of the the just the tempo of the style of discourse that they tend to use in their channel has a overall metric for the channel so I think there's some powerful things that we can techniques that we can use um to to get at some of that one of the strengths of it seems to be that it's a kind of central place for the things that I think a lot of us have been doing piecemeal kind of like a clue style so you know I've used the genius annotator the chrome extension and maybe like five or six years ago the institute for the future of the book had something called social book which is really cool yeah um so that's fantastic to be able to have all of those piecemeal things integrated into one place I'm wondering in terms of siloization that you mentioned is there any kind of effort to communicate with those already existing big time yeah so our conference that we throw we invite bob stein has spoken probably three times there genius has come presented prevented presented three times on their project so we kind of encourage everybody to come compare notes um when the when we launched the working group the w3c we invited a lot of the people that were there and you know I think you know the people that were clearly trying to solve this problem in a coordinated community based way showed up um and um you know the others you know um you know that weren't so interested or incentivized to be more open um tended to you know um you know show up or do a hit and run but you know not really participate over the long term yeah well that that is a question of it's great question um two things one is genius is um uh terminating their web annotator project um they're pivoting back to lyrics um and um but the larger question I think assuming that hadn't been the case is that if you know I think that puts a tremendous amount of pressure on them to essentially open their api um so that their service because there are two things right there technology and service um their service can be discovered by other open annotation clients so that you could see the genius annotation alongside the other one if the world is starting to add annotation services that you can browse openly in a client kind of like a web browser but to see these other annotations you got to go get a special browser and you can only see those annotations in that browser it puts a tremendous amount of natural pressure on them um to to you know move to the standard thanks very much I've got a couple questions about um about teaching applications uh given I mean it's not a core necessarily but it's definitely something you guys are obviously very interested in um one is a pretty straightforward question about um the fact that you know what you were showing us was was browser integration on what looks like a fairly typical like desktop or laptop what about other like mobile devices it works on mobile integration it needs there's a lot more that we can do to optimize it for mobile um but it does work um you know there's a firefox um once we get our firefox integration up firefox mobile vows plugins so you'll be able to have a version that you can create make annotations um on mobile um you you know you can browse to annotations now if you're on your mobile device somebody sends a link to you um but there's a really a lot more more and more that can be done like you know if I'm taking pictures with my phone um how come I can't tag them with the same tag set that I'm using um you know to tag my documents like maybe I'm actually taking a picture of a um of a uh you know of a field sample for my research and um I should be able to tag it you know along with the tag set that I'm using for the whole research project and I should be able to do that right from within so apple's not going to integrate it that into you know their i-photo app but we might be able to get a third party um app um to do that first and then force you know these guys to open up you know their their system I mean it's actually surprising you can't tag photos um in apple I feel like astonishing to me um but so there's a lot of mobile as a huge category it represents you know just the narrowest bit is you know creating and in viewing annotations but there's really just a lot of a lot of potential the other question I thanks for that uh the other question I had was about um I mean you mentioned canvas and the ways in which say it almost looked kind of like what you were describing was a copy and paste operation where you can go back and forth but I'm wondering about the possibility of closer integration and specifically just as someone who also wears an administrative hat and has to think about things like assessment is it possible for the application to work with a CMS like canvas to capture a level of student like number of annotations or quality of annotations something like that yeah we're starting to work on those we've integrated with speed grader in canvas so we've got at least the basics of of assessment but attention and other kinds of metrics time on page annotations are all things that I mean are like definitely should you know it's egregious that that you can't get that information in canvas now from from this application so those are on the roadmap as soon as we can build them I'm just curious about say I make an annotation on a web page that then changes or disappears altogether I'm just I'm guessing those annotations still exist in my user profile but what do they link to at that point yeah so if you um so we have what's called orphans um and uh if you make an annotation it the fuzzy the the text anchoring um uses the selection of the the sentence and then 32 bytes on either side either side of the sentence to dynamically re-anchor the annotation to the page and that's why it works between html and pdf because actually the text in html and pdf is not always exactly the same spaces are sometimes gone between words things break differently but they're close enough that the edit distance tolerance of the the algorithm um is able to um to anchor them within you know kind of as long as there's 70 of the same material in the sentence there which is also important for things like news articles which change dramatically through the course of the day as as the article as the story breaks and then they continue to go back and re-edit um but if it fails because you know it was on the second sentence and they ripped out the whole paragraph and so just no chance then it's orphaned there's a separate um channel on the sidebar which only shows up if there are orphans on the page and um you can so you can still see them they're always in your profile um but um you can see at least um when you're on the page that there were some that didn't anchor um the cool project that we're working on um the internet archive is about to integrate hypothesis into the wayback machine um so um the so you'll be able to go and to old versions of articles and annotate them um but the other part of the project that we're working on with them is that when you annotate every time you annotate it'll send um it'll ping there what's called their save page api with the url and tell tell it to go grab the page and so if the page changes they will have scanned the page within five minutes of when you originally made the annotation and if you if the annotation orphans you'll shortly be able to go say take me to the archived version um and um re-anchor the annotation to the article the way it was when it was originally made um so that's becoming a bigger and bigger need for people it's surprising how much of the web changes um it's something like every you know 40% of pages dramatically different within 90 days and within you know a year or something like 30% of all pages are gone crazy statistics like that so it's a big a big deal um something we're excited about hi thank you for that dan so I think what's so exciting is to think there's limitless applications to this technology and even when I think of a particular document I think of the different ways that people might annotate and for different audiences so for instance if you have a climate science report or a supreme court decision you can imagine how you know someone might annotate it for a very high level specialized audience where someone else might want to annotate it so that the the common man could understand something that's sort of complicated so how could someone I mean how would you parse those annotations and how could you create ways where you you go and find what you're looking for the tone you know the level with Wikipedia for instance they really say write for a non-technical audience write at an eighth grade level spell out your acronyms but you know with hypothesis that's not the purpose you know it's not just for a general audience it might be for a very specialized audience so how do you how do you make that work well I'm glad you asked so science in the classroom which is a triple as project is using I've got this word is using hypothesis to annotate articles for lay audiences so this you know bother to go show you all this stuff but they'll take a scholarly paper and then they'll they'll annotate it so you can I think when you can see that there's annotations I can't mouse over to that so I can't show you how cool it is but you can see that the hypothesis sidebar is on the right hand side and so this is in a layer of lay annotations targeted a certain age range so our thinking is maybe we can start to classify like if you've created an annotation layer for a specific purpose and it's for for instance translation for different age ranges or something we'll let you tag the group layer as being for a certain purpose and then you could start to browse those sets of layers run by lots of different organizations targeted at different types for for all the interesting stuff that happened today that was targeted at a lay audience in you know neurobiology or something like that annotations in the browser natively so we're talking about zilla they have a project called test pilot test pilot is a cohort of 100 000 users that test new concepts that would be brought in if you had typically be a plugin but they're candidates for being made native so our goal is to see now that we're our firefox plugin is almost done which we probably requirement for that and then to work with them over the next year to say you know what would what would what makes a successful test pilot run we want to make sure we're kind of well prepared for that we need to get all the multi-auth backend stuff done so that it's not we're not asking them to bake hypothesis the service into the browser only only the open client so I guess my guess would be to say three years maybe but we probably if for the for the majors but there's probably a few others like brave that we might be able to do a little bit sooner or just roll our own you know just take a webkit or chromium or something like that make a copy that's got the invitation client in it