 Okay, since we don't want to run too much into the lunch hour, I think we'll go ahead and get started. So, quick introductions. I'm Dean Graft, and this is my colleague Brian Lowe, right in the Cornell University Library, talking today about the Vivo project, where we are and what we're up to. So, the format I will talk for about 10 minutes with a sort of a quick introduction to Vivo for anybody who's not familiar with it and an update on what's happened recently with the project. Brian is going to talk about the work on the Vivo ISF, our new ontology work, combining research resources for extending the Vivo ontology to do a much better job with research resources. And then I'll pick up again at the end and talk about a very recent project that extends, again, will extend the Vivo ontology in any exciting new direction. So, let's go ahead and get started. So, what is Vivo? It's actually, you think of it as four different things. It's a piece of software. It's a system you can install and run at your institution. It's a set of data, typically linked open data made available as RDF on the web, institution-wide, publicly visible information. So, the system doesn't support private information about individuals. It's really focused on public information about researchers, scholars, and their scholarship. Standards, it's the Vivo ontology. We call it Vivo data and now Vivo ISF with the recent extensions. And then it's a community, an open community with strong national and international participation. And I'll talk a little more about all of it. So, what does Vivo do? Vivo takes a whole bunch of information silos both within and outside an institution and pulls together all the information about researchers and research, the one common normalized infrastructure. So, HR data, faculty reporting data, self-input data, research facilities, outside things like PubMed, all sorts of information. And then it creates a sort of network of context around researchers that describes, I mean, demonstrates what it is that they do and what they're engaged in and builds these networks that interconnect researchers with their sort of organizational structure, their publications, all of their academic life projects. So, what does Vivo look like? So, this is my page at Vivo Cornell. The two senior research associates are because I'm actually both the endowed side and the statutory side of it. And what does it look like elsewhere? Well, we've got, if you're in China, it looks like this subject knowledge environment of the Chinese Academy of Sciences, Netherlands, like that. If you're at the University of Colorado in Boulder, here's there, see you Boulder, Vivo. Griffith University has, focuses more on sort of the research hub and finding individual researchers. University of Melbourne uses the sort of finding an expert interface to connect people with researchers in particular areas. And then this is the Mexico City Vivo installation. So, lots of different sort of views and visualizations. The Vivo community, now over 100 institutions around the world. So, why is Vivo important? It is the only standard way to exchange information about research and researchers across diverse institutions. It provides authoritative data from institutional databases of record and makes it available publicly as linked open data so that other people can use this, build on it, create tools and other things. The structured Vivo data supports search analysis and visualization across institutions. So, the provost at Cornell can see all of what's happening at Cornell across all the diverse colleges and departments. And then across consortia are other regional groupings. It is highly flexible and extensible to cover research resources, facilities, data sets and much more. We're going to be talking a lot about that. So, given the way the system works, it displays this information, the underlying RDF, this triple data, about an individual either as an HTML page or directly different HTTP call. You get the actual underlying RDF data. So, either way you can get the same information about the individual. So, a lot of value for institutions in consortia creates this common data substrate. It supports distributed curation. So, individuals can maintain their review and maintain their own data. Departments can do it at the departmental level. It doesn't require sort of a central office to be trying to collect all this information. It's actually the individuals who are closest to the information. And finally, data that Vivo makes visible gets fixed. One of the biggest complaints we get when we put up Vivo was, the data is wrong. Well, in fact, the data in the underlying university database is wrong. Vivo is just presenting that to you and you're seeing it for the first time. So, that's an important value. It's a nice example. The U.S. Department of Agriculture is implementing Vivo. They have a bunch of agencies. They have a large set of internal researchers. And they have no way to sort of see across and make connections across that group. So, it's a portal for 45,000 internal researchers. Their long-term goal is actually to link Cornell and the other land-grant universities into a researcher network. So, they can really see all that's going on in the agricultural area. So, Vivo really supports exploration and analytics. This is something that a number of partner institutions have really been working on. Since it's structured data in this common format, it can be easily navigated, analyzed, and visualized. You can see the strengths of interconnections. I can see how much are people in this college actually working in agriculture, working with folks in engineering. What are the interconnections with my campus? And across a consortia at a CTSA, Technical Translation Science Award consortia, you can see how much are the groups actually working together. You create dashboards to understand academic outputs, map research engagements and impact, all sorts of things. So, what does this kind of visualization look like? Here's a Cornell faculty member in biological and environmental engineering. So, we can visualize his co-author network. We can drill down on a particular colleague within our network. We can also view his publications record across the full map of science. So, what are the actual disciplinary areas that he publishes and how does he fit over the sort of broad set of disciplines? You can do the same thing at an institutional level or a college level or whatever. How does the set of publications play out over the broader disciplinary span? For the College of Agriculture and Life Sciences at Cornell, we've done a map that shows the impact of individual research projects across the world. Again, you can do the same thing in New York State. Drill down, get information about the individual projects so you have this geolocated information. It's easy to find out what's going on where, when your local state legislator shows up, you can immediately identify what projects are relevant to their district. Another thing Vivo does, and we've been extending Vivo in the area of research data, is provides context, all this sort of context around the researcher is also context around the research data. That kind of context is critical to being able to find research data sets that you can potentially make use of in your own research. The kinds of contexts that are valuable include the sort of standard narrative publications and citations, grants, research resources, data set registries, and web of linked open data. So, here's a nice example that's been developed at the Library of Atmospheric and Space Physics at C Boulder. They have a set of data products. They basically have been curating this data set list by hand on the land. But, you know, trying to keep things up and make the connections, it's very hard to do by hand. They've installed a Vivo instance where they've modeled the spacecraft, the flight equipment that all of their instruments run on. They've modeled the individual instruments, tied them in to the output data sets, tied those back to the researchers, the specific grants. So, again, you have this entire constellation of information around the research data set, tying it to the resources that are produced. There's other work in data set registries. In fact, early, quite a while back, University of Melbourne was doing a data set registry, the Australian National Data Service, I guess it is. We've, at Cornell, we've developed something, a project called Data Star, over a number of years, which is a data registry tool. It allows you to create, meditate, and describe data sets in useful ways to the Melbourne site. There's Data Star, you sort of see a breakdown of how the data sets, in fact, themselves interrelate. How they relate, how they are cited, how that ties together with other data sets. A particular data set cited 14 times, like 20 new publications as two related data sets. So, again, you get this sort of full context in this structured data format. So, what is Vivo today? Vivo is an open community, hosted by the Durr Space Organization. There's Jonathan, 501C3. Vivo has a strong national and international participation. We are currently hiring, in fact, we have interviews, we're scheduling right now, a full-time Vivo project. I'm Vivo project director, and I'll drive the project forward. Vivo is an open suite out of software tools. We're releasing Canada A4, right, for our 1.6 release. So, hopefully, we're hoping this will pass all the tests and be turned into the release itself very soon. It's a growing body of interoperable data, and Brian will actually give some nice examples about that. And it's an ontology, now Vivo ISF, a community-driven process for extension. So, now I will hand off to Brian to talk about the integrated semantic framework. Thanks, Dean. Yeah, so I'm going to focus a little more about the data and how the data works. Because there's Vivo, the software, the release candidate 4 that we're hoping is going to do well. I think it's going to go to release candidate 5, maybe the last one. But then there's also Vivo the data. And so, you don't actually need any of the software that you write to be part of Vivo. And you can use a lot of other tools to produce the same kind of linked open data, and be part of this big network of the context of research. So, the Vivo ISF is the Vivo integrated semantic framework. And so, the ISF piece is all about. And so, what are we integrating with this? Well, it started out by being Vivo plus Eagle Eye. So, in 2009, there was this major funding from the NIH for the Vivo project. And at the same time, there was a parallel project called Eagle Eye. And so, Eagle Eye was focusing on research resources while Vivo was focusing on the research and networking. And they both had similar consortia of institutions that were building this out. And it was really more just an artifact of the way the funding worked out, because this was one-time stimulus funding from the NIH. It wasn't part of an established program. But they couldn't make this into one big, giant grant. It was split out into these two different piles. But right from the get-go, there was the realization that there was an awful lot of overlap between what Vivo was interested in and what Eagle Eye was interested in. And they were both taking a semantic web approach to this and producing ontologies and putting together ontologies and producing semantic data. And so, all along, we were in contact between the two groups, trying to coordinate to some extent what we were doing with ontologies and semantic data. But after that, that NIH major funding, it was time to then go through and actually put these two ontologies for Vivo and Eagle Eye together in a real way and actually have a common name, this ISF, this integrated semantic framework, Vivo ISF. And so that led to this CTSA Connect project, which was led by Melissa Handell at Oregon Health and Science University. This was an 18-month effort. One of the main things was putting together these two ontologies. That's by the Louisiana Hamilton contract. And this is the teams that are involved. This is, again, a multiple institution endeavor. And so, as you can see, there's a fair amount of overlap between what Vivo and Eagle Eye were interested in. Over on the left, there's the people-focused worlds where we've got people's affiliations who they work for, the roles that they play in different projects, the grants that they're pursuing and investigating, the credentials they have. Over on the Eagle Eye side, they get into more esoteric, real researchy things like the genes that people are using in their research and various pieces of anatomy and kinds of biomedical details on that side. But then in the middle, there's all kinds of overlap where there's different techniques that people are either experts in or are using, in particular investigations. The training that they've got, publications they're producing, obviously, is something we're always interested in. And protocols, things like that that they're using in the course of their research. So the ISF is all about making that work together in one harmonious way, rather than having it be an artificially separated set of ontologies. So the Vivo ISF ontology is really, it's an ontology that's about making relationships. And in a lot of cases when people think about the word ontology, they might think more of a taxonomy, a set of terms, that you're going to go and tag some piece of research with a particular scientific term that this is what it's about. And it's really, it's important to be clear that we're really not about trying to classify people's research. It's not about applying terms to things. It's more about, as Dean mentioned earlier, this idea of building this rich context. It's about linking things together. And so when we want to talk about the details of what research is about, what people are actually studying, that's where we rely on other terminology that we bring in from other places. So down on this slide here in the lower right-hand corner, we've got the example of the Unified Medical Language System, which is another piece of the CTSA Connect project that we'll mention a little bit later, where we can bring in things like the ICD-9 and 10 medical building codes and the medical subject headings, things like that, and the Unified Medical Language System. These can all be kind of intertwined with other ISF data, and that can provide the meat about what people are actually working with, what they're, the patients that they're treating, or the kinds of things that they're actually studying in their research. But then ISF weeps together all these other bits. You can tie in the organizations that people are associated with, documents that they're producing, databases that they're producing, other kinds of research resources, grants and contracts that are funding their work, and it's all about putting together these communities. And it's also about being able to be used in a very sort of flexible way, and sort of picking and choosing the bits that are important for any particular application. Because the ISF has a lot of possible different connections that you can make between things, and only a certain subset of them are going to apply in any given situation. So it's not about having a big metadata record with a bunch of fields that are either required or optional that you're filling in. It's a very different kind of thing. It's about making the kinds of connections that are appropriate to the context of the research that you're describing at a particular time. So this is about going beyond traditional CDs. Static documents that are just sort of big lists of publications or grants that people have gotten, or going beyond author lists on publications where you've got here are two initials that are last name. I mean, all this kind of data that we still have to deal with all the time. We want to get beyond that kind of structure, and actually have a real structure behind the data that shows where it's coming from and who's actually involved rather than just a list of texts. And so it's about putting that research and scholarship in context. And by building out this richer context, we're leaving for one thing with the idea of disambiguation. And we'll figure out who is it that we're actually talking about. And as we move into a world with orchid IDs and things like that, this is going to get easier. But that's always going to be to some extent an issue. And so we want to be able to make it and build as much of that context around people as we're capturing that data rather than just having people's names floating around. We're also interested in getting richer vocabulary of roles for what people are doing. There's more to scholarship and research than just being listed as author number 27 on a paper. There's all kinds of things that people are doing, things that they produce, other kinds of outputs, and eventually then outcomes that are more than just writing a paper that might be resources that you've produced, other things that you've contributed to your field. And so the ISF is about trying to set the stage of being able to capture those kinds of things in addition to what's traditionally been recorded. And as Dean mentioned, this is all about link data. We want this to be open link data that's discoverable and usable by different applications. So this idea that from one address, from one identifier, you can go and get human readable stuff or get the structure behind it, that's really what we try to keep in mind all the time. And so in the course of designing the semantic framework and building out the ontologies, we always try to come back and think if we're faced with a design decision about different ways that we could approach modeling something, there might be one that maybe is a little bit more correct in an ontology, a pure ontology world, but it maybe doesn't work quite so well for people actually trying to consume that data for practical applications. And in that kind of balance, we usually come down on, we want to make it practical. We want to try to make it usable for things that are actually crawling this data and doing something with it and also for the people who are reducing that data to make that an easy thing to do and let the ontological details sometimes sort themselves out as they may. So to do this, we've incorporated in this framework a bunch of existing link data book categories. One that's probably familiar is both Friend of a Friend and we use a piece of that for basic stuff about people, organizations, groups, some of that standard vocabulary. We're also now pulling in another W3C spec, the VCard ontology which is a little bit in flux right now and it's being revised, but we're using the new version of that which we're finding pretty attractive as a nice way of getting all of that nitty-gritty detail about people's name parts and their phone numbers and email addresses and how to contact them, getting all of the rich detail about that and bumbled up in a nice way that we'll look at again in a minute. We're using Bibo, the bibliographic ontology as sort of the core piece of how we represent publications. That's something that may have changed a little bit in the future. We may extend that in different ways, but as sort of the nugget of what we're dealing with the publications, we find Bibo pretty attractive because it's usually a pretty nice granularity. It tends to, if you're familiar with the Furber scale, the work expression manifestation item continuum, it usually kind of falls in the expressiony sort of realm which is usually what people are interested in talking about. So that often is a pretty attractive way to model publications. And then SCOS, the simple knowledge organization system for dealing with the SORI and control for capillaries and broader and narrower terms and related terms, things like that. That's another standard thing that we bring in. And there's sort of a little technical twist in the latest version of the L web ontology language where we bring in these other vocabularies for things like scientific terms. We can sort of all treat them all as SCOS vocabularies, even if they're also something else. So we have kind of a nice consistent way of being able to refer to terms and from these other taxonomies. And then coming in from the eagle eye side of things, there's a suite of ontologies under this OBO, this open biomedical ontologies of Rella that are part of the BOISF. And so some of those are things like the ontology of biomedical investigations which allows you to describe in greater detail about what actually happens in the course of an investigation, of an experiment, things like the planned process that you go through, the methods you use, things like that. And again, that's a case where a lot of data that you might be putting into an evo system or an evo-like system, you might not have that kind of stuff. And so you can ignore it and leave it out if you're not interested in that. But that rich detail is there if you have it and if it's available. So we try to encourage being able to capture as much detail as people are willing to put in and not get stuck by having a slug to stick it in. So that's the OB. There's, of course, eagle-eyed research ontology, all the work that came out of the eagle-eyed grant for modeling research resources. RO, the relationship ontology for a sort of a common core set of relationships like something being part of something else or continuing other things. And then IAO, the information artifact ontology which sort of goes a little bit beyond the bibliographic stuff and also is kind of a core set of metadata for other things in the ontology. And so these things in the biomedical world in this OBO, speedo-ontologies, they all sort of descend. They extend this basic form of ontology. And so as a result, Vivo ISF now all sits under this common top-level ontology. And I'm not going to go into detail about this, but this is sort of the philosophical grounding for the whole world that these biomedical ontologies live in. And so you get into very abstract ideas like occurrence and continuance and things that exist fully at one instant in time and things that play out over time and stuff like that. And the idea here is that it's nice to sort of feel like we are fitting into a basic coherent philosophy of ontologies. We don't always, we don't religiously adhere to this. That if it comes down to wanting to do something that's going to have a practical result for actually trying to share data and it doesn't necessarily quite fit with basic formal ontology, we're probably going to do it anyway. So one of the things that, you know, this is actually a perfect example, something that doesn't quite fit exactly with what BFO would do, but we do it because it's important, that one of the basic patterns is this notion of a reified relationship. So if you're familiar at all with RDF and its predicate structure where you have a subject and you've got a predicate that points out to some other object, that's a really nice, simple model and it's great for a lot of things, but also it breaks down at certain points. The nice thing about RDF is that you can say anything you want about anything. The problem with RDF is that you can say anything you want about anything and what you often run into is contradictory information when things change over time. That people don't just stay in one position their entire career, they move around, they do different things and if you just make simple statements like John, employee of University of Chicago, well that might be true today and it might not be true next year. So if these triples are just floating around out there together and mixing around in the wild, then eventually you're going to have to try to figure out which of these things is actually true at the time that you're looking at the data and so we try to give you better clues for figuring out what's actually correct at the time you're querying. So we have this node that we put in the middle of two things. So instead of just having one subject predicate object here, we take the predicate position and stick another resource in between. So you actually end up with a triple going from this subject predicate object and that in turn becomes a subject of the zone with another predicate to another object and so we can actually treat the relationship as an entity unto itself that can have its own metadata and one of the important things with that is putting times and dates so we do that for things like positions, we do that for things like authorship instead of just saying somebody author of publication or publication that has creator, has contributor or somebody that's a node for representing the actual authorship because we usually want to hang additional data off of that to further qualify that idea and that basic pattern plays out in a number of other places and that's what we often really try to do with Vivo ISF is that while we might make things initially a little bit more complex than you might see in other ontologies we try to make it a good trade-off so that once you learn that new pattern being applied in a lot of different places so it becomes familiar to you and hopefully that richer data, that richer structure is going to be a lot more useful for the applications that are actually trying to consume this data so here's the idea that if somebody has a position and the position is then in turn linked to the organization that now because that position is a resource on its own we can say that position was held over a particular time so someone was at University of Chicago, they had a post-doc position from 2007 to 2009 and then they moved on and went somewhere else and we can keep around all this data, we don't have to try to pull it out we don't have to try to retract it or say it's invalid, it can live there harmoniously with all of the current new data and we can build up this distributed CV over time of how people's careers evolve because we can accurately say what was true at a given period of time so here's the notion of two different positions somebody moves from position one with one time interval they move to position two and we don't have to delete any of that old data, we don't have to pull away position one, we just close its time interval say it's got an end date we'll open a new time interval for position two and we can start building up this system and similarly because we're using the V card standard for things like people's names one of the stupid things that you end up dealing with any time you're dealing with people is people's names change and if you can figure out at what point they changed then it becomes a lot easier so we can explicitly represent that kind of thing in our semantic data by having things bundled up into an ISP card, essentially a business card that's true for a certain interval of time and you can put a time interval on that say this is the accurate information for this date range open up a new one for a new date range and we can start aggregating that data and you know which is true at a particular time and it's not stepping they're not stepping on each other's toes so again the same exact pattern that we just saw with positions do the exact same kind of thing with V card and similarly because we've got the V card as a resource on its own that can have its own metadata you can associate that with an authorship we have the authorship again as its own resource and so you can say here's the particular form of a name that somebody published with on a particular publication this was the contact information that they listed with this particular publication and it's very clear what's associated with what there's a sort of a slot for everything that you could fill in even if it's not necessarily something you're going to fill in in every case so build the relationships that are important for the context you're describing so going beyond the pure notion of authorship that people being listed as authors of documents roles are very big in Vivo ISF this is something we get this basic notion of a role now from the basic formal autology so it ties into broader notions of roles elsewhere and so we really want to try to make it as attractive as possible to capture richer detail about what people are doing in the course of actually pursuing their research in addition to just what eventually results from so in addition to just being an author on a paper we also want to capture the fact that you have a particular role in a project or you have a role in some kind of event or activity the different things that then ultimately might have an output that is a more familiar kind of art effect and so again this is the notion that we have this kind of you know this set of this web of things that you could kind of build in whatever complexity is useful for what you're trying to describe we give you that basic vocabulary to stick things together with and so you can link together all kinds of different roles that people might have to different projects or different pieces of projects you can make it as detailed as you want or as simple as you want and then ultimately those activities those projects kind of temporal things can have documents and resources that come out of them and again those can be inputs to other things it's a very generalized framework for kind of building out these relationships so when you have all this data what can you do with it so here's an example of what you can do with linked open data this is the beta.vivo-search.org and so this goes around and crawls linked data from not only vivo but other software types that are then publishing using the vivo ISEP ontology and publishing the same kind of linked open data and putting it together in a search index so it's kind of you know it's a google-like activity going around and gathering data but instead of scraping it off of html pages it's pulling it out of the structured data and so here's the network of different things that are folded to this index and so overall left all of these different blue things are things that are actually using our vivo application and on the right are some other things like Harvard Profiles which is a totally different software application looks nothing the same under the hood but it's storing the same kind of RDF triples in its triple store it's publishing vivo linked open data and it can be harvested into this index the same way with the vivo system similarly Iowa's Loki system it's got a completely different database it's simply storing its data but it's got a translation layer on the top that can dynamically turn that into linked open data if you come and make these linked open data requests and so this the crawler that's building this index can put these all together and make a nice consistent search interface out of it this is vivo search light by Miles Worthington another example of what you can do once you've gone and aggregated this linked open data if you're at a particular page and you highlight some text that you're interested in it can go and look up and find different people who might be working on that topic or have some expertise in it and I'll let you click through to their vivo profiles and learn more about them and so it's again it's using that linked open data to provide the basis for being able to do that lookup here's a search engine that was created by someone directly affiliated with the digital effort this is Dave Weichmann's group at University of Iowa with CTSA search and so the clinical translational science awards they have adopted the vivo ontology as their sort of recommended way of sharing researcher networking data and so he's got a search here that goes and also gathers that semantic data from these different CTSA sites and as you can see they've got 1,945 persons and 19 different institutions and over 1.3 million publications and so they're building up a quite rich network of linked data out there and so this is an example of an application that isn't trying to go and search the entire world of linked data but it's identified a particular set of institutions a particular set of sites that are of interest and is crawling those particular sites for a particular purpose so some of the use cases that we have by SF it's going to be useful for in building applications doing things like being able to find the publications that resulted from grants being able to reverse those relationships see what funded what being able to discover and reuse facilities or equipment things that an investment has been made in and should be possibly more optimally reused than they might currently be this is one of the classic e-line use cases being able to reuse resources efficiently demonstrating the importance of different facilities on a campus in actually producing research results and being able to discover people who have access to resources or have expertise in particular techniques and so this again going back to the CTSA Connect project Stony Brook in Florida and the HSU were heavily involved in this project to be able to link together clinicians and researchers by taking the codes that were associated with the particular things that clinicians were treating in their practice and matching that up with mesh reports from what the researchers were researching and making those ties through the unified medical language system and ultimately tying that all together with the ISF to publish that as linked data so this basic notion of this bench to bedside idea of being able to get the research more quickly from the researchers to the clinicians being able to find those connections that are going to make that happen is another application for this kind of linked data and looking a little more towards some other ways of expanding ISF especially Steve McCauley and Ted Lawless and Brown have been doing a lot of work at looking at the peculiarities of some aspects of the humanities and artistic works especially for things where someone has created a work but then wants to be able to track all the different people who have performed it in different venues, in different places works that have been translated into different languages collections and exhibits that have been on an ongoing basis and you want to be able to then aggregate all of that together because that's part of the evidence of the impact that this scholar has had these are things that again it's a sort of a natural extension of what the ISF is doing and also an example of the way that it has these multiple different pieces that you can mix and match and choose from for a particular application and so if you want to get involved to learn more about what FIVO ISF is doing we've got bi-weekly calls as part of this FIVO ISF working group as part of the Dura space and roll up and so there's a link here to our wiki page you can look for the ontology working group under there we'd like to invite anybody to join the calls and learn more about what we're doing and get involved and give us some feedback and ideas so here's just a brief list of some of the different interest groups things that people have been talking about recently in the group looking at more detail around grants again the humanities aspects things like knowledge mobilization about how you actually get stuff that universities are doing into practice in the community more about annotation and provenance and also more detail about publications so we've looked at a lot of ways that ISF has extended to research resources and other things and now I'm going to hand it back to Dean for more about link data libraries okay so I'm going to be talking about a very new project called link data for libraries creating a scholarly resource semantic information store so last Thursday the Mellon Foundation made a grant to Cornell Stanford and Harvard for basically a million dollars starting this coming January so the partners will work together to take this same kind of idea that you just saw in the biomedical area and apply it to scholarly information resources so basically sort of extend the vivo model and ontology to cover all of the kind of contextual information around scholarly information relationships, metadata, broad context so this will leverage the work the vivo's done, we're also actually leveraging the work, the hydro partnership and stuff they've been working on that's part of the effort so the project team at Cornell I'm on it, John Corson-Reichert, Brian and Simeon Warner we've got another half new FTE at Harvard it's David Weinberger Polish Nair and they'll be bringing an outside consultant with expertise in the link data area at Stanford so the official participants are Tom Pramer and one new FTE that will be full time on this but when I just talked to him about our first meeting he said he had about eight people in his group that he wants to bring in and engage with this effort and I think we're going to be doing a lot of that as well so what is the goal of this effort? It's to create a scholarly resource semantic information store a model that works both within individual institutions and through a coordinated, extensible network of link-toped data to capture the intellectual value that librarians and other domain experts add information resources when they describe, annotate, organize, select and use those resources together with the social value evident for patterns of usage so we want to draw on the work that the Harvard Information Lab has been doing at their library cloud where they're tracking usage information about information resources we want to pull in things like live guides and other sources where right now librarians have been adding value but typically in sort of very siloed systems that doesn't necessarily inform what exists in the basic catalog in the basic search mechanisms and to make all these things available then to support discovery and use of the materials there's a quick look at our project timeline beginning now we're starting to work on the initial ontology design identify all these disparate data sources that's one of the reasons we're very glad that we have both Harvard and Stanford working with us on this we see ourselves as all having individual sort of localized sources of added value for these information resources. Stanford is looking at some of their archival and manuscript descriptive information and being able to fold that in mentioned the Harvard Library Cloud work we'll begin creating this we're calling the sources which actually takes code from the Vivo project the underlying software called METRO you just pull out the Vivo ontology you can plug in any ontology you want to do a little tuning to make everything look good so that's going to be part of that work and then connected with HYDRA HYDRA now has a component called Active Fedora that talks to a Fedora back in this sort of store we're going to build an Active Triples Ruby layer that will interact with a triple store in the same way I meant with HYDRA framework up above that second half of the year we'll complete an initial ontology and highlight the initial data ingests at Cornell maybe of interest to some of you here we'll be running a workshop in December where we bring folks from 10 to 12 institutions to help give us feedback on the ontology on the overall approach and design to make connections to support potentially piloting this at other institutions and to understand how the institutions see this fitting in to other collaborations that are happening now TKLA, Vivo, Share people are doing all sorts of collaborative work we want to find out how can this best fit into that environment and for the second year we'll be doing pilot instances at Harvard at Stanford populate our own instance from multiple data sources Color is our curated list of library resources it's a framework for organizing and annotating existing resources in our collections we will develop a test instance we will have a search across the partner institutions integrate with active triples then second half of 2015 we'll do a full public release of the open source code public release of the active triple cyber component public release of the ontology full functional instances and a full demonstration system so project outcomes the primary outcome of the project we see is this sort of open source extensible ontology compatible with the Vivo ontology compatible with big frame we're going to be working with other existing there are a lot of library linked open data efforts going on we really want to work to be compatible with all of those things like the open annotation the we'll release the open source source systematic editing display and discovery system and the project type of compatible interface to the sources let's see I think at this point we are to the question stage well thank you all very much