 We're going to get started because we're the last session of the day, and somehow we got in a 30-minute slot with about 86 minutes of content. So we're just going to go fast and furious and get through this. So I'm Bill Aske, this is Jay Brodera. We're from McMaster University in Hamilton, Ontario. We're going to be talking about our infamous Vivo and Elements Implementation Project that's about research information management or research intelligence, as Tom Kramer suggested earlier. So just to give you a little bit of a sense of what we're going to be talking about, we're just going to give you a broad sketch of our context for a couple of minutes. Then we're going to talk about the requirements and the needs of what we're trying to accomplish, go through the implementation, talk fairly extensively because this is really the meat of the presentation about the challenges and solutions we came up with, and then sort of a quick roadmap of what's next in our project. And then some recommendations for, I think this is probably why some people are here, is they're looking at projects like this. Some recommendations and takeaways for you to think about. Okay, so we thought it relevant to give you just a brief contextual background of McMaster. So McMaster is classified as a medical, doctoral, research intensive university in Canada located in Hamilton, Ontario. In total, about 30,000 full-time students, undergraduate and graduate students. About Patriot McMaster lists 949 full-time faculty, keep that in mind, and also quotes itself as the most research intensive in Canada $405,000, Canadian dollars, note, annual funding per faculty. As well, we thought it to be relevant to give you to find the acronyms and the Canadianisms upfront so that when we mentioned these later on, you don't have to take time to think about what they actually mean. So first of all, talk about systems. So a RIMS, also referred to as a CRIS or a RIS and many different things, research information management system, a system capable of integrating information about research researchers and their scholarly activities. Elements itself is a specific RIMS developed by Simplectic, which is part of Digital Science Suite of Products. Vivo is an open source link data web application and it's an ontology for representing scholarship. And then in terms of people and groups, VPR, that is our Vice President of Research. RHPCS is the Research and High Performance Computing Services, so a unit within the VPR's office that we worked with on this project. And we want to distinguish the two types of faculty that we'll use here interchangeably. So in Canada, faculty with a capital F means colleges, means the primary academic units in the university. Whereas faculty with a small f means the individual faculty members. So we had a challenge that I think a lot of universities face, which is that a lot of different websites from the various capital F faculties present profile information per the style and the dictates of that individual faculty. And so you have a lot of information, but it's on various subdomains and there's definitely no central store. So from a web indexing standpoint, that's kind of a disaster. There's fairly inconsistent profile content, so different faculties would highlight different information. The ones in the humanities are incredibly text intensive. They give very long overviews of their research, but they're a little bit light on the data, whereas in engineering, they have all the publications. It's just very, very much a hill and valley type of profile situation. Beyond that, there are various designs. I just grabbed three screenshots just showing that everyone uses different templates. It's complete web chaos. And these are, at their core, technically speaking, simple CMS web profiles. There's no automation in terms of how the data gets fed into those. These are all just bespoke, handcrafted web pages. And they don't publish to the web as linked data or anything fancy like that. So when you have that kind of situation, you have a serious problem with currency. Every time one of the faculty updates a webpage that for a while they have these wonderful, and a lot of them just did this, they have these wonderful, great-looking web pages. But of course, they age very quickly because there's no one that's going to go in every two weeks and out all the new publications because it's all manual labor. The consistency is clearly a problem. Accuracy is simply a problem because what they use is their data source for publications or other information is often highly suspect or at least not incomplete. It requires, of course, manual labor. Someone has to do this work and you can imagine that the faculty aren't super keen on yet another place to curate their information. And you have repeated entries. So they have to put the information here. It has to go into their faculty reporting systems and so on and so forth. Beyond that, McMaster, like a lot of universities, but others are a little bit more fortunate that they have more robust information systems. Tom was describing what Stanford has earlier. McMaster has no central REMs and has nothing that would stand in proxy of being a REMs. There are various systems across campus, but they don't talk to each other. And so there's a lot of data silos. And silos, of course, will be a feature of this talk like there are of so many. There's actually zero interoperability between the silos. And this distinguishes us, for example, from Tom's talk earlier. McMaster really has no IDM solution in place. And so we have a hard time just knowing who is faculty and anything about them. So that was also a major challenge for our project. So in terms of what we did was we, the idea began percolating in 2015 and in 2016, we put together with RHPCS a proposal to the provost to do something about this. And so we went after $150,000 to set up what we said we were going to do. We set up a vivo instance with attractive, consistent, accurate profiles, at least stubbed shell profiles for all faculty at McMaster. And that we would bring in some publications from external sources. And that we would sort of try to persuade faculty that there's value in doing this because they would have a public profile that would be consistent and so forth. Of course, I don't want to say what happened. I don't want to get ahead here. So we got 148,000 out of 150 and we used that money to hire a project manager to do a little bit of hardware purchasing. We said we were going to buy out staff time from RHPCS in the library. In fact, we just ate the cost which is I think a key takeaway here is that we just put an enormous amount of human labor into this without compensation. And we hired it in July of 2016, we hired a project manager and we began sort of actually doing the implementation work. I think that's you. Okay, and then so we've kind of parceled us out into six month chunks after this point anyway. So for the four months after that, the first few months were spent really just scoping gathering requirements, scanning for and classifying various internal, external data sources that were available. So there was a lot of discussion with various groups and figuring out who had what data and how current was it and how complete was it. And the outcome of all this work, two major outcomes here was number one that there was no internal data source that could be considered comprehensive and complete for all scholarly works at McMaster and that building custom applications to harvest researchers publications from the accessible external data sources was beyond the amount of resources in the time that we had. And so that led us then to investigate a solution that would help gather that information and feed it into Vivo on the front end. And so over the next six months, October, 2016 to April of 2017, we explored symplectic elements as a potential solution for that. And during that time, we continued to have conversations with various groups around campus to understand, you know, to their interest in this and their potential needs or requirements that they could get from this. And there wasn't a whole lot of enthusiasm, to be honest, about this product. Many groups were reluctant. They were locked into what they were doing. Some were pessimistic. We tried something like that and it didn't work. So I don't think it'll work either for you. So in April, 2017, we signed a contract for symplectic elements to be the rims as well as the Vivo connector. So they do the work of connecting the information in the rims database to Vivo as like data. And so in the end, this is essentially what the system looks like now. And so between April and October of 2017, we have a rims for the entire institution. And so essentially how it works is that we get data extracted from a central HR data warehouse. We do some automated cleaning and formatting for ingestion into the rims. We also take in non-faculty, we allow them to sign up via an online form and we can ingest them at the same time so they can have profiles or they can just have accounts inside there as well. So once a faculty member has their search parameters refined, then the rims does the work of automatic publication harvesting and tries to do some of the deduplication disambiguation work. Of course, there's still work required on the researcher's end in some cases to finish up that publication claiming to actually claim what's theirs and what's not theirs and also enter their other relevant scholarly activities. And this can be done by the researcher themselves or any delegates that they allow. So the information then from the rims is pushed out to the Vivo site at the front end, which we've branded McMaster experts. And the data from the rims is also available to other internal users via the API. So for uses in internal reporting, web page content population. So getting back to what Dale was talking about, they can integrate this and make sure that they have better currency with their information. So just a quick show of what the profile pages look like so that the elements pages are private to the individual or their delegates. And the main profile page shows a research summary, just the typical information about research output. And there's a carousel that prompts the user if there's actions required, such as entering an Orchid ID or claiming some publications. So in edit mode, the researcher can also add in other relevant biographical information, educational details and assign themselves a research areas, just something we'll talk about in a little bit. And the publication manager page, users can feel their publications claim or reject what has been automatically harvested from their system or manually add publications by entering all the metadata themselves or entering a DOI and hoping that it's somewhere in Crossref where they can pull it back most of the metadata. So McMaster experts, the Vivo site looks like this. And you can get there at experts.mcmaster.ca. So this is the main landing page and it looks more or less like a Vivo front page with a little bit of skinning on it. We also have an area that is populated with tag media stories. So the communications public affairs department, tags or stories and they get fed automatically to the box in the front page. For an individual, their profile page will show name, title, contact information, a research overview if they provided it, a LinkedIn institutional affiliations and appointments and those come from the HR feed as well as the Sparkline showing cumulative publications over the last 10 years. And then there's the stock Vivo visualization such as the researcher's co-author network. And one addition that we've made here is that profile pages are also integrated with media stories. So when individuals in communications public affairs tag this individual in their stories, then those stories are harvested and then fed at the bottom of the profile page. So it's tied into our branding operations. So and finally, from October 2017 of past year, sorry, to present, really what we've been focusing on is wrapping up this project and transitioning to a sustainment mode. This is mostly involved hiring two full-time library employees ongoing full-time to sustain operations and developing the appropriate governance structure so that we can ensure that there are feedback mechanisms between the users of the system, those who operate the system and those who provide the governance with that as well. So as you can imagine, doing something like this in an environment such as ours or yours perhaps, brings a number of challenges and solutions. And I think the first one would be just to point out that we were building this. This is of course the obligatory cat herding slide. We were trying to build the system without any sort of strict requirements from a defined audience. We know that universities need this infrastructure. They just don't know it quite yet. So you have to build it to convince them that they need it and then they of course want it. So we just saw a need and so we were sort of whipping people into shape. And the funny thing is this is actually a poor picture because it will imply that we were on the horses and we were herding the cats. When in fact, a better picture would be we'd all be cats running around just kind of spastically because we were cats ourselves and had to sort of learn the hard way how to get on the horse. And that was perhaps kind of the most difficult thing. The other piece was that and of course you have to have a picture of silos but our silos are as you can see shiny and silver and they're connected by infrastructure. There's infrastructure connecting them so that something in one silo can travel to another and travel to a central silo. So our silos are an improved version of silos. So the faculties are building all these websites with no integration and now there's the possibility anyway that on an ongoing basis with whatever CMS they're using whether it's Drupal or Plone or whatever they happen to be using they should be able to pull in at least the publication data from the elements database. And that would be a huge way of sort of addressing the currency and manual labor issues of maintaining those profiles on their website. So we've at least created that possibility. That's sort of the superstructure here. Another challenge we ran into was that there were no Canadian or certainly nothing on campus, no ontologies to describe research areas. So there are various ontologies one could pick and use but they're always sort of partial and incomplete. So we ended up using the Australian New Zealand I was with the rest of the SRC, yeah. Which is quite robust and fairly large but not overly large and complex to use within our Vivo instance and it's within our elements instance. And it's worked fairly well for us. And last but not least, the work that we're doing is sometimes perceived by others on campus in JLUB to this as possibly threatening their jobs or their livelihood because we're competing with other systems that have been built on campus that have not been able to sort of solve the problem but they definitely see this as an intrusion upon their turf. All right, so one of the challenges that we encountered that we didn't expect to encounter was a complete lack of data governance oversight and foresight at our institution. So we went searching for, yeah, I don't, we should have really expected that but we didn't think it'd be that bad. Maybe I'll put it that way. So what we discovered was there was no authoritative vocabulary for things like department, faculty names, position titles. And we also learned that the HR files are populated, started, and may be maintained for purposes of finances. And so as long as people get paid, then there's no real other issue with the data is what we learned. And so most of the information that we were able to access had few to no standard practices used and so we had to do quite a bit of work to clean that up. And in fact, we did so much work that we became the de facto authoritative source administrative information on campus with other projects contacting us, asking us if we could give them a list of all faculty members on campus or everyone's appointment on campus, group memberships, et cetera, et cetera. We even had HR contact us to ask for clarity on department names at one point to which we said, oh, I just use Google and I'm not really too sure either. So the good news is that this has led to an ongoing partnership with campus IT and HR and they really now, they get it, the importance of data governance and they're making great strides to improve that which improves us as well. And finally, thinking about linking within and beyond the institution. So Dale really talked about research areas already. I think that's the most salient point where we see this need. What we would like as a Canadian set of vocabulary research areas and something that we can take to faculty members and say here's the upon high tri-agency's standard classification of research areas because what we've discovered at each faculty likes to describe themselves using research areas that are their own research areas and by their own, I mean their own, they create them themselves. And it's just push and pull about giving faculty members, giving faculties the ability to describe themselves but then making sure that you're not reproducing the silos that we've been trying to break down over the past decade or so. So another challenge we've had, and boy, our screen's gone yellow. I swear these pages are actually white background but is that we've, you have to persuade faculty at some level to buy in. They do have a curation role to play in elements. It's not completely, you can't completely automate the system. They really want to have an accurate and fulsome profile. They have to do some curation or delegate someone to do that for them. So this is an example of a sort of an empty profile. It's a lovely outline of Donald Smith with absolutely no information about Donald Smith other than his name and email address. Not terribly information rich. So that's a bit of a challenge. The other is that multiple points of entry. We've heard this in other talks today. I think that it's more work for faculty. They don't see an immediate benefit from this and so it's very difficult to persuade them that this curation will somehow yield results for them. There's minimal staff support for that type of work. They don't universally across the campus have someone that they can go to and say, would you do this on my behalf? Some faculty have that, some don't. And it also triggers, asking people to do curation, it does trigger scope creep. As soon as you ask people to do something, they want something back. They want something for their labor and so they start asking, can I do this, can I do that? And can you turn this on? Can you turn that on? And it's like, well, yes and no. Let's talk about that some more. So you sort of, the harder you push for curation, the harder you get resistance in terms of scope creep. There are some perhaps high and unrealistic expectations on campus of what we'll call a stub rims can provide. We're running elements in a fairly modest implementation without a lot of the modules that it offers. And so the people are coming to us and asking for certain types of things so that they can do analysis and there's lots of interest in that. But the current elements implementation is light years from that. They also don't understand that part of what they want to do is use data that's not in there yet because they haven't curated it or they haven't actually created completed profiles. So when deans come and ask us for business intelligence or research intelligence, we basically tell them, well, we can give you something, but it's incomplete. So have you talked to your faculty about doing this? We really want to drive this website integration. We want them to stop doing all this hand labor. We want to sort of put a break, the breaks on all of these bespoke ontologies that rise up. And the one that I'll put out there is and I think we're not the only institution is we have a bit of a health sciences gorilla, which is one faculty that is larger than all of the rest of them put together and has a lot of really interesting and unique attributes. And Jay said earlier, we have 949 faculty, but if you go to our experts instance and you click on faculty, we'll say we have 5,640, okay? Now McMaster calculates for the purposes of ranking and the purposes of its rankings inside Canada takes the amount of funded research, divides it by 949 and says, look, we do 400 and change per faculty member. If you divide that same number by 5,640, you can imagine you get a slightly different integer. And so anyone with any basic facility in math can look at this and just sort of poke it when you've got a six sex order of magnitude difference between the number of faculty you say you have and the number of faculty that you claim as faculty. And so this is an open and unresolved issue that is of quite, it's quite an entrenched issue between health sciences and us and other people at the university around, who's in elements? We think that there are probably of those 5,600, a good 3,500 or more will never, if ever, even know they have a profile in the system. So they're always going to just remain empty sort of stub profiles. And so therefore sort of degrade the utility of the interface. So there's in this basically, to sum it up, it's just a classic card hoard situation. You know, we've pushed the university a little bit further down the road of doing research information management, but we have not built the system that can do a lot of the things that people want it to do in terms of providing actionable, useful data for the university. So let's wrap this up so we can get to cocktails. Real quickly on future directions, we're going to take a whole bunch of stuff people are asking for, this card before the horse stuff. You know, people just send us all sorts of crazy ideas, deans call us every day, it seems like and say, hey, can we do this? And it's like, yes, we could only if you give us money. So we want to sort of take all of that and mash it through a big filter and turn it into actual actions that we can take afford and sustain. And that's, of course, a very limited list. So that's sort of our biggest challenge right now. All right, and then a few final thoughts and then we'll open it up maybe for a little bit of discussion, hopefully. So, I mean, to wrap it up, we would consider this as a success. I think we did everything we set out to do and a lot more than what we set out to do, too. And so can you. This is the part where we say, you could also do this, but we also say, you might not want to do this, though. I mean, the challenges are considerable and the obstacles are considerable and oftentimes it can feel very thankless, at least in our situation it was. I mean, it feels good to do it, but some days it's just, I don't know if it's worth the, if it's worth the headache. I've aged a lot. Also, it's mostly about the data. So, Vivo does everything we thought it would do and we're very grateful for that. But what we've discovered through this process is that the value, the value at this point anyways, is with that data as well, centralizing that data and having some consistent product that you can use and that others can use as well. And finally, Tom talked this morning about rims not being a profile page. And we'd like to talk that, I'd like to say that Vivo is not a rims and this is something that we learned yet, of course, because we've heard grumbling so that it might be. At some point, so I think we have a few minutes to be happy to open it up for any questions or comments. Well, we're over time and I really appreciate you coming last to the day and staying over time. So, thank you very much. Thank you very much for the last couple of hours. Right, you were about to say the same thing.