 In the spirit of getting everyone to the reception on time, I think we're going to get started. My name is Tom Tepper. I'm the associate dean for collections and technical services at the University of Illinois at Urbana-Champaign. And I'm joined by my colleague, Bill Mishow, who's the head of our Granger Engineering Library and Information Center, as well as holder of our Bertolt professorship. And we're here to talk about the shadow acquisitions budget, APCs and open access publications at a research university. Before we go on, a little bit of context. University of Illinois at Urbana-Champaign is a comprehensive land-grant institution. We have almost 50,000 total students. At this point, almost going on 15,000 graduate students and almost 1,900 faculty members. Our ICR income on campus in 2017 was $138 million. And our sponsored research in that same year was $462 million. That's just for that campus, not the whole system at this point. In terms of scholarly communications on campus and what's been happening with that, some of you may be aware that in about 2013, there was a member of our state legislature that actually tried to pass a resolution stating that all research at state institutions needed to be made open access, regardless of the type, book, journal, et cetera. That didn't go very far. And actually, the research universities in the state worked with that legislature to draft, legislate or to draft a new piece that changed the tone a little bit. And it went in the direction of requiring each campus to complete a study and begin drafting an open access policy and passing one of those. So we have an open access policy that was passed in 2016. It's very similar to the policy that the University of California system operates under. Now, the library has strong partnerships with campus research administration as well. The Illinois experts is our faculty profiling service that is actually run out of the university library as is the Illinois data bank. And in the spirit of walking the walk, I recently put a data set in there for a forthcoming publication, went through those steps, and it's been downloaded 30 times in the last two months. I don't know why. It's not that riveting. But in the library, we have a fair amount of support for open access overall. Our campus provided some of the leadership for the policy development itself. We've been heavily engaged in open digitization activities. We've supported acquisitions related OA and scholarly communications efforts like scope three archive and knowledge on latched. And we have a developing publishing program. We have however, no history of any direct support for APCs either at the library or the campus research administration level. And it's a little bit interesting and one might wonder why. In part, I think it stems from the financial situation in Illinois for a few years. You may recall, we had a bit of a dip. We had no budget for three years, I think it was in the state. And there's also been a bit of a skeptical view of APCs, especially when it comes to commercial publishers and their support for open access publishing. Were we going to just create another revenue stream for these publishers? So we really refrained from being engaged in that activity. About 12 to 18 months ago, Bill and I started having some conversations. And those conversations are really the root of where some of this work comes from. We started talking about various things like or various issues like, how much does Illinois pay in subvention fees? How would we begin tracking that or figuring that out if we wanted to? What are the opportunity costs for the institution in paying some of those fees? And should we be trying to look at those in some way? And rather than go through all of these, what I think we can do is start turning to Bill's portion where we go through the methodology a little bit and can begin talking about some of the findings. I think we can get back to some of these questions as we go through the methodology as Tom mentioned. So I think it's really important for us to know, to be better informed about OA activities on our campus. And this is useful now, particularly now, since we have the plan S and we have the OA 2020 program and paid forward. There's a lot of talk about a gold APC-based model for publishing. I think this is also useful as we negotiate contracts. There's been a lot of concern, a lot of activity around negotiating contracts where the portion of articles that a particular institution is contributing offsets the cost of the subscription. So you sometimes see this term offsetting or offsets. It's also an important element as we go through the contract negotiations with particular publishers. So there's been a number of studies that try to figure out how much percent, what percent of the literature is OA. Some of the older studies are really methodologically flawed. But there's been two or three studies which I've listed here in the last couple of years. Science Metrics was funded by NSF to actually try to make an estimate of the amount of articles that are being published that are open access or percent of the literature that's OA. They found that at least 50% of the articles became available in OA within 12 to 18 months. That's probably a little high based on what we were saying. Nice article in Peer J in 2018 which actually tried to calculate this in three different ways and found a wide range of percentage of articles that were open access or papers that were open access. Another article in 2018 in Journal of Informatrix where they actually looked at Google Scholar found that 54% were open access. What we found was something sort of in between also. Again, I wanna mention these open access initiatives. I don't know if you're familiar with Plan S that's a very hot topic right now and all of this serves. Actually kind of based on this OA 2020 started out as European projects kind of been embraced by California and other places but a lot of gold open access around APC costs. A lot of universities including us have invested a lot in gold OA and we're trying to measure these frequencies at UIUC. Now I think we're a pretty good representative of a typical art one institution. We have a less strong program in biomedical and medical that we're just starting a new medical school so that's gonna go up. We have a very strong program in physical sciences and engineering and social sciences where a lot of the OA activity is. All right, so the thing that's changed I think recently is that we finally have in place the tools and the services that kind of let us do this. I wanna show you one way that we actually try to calculate OA activity longitudinally but you can do this with other types of systems. We used our local CyveL pure database. How many people have pure here? Like a handful. Pure has a couple of nice ways to download metadata. One is their API which is really not very good but there's another way where you literally can download the metadata into spreadsheets and we took the spreadsheets, converted them into databases, relational databases so we can do SQL queries over them. So we took our CyveL pure database which was from 2013 to mid-2018 when we did the study a few months ago and pulled out 37,000 papers or articles or book chapters or books. From those 37,000 we determined that 27,000 were articles, journal articles. We excluded a random letters and editorials. So after we did that we were left with the 27,000. So this is 72.9% of all the publications. We're getting about 5,000 articles a year that our faculty are publishing which I think is typical for a lot of the institutions that are sort of an R-level, R1 level. So we're pulling out metadata which includes the title author, ISSN, EISN and DOI. DOI is extremely important. DOI is kind of the coin of the realm now. We can use DOIs and a lot of different API services and I'll show you how we're doing this here to extract information and make comparisons. We use APIs a lot in our discovery system. So then we took, downloaded the DOAJ database although we also use their API and Alricks open access or pull the open access records out of Alricks. And then we started looking to see how many of these are open access. So for the 27,000 articles we searched the ISSN, EISSN title against Alricks to get the gold and DOAJ to get the gold OA articles. DOAJ is the directory of open access journals. It has about 11,000 titles in it. Then we also searched the article DOI against Unpaywall, the Unpaywall API to determine OA availability. Unpaywall, let's just send a DOI to it and it'll come back and tell you what, if there is an open access version of that article and actually what the best open access version is. Unpaywall gives the typical gold and green OA matches. Also the hybrid matches and what they call bronze OA, the bronze OA type. Bronze OA is an open access article from a journal that's typically behind a paywall. So it's typically not a open access journal. They may not even allow hybrid payments for articles but they're making the article open for one reason or another. Some of these come and go which is one of the problems with using those. So then we also looked up the APC charges by looking at DOAJ database which has the APC charge. Max Planck has a nice website that lists all the OA APCs. And then what's cut off here, we actually all even looked at the journal website in some cases to try to figure out the APC costs. Again, we did not, this is important. We just took these base APC costs. We didn't pro rate for non-Ellinois co-authorship. We didn't try to calculate additional, we didn't know additional charges. Some journals, if you want color prints, still charge extra for page charges. Didn't differentiate between the faculty grad students. Didn't make any time adjustment for a moving wall or embargo period. Although you'll see when we show a chart later that you can see the articles that are a few years old, a higher percentage of them are open access. One important thing is, I've talked to a couple of faculty members recently. These APC charges are typically played, what I understand, I've talked to two faculty members, is that the APC charges are paid by the corresponding author. So in many cases, one of our authors or one of our faculty is the corresponding author, and they paid the APC charge. So they're not typically divided between institutions. In fact, one institution pays the charges. All right, here's some of the numbers. So we look at this database of 27,000 articles published between 2013, mid-2018. 13% are in gold OA journals. So 13%, which is a little bit lower than what you would expect, although again, these are not individual articles, they're at the journal level. And there were a total of 409 different gold journals. So there's a long tail. 205 of them had one article only. So there's a bunch at the top where they were publishing a lot, and a bunch from the middle on down that our faculty are only publishing at one time over five years. Remember, gold OA does not require that the publisher charge an APC, an author processing charge. Gold OA by definition just means that the article's available free from the publisher website. However, only nine of the top 100 did not charge APCs. Four of them were journals that are in scope three, and we'll talk about that later. So they're really only five journals of the top 100 that are people publishing that are not charging APCs. So there's very few free lunches, very few of these gold OA journals that aren't charging author processing or article processing charge. We add up all the APC charges from these four to nine different journals. It comes to about $5 million over the five years, basically, 2013 to mid-2018. Average APC article charge is $1,511, just doing the average. This compares very fairably with a quote of 1,825 for Research One universities that was derived in the Don't Pay It Forward project. Scope three, I mentioned this earlier. For those of you who aren't familiar with this, CERN negotiates the tendu's office to 11 high-energy physics journals. We all contribute to that, or the libraries that used to subscribe to those journals. We actually had 569 articles published in those journals in 2013 to 2018. We have a very active high-energy physics program, the University of Illinois. This may be different at your institution. It's the one area that we may be specializing in that some R1 universities are not. Our assessment for a scope three last year was $10,203 that's come down over the last five years. But if you figure this over five years, it only averages $88 an article. At $1,500 an article, we'd be paying 900K. So the scope three model for OA is very inexpensive. Right now, it's the only high-energy physics that has this model. It's a lot of work for CERN to do this. They're basically collecting this money and negotiating with publishers for how much they're gonna give them. Gold L8% by year. You can see, this is kind of interesting. This kind of peaked in 2015. It's going down a little bit. This is a little misleading also in that these are the pure dates of loading, not necessarily publication dates. So that may even out, but that may actually be a factor. This actually reflects something that we've been told by faculty where they're getting concerned about APC charges because their discretionary funds are sinking. More ICI money's being taken off of grants and they're concerned about that. Hope people can see this. This is really the list of the top OA journals. The highest on the list is PLOS ONE, Public Library Size ONE. This is their peer-reviewed light journal. They publish about 25,000 to 30,000 articles a year from PLOS ONE, they charge $1,495 per article. Very interesting in that a couple of years ago I talked to the College of Engineering Department heads and they said basically that they would not accept articles for promotion tenure that were done in peer-reviewed light journals like PLOS ONE or APS Advances or other journals. I chiply also has one. Last year I talked to a faculty member who said, yes, they were looking at them now if they were cited heavily. So there's still a bit of a prejudice against these peer-reviewed light journals, but that's changing. And you can see we have a lot of articles being published in PLOS ONE. Two of the other top five here are high-energy physics journals. So we have 243 articles in journal high-energy physics. That is one that's under the scope three process. If you go down and look at nature communications, which is 164 articles. That charges $5,200 per article. So all together in the last five years we've put $852,000 into nature communications. The campus has, from grants or from discretionary funds. Because as Tom mentioned, library's not paying for any of that. Office of the Vice Chancellor of Research is not paying. Maybe some departments are. So very interesting. Again, if nature would have come to us five years ago and said we got a new journal that we want to charge you $110,000 a year for, I think our answer would have been that's very difficult for us to understand. But yet we are paying that kind of money as a campus of this. Down here to the next 16. At 35 we're down to 16 articles published over the five year period. You can see a lot of these are in the $2,000 range. Cell reports is $5,000. Interesting one in the middle here, number 23 is Chemical Science, which is the new ACS journal, which charges no APC fees. But it's very difficult to get an article accepted in that. We've had 23 articles accepted in that journal. That's the only one really in the top 30 here, other than the scope threes that do not charge APC. What are we doing for time? Very good. Hope you're saving up your questions. Now, so these are the gold OA journals, most of them will charge article processing charges, that's 13%. Now we went back and looked at all of the articles, again the 27,000, to see how many of them were open access by checking them against Unpaywall. There were a number of them, I don't remember the exact number, where we didn't have DOIs. So we actually went into CrossDraft, wrote another script that searched CrossDraft to try to pull out the DOI, that brought back another thousand or so. But the universe here is less than 27,000, it's about 25,800 articles that we have, articles and DOIs associated with that. So we looked those up in Unpaywall and found that about 41% of all the articles were available in an OA version. Either a gold OA version, green, for example, an institutional repository, a hybrid where it was at the publisher site, but author paid to make the article open access, or in this bronze category. So that 40.9% or 41% compares very favorably what we're seeing in the literature. If you look at Unpaywall themselves, saying it's between 27 and 47, Google Scholar studies showing 54, the site metrics studies showing about 50%. So it means that of all the articles in our SyVal pure instantiation implementation, 40.9, 41% of them are open access and available free of charge. We added the Unpaywall API into our discovery system. So if you look at our bento-based discovery system in Illinois, you'll see what we call open access links. These are links that we've got from Unpaywall by looking up the article DOI in the person's result set against the Unpaywall database, and we're putting those open access links up in addition to the Paywall links. I should also say there are some problems with the bronze Unpaywall data. So this 41% is an estimate, or it's an accurate figure based on the Unpaywall database, but there are some mistakes in the Unpaywall database, particularly some of these bronze articles that were open access for a short period of time, complimentary basis or introductory basis, may no longer be open access, but they still show up in the database. We've done some manual checking of that and found some of them, but then there are some things also that are not in DOAJ and not are in Paywall. D-Line Magazine, for example, isn't in either one, and those are open access articles, obviously. So this is representative, this is the Illinois numbers, in a lot of ways it's representative of the R1 institutions. Same thing here for the goal OA, you'll see the percentages kind of peaked in 2015 and 2016, and I'm sorry, this is the all OA, and it's going to be going down a little bit. All right, so this is our last slide, actually. This kind of repeats some of the questions that we have to the Tom's second slide. So we need to duplicate the study and other institutions or expand this. We could try to do this with all the BTAA, Big Ten Academic Alliance institutions and see what the differences are, be a good test in terms of differing disciplinary emphases at the different BTAA institutions. Again, we've got these tools now, so this can be done not only from Syval Pure, but it can be done from Web of Science Database, if you had it, or even a journal list of all the articles that were published by an institution or a Scopus database. We've done a lot with the Scopus API and on Paywall API also. So we mentioned one other thing, I think I mentioned earlier that we were getting some pushback. Recently, we talked to a couple of faculty members in our medical school, actually, who have been very active in terms of publishing. A couple of them have published a lot of articles, a number of those, 164 articles in nature communications. And they're getting a little upset. I mean, this is something they've brought up at the campus level. It's something that needs to be discussed. One person said, well, if I published three articles in Nature Com, that's actually one less grad student that I can hire to actually do real work. You're cutting my discretionary money, you're taking more ICR off, my grants are very tight, we don't have this kind of money. So this is a question I think a lot of institutions are going to have to address in the next couple of years. Particularly if we go to a Plan S model or OA 2020 model, where we're focusing on gold OA, somebody's gonna have to pay those charges. One of the things we've talked about is trying to take some money out of the library budget. But again, we don't, I don't think any of us want to be in situations where the library is deciding whether or not to pay somebody else's OA charge. So if we get 500 people on a publishing plus one, we're not gonna be able to, we should not be the arbiters of that to decide who we're gonna pay for and who we're not gonna pay for. So those are interesting questions. Tom, do you wanna say anything more about these questions? Well, yeah, I think certainly that opportunity cost is something that's important. I'm struck a little bit by the list of journals that is represented. They are very heavily weighted toward the STEM disciplines as one would expect. Yet just the other week, I had a conversation with a colleague in the library who was, who had a paper accepted and wanted to, in an open access publication and wanted to walk the walk. She wanted to take that step and one of the challenges she had was a $1,600 bill. Now, she doesn't have grant money to pay that. She does not have that same kind of funding. And if you create a fund that's going to support this and provide the same level of access to people regardless of whether they have grant money or not because we want to be equitable in some way, that creates a different challenge because one would anticipate there's going to be a growth in non-STEM related publications that are going in this direction. Frankly, the APCs for those who don't have a lot of grant money, who don't have endowed professorships, et cetera are probably a, they probably turn people off in some disciplines to publishing in an open access model. And we haven't even talked about faculty in developing countries who have a $5,200 bill for nature communications, that might be their research budget for a year. So I think one of the follow-ups to all of this though is a need to actually do some qualitative work with some faculty on campus as well to get an idea of how they feel about this and have more than just hallway conversations, have some data down about their particular feelings. So we have a little over five minutes. We have a question. That's really a great question. Sorry. Thank you.