 Good morning everyone, thanks for coming to the session. I want to talk today a little bit about the transition of transcribes at the moment and the place of the user within that. We're obviously in a time of great growth and we're obviously in a time of great change and part of the success of transcribes is down to our user community but it does seem strange to me if I'm going to come to the user conference and talk to you for half an hour about what you do when actually the audience is filled with users. So I'm only going to talk for five or ten minutes about the kind of things that we're doing and then I want you to have a think about what you need want to tell us about what's going on so we can have a discussion about some of the information flows, some of the things that are going on. For example, sitting to think about the questions you want to ask or the things you want to know about. What information do you need that you don't have? How can we support your use of transcribes better? If you need support, who do you contact? Do you know? Are the forums that we have for help? Are they adequate? What additions would you like to have to our new website? We'll be putting a new website up soon. And what functionalities would you like to see added that we don't? Listen to the talks yesterday. Another few things came up. Firstly, obviously the pricing issue and getting the information that you need about that, particularly for those who need to embed the new cost structure into their own financial plans going forward. So letting us know what you need to know about this so that we can figure it out. Another thing that came up yesterday is that due to the growth of the community, there's lots and lots of models that people are publishing and other people can't seem to find them or don't know about them. So what can we do to help that? And something which has come up on social media is also thinking about the openness and transparency of the system and the code itself. We're not formally open science because we don't publish all of the code and thinking about whether that's an issue for you. So these are the types of things that I'm quite happy to have a conversation about as we go forward. But let's talk about users and the growth of users. We saw yesterday there's been a massive increase in the number of users of transcribers, particularly over the last couple of years. We launched in 2015, we had a couple of hundred users for the first six months and a lot of comments about how the tool didn't really work. There was a step change about 2017, 2018 when we introduced keyword spotting, and we got past 10,000 users at the end of 2017. Now from a network perspective, once you get past 10,000 users of the system, that's when it becomes a word of mouth thing, where people start talking about it and using it and telling other people about it. I went to a really interesting conference earlier in the year in London called AI and Archives and it was about archivists thinking about how artificial intelligence was going to change. And I was given the keynote about transcribers at the end of the conference, but lots of people there didn't know who I was and that was fine. I didn't expect them to know who I was. But I was just listening to the conversations that were happening by this archivist community. They were all talking about transcribers. There were people in lifts explaining transcribers to me and they had no idea that I was part of the transcribers team. And we've gone from that moment from it being a relatively small community to a large one. Now that means there's some changes in how we support the community and we're still a small team and it's that tension I want to talk about today. Just to give you an oversight as to the use of the system. Although we now have 32,000 users registered, we have about 1,000 core users who regularly log on. So these is a report from last week over a period of seven days. We are getting about 350 new users every single week at the moment. And we have about 1,000 active users who are logging on and trying stuff. About 8,000 jobs are going through and about 100,000 images a week are going through. In November last year, we hit the tipping point of a million images went through the system for the first time in one month. A million images a month are now being uploaded. Now psychologically, that's really interesting how humans go, wow, a million, it's not that different from 999,999. But hitting the million is like, wow, we've actually built a system which is regularly in use. So it is something which has a large user base, it is now kind of self-sustaining in that community. But it's not a popularity contest this. We want to make sure that we are still delivering a really good system to users and we are in that process of change. We're still a small team, we're trying to support people. So as things grow and as things change and as people talk about it more, we have to make sure we're sorting out the back end. And that is part of this transition to the co-op, making sure that we are building not only the technical infrastructure, but the people infrastructure and the information infrastructure to support our user community. Now there was a survey done at the last user conference, which was in November 2018. This is a fairly high level one, but at the same time shows various perceptions about how people are finding the tools, what they're finding difficult. This is really useful and really interesting to us and really helpful. You will find in your packs a user survey for this conference. And please do think about filling it out and leaving it at the front desk. It'd be really useful for us. And if there's things you don't want to talk about publicly, we just want to let us know, then please do fill it out. But it did show that a lot of the people who are here are expert users of transcribes, lots of people are really satisfied with the system. The people who are not satisfied, we could tell looking at the individual features, the things that they're having difficulty with. For example, the things that people found most difficulty were segmenting and training the HDR model. So if we know that's difficult for the users of the system, we have to go back and look at our user documentation and our help and support. I did another survey in March and April this year. And what's fascinating to me as an academic, so I work in the digitisation of the past is like my academic area. I'm a professor of digital cultural heritage. So I'm learning a lot about that through the transcribes project. But also I'm writing up a study just now about how HDR is going to change historical research and what we can learn from users of transcribes and the volumes of stuff that they're doing and their ambition for what comes next. It's important to know that transcribes is the only user facing HDR system at present that you can actually use. We have good levels of use. There are other commercial providers who have HDR embedded into their systems. For example, Adam Matthew, but you cannot upload your own documents. You can only do HDR or they have done HDR in their systems. There are other publishers who say that they've done HDR on our systems. I won't name them today, but I have access to all the logins for transcribes. So some of the major publishers who are claiming that they have done HDR on their digitized manuscripts, they've used transcribes to do it. And they've put the HDR content and then they're selling access to that HDR content. So if you want to upload your images and have HDR done and you're not a computer programmer because there are plenty of HDR systems if you know what you're doing you could implement your own. Transcribes at the moment is the only commercial facing system that you can upload your own documents. So over the summer I did this study about can we use transcribes as a case study to look at how this is affecting users or readers of historical documents and what this means for history. And many of you will have contributed to that. And thank you. I put up 50 questions. It was a long survey. We sent it to all 20,000 registered users. It was open for a month. We got 155 really detailed responses. That doesn't sound that much out of 20,000. But at the time there were 800 active users of the system so 20% of the active users of the system were responded. And there was a huge amount of information from that. I'm still writing that up just now and it might turn into be a book actually. Isn't all unicorns and rainbows though? It isn't all yay, this is amazing. We heard yesterday that transcribes it just works. I think for the most part it does. We did get some really negative responses. My favorite one was someone who went through that survey and filled out every single question with your tool does not work. But when you look back and said what type of documents are you trying to read with transcribes? He said ancient Greek and that's not a language or a linguistic system that we support in transcribes. So he was trying to do stuff that we couldn't do anyway. It was really useful insight to that. We also learned from that what people are using the system for, the size of projects. The majority of people using transcribes out with 1,000 regular people that are logging in. The majority of people are putting very small projects through so 70 pages or so or up to 1,000 pages. We have a core of people who are genealogists who are using the system and that's great. And that's fine. It's one of the reasons we're giving free pages to process when we go towards the paid model to keep that supporting the people that are doing very small projects. There are two projects that we're putting through over a million images over the course of the project and it's that type of scale that we want to see that those major funded projects contributing to keep the infrastructure of transcribes up and running. I asked people what they're going to do with their data once they actually get it out of transcribes and this is the kind of interesting stuff. So it's things like creating a TI edition, private use, whatever that means, to Horizon 2020 Project. So we were a Horizon 2020 Project and there's two more who had embedded transcribes into their workflows and into their funded workflows. So that's a big success for the EU. Annotating it for grammatical analysis, data extraction and editing and publishing it. Our ultimate aim is to import the transcriptions into our collections management system. Now I think this is a game changer in digital cultural heritage when we can have dams or CMSs engage with the outputs from transcribers and allow users to get access to that most CMSs are not set up to incorporate this large scale text alongside digital images. So it's a fundamental infrastructure issue for libraries and archives to sort out how we work transcribes into a general digitization workflow. That's something I'm working with just now both at Edinburgh and the National Library of Scotland to see how we can integrate that into our digitization workflow. So I think that for me, that's a step change when we can get that, we can get the outputs of transcribes embedded seamlessly so users don't have to even think about the difference between the two. And completion of my ancestor research and of course your tool does not work where she fell down to every single one. In general, when you look at the analysis of what everyone said in the survey, the most important thing was access, how this is going to be transformational to access and access of the past. And I think there's a lot to unpick with that and how that's going to change. And I am writing that up just now. Moving on to things which are more prescient for what's happening just now, I did ask people, if you were concerned about the sustainability of transcribes, 45 people, 29% even considered it. A lot of people said no, a lot of people said don't know. So only half of people had thought, wait a minute, this is a free tool that at some point we have to pay to keep the lights on basically. And that's where we are, our funding ended in July. We now need to get to a model where we can keep the servers running and keep the people employed behind the scenes. And so I know there's some difficulty in that and no people have been used to using a free system and we're trying to engage the user community in this but also in a way that we can support and continue this rather than just shutting the whole thing down. It would be normal at the end of a project, a funded project, just to go, oh well, that was a success. We write up our papers and we shut the whole thing down and we move on. But we're at this thing now where we have 32,000 registered users and 1,000 people and a million images getting uploaded. We're a success, so we have to try. And we are trying to do this in a cooperative way, in a way which is profit making for the system but it is not set up in the same way that most tech firms actually are. We asked if you were aware that we're gonna be paid for moving to a pay for model. Most people were not and most people hadn't heard of turning transcribers into the cooperative. So we need to do a lot more work with the user community on that and actually convincing. I hope you've heard a lot the last few days about this. So I'm still writing up this and I've got lots of questions about the difference in paleography and HDR and what it means to actually read documents and have the computers to read them in the black box nature. And I can get all philosophical about it. That's kind of my academic thing. There's lots of things we need to do to understand this is a historical turn in our approach to historical documents, right? And you are the folks who are doing that and it's fascinating to be part of this. So it's a really exciting moment. But I wanna talk now about help going forward and how we're actually gonna help folks as part of the new co-op. We are putting a new website together. It's not up yet. It will be at readco-op.eu. If you have any questions or queries about anything, the new contact is info at readco-op.eu. So that's the new core contact where we can then triage that to other people depending on what your query is. So that will be coming soon and we're putting together the materials now. So if you've got any insight or thoughts of what you would like to see, please do let us know and please do start contacting us via this method to make sure that your query gets to the right person. But there is a change when you go for a free thing to a paid-for service that comes with a shift in expectations. That comes with a shift in knowing how we can support people and expectations from users. There's service delivery issues. There's expectations about communications. There's expectations about standards. And so it's going to have to be a shift in our thinking. And it's worth me stating that we are not Google and we are not Microsoft. We do not have billions of pounds worth of money to invest in user, either front-end design or in support systems as well. Our team at Transcribers hasn't grown massively even though we've gone from 300 users to 32,000. So we have to find ways to navigate that. There are existing information sources. Here's where I'm going to start doing a question and answer. So who here uses the wiki to find out information? So a few people. We need to update. It hasn't been updated in a few years for some bit. So we need to make sure that as we integrate and change the system that we actually integrate that to. Who here uses any of the tutorials on YouTube? Oh, that's good. The user numbers on that are not that high but we can see that people are looking at the whole thing. So it's evidently that that's a really important introduction and I think these serve as well as an important introductory material. Who here follows transcribers on the Twitter? So that's the major kind of current news source which is coming out. Good to see so many of you are engaged and that's the place that and the official transcribers Facebook group are the places that we're trying to keep the community engaged. So we do have the official Facebook which is the place where, and Facebook has a few issues like the way it's destroyed democracy throughout the whole of the world and how it sells all your data. But it seems to be the forum where people are coming together so it is an effective way for us to reach the user community regularly. What is of great interest to me actually is the unofficial Facebook group. So the transcribers users ones which was set up by the users it wasn't set up by the transcribers project but the engagement there is fascinating because you're all so helpful to each other and people, newbies or very experienced people post very specific questions and people all pile in and give very detailed helpful answers. Now as we move to the cooperative model where we're trying to be a cooperative and we're trying to all be helpful to each other this is the type of thing we need your help and we need your support too. So supporting this type of activity where you can help us grow the community with your expertise. If I could be honest, I am not using transcribers every day in my day to day life. A lot of you are and so a lot of you, sometimes people ask me a question I'm like right okay I have to go back and do my homework to find out that one but you were right there because you were on it all the time so we were dependent on the user community and this is not something I've talked to Andy or Gunther about but I was thinking about this this morning and given it is a cooperative we might have to think about this labour that other users are putting in to support other users and how we can support that whether it is giving credits if you're doing training we can give credits for pages and things like that. So if we're going to this paid for model there might be a way of helping us support people who are willing to give their time to support other users and I think that's something we have to think about but that's just entirely all in my head and we need to figure all that out. So the kind of discussions we're having as we move towards this more sustainable model. I just want to shout out to everyone who was so excited about coming to the conference everyone was going, he's coming to the conference on the Facebook group and this is the type of thing seeing the community grow and seeing how people interact online and don't know each other and then come here to meet it's a kind of fascinating intersection between the digital and the physical. So we've got 10 more minutes and I would like to pass this over to you now for a part of a discussion. What are the things that you would like us to know? What are the things you would like us to think about? What are the things you would like us to address and can we be doing anything differently more or better as we move towards this co-operative model? Yes. Thank you. Yeah, I was wondering because I heard around me some concerns about pricing and whether archives were able to find the funding to pay for that. Whether you also thought of a model where you have less innovation and more the basics and maybe then be able to cut half the price. Do you mean like using folks that are using other folks' models? Like no, because of the model generation is from the processing point of view that's the heavy lifting at the back end, right? Even if you're using models which have already been generated by others that's not nearly as computational expensive. So is that that kind of thing where you're saying if you're processing using that then there would be one tier? No, I mean more, it's a very innovative team and product. And I was wondering whether, well I don't know what you say a little bit now where your biggest expenses are, but I was wondering if you put for the future less innovation in it because it works already quite well and if you then also retrain but maybe not so many new features that you can lower the price and that it may be more sustainable also for more archives to do projects on it. So we will not be having the same level of innovation as we've had during the EU funded project. There was eight different teams that were funded across Europe and most of them have been disbanded and it is the folks at Innsbruck who are now keeping the system that was developed alive and there will be tweaks and there will be improvements but we are not now employing groups of people who are working on for example the tables or the keyword spotting to develop those tools better. So we're going towards an R&D project or research and development project to rolling out the product that we delivered. So what we're trying to cover is the three or four salaries at Innsbruck to keep and the cost of compute at Innsbruck to keep it alive and the user support at Innsbruck. And at the moment the University of Innsbruck is putting in a lot of resource for that until we get to the stage where but we have given the amount of money that's being pledged we have got enough money to keep things going for the next year while we sort all this out but new research and development is not part of that business model we're trying to service. So I'm not sure where there'll be much savings on that but there might be a differential in people using it to do the large scale compute of new models or not, I think. Okay, well it's also where the value creation is so that's a bit. Exactly. Yeah, but very insightful. I think maybe also good in your communication to tell that the Innsbruck University is now paying for it to make it sustainable because this is the first time that I heard it here so I think this also is good for people to know. I think we have a huge thing to do with comms like to improve our communications about what's going on, I think. I mean, that's on us, right? And maybe it's an option to do for the future more innovation to other European funded projects in order to go to a next level and then incorporate it into co-op again. Yeah, we are, exactly that. We are actively looking at other research grants, EU and other that we can apply for to develop and support transcribes and then to bring it back in so that we can do the extra R and D and keep improving the system. Thank you. A proposal for your business model. If I were you, I would think about offering the software in a strongly reduced version to the Ministry of Education with intention to use the software at public schools. And let's say pupils at public schools could use the software in a way that they throw so digitized versions of, for example, letters of their grandparents just with the software with some adjustments via selection boxes. Let's say letter of some grandparents from Scotland end of 19th century type of subject of the letter is private or a trade or scientific. And so younger people could get accustomed to the use of such software. In addition, they get information about historic events about their ancestors. And so they should have an intrinsic motivation to use the software. That's really interesting. Thank you. We'll take that away. Other comments? Thank you for your insights. As being a member of the co-op, I think that one of the things that we run into is that you're changing from a funded organization to a corporation that would be good to have some kind of service or support ticketing system in place. So if you run into an error or request new features or small changes to, for example, the web interface, you can actually insert a ticket into the system. We get a number or something. And then we know it's in the system and it's worked on. Now, of course, everyone is trying their best and we have the communication in there by email and stuff. But sometimes things get lost on both sides. Exactly. I think a ticketing system is now pretty standard as well. So I think that's a good point. That at least would help us, I think a lot. And you know it's in there that it's part of it. You can actually see if it's worked on or not. Yeah. Great. Thank you. Thank you also very much for your presentation. One of the things I was thinking about is you have a huge community and the potential is enormous. But these people can only meet via actually quite specific social media platforms. And we, of course, have the experience with Velehande that that is, of course, one of the biggest powers of Velehande that people can meet on platforms of all kind of projects and just communicate with each other about nice findings, about the weather even, but also about problems they have with segmentation, et cetera. So are you using your own? You've got your own forum set up. We have our own forum with Velehande. We will also present it shortly. OK. Yeah, but it's really a big thing where I think transcribers, the technique is working great. I see the innovations. Bilk, indeed, is, of course, one of the issues. But I'm not sure if that's really something related to crowdsourcing right here. But the forum aspect, that's really rewarding and communicating directly with your volunteers all over the world. That's really something where I think that's missing in transcribers right now. Yeah, I would agree. I think that has to be part of the new website as we're scoping it out, because it seems to me, well, hey, Facebook, I was talking to someone yesterday in the audience, like, and he's like, well, I've never been on Facebook, and I never would because of the issues with Facebook. And so there's only a small section of the community that are there, same with Twitter as well. So we need to actually own our communications and we need to actually own a place for discussions to happen, which is a place where anyone else can then find it and know that that's where it's hanging out. So I think that, A, the ticketing system for issues, but, B, also like the place you come to chat and hang out, like, we have to have that as part of our own infrastructure. And it's something that was never part of the EU project, that it was never part, we never had a work package to deal with that. It was, you know, the EU package was mostly sorting out the infrastructure to actually do this. But now we have to think about the project and the round, which engages these type of things. So yeah, thank you. Thanks, Melissa. I have three points. The first one is if you're intending to do like a press release or anything like that, us as the British Library, as a member, we would like to help you spread the word. So we would also like to shout out that we are a part of this, and this could be great for promotion. Second point was that it would be good for us to have the minutes from your board meetings when you have those for transparency. And, you know, we can report internally how this is going forward. Third is a question about communities and for, etc. Is that it could be, for me, I think it would be useful if the new website would also have some type of, maybe many discussions about specific areas of language and script. So for example, for me, I would like to know in the wider transcribers community who is working on Arabic and on Indian scripts and on Hebrew, so that it would be easy for me to kind of approach a smaller group and talk about the very specific challenges that each language has. Yes, that's really useful, thank you. So the first point with PR, for our, we now have nearly 40 institutions that are partners, and so we will be, once we're ready to go, yeah, this is what we're doing, we will be really grateful for support in spreading the word further. We set up our official Google Drive that is professional Google Drive, so it's all GDPR compliant last week to share, board meetings, financial reports, all of that. So that's in train, it's just taking us, because we only got set up properly in December, it's just taking us a week while to kind of go, right, okay, now we have the legal bits out of the way, what we're actually doing, so that's coming. And all partners or everyone who's signed up to the call-up as a member will have access to that shared secure space and can see all the documentation. I'd like to move towards being totally transparent about what we're doing. And the third thing, I think that's a great idea. So it's something like the wiki, but more based around types of, and actually having a community part of the wiki, and people can kind of self-fill in, and so you can help find each other, I think that's great as well. So we've got the forum, we've got the tickets, and we've also got a wiki part for community support and making sure that we can help users find each other. And I think it was three things going away with those today. And I sit and bang on about this stuff, but having the support of the users and saying this is what we need is really powerful for us. So if you could fill those things on your forms as well and hand them in, that would be great because it shows that there's a need from the user community for these things too. So please do write this stuff down on the conference user survey as well, that would be great. Any other points people want to make? This came up when our membership was being reviewed, and as a university partner, as opposed to an archival partner, it's actually much easier for us to contribute to the research end of the cooperative than it is to what we see as basically renting processing, because we have lots of processing and we don't need to rent it. We could do that at the client side at no cost to us. What's much easier for us to do is to say, embed a researcher within the University of Innsbruck to support our own research and the research of others. So I'm wondering if you've looked at doing it in that way, having the funding researched by the club rather than relying on this processing model. Okay, that's something we haven't worked out actually, but that is interesting, and there are also other opportunities for going for joint funding towards things like postdocs for things like PhD studentships that we can embed into... I'm writing one just now that would be half-time at Edinburgh and part-time at Innsbruck and having that model where we can support people using external funding to support things through the system as well. So thank you, that's something that we need to... I'm looking at Andy going, yeah, we need to seriously consider that. That's really useful, thanks. I'm aware that time is going on, but that's been incredibly useful to hear your thoughts on this. It is a time of transition and change and imagination, and also sorting out a lot of boring administration behind the scenes. But I'm hopeful it's really... It's an exciting opportunity, right? I don't know any other large-scale digital... Humanities Digital Archives digital project, which has tried to go from this funded model into being something that is sustainable. So what we're doing is brave, what we're doing is innovative, and we're trying to keep the lights on. I'm aware that there are some tensions with that, but we hope to work with the community and try and work these issues out. So it's really important that we keep this dialogue going. And I think today, and this conference, these things didn't come up in the last years of conference a year and a half ago, right? This is a space where we're transitioning. And so I think it's important that we keep that dialogue going and we keep that supportive, both supporting each other and acting as the co-operative that we want to be. So thank you very much for your time today. That's been really useful.