 Okay, where we are. The claim which is also on our website is and which we wrote into the proposal two years ago when we applied for the project that the EU Commission was to revolutionise access to handwritten documents. I know that this is a big claim, but I think that not only because of the project, but because of many technical developments going on, this claim could be realised. In order to realise the claim and to really make something happen, we tried to follow five key concepts. We said we want to set up the culture of cooperation, we want to provide breakthrough in handwritten text recognition and similar fields, we want to set up a service platform so that's Transcribos. We want to initiate the cycle of growth and we want to make innovation happen. Okay, the culture of cooperation. I always stress in my presentations that this slide really is for me one of the core concepts of the project and of the platform, that I believe that real, really good results can only be achieved if different user groups are working together. So all the user groups who are dealing with historical documents should be on the table, should provide their knowledge, their competence and also receive of course benefits from this infrastructure or from this service platform. And we really try to follow this concept so Transcribos understands itself as a platform mediating between the different user groups and to be able to satisfy the needs of the different user groups. Breakthrough in handwritten text recognition and related technology. One important component of this are scientific competitions organized by the READ partners. So since years, especially from Valencia, but also from here from Vienna, they are organizing competition, scientific competitions within the framework of scientific conferences. And if you look at the results of these competitions, you can say that you make somehow a few to the future. So what is in keyword spotting is something which was already part of research some years ago, but took of course some while to get realized as a service. So these competitions are, from my point of view, a very good indicator of what is actually happening in the research field, but also a means to guide somehow a little bit the researchers and to feed them with interesting material. Because that's really one of the key things of this cooperation that real stuff, real world material needs to be provided to the research competitions. And this will increase the applicability of the algorithms developed and the technology. So there is a long list of institutions providing material for the competition, CG for wide identification or for the baseline detection. And actually next week or two weeks, the ICTA 2017 will take place in Kyoto and Japan and READ partners will be there. And also some of them are very successful in these competitions. A breakthrough in handwritten text recognition actually took place already, as I said in my introduction, maybe some years ago. But to make it really happen, that's something where we can say that is part of the Transcriber's platform. For my point of view, for my knowledge, it is the only way where users really have access to the technology and this relatively simple means just transcribing line by line text and then train a model that this technology is really available. And as you have maybe experienced, the technology is very robust and not dependent on language in the background, but just focusing on the characters on the alphabet. And it's also able to read from right to left, so Hebrew, Arabic and other character-based language, character-based alphabet. It is possible to process it. So it's often, that's one of the nice things when we are approached by people and they ask us, this is a medieval document, a Latin document, or a Greek document, or this is an Arabic document. Can you process that? Do you need a dictionary? And we can say, of course we can process it and no, we don't need a dictionary. It might help sometimes at the end, but for a first shot or for many operations that's not necessary. So it's really a big breakthrough for my point of view. All those who have worked with transcribals will know that segmentation and segmentation is, in our case, the detection of baselines is an important issue because the computer has to know where is the text. And all of you will have experience that there are typical problems with lines which are narrow or lines which are very short table structures which are very complicated and both cards where text is written around in different angles and so on and so on. And actually during this summer we can say that there was a real breakthrough in research and a colleague of the team from Rostock was very successful with this and I think it's no exaggeration to say that this is probably now one of the best layout or probably the best layout analysis or baseline detection tool available anywhere. And as you can see here there's only one little, little tiny mistake that one baseline is a bit too short. The rest was detected automatically with the standard model and here the same is a complicated table structure. Nearly all baselines are correctly bound by the engine. That really will make a difference and there is a little bug still included that the reading order of the lines is not correct so you might experience this in the workshop but that will be fixed probably next week or so when Tobias is back from holidays. Yeah, but what you can see is that there's a real difference. And that's of course extremely important for automated processing. Then I think we achieved the breakthrough in keyword spotting. We said that's the next big thing and you will see more details on that later on. But keyword spotting is simply said a way to search in a very effective way directly in the image and not in the transcribed text. That's a new concept for all of us who are coming from the humanities and it's hard to understand and we have a lot of discussions in the project how to get this concept but you will see the results and you will be surprised. Then the third point was that we said the third key concept was that we want to set up a real service platform. So not the demonstrator, not something which works very nicely for certain kind of material or certain collection. We wanted to really provide services and an important role plays here of course the expert interface. The idea behind this expert interface was that we tried to integrate all important things which are necessary to set up a project from importing or uploading files to exporting to processing them to measure the results to cope with user management and so on and so on. And of course that gets a bit complicated or not complicated is the wrong word. Sophisticated it needs some time to understand all the functionality but then it's of course a tool which allows you to do a lot. That will be accomplished by web interface which reduces the complexity and is designed for supporting users who are mainly interested in transcription. We will show it in the evening to you and which is also very abstract for non-technical people but which is from the concept very very important that all the services in transcribers are also available via an API so they can be used by other machines. So the machines can talk to each other, can exchange information, can exchange documents and that's extremely important if we think on larger scale projects. Then the fourth concept was the cycle of growth and for me it was also one of the very nice experiences in this project that so many institutions all over the world are interested and I just listed here some of these memorandum of understanding partners. Of course a lot here in this room and I really am very very thankful for those who concluded such an agreement with us. It will help us definitely to proceed with the platform to look for new funding to come up with new services and so on. So you see we have institutions from Australia, Austria of course, Canada, Denmark, Finland, from France of course many institutions from Germany, from Greece, Italy, Luxembourg, from the Netherlands or actually in the Netherlands it's really fantastic what is going on there with digitization, with archives really having understood that they need to open their collections and to make it available in digital format and also it seems that the government has understood it and is willing to invest some money in this. I think that's really great. Norway, Serbia and Spain, Switzerland, that's also a strong country and United Kingdom, also in here we have a lot of requests and many interesting projects and also the United States we have some some institutions. Currently there are more than 7,500 users registered in transcribers, 10-15 users are registering every day and are downloading the expert client. Not everyone of course starts then the work but many people are uploading their files and trying it out and yeah it's always interesting to see that so many people actually somehow involved in this. Usually we have 20-30 users always online, someone is always working, we are of course thinking how others could benefit of that fact. From the users of course many are family historians and they I mean they have the real request they say I have here an old contract from my father's house and I cannot read this old writing so is your software able to read this. Currently in most cases we have to say no, we first have to train a model and we are heading for global models but that's not actually the case. But the other half I would say of these users are researchers, archivists, librarians looking for options how to make their collections available, trying it out and which is also a good portion are students. Students involved in digital humanities courses, students writing their master's thesis. We had here a student just 14 days ago writing a master's thesis about the war diary from the grandfather and she just provided 50 pages. We trained the model and she run it against the whole diary and was excellent results so it was really really funny to see that. Services, thousands of documents, hundreds of thousands of images are actually uploaded to the platform. We are currently using three terabytes of storage but many more terabytes are actually reserved on our computing center in Innsbruck. If you look at the job ID you will see that about 160,000 jobs were completed in the platform mainly segmentation, HDR and so on. We have now trained hundreds of HDR models. The training is standard activity but I would say in 30% of the cases it fails at first and then some parameters have to be changed and then it will work and of course we have answered thousands of emails. The last item or key concept we had was make innovation happen. Innovation is now one of the buzzwords in the university environment and also in the EU commission. I think there is somehow a connection between research and innovation but often innovation is not connected directly to research so we wanted to have a space in the project where also new ideas can be developed and might come up and as it is the case with ideas of course not every idea is a breakthrough so it makes also sense to just try out things and that's that's one of the fields so one of one of these ideas is that we said okay can we use the documents also to train people, students, volunteers to read historical handwriting documents and there is now an interface where you can try it out yourself which is dedicated to self-study. Then another idea was to use the mobile phone in an archive for taking pictures sending them to the transcripts platform and from this idea actually the scan tent resulted so you will see this and you will be able to try it out tomorrow. So that's the scan tent the idea is that both hands are free and another idea was that we want to motivate people to look for famous hands we called it for writings of famous people and to give them the chance to upload the writings because there is a big community fascinated by handwriting itself and this was something we also developed during the summer you will see this tomorrow as well. Okay so this is what we did a small part but the most important issues of course were mentioned so where would we like to be? Of course our main objective is that we would be we would like to be there the search infrastructure for everyone interested in transcription recognition and searching historical documents. Something which will be important what is in the core of my thinking but still not realized in the way I would like to see it are these network effects which we can gain due to the fact that we are such a large community and due to the fact that on a technical level so much synergy can be used. So one of these network effects could be or will be that isolated models become global models so that ground truth or training data which are produced on the one side can also be used from other users and that's something we hope to work on in the future or we will work on in the future and I'm convinced that it really will be one of the success factors for the next years that you really feel a benefit that others are using the platform as well. Yeah large scale demonstrators of course many of you are representing archives libraries with large collections and it's nice to work with hundreds or thousands of images but of course the direction goes into hundreds of thousands or millions of page images so that's something which is also on our list that we want to show and demonstrate how the technology can work on large collections and where we would like to be is that we are able to provide a centralized access via keyword spotting to the collections in transcribers this will require also that we have the chance to make collections publicly available so that's also a big step because as you know currently everyone is working in his own collection which makes a lot of sense for for many projects but for public collections that also needs to be adopted. So how will we will we approach it these objectives of course the read project is still running for one and a half years so 2018 was full power 2019 six months was a bit reduced but just for rollout and dissemination but of course there are still resources in the project also a lot of services or several services are planned for 2019 like these global HDR models or document understanding so document understanding means that the metadata which are included somehow in the structure of documents can also be exploited for your purpose. Table recognition falls under this issue last cave demonstrators crowd sourcing services so that's on our list and should come during the next year. Yeah of course we are involved in a lot of small test projects with many MOU partners so that's also on our list and to keep the services running and to answer the user requests is a big thing. So this was the read project but the EU Commission expects from us also to come up with a sustainability concept and within the sustainability concept the business component will play a role as well so one of the ideas is to come up with a subscription model so that's meant for permanent cooperation and will very likely be yearly fee which enables users to upload pages into process them such a subscription model could look like this and I'm happy also to receive your feedback on this and we are currently in the process of discussing this so one of the ideas would be to have of course all services for free for a certain amount of documents e.g. 1000 pages per year or more depending also from where the images are coming or whatever so that someone really is able to do test projects and try out all the services or even do smaller projects on a free level when it comes to the processing of more amounts of pages e.g. 10,000 per year or 30,000 per year or 100,000 per year then a certain yearly fee could be applied and that would also include special services such that users are able to train their models on their own or text to image matching or crowdsourcing and so on large scale projects so to process to have a permanent connection with transcripts would be for this subscription model but of course there will be projects which need to process large amounts of documents in a relatively short time so that would be based on an API so that images are coming transferred via an API processed and the results are transferred back and that would be calculated on a page based basis of course some projects will have special services or requests for special services e.g. development or research is involved and that's something which can also be provided as long as it is as it does not fall on the standard services which are there anyway. We will try to implement the business component in 2018 don't worry there will be no surprises for those who are actually working with this but for the long-term sustainability and for the commitment I believe that this really makes sense and will also increase the flexibility with which we can react to requests and special projects and the third component which will be important for the future is the European Open Science Cloud actually last week the EU Commission provided European Open Science Cloud declaration you should have a look on this document you find it easily under this heading EU EOSC and it outlines the main principles for research and for research infrastructures in the European community for the next years so it's a very high-level strategic document but I have to say it's it's from first glance it comes relatively to the spot and is not a bad document from my point of view many good ideas are in open research data open access the connection of services with research research infrastructures and so on and so on so this is not the project it is really the European Open Science Cloud shall also be a top-level research infrastructure under which other research infrastructures are working or connection with this and that will be important also for us because as I said at the very beginning we want to become one of the research infrastructures for processing historical images okay here are some citations from the document but I think I used my time so what is important is that the infrastructure I mentioned here as well and we are one of these infrastructures and of course billions of euro will be guided to this to this field okay this means that it will offer a number of chances for us to connect with transcribers we have this interdisciplinar cooperation I think we can show that real needs that we are driven by real needs and a user requests that we have running services and not just demos and that standardization also plays a role in our case yeah this means that the more users are working with transcribers the higher the chance to indicate the need for this research infrastructure on a European level so once again thanks for your interest MOUs and later on the subscription model will be a very good way to express commitment and interest and of course we want to have critical users so we want to learn and we are thankful for your input and for your critical remarks yes thank you for your attention questions feel free yes please the ad can be made available to everyone but it is mainly a question of so the user interface is there user interface is simple and some users are already using it why we are a bit careful about this is just that the that the amount of computing power on our side will be very high if every user can start the training so we are a bit waiting on users who are asking us currently sorry for that but I have last week I ordered via the university I hope that they will will agree to GPU servers and then things will improve I think as far as I know the Google team this octopus is working also in this direction a part of the Google team comes from the handwritten text recognition I heard from one user that they are now also opening octopus API for handwritten texts but I haven't heard anything more about that so of course there are other groups I think there are groups in Paris in in America yeah but I think it will be hard to find a group which really is able to provide such standard services as you can find here it's more research prototypes I would say but of course technologies is more groups are working on this okay yes