 You can always come to the front. Nevertheless, it's a pleasure to be here. CNI always has been a very interesting meeting. My name is Jan Braza. I'm from the Göttingen State and University Library. We're here because the German Research Foundation in Germany is a CNI member and has the fantastic program of sending a German project to each of the CNI meetings to present what we're doing. So to exchange and strengthen the cooperation coordination between US and German projects. So I'm here to talk a bit about the Discussed Data Net project. Let me first start with a brief introduction of who we are. So Göttingen University was founded in 1736. It's one of the larger universities in Germany. It has around 300,000 students located in a city that has 80,000 people living. So Göttingen is a very nice, very small university, classical German university city. The University Library of Göttingen is actually two years older than a university and was founded in 1734 and is one of the largest in Germany and actually was also the first science and research library in the world. So the first library that actually started collecting books by a scientific acquisition method. I'm the head of the Research and Development Department. By the way, I apologize that my director, Wolfram Horstmann, cannot be here today. He also planned to make this presentation with me but had to leave due to colliding events. So the Research and Development Department is the largest research and development department of any library in Germany. We have currently around 30 people working in 14 to 15 third-party funded projects. More than 80% of our staff is based on third-party funding. We just had our 15th anniversary because we were funded in 2004. And our general mission is to have a close look on how the digital transition changes the way we work with information. So basically, look at all the different aspects of virtual research environments that help a scientist to make their everyday research and their everyday work. Look at services for data centers. Look at services, how to exchange the data between the environments and the centers, and look at all aspects of the underlying infrastructure. So the Discussed Data Project is one example of something that we do. The general idea is to establish a repository for research data on post-Soviet region information. So the idea is that scientists who discuss and research the post-Soviet region have a central repository where they can access all of their data types. This data includes census information, information about media articles if the whites are there, and additional and any kind of information, including up to videos or pictures. And the basic idea is that when we started this project that we wanted to establish a close cooperation with the scholarly community. So the repository should not only include the data, but also the context information about the data set and the information about the methodology of the collection. And it also should include interactive ways for the researcher to discuss the data. That's, I'll write the name Discussed Data. So to have an interactive platform of researchers talking about data sets and also evaluating them. So the partners, apart from getting in State and University Library, are the Research Center for East European Studies at Bremen University, which comes from the fact that from the beginning on we wanted to work in close cooperation with the scientific community. The funding is from the German Research Foundation and we're in a three year project, so the project shall end in 2019, December 2019. So as I said earlier, the basic idea is that we make a community-centered approach to research data management, because we believe that for this standalone solution, scholars are the best experts for the data quality. So especially with such heterogeneous data in for such a specific field as post-Soviet studies, the scholars are the ones who can actually tell you something about the quality of the data and are the best people to evaluate the data itself. This follows also the idea that a small scope repository in this case works better than a larger one. So we really pick a very specific scientific field and make a tailor-made repository for the needs of this specific discipline instead of having a bigger repository that holds many different disciplines. And also the basic idea was to try to look at if it is possible to make an interactive solution so to allow the researchers to use the repository as a platform for discussion and ongoing interactive evaluation of the data. And last but not least, we also wanted to make use of existing infrastructures. So instead of building everything up from scratch, we tried to use existing solutions, especially from our department or from our community as often as possible. And finally, although we are a very strong believer in open access, the nature of the data that we're looking at, of course, with a sensitive nature sometimes requires that there are certain access restrictions in accessing the information. So the idea of discussed data is that, as I said earlier, we not only want to publish the data and make the data available, but we also want to link it with the documentation of the survey of the methodology. But we want to establish a single point for evaluation of the data and allow the researchers tools for interactive evaluation, context utilization, and also discussion functions, including social media functionalities to make an interactive exchange between the scientists and also to strengthen the community building of this small, specific community of post-Soviet studies. So it's, as I said, a specialized platform. So we don't look at the complete discipline of social science studies, but have a specific look at this sub-discipline to address specific topics and already have the chance to include the scientists in the beginning and the design of the platform to actually tailor-made a platform that specifically addresses the users of this community. And the final idea is that, at the long-term, we want to make this platform as easy and as adaptive to use so that after the initial building up phase, the curators and the scientists, the experts, themselves can take over the long-term care and motivation of the platform. So we allow the researchers to upload the data. So the data comes from the community. There is, of course, is a peer-review process, so the data should be peer-reviewed by the domain specialists themselves to ensure the quality. We, for the storage of the publication a company, the data sets, we rely on existing infrastructure like the Daria. The eRepository Daria is a European project providing infrastructures for the digital humanities and also includes DUI publication of publications and data sets. And so the publication on the data gets a DUI through Daria and also the data sets can be commented and in a further edition, we also aim at allowing DUI registration of selected comments that accompany the data sets. And last but not least, we also rely on existing authentication infrastructures. Daria has a shovel-based also an access identification that we use to support the rights management and the privacy of the material. So in a nutshell, these are the features of the discussed data and net services that we have. So we have the evaluation and annotation tasks like discussion and peer-review of the data, linking the data to the publications and the underlying documents, allow user waiting of the data sets and tagging of the data. We allow social media features like sharing of the content, dashboard functionality, notification mechanisms and establish a researcher network. The documentation includes best practice documentation and guidelines to use the system and of course we have core functionalities, account management, metadata and data ingest. But all of this is in cooperation with external sources, external services and from the beginning on it was a clear eye on the community, especially on the creation through the community. So in the project infrastructure, we have a clear link to existing infrastructures like the Daria Depository that is responsible for a storage of publications that combine the data and also for DOI registration of any content and also for the access identification. We also have external data repositories that we can include so we can also allow data sets to link to other resources that are outside of our system and we also have the data collection, metadata and data integration that can be part of the system or also part of it can be outside of the system. So although we started last year when we were around half the way, we have established a prototype of the system although it's still an internal alpha release, I'm happy to show you some slides and some screenshots showing how the system will look. The first open beta will be released at the end of this year and we will launch the repository in the system itself next year to evaluate it through the community. So these are just some previews of the final system and I again apologize for the size of the slides. So if you're not able to see them, it was originally intended for much more comfy atmosphere in a smaller room but you might get an idea out of it. It's a classical infrastructure that allows you to have a profile and for user to log in and have its individual pages. This is an example of a page for a data set. You can see that there is under, there's a title in summary of course and where the error meter data goes to there actually is the description of the data set. This for example is an analysis of how media reported on expert pipelines in the Caspian region. So it looks at pipelines that went through the Caspian region for gas and oil and how media reported about these pipelines between 1998 and 2011. So this is not the media articles themselves but an analysis of the media articles looking at the way, the topics they have and depending on if the country is more or less free and has a more or less free press how the reports themselves look like. And of course there are some tags tagging the data. There's some author information and it also gives you some context information about this data set. And this is a screenshot of a discussion page. So this data set actually has a discussion attached to it where different users that are logged in can make comments on the data set, can evaluate it, can rate it and highlight specific aspects of the data that are worth mentioning. So this was a specific interest by the community to allow them to have active discussion on top of the data sets. So what are our challenges and what is our own evaluation criteria? So we start with a three year project and the idea is that the end of this three year project we will have a final system that will allow the scientists to use it and independently take on the work from the basic platform that we give to them. So we of course rely very heavily on community involvement and willingness to exchange data. And so as from the beginning the community was heavily involved in the project. We are very positive that this project and the final service will be carried by the community and already there has been a lot of community involvement and the very small but very specific community of researchers looking at the post-Soviet regions and have a big interest in working with this platform. Nevertheless, it is the final community's task to require long-term curation. So we also need the willingness of the community to provide additional work once the platform is ready. For the data integration we currently look at the technical implementation and try to make guidelines for best practice for data suppliers and users to use it but nevertheless, of course, without a critical mass of data the community will not use it and if the community will not use it we won't have a critical mass of data. So we hope to have the system ready at the end of the year and next year start it, start uploading it with more and more data from the community and hopefully engage more and more in interaction between the scholars using the system. As I said earlier, we have a pretty tight schedule because we have three years for the development of the platform and then at the end we'll give it to the community to establish its further use. Nevertheless, the strengths is that from the beginning and that is why we spent the first year in actually designing the platform in close cooperation with the community. So we looked at user-oriented design of the interfaces and workflows based on the needs of the specific community of scholars and our ideas of sustainability, of course, is that after this three years initial platform development the community should be able to operate the system on its own and continuous cooperation of the project partners. And we also have to look at challenges like ownership, copyright, data privacy and the current method is that we have authentication and also have access restrictions to sensitive information because sometimes especially with post-Soviet era data it can't be that the data is sensitive in the nature that you won't have to publish openly who the sources are who gave you this type of information and you sometimes want to anonymize names that appear in the data sets in the service so there needs to be some access restriction and anonymization based on top of the data sets. So what we achieved and wanted to achieve is actually to create a platform for a specific sub-community, a very small, very specialized community of post-Soviet era region scientists, researchers to have their own platform for their own data sets and from the beginning had the chance to be actively involved in designing it. We will also try to have strong support to large data sets and institutional collections because there already exists a number of collections that we want to integrate into this infrastructure and where needed but sometimes just provide a close connection for example by assigning DUI, identify us to data sets of course, it's easy to integrate the data without physically moving it from existing infrastructures and also we want to provide a platform that gives a special support for data integration and visualization of the data. These are parts that we haven't actually touched in detail so far but we're positive that we will have some support for that. And last but not least, this is an interesting special, very specialized example of active community engagement in a specific area and also has the strengths of actively providing interactive features for evaluation and weighting of data sets based on our experience and by the needs of the specialized community but we're happy to share with you our thoughts and happy to learn from your ideas and see if you have examples of comparable infrastructures. So I think that was already my presentation and I believe I'm on time so we have now time 10 minutes for questions.