 Let's start. So the second open air content providers community call online meeting dedicated to all repository managers, cruise managers, journal editors that are part of the open air infrastructure that are contributing with content to our infrastructure to be aware of the recent developments to share with us the feedback about also the type of uses that you are doing about our services and for us to share some novelties with you and to share all the new things that we are integrating in our different services. So welcome to this community call. So everyone is welcome to present and to and to make questions. So the organization of this call is that we have a first five minutes where I share some of the novelties about our content provider dashboard and related services and then we have a main presentation this time and this call is indicated to the user statistic service 25 minutes, half an hour and then we have 25, 20 minutes to receive your questions about the presentation of today the user statistic service or about any other issue that you have with other open air related services or open air related activities. So from the guidelines to the different services that we have available. So I ask you just to if you have questions during the presentation you can put it in the chat but if you want to make your question in using your microphone so you are welcome this is a this is a participatory call so after the presentation about the user statistics so you are welcome to join us and to make your question to discuss all of you. We have all the channels open to your participation. We don't mute you so you can just enable your microphone and put your question. So all the information about the call the different calls links to the to the to the minutes to the agenda to the links to the to the different sessions so it's available in the in the in this website provide community calls and you have also here a short worry L that Andre also already shared here in the chat for the notes so you can also put your comments and make your and we have also access to the presentation in this in this notes where we have the agenda and the notes. So before I give the floor to my colleagues in charge of the user statistics service I just want to highlight four different things that are always of course lots of different other activities in open air that I would also highlight the public consultation that we are running until January about the new enriched research graph that we have available so please go to the open air portal and check the information about this public consultation you I think some of you also receive it via our newsletter but I just want to highlight three new things related with and available via the dashboard for content providers so now one is something I think quite important that we have in the in the new in the broker events so in the enrichment events that we have available in the in the content component of the dashboard and so we have now new events related with orchid so we are sending you some so metadata enrichment about orchid about authors that you have in your repository and we found orchid ideas from those authors that you have you are exposing in your repository and then you can use the information that we are sending you via the dashboard in order to enrich your metadata so if you check in your broker events in your private area in the dashboard if you have any event related with orchid at least I know that lots of repository managers have already access to this type of this new type of events also about the statistics for sure my colleagues will talk about that but we just want to highlight that we also have we share via the dashboard and then via all the support information related with the user statistics service and new generic tracker scripts so a new so a complementary to the plugins that we have already available for reprints in this space software platforms we also have a generic tracker script to track the users statistic events so be aware of that and check also in GitHub or also in our support information in the last information is something that came from our community so because of the of the alignment that we have between open air and and in Canada and different Canadian University is a set of Canadian universities have are supporting the development of extensions to this space five and six in order to be compliant with the open air guidelines literature version for guidelines so the last version of the guidelines for repository managers which is something so great that emerged from the community so we need this case for Canada so we all can benefit from that by February next year so by February 2020 we hope to have this the the needed developments in order to comply for those that have repositories the space repositories in version six or seven or five so you will be able to easily to comply with the guidelines these are the three main informations I just want to also to highlight that the dashboard the providers board have a public roadmap so check what we are working on and if you have suggestions or if you want to provide some feedback about specific recent developments just share your your thoughts directly in Trello so we have a public roadmap available in Trello and you can also provide your your comments so of course there are other initiatives but if you want to share also some news related with open air please share it in the chat or in the second part of our of our meeting now let's focus on our main subject today the user statistic service where my colleagues Andreas Dimitris and Jochen Andreas and Jochen are from Bielefeld University in charge of the aggregation process in open air and also responsible for the user statistic service and Dimitris a member from from our technical team in charge of also running the the workflows in the data collection of this new open air service so I will give the floor to Andreas to to do the first service overview I will just make Andreas presenter so okay now Andreas is presenter I think yes yes Andreas is not present perfect so do you see the screen yes yes perfect so thank you Andreas and the floor is yours so I just want to before Andreas take the floor to to say that I will I will I unfortunately in the middle of the meeting I need to leave the meeting so I will not manage the second part but for my colleagues we will manage all the discussion around the service and then other things and also my colleague Andre from University of Mino will be able here also to manage the call so thank you very much for joining and I hope that you can benefit from this presentation about the user statistic service so thank you Peter and thank you that you are participating in this call for user statistic service I will do the first part and the last part of the user statistics service presentation and in the middle I give the floor to my colleague Dimitri from Athena Research Center and we start with the user statistic service and an overview about the infrastructure on the right side you see our open air info space the syndicated services that are running in our info space to clean up services to de-duplicate records from different repositories and also for validating repositories some of you use this for the guideline version 4 or Chris guidelines so you can validate services validate your repositories and also there are some features of enrichment of records so all these information space would be aggregated from public repositories or research data repositories cruise systems and so on and these produce the open air research graph that are linked projects results publications and so on together additionally to these info space services there is the user statistic service this is dedicated for tracking lock tracking events events from your repository that can be read us in different ways into the statistic service and also the statistic service produce reports in different ways to expose the information that are collected with the user statistic service this is a short overview about the open air info info space and open air graph and the user statistics and the features of the user statistics are to track views and downloads but also we can collecting reports via the counter format so there are different ways that are we described later in the presentation of dimitries also the feature of anonymization of IP addresses you have the possibilities to anonymize your IP address from of your customers and send this information to our user statistic service but there are some some related some some things to relate it to the data protection in some case in some areas the second part is that you have seen that we collect many of from from many different repositories and we have deduplication service to deduplicate and identify same records and also these deduplication works for the user statistic service and last but not least the counter code of practice the service is compatible to that and produce reporting in this format so we come to the technical part of the user statistic service in more detail and I would like to give the floor to dimitries can you make me the presenter of course one moment hi dimitries I think you you can now sure yes I can okay can you see my screen right now yes okay okay I'm Belitis Piracos I'm from Athena Research Center let me welcome you to our second community call as Pedro mentioned I'm part of the technical team of the open air and I will try to provide the some more technical details of the user statistic service for example what the service is about what it does how it can be configured and how you can use the service to get some useful information for your repository in this slide we present the architecture and the workflows for the user statistic service there are two main workflows two main strategies they say for tracking and collecting and usage activity the first workflow which is depicted on the left side of this finger and is based on what we call a push approach we provide the tracking software which operates on the server side meaning on the repository side and this tracking software is used to push users activity by means of metadata views or item downloads on open air when a web analytics platform the web analytic platform is based on well-known matomo web analytic software web analytic software so the tracking software is installed either as a plug-ins for e-prints or as a patch for this space versions four five and six and the tracking software operates in real time meaning that each event its usage event is sent to matomo the time it occurs recently as Pedro mentioned we have developed a more generic tracking approach and in particular we are providing to repository managers python script that it can be configured to pass web server logs extract the required information only again metadata views and time and item downloads and subsequently push this information to open airs analytics platform the difference between this approach and the this script and the plug-ins and patches is that it operates in offline so it can be scheduled to run anytime but it does not send the user's activity in real time to the matomo analytics platform so the usage events that are stored in the matomo are retrieved at a later stage in open air user statistics database where they are processed using the countercode of practice which is now in release four using this countercode of practice we remove double clicks we associate item information with our metadata index so we will be able to hand the deduplication and for instance extract the user statistics for the same item on different repositories this information is subsequently used to generate reports such as graphs in the dash that we can be used in the dashboard or counter reports that are exposed using a sushi light endpoint and we will I will give examples of these reports later in the presentation so this is the first strategy the first approach or workflow that we use to track user's activity the second approach which is which is the picket on the right side of the of the slide is based on a pull approach this this workflow is exploiting sushi light endpoints which are provided by aggregators like iris uk and these endpoints allow us to retrieve user statistics in the form of counter reports that contain usage information these reports are collected again are collected in the user statistics process using information from our metadata index and displayed as reports in the portal or by again by the sushi light endpoint so this is more or less the two approaches that we use to track and collect users activity so just to recap for the push workflow dragging workflow after a user institutional repository registered in open net user statistics service by the content provider dashboard we provide them either a server side the real time dragging plugins for this space or patches for imprints using these plugins or patches this users activity is tracked is tracked and logged at open net's analytic platforms for real time and as a second option we provide the a generic log file parser which is a python script that parses the repository log files essentially uses events to open net's analytic platform but not in real time this time later in offline workflow which is based on matoma api transfer stores users then to open net database for statistical analysis the duplication removal of double clicks etc and then the statistics are deployed for human in open net's portal and for machines let's say consumption via the sushi light api endpoint for a repository to enable the metrics and access the user statistics service we provide in the providers dashboard this information on what you have to do to enable this service so for instance the repository manager has to download the tracking code for the repository platform has to configure the tracking code according to the instructions which are provided and to see has to deploy that the tracking code on the open net side we have to validate the installation of the tracking code by confirming that we are receiving software and usage activity and accordingly inform the repository manager that everything is is okay so by clicking the enable metrics button the repository manager receives information on where he can download the code depending on the based on the platform that he has or for example as a path for the space or it prints and as a generic script he's also provided with a pwik id and then authentication totem that the token that have to be configured in order to allow the the tracking of the user's activity this is these are the more technical details on the configuration files that are supplied for the plugin and patches for this space and it prints accordingly you can see that for both platforms we have to specify the endpoint for the the matomo endpoint which corresponds to the endpoint of the analytics platform you have to configure the pwik id and the matomo id and the authentication token pwik and matomo are the same recently the pwik has been renamed to matomo but the configuration files in the platform are used the same the old name pwik not matomo i don't know when they will be changed and we also provide the option to anonymize the ip address of the users activity that is tracked if you can specify how many bytes will be hidden one two or three or you can ignore it completely and deliver a user's activity without the requirement of ip anonymization and you have also the option to enable or disable the the track for instance if you have troubles or if you have technical issues or you want to update your software etc the same configuration is is required for the iprints plugin you have to specify again the pwik tracker location the pwik site etc for the generic tracker script configuration and more technical skills let's say are required you have to install the python virtual environment created virtual environment environment for the project and activate it you have to download the counter robots list which it is a list of agents that specify a set of agents that have to be ignored and because they correspond to non-legitimate traffic so they have to be ignored by our script and you then you download the python and the configuration file you configure the matomo parameters in matomo.yaml file and after installing if it's required the pyyami packets you simply run as a porter or you can schedule it anytime it fits you to send the to pass the logs and send it to our analytic platform. This is a generic tracker script configuration file you have to again you have to specify the endpoint you have to specify of course the repository site id and the authentication token token some parameters for the matomo and the important part is the at the left at the right side of the of the slide where you can you can see that the you have to specify the my pmh preamble of your repository and you have to specify the metadata location and the item location where in order to tell the script what kind of information it needs to pass for example for the disk space you have to say that the metadata items are located in the preamble order for the downloads you have to say that you have to tell the script that you should look at the bit preamble etc and you can use of regular expressions which are supported in order to be able to download to allow the script to send the user's activity to pass the script and send the user's activity to the platform and this information refers to the first workflow to the push workflow for collecting consolidated user statistics reports i mean this is the push workflow we simply gather consolidated statistics from aggregation services such as for example iris uk using susilite the susilite protocols the statistics are stored to open s database for analysis and then are deployed via open s portal or susilite api for now we have 61 repositories which are tracked with matoma id we have 78 repositories which are tracked which are collected by iris uk susilite endpoints and we have also 10 sarco jls journals from portugal which are also tracked by which are also collected by using the susilite endpoints provided this is the evolution of our metric service you can see that for the last two years how the matomo traffic has been evolved and how many visits we have during the last two years and you can see some statistics that we have for the last year in terms of views and downloads you can see the number of views we have and downloads in total that have been collected and we have almost six million downloads that have been collected in our platforms and on the right side on the on the top you can see the iris uk opener and sarco jls downloads and that have been collected and at the right at the bottom you can see that the the number of views that have been collected on the or from the matomo using the push approach note that for now we collect views only using the matomo tracking software because iris uk does not have does not provide views for now using the counter in this five i think they will include also information about metadata views this is a snapshot of the of the user interface in the corner provided dashboard where you can see the views and downloads for the for a particular repository in this case we have the the repository of the university of minio and you can see how the views and downloads of all these repositories are displayed in the content provider dashboard for for each month and this is another snapshot from the from the dashboard where you can see for it for a particular item the the views and downloads that have been tracked in two different repositories for example you can see that we have for this particular item we have a number of views for a utl repository and the number of views for the university of minio and we also provide the number of downloads for you in two different repositories in utl and minio and this slide refers to the feature that we the functionality that we have discussed about the duplication of information we also mentioned that we provide sushi light reports and these are the supported reports that are compliant with the counter-release file so we provide the reports ar1, ir1, rr1, br1 and br2 which are a number of successful articles that are loaded requests by month in the repository or a number of successful items that are loaded requests by month in the repository etc these are provided using this endpoint so you can connect there and specify what kind of report do you want so you can download by means of json file these are for example you can have this response of either for a repository or for a particular item this can be retrieved using our sushi light endpoints and get user statistics information for either for a repository or for a certain item so this is more or less for me regarding the technical details of the service i will give the floor to Andreas to discuss the next steps in the closer talk thank you thank you Dimitris for the detailed information about the different plugins and components that we have from my side there are two more slides to present you what we see in the next steps the first is we have seen some screenshots in Dimitris presentation about our upcoming and new provide dashboard that are actually under we are working on and we improve the visualization of events in these new dashboards you have seen this in slide of Dimitris here this was our first main point the second is we would like to offer a short snippet code for your repository software to embed the user statistic from our analytics service in your repository so if you use the snippets in your repository software and embed it in the page of an article then the snippets was automatically automatically receive information about the user statistics about these records and showing up the numbers from open analytics service for you we also support the countercode of practice for research data just working on and updated the countercode of practice from release four to the release five and the last point is a point that is dedicated to infrastructures like of mayor or a la referencia irsuk to setting up and use the statistic hub to exchange information about user statistics in that case there are the next steps and in the next year we would like to extend some user statistics in the way that we have some cooperation with other projects the other projects are in this case in this example case use case open apc and the service from knowledge unleashed the open analytics service research service and we would like to discuss at the moment a layer of open research analytics services um from these different sources um to get a better understanding about the transformation of open access um in the u-paying countries but all over the world let's say this is future and um this would like that we are working on at the moment and I would like to close and like to thank you