 about this in the afternoon. And it's my pleasure to invite our next speaker, Nikos Housos, from National Documentation Center in Greece. And he'll talk about Engage Project, which is infrastructure for open linked governmental data provision towards research communities and citizens. Thank you very much. Good morning. So I'm here with, let's say, two hats. First of all, and mainly, I'm a software development manager at EKT and a board member in Eurochrist since 2009. And actually, since last month, I have undertaken at Eurochrist a new role, which is to be responsible for a task group called Greece Architecture and Development. So under both these hats, I'm involved in OpenAir Plus, especially in the design of the data model and its alignment with CERIF. However, this presentation concerns another activity, another project. And for this presentation, I will wear mainly the hat of Eurochrist board member. So I will talk about Engage. Which is a project of the FP7 Research Infrastructure Program, and it concerns an infrastructure for open public sector information, and especially data sets targeted mainly towards research communities, but also citizens. So it's an important initiative, which is complementary to the activities concerning open access to research information. And in this presentation, I will try to provide an overview. And my main focus will be to identify some key points and key issues where collaboration would make sense between the Engage effort concerning public sector information and the OpenAir Plus effort, which focuses on research information. So a very short overview of Engage, some factual data. It's a project, as I said, before of the Research Infrastructure Program of FP7. It aims to build an infrastructure for open linked governmental data, both the open and linked terms are very significant there. It runs now for one and a half year, and then May 2014. And the coordinator is the National Technical University of Athens. There is a range of partners from the academic and the commercial sector, and Eurochrist also is involved in this project, mainly in the area of metadata specification and interoperability. So let's see what we mean by public sector information. It's simply data produced by governmental organizations. It typically refers to data sets, and one can think of many examples like geospatial data, demographic data, statistical, environmental, financial, public safety, and so on. As probably all of us know, there is a growing international movement, or it's open access to public sector information. It's a worldwide movement, which has grown significantly during the last years. And the reason for that, besides the feeling that openness is good for the data, this intuition, that the openness is good for the data, there are also some quite practical financial motivations behind that. So there has been some work, several studies on the opening up of data sets, and their impact, and this impact of opening up confermental information. And these studies have shown that openness can really lead to substantial economic gains. So a recent study published by the European Commission in 2011 identified the potential benefits to 40 billion euros per year across the EU countries as immediate benefits of open gap, and 140 billion euros as indirect benefits. So there is really an important economic and wealth creation motivation behind the opening up of public sector information. So just to provide an overview of the objectives of engage, the aim is to build a real life system. It's not a research project or a project aiming at the prototype. So we need to build an infrastructure capable of opening up public sector information data sets, capable also of supporting collaboration between users of this information. And we'll talk about that with a little bit more detail later. With a specific target to scientific communities, especially in the area of the social sciences and humanities, where probably the data that is provided by and produced by governments is particularly useful. So engage has, as its main objective, a two-way scenario where public sector information will be collected, will be curated, and improved also using the efforts of people probably not involved in the public administration, so through a way of crowdsourcing, which can be really important with you. And then through this curation, one can get very good contextual information about the data sets, better quality for the data sets themselves. And then this can be the enabler of providing very advanced services on top of the PSI information. Also, another dimension of the usage of engage is to deliver open data requirements and guidelines to the governmental organizations. So provide feedback to the governmental organizations and guide them towards publishing their information in an appropriate way that promotes easy reuse and the maximum utilization of the huge potential of public sector information. So if we want to schematically to have an overview of engage, there is a huge heterogeneity of public sector information sources. Some of them are structured, some are semi-structured, some are maybe totally unstructured. So engage would like to traverse all this information, provide a single point of access, and also a single point of collaboration, which is a very important notion in the project. A key objective of engage is not to be an isolated data silo, but to be a vital part of the global link open data space. So the idea is not to provide a single system, a gateway to public sector information, just to gather people there, but to be a part of the emerging link data paradigm and enable the linking of information to other sources, which really unleashes the potential of this information. And we will talk later about how this is addressed at the technical and the interoperability level. So in the interest of time, I probably proceed a little bit faster. So just to, let's say, identify a couple of key issues and key messages from engage in the approach. So one important finding in the course of engage is that what we need for public sector information is high quality and rich information regarding the data set. So both high quality information in the data sets that can be improved heavily through the contribution of people outside the public administrations, for example, researchers, citizens, scientists that might not be the original producers of the information, but have the knowledge and probably the motivation to help in improving the data, for example, for using it for their own purposes. So in engage there, there is the capability for somebody to derive a data set from an existing one, improve it, and upload it back to the platform to make it available to other users of engage. So this is a really important part. And this is assisted in the platform by web 2.0 collaboration features. So the idea is to attract also groups of users that span administrative boundaries, country boundaries, and so on, people that are interested for a specific subject, for a specific category of data sets can set up their own groups and work together, collaborate to improve these data sets, combine them to produce high level information that can be used for value added services and make the most out of the information on or against and also contribute back to this platform. So a really crucial, let's say, identifying or characterizing factor of engage is the emphasis on the high quality of information, on the availability of rich metadata that enables easy discovery and reuse of information and also value added services. And the ability and the, let's say, effort to utilize the capabilities of users in research community, scientific communities, but also the wider audience of citizens that can, let's say, improve the information. So we do not just aggregate information from a large pool of public sector information data sources, but we try to provide the tools to improve information to attract public sector organizations to deliver information directly to us or other communities to contribute there. So this is an important fact and an important, let's say, effort within engage. So this is a platform that is now being built. It will be available in spring 2013. There is a prototype platform that is already available, but it does not have the full features that we have planned and we're developing. So in the next few months, this will be available. So functions and features for the users of the platform are mainly focusing, as I said before, to extending, improving, deriving data sets from existing ones. And for that reason, we have also incorporated some features about visioning, about seeing the history and the provenance information of its data set within that platform. And we have done some considerable work to facilitate data improvement, data cleansing, data visualization, not reinventing the wheel but trying to connect with already existing common tools that are best in these areas. We encourage collaboration, for example, data requests. A user does not find a data set in engage and looks for it or looks for some improved version of a data set. So again, there we try to build a community of users helping each other with data sets. And we try to engage as much as possible the providers of information, the governmental organizations, both by guiding them and providing them information on how to publish their data sets but also to try to attract them to be part of the community, to form groups of employees that can work together. Probably form groups of employees across organizations, which is probably an idea with huge potential, different groups from different parts of the administration to work together and combine their data sets to provide some more high level information that is meaningful also to the government itself. So all these features are in the platform. And just to go briefly into the technical stuff and the interoperability stuff, which has some implications also on collaboration opportunities, we found out that rich and structured metadata is really important to enable linked open data. That is really useful for the ecosystem of engaged and the public sector information data sets to be part of the global linked data infrastructure. So a couple of important points there is that the metadata model that is used needs to be structured. So entity and semantic relationships are used instead of plain data fields or mostly instead of plain data fields. So its entity, its organization, its person has a structured metadata, including a URI field to uniquely identify it. And instead of having fields like author, maintainer, or whatever that contain main plain strings, you have separate entities. And the role of its entity in a particular relationship is defined separately through semantics. So there is a record for an organization that can be the creator of one data set, the maintainer of another, the commissioner of a range of other data sets. So this information is explicitly recorded in the metadata model. And there is the ability to dynamically include the system, new vocabularies, not hard coded into the system, which makes a lot of sense for linked data. So let's say the predicate that is used to link information can belong to a vocabulary that is initially external to engage but can be reused there. So for this to become possible, we have used Serif, the entities and the semantic layer, which provide the required features for producing high quality linked open data. Of course, having the right structure and the right metadata model is not enough. A lot of work needs to be done for assigning identifiers and all that stuff. But this is a necessary condition, we believe, to provide linked open data. And this is very important for allowing users to discover data sets, to evaluate their utility and their reuse potential and actually to reuse it, which is the essence of all this activity. So going deep into the technical level, we have adopted a three level metadata approach. I won't go into very much into detail. So we have three levels, the discovery metadata, the contextual metadata, level two, and the level three is the domain metadata. So we have seen that in data sets there are vertical domains, which have their own standards, each of them. So you cannot create probably a very detailed domain independent model, but at level two, one can create some metadata model, which is common across domains and can adequately capture the semantics of the relationship of different entities with each other, which is the crucial thing for interoperability. So this is a depiction of the architecture in more detail. So we use Serif as the level two metadata model, which is central to the architecture. The idea is that Serif has the expressive capability to represent any type of information that we can find in the public sector information data sources, at least the domain independent information. And from Serif we can import and export other standards that are common that are necessary to be supported like SICAN, GMS, Dublin Core, Decad, and so on. And we link to Serif from Serif to the more detailed metadata standards that can be DDI, for example, in the social sciences, SDMX for statistical data, Inspire for geospatial data, and many others. So this architecture enables interoperability and this can be used for the also exchange of information with system like OpenAir Plus. This should be particularly feasible since Serif is the, let's say model that is with which both systems are compliant and having, let's say, a semantically and expressively rich model at the center enables the generation of all other types of information and of course the generation of linked open data in that F format and the provision of these formations through a Sparkle interface. Actually for the implementation, we use a unified server which also, which enables the storage of the information in a relational database and the simultaneous exposure, storage in a typical store and the simultaneous exposure as RDF. So that's about it. Thanks very much. I think we have time for one, perhaps two questions.