 So good morning, everyone. It's a pleasure to be here. And well, I'm Fernanda. I'm currently leading the open knowledge chapter in Brazil. And I'm going to present this, our dearest project, which is Querido Diário, which literally means Dear Diary. But it's a wordplay that maybe doesn't make much sense in English. Because in Portuguese, we say diaries for official gazettes and official records. And also the personal journals we have are my Dear Diary. So it's this idea. And I'm going to try to answer these questions, all these topics. Why we did it, what we did, how we made it, and who is using, and what for, the impact, and what's next. So let's see if I can do it on time. So this is basically why we did it. This image summarized why we did it. Because this is an image of the official gazette from almost two centuries ago. And this is the yesterday's gazette. So it's basically the same thing, but now they are on the internet. But of course, definitely you wouldn't read it while having breakfast. So it's not something easy to get. And this problem in Brazil becomes a huge problem if you think that because of our federative system, we have 5,500 something autonomous cities that can deliver all sorts of public services. They must deliver public services. And they don't have any standards for doing so. So there's no standard for records publication. And in a situation like that, you can imagine that PDF brings lonely and sadly. And also, we used to say that we have data deserts. Because in these cities, they don't have transparency portals. The vast majority of cities don't have. We have any capitals, the larger one. But even the ones that have transparency portals, they don't publish everything. They publish the mandatory things, such as contracts. And the rest is forgotten in the gazettes. Things like public purchases, contracts, appointments, dismissals of servants, local legislation, public policy, and so on. So this is querido diário. This is just a glimpse of what it is. You can, in a more friendly format, search for keywords for cities and for periods of time. But it's important to say that this is not only a platform. It's also an open infrastructure. It's more than a platform. So I'll just give you an idea, an overview of this infrastructure. Well, we have the official gazettes, the websites, which we have to discover where they are. Because we make a collaborative census to discover where they are. And then we have, we build custom scrapers. They have to be specific, because each city publishes in a different way. And we collect the file of the official gazette. And we keep that file. And we also get metadata for additional number, dates, and so on. We keep everything in our database, indexing currently we are using Elasticsearch, but we're thinking about that. And we process all these files, PDI files, is extracting pure text and making it available in an API. This API is already serving some platforms, including the one I showed you. And of course, the final user with journalists, researchers, and public managers, and so on. So it has four components. I already mentioned this data engineering part, where with the scrapers and processing, we have an open data component because of the public API and forms of download we are implementing. We have the search console in the web platform. But I didn't mention the intelligence part, because we are trying to gather other public data sets and to make sense of this data as well. So we are exploring some artificial intelligence, like natural language processing for contextualizing, categorizing, and enriching data. For instance, with other data sets, such as the company registry in Brazil and all the company owners. So when you get a code of a company in a publication of a Gazette, it would inform all information we have about this company and the owners, and so on. So all this is possible, this crazy idea of candoring local records, because we have a participation strategy and a collaboration going on. So first of all, it's an open source, and it's available on GitHub. We have more than 80 contributors in just one of the repositories. We have a community in Discord with almost 600 people. We deal trainings with our school of data. For instance, we created a Python for Civic Innovation to teach people how to make scrapers, so they can do scrapers for their own cities and help us to implement in our interface, in our structure as well. And we do lots of participation in conferences, such as the Python ecosystem in our own conference of data journalists called the BR. We have a network of civic ambassadors, more than 150 people in this network. We have a program for cooperating with universities, this cooperation program. For now, we have agreement with three higher education centers, and the students and professors are exploring technical issues we have. For instance, how do we segment several cities inside a PDF, and how can we put this in an easier format for people to consume? So they are working on that. And a Northeast institution does solve our problem of segmentation with students from undergraduate students, so it was so, so nice. And we also do open calls for local journalists. So we pay a small amount for them, a small grant, a micro grant, for them to cover stories based on the content they find in this platform. Or if they are more advanced, they can use the API. For instance, we have this independent journalism collective in Rio de Janeiro in Favela da Maré, which built their data-driven lab. They were formed by School of Data, by the way, five years ago. And they did a report on basic sanitation in the community using the things they learned in the deer diary. This is a more recent page. It's available now in the Querido Diario context. It's exclusively for monitoring words with technology for education. How cities are implementing technology. And for instance, I just put Google to see what is happening in the last month. It's not showing here, but I've selected the last month. And it showed that a city is acquiring, in an emergency bid, Google licenses for education with no transparency, and no competencies. No. Yeah, it's not doing this by the book. Just an example. And here we have alerts. People can receive email alerts of keywords they want. And it's already being used for public managers to see what other cities are doing, but also people like institutions that, for instance, internet lab in Brazil that researches, surveillance in schools are keeping an eye in what cities are doing with facial recognition in schools. If you put facial recognition, it will show all the cities that are doing something with that. And this is another layer, because we do some processing to really identify what's linked to education, to technology. And it's a filter that keeps it easier for people to monitor the subject. And we're doing the same thing with environmental. So this is the climate diary that do this processing for finding content on environment. So acts, public acts, such infractions, or policies, or legislation related to environment that keeps easier for journalists to search. Because otherwise, they would have to go to each municipal diary to try to find keywords on PDFs. So it's much easier that way. And this is what we did, I forgot to say, that the previous one we did with educational organizations. And this one we are doing with environmental organizations. So we are partnering with people that have knowledge in those fields. Those fields. By now, we have 67 cities available. It may not be much if you consider the absolute number of cities. But if you consider that we have 47 million people live in those cities because we started with the larger ones, it's already something, an impact. And it's almost a quarter of Brazilian population. So what's next? What are we trying to solve and to do next? We have to improve the architecture because it's limiting us to escalate, which would be the next step. We are planning to have 250 cities by the end of the year. And we want to cover at least 1,600. But we have to do improvements in the architecture because it costs a lot. So we do crowdfunding. We have specifically funding for doing these thematic layers of the infrastructure because no one cares about official records until you do something related to a subject. So this is our strategy to keep it sustainable and growing. And also to research more the NLP and other artificial intelligence applications. And our dream, our final goal, would be construct to build an open-source system for cities, open by default, because it makes no sense doing all this scraping all the time. If cities would adhere, it would be much better. But we have no power over this. So we are starting to show the value of opening this information. And cities are starting to want to open by default. They're coming to us, asking, how do we do to being the platform? How can we have our city so I can use the API? And we are starting this dialogue with them to show the value of having this information. So join us in GitHub if you want to have. We are making an effort to have a bilingual code so people of other contexts can contribute. But the main language is Portuguese. And just quick facts about the open knowledge in Brazil. We have these four lines of action. This project is inside the Data Science for Civic Innovation Program. But we also have the School of Data Advocacy and Research. For instance, they do reports about education and technologies using the diary. We do trainings based on this content. And we have services and cooperation as well. We have 15 people in the team. These are the people. And Giulio is the technical leader of this program. And we have community manager Rebecca and the intern Juliana, which is super, super engaged in the project. And this is the other people of open knowledge in Brazil. So thanks so much. Let's talk. Thank you so much. We've got time for a couple of questions. Does anyone have a question? Your hand went first. I'm going to run over here. Oh, you know how much I love this project. I want to know about the census data that you had in the scheme that you didn't really get into. How are you connected with the diary data? The census is basically for discovering the URL for the city and also to show the level of openness the city has. Then the platform shows the API at this point of saying, well, the city has already the, we discovered the link or we know what it is, but it's still not in the platform or we know what it is, and it's in the platform. So it gives it back. Thank you for the presentation. Amazing project. I was curious about the reception, if any, that this project has had on the government side of things, on the agency is responsible for publishing these records? Yeah, it's mixed because some of them don't understand. And we are scraping responsibly, not making a mess in the servers. But mostly they want to know more, they want to know how to open by default. And we are starting to do cooperation with cities associations. So I haven't mentioned, but we have a dialogue going on with two states that had several cities in the same system, proprietary system that just publishes the PDF. And we are trying to make them understand that it would be much easier if we just get in the source the information there. So they are starting to understand the value and mostly control organizations, internal control, because they want to monitor and they are trying to see how they can use the API. And we know the federal government, I think August is not here, but the federal government I just discovered they are using these to keep track of what seats are doing. That's brilliant. Thank you so much. I want to talk one more time.