 Hello. Welcome and thanks for joining me in my talk about the future of personal data. And I'm actually really happy to be back at the LibreOffice conference. It's been such a long time talking to you guys. So, yeah, I'm really happy to be back. And for those who don't know me, let me quickly introduce myself. My name is Björn. I'm a psychologist. And professionally, I work for KDAB or KDAB as a project manager and a UX person. But I also have a long history in free software, which started around 2000, when I helped to create the Open Usability Initiative. With this initiative, we helped a lot of free software projects to build up usability, expertise to improve the user experience of their solutions, of their software solutions. And I was also involved into LibreOffice in the past. And mainly I've been working on trying to understand who the users are. So we introduced surveys into LibreOffice and trying to get a more data-driven approach to user experience. And next to being a UX person, I've also always been a privacy activist. And, you know, the struggle of these two competencies is actually what qualifies me for this talk I'm giving to you today. Because as a UX person, I always wanted to know everything about the users. While as a privacy activist, I try to educate people not to share personal data. And, you know, this is kind of the struggle of my life. And today, I'm going to present to you how I think a solution to this problem can look like. So what can you expect from the talk today? And I've split it up into two parts. And in the first part, I'm trying to explain to you why we have a problem, how this problem can be solved theoretically, which is data cooperatives. And I want to introduce to you a practical solution because there already is an initiative or a project that actually tries to solve this problem, which is called polypoly. I will talk for, I don't know, roughly 30, 35, perhaps 40 minutes. And after that, we'll have enough time for discussion where I want to know from you how you like the ideas I present, how we can sort of move on. What are the next steps? And I'm really happy that Felix von Polipoli is going to join me for this discussion. So he'll be able to answer more in detail questions. I don't have the answer to about the polypoly initiative and the polypoly project itself. So let's start off with the beginning. Why should we care about personal data at all? I mean, we all know personal data is bad, right? We all hear those stories about this bipolar person that's drifting into the manic episode and the AI systems, they recognize this. What are they going to do? Are they going to call for help? Are they going to call a doctor? No. They are going to sell you a trip to Las Vegas. They're going to present you an advertisement for a trip to Las Vegas. I don't know if this specific story is true, but you hear so many that sort of have all the same kind of story to tell and that is your data is used against you. Why? Because it's not your data anymore. Your data is stored by an oligopoly of a few companies that know such a lot of you and their business model is basically selling you something or presenting ads more or less and if these ads sort of make advantage of a weakness of yours, then they don't have any problem with this. They're just going to do this. But the problem is not only on the personal level and I'm not going to pitch you for long about the problematic of personal data because I'm really sure we're on the same side. The same thing has happened on the level of society as well. There have been attempts to influence elections like the first Trump presidential or the Brexit campaign. If you take a look at the Cambridge Analytica scandal, then you'll understand a lot about what has been going on there. There's a great documentary in Netflix about this. But the idea basically is that because those systems know so much about so many of us, we're able to spot those people that we can influence and influence them in a way that they either go to the elections if we expect them to vote the way we want to vote them or that we keep them away from going to elections. And by this they try to influence the outcome of elections, which is how personal data is used against us as a society. So I don't want to dig any further into this topic because there's more interesting stuff to talk about. I'm pretty sure you all know about how bad personal data is. So the question again, why should we care? And the answer to this is as easy because we all know personal data is good. Think about the COVID situation. How much better could we have dealt with this situation if we had a trustee we would give our personal data to? If it would be possible to have a central mechanism to evaluate all our personal data. But we don't have this, so we ended up at least in Germany. This is my kind of perspective to the world or my biased perspective to the world. At least in Germany we have a COVID app that is so lala. I think we could have done much better than this. But also other innovation nowadays is basically not derived from smart algorithms or anything, but it's simply just mass evaluation of personal data. Think about things trivial as predicting traffic jams. You can only do this if you know the position, the velocity of a lot of people. And if you find out they are not moving at the speed they could then you have a traffic jam there. But you need this kind of data in order to create this kind of innovation. It's not an algorithmic innovation. It's innovation based on personal data and this is something we're facing much, much more. This is something where free software actually loses. As free software will we be able to create traffic jam prediction or something like this? I guess not, even though it's trivial. But we can't get the data. The second aspect I want to mention is that we need this kind of data to be able to create better user experience. For those who don't understand user experience but still think this is some kind of creative magic thing happening. No, user experience is something, is an optimization process. And this optimization process is fed by knowledge about the people that are using the software, the situations they are using it in and the goals they are using it for. And with this kind of knowledge you try to optimize the tool you are creating in the end. And this is something our proprietary competitors actually make a lot of, they use it a lot. They get a lot of data out of this kind of, you know, they get a lot of personal data they use to improve their user experience. And this is something we as free software we don't have. We're not right now, we're not assessing personal data. So we cannot improve the user experience of our products in a way our proprietary competitors can do. Of course I said I try to introduce surveys into LibreOffice. But when we take a look at the answers we find out, for example, that 99% of our users are male. I mean this is definitely we're not reaching out to the whole population. All we get is a very, very biased sample and that's all we can access. And we should not optimize our user experience based on these biased samples. That is very, very problematic. So this boils down to the central problem I think we have. And that is we will not be able to create software that is as good as the software of our proprietary competitors. And this again leads to the situation that our users are users that use free software for ethical reasons. Not because what we are offering to them is a good tool. And this obviously is especially true for user interface free software. I mean with free software we won sort of the battle about the backend of the web. I mean that's more or less all free software running there, the backbone of the internet. But for I don't know 20 years we're calling each year. This year is going to be the year of the Linux desktop and it's not. And you know the same is that there are very, very few software solutions out there that actually convince people through quality and not because they are free. And this is something I'm very concerned about. I do want free software to succeed for ethical reasons. I want it to succeed but I know it will only succeed if we are able to produce good software. And that's why I think we need a totally new way of thinking about personal data. Personal data does not necessarily has to end up in the silos of a few big companies. There are different possible solutions. We just have to work to make them become real. And basically the solution to the problem is it's trivial. It's a cooperative. You can read in the web and you can read in literature a lot about the idea of creating a data cooperative. A cooperative is nothing different than people who cannot achieve something on their own, joining together and together achieving the common goals. And if we rethink personal data like this, I think there is a great chance that we can find a different system of how personal data is handled. But let us step back and let us compare for a second how a traditional cooperative works like and how a data cooperative in contrast would look like. What are the core differences between those kind of things? And for this comparison I would like to introduce to you a milk cooperative. So this is a couple of farmers who have decided to join together to sell their milk together. Because if they individually try to sell it to the supermarket, they get prices which are not acceptable. So they join together and together they can say okay look now we dictate the price or at least you have a basis for negotiation between the supermarket and the milk farmers cooperative. So let's take a look at the three major differences between a traditional cooperative and a personal data cooperative. And the first thing is about the nature of the good itself. So milk, if you sell milk it gets drunk and it's gone away after some point. Or it's produced into cheese and then eaten at some point. But it will be gone eventually. Personal data quite the difference is true. If you give away personal data it won't fade away in some way. Just the opposite the more personal data you can add to it the more valuable it even gets. And over time it might get a little bit less interesting after years. But in general it's just the opposite as milk. So the more personal data is out there the less you can control what's happening to it. You can copy it as you want it. You can multiply it and simply it won't vanish by itself. A second problem actually is that while in this milk cooperative all the farmers have to physically bring their milk to some central point to be able to collectively sell the milk and so on. This limits the area of this cooperative. It will be just, I don't know, perhaps 50 kilometers, 100 kilometers or something. After that it's not worth for the farmers to carry the milk all the way probably. Quite the opposite is true for personal data. Each and everyone in this world produces personal data if they interfere, if they interact with some sort of technical device. So the cooperative, the personal data cooperative needs to be open for everyone in the whole world. And this makes the governance and finding decisions and finding a central direction and so on and so on. It makes it all much, much more difficult. And the third aspect I want to point your attention to is if you spill a liter of milk you wipe it away and it's gone. But if you spill personal data this is a catastrophe on a personal level for sure but perhaps even on the level of society. And so when I applied for this talk I actually had the idea that we as free software community we should start an initiative like this. And I was actually talking to the KDE community on Academy earlier this year about this topic. And the great result of this talk was actually that I can officially say hello from KDE, we from KDE, we want this to happen. This is an outcome out of the Academy and I got great support from the KDE community and it's sort of an official KDE project right now to start something like this, to start an initiative that deals with personal data. But at the same time something else happened and that is that some member of the community, someone watching my talk actually pointed me to an initiative called Polly Polly. And so I was very curious because I didn't find them beforehand which is, I still can't explain why I didn't. But it doesn't matter, I got the hint and so I reached out to them and we had some long, long, long discussions about their approach and so on. And this all happened even after I applied to this LibreOffice talk. So I had to change a little bit of what I was going to talk. So my initial idea was to now present you how I envision such a cooperative. But instead I have decided to advertise Polly Polly right here, to advertise that we should work together with them. So let me start introducing to you a little bit about Polly Polly. First of all, Polly Polly is not an open source initiative. It is not something where volunteers gather together and try to find a solution. No, Polly Polly is a funded startup. They do want to make money. I don't find this very problematic as you can see later on because I think they're doing the right things. So what do they want to achieve? They want to build the infrastructure that allows users to actually own their own data. I will go into more detail in the next slides, but that is sort of the basic idea. Why does Polly Polly exist? Because they actually want to provide the infrastructure so that we can keep control and keep ownership of our own data. Something I just want to strive very, very shortly is that they do have a monetization concept that will allow you as a user to sell your data under certain circumstances. But I don't want to touch this too much because I think it's not of that much interest for us right here right now. What makes me advertising them here is that I'm truly convinced that they convinced me that they want to be good floss citizens. They understood that it's essential to their success that the whole software will have to be free software. And as I said, they are a startup. So they've chosen a dual licensing concept, which I come from KDE community. So I don't think dual licensing is much of a problem if it's done right. So again, they are a startup. They are producing free software which has a dual license. So whenever we contribute, we'll have to contribute on a dual license as well. And what they do is they build the infrastructure so we can own our own data. And this is based on a cooperative concept. And I'm really, really happy that Felix is joining us later for the discussion and he'll be able to explain all the things that, you know, I can't explain or I might be not very accurate about in my talk here. So let's move into the technical and organizational description of what PolyPoly does on a more detailed level. The first thing and this is something I really love about their concept is that they introduce something they call it pods. You might know about, you might heard about pods from the solid framework from Tim Berners-Lee. This is not the same. This is a different kind of pod. If you want to compare both systems, then solid is for all the data you want to share. PolyPoly is for all the data you want to keep private. So that's sort of the rough difference between those two projects. But still they both have a concept they call pods. So the pods in PolyPoly are basically the containers that hold your personal data. So all applications you have feed the data into your pod. This pod always stays on your premise, on your devices. So if you have a set of devices, let's say you have a computer, you have a smart TV, you have, you know, a couple of IoT, whatever things, you have a car, whatever. Now these, on the long term, all these devices should have their own pod. All of your pods create a mesh network and this mesh network stores your data. So your data will never ever leave your premise. You own the data. So that's the first step about storing the data. But how about, you know, giving projects access to this data. And again, there the idea I think is very, very smart. What they do is you can download features into your pod, extending the functionality of your pod. And once your pod is able, you know, to have this new functionality, evaluations from certain parties are allowed. So let me give an example. We as LibreOffice want to know if our users are really 99% male. So what we do is we integrate into LibreOffice a mechanism that writes down sort of however we access this, but the sex of our users into the pod. And now we hope that our users allow us to access that data. If they do so, they will download a feature from us. So they will download the LibreOffice feature. And what this feature does is the following. It gets a model from a server on your pod. On your pod it learns whether you are male or female. Then it, you know, pushes this new, this trained model up to the server again and the next users downloading it. You know, adding their information about whether the user is male, female or diverse or whatever category we want to introduce. So what happens is that by this, you know, up and down ping-pong push, we don't have access to the individual data anymore. We cannot say in the end, like, you know, Björn stated he is male. We simply don't know this, but in the end, you know, we will understand the sex distribution of our users. This allows users on a very, very detailed level to control who is accessing what data of your own. And by not giving away the data itself, but by always training a model, you're not giving away sort of the data point Björn male, but you'll just sort of, you know, train the model and the model gets better and by each ping-pong the model takes, it gets better and better and better. And in the end, you cannot understand who was answering and who was sort of who is the male and the female and the diverse users of your software. Because this will become very, very complex in the end for the users to, you know, manually and fine-grained tune that, you know, I like LibreOffice for, for example, collecting information about, you know, my sex, but I don't want, you know, Facebook to know about this and I do want the next, you know, Research Institute to know about, you see, this becomes very, very complex. So what they will introduce is something they call a feature depot, which is nothing else but sort of a set of features from trusted parties. So what I envision here is that this should not be a LibreOffice thing, but I think we should start a free software initiative and in the end have a free software feature depot where people can say, okay, I do trust free software and I do want to contribute to make free software better. So I'm allowing the free software feature depot to be in and then not only LibreOffice can collect data, but we, but also other projects like KDE, for example, can collect data. So you could, for example, try to understand when are people are using LibreOffice as a text editor and when are they using just, you know, a small note editor or something. Yeah, this is something I envision where I think we as free software community should work beyond our projects, work together to try in the end to find or create good feature depots. This is what I want to say about the technical aspect. I think also interesting is the organizational aspect of it because they have different layers there or not layers, they have different entities, they've created different entities, which together sort of create kind of a balance. So the first entity they created is a cooperative. Actually, it's not one cooperative. It's one cooperative per country because this cooperative actually is the owner of the pod software for this specific country. And by this, they can make sure that each and every pod software respects the legal situation in this country. Yeah, because obviously we have to comply with the political regulations and demands. And this cooperative is something you can join. I did. Yeah, all I did so far is I bought a couple of, I don't know what you call them, probably not shares when you buy yourself into a cooperative, but I did buy a couple of them to just say, I think I like it. So just as a disclaimer, this is what I did. But anyone can join them. This is just open. This is a cooperative. And by joining there, you basically say, I'm one of many and we are many. We want to, we want to change how personal data has been handled. So the cooperative or the cooperatives are the actual owners of the software. They are the ones that try to sort of within one country, within one legal room to further develop the software. That's the second part. As I said, they are a starter. They have an enterprise. And this enterprise does for profit stuff. Like, for example, they find solutions for the industry, they create solutions for the industry, because obviously they have to change the way they work. Currently they work with a centralized data storage paradigm. And they need to switch this to a distributed data storage. And they will just offer tools for companies, for example, to go this way, to take the step to make it possible for companies to go away from central data storage towards decentralized data storage. And this is what they are trying to sell or what they want to sell. And this is how they actually want to make money. Something I think for us is not that much of interest. The third part, which I also find very interesting, is that they created one foundation. This foundation is, first of all, it's a lobby organization, because they found out that we have to, in order to do this change from centralized data storage to decentralized data storage, we need to be supported by politics. And so what they do is they talk to politicians and try to influence how future laws are being made. But they also have another very, very important function. That is that they actually own the pulley-poly trademarks. So in result, as I said, the cooperatives are local for each country. But if there is a cooperative that is sort of drifting out of the scope of the initial scope of what pulley-poly actually stands for, then the foundation can say, okay, you're not allowed to call yourself pulley-poly anymore. They can obviously still use the software because it's free software. But they cannot use the trademark anymore. And this way they sort of try to solve this problem of what can be an official pulley-poly project. So that's all I wanted to tell you about the technical and the organizational structure of pulley-poly. So the basic question is, should we work together? And my thoughts on this are the following. I think every user interface floss project faces actually the same problem. All of us, we're not gathering personal data, but our competitors do. We all fall behind. And at the same time, this is not a topic for any existing floss project. This is not a project that natively goes to LibreOffice or should sort of start out of LibreOffice or out of KDE or whatever free software project you have. We all have the same problem, but it's not our core problem. So I think it is very smart to try to find a third party in this term. It's pulley-poly we found or we can join. That actually drives this and we should sort of accompany them. Because the whole thing is such a complex issue. I mean this is more than just writing a bit of code. This has legal implications, a lot of legal implications. And it's really important that we get it done right. And so I think there is a balance or there can be a balance. So we have the pulley-poly people that really truly promised that they do want to be good floss citizens. And I think this is a chance for every user interface free software project to join together and critically accompany pulley-poly on their way. Just as the pulley-poly foundation can revoke the use of pulley-poly for individual cooperatives, we can add our good free software name to pulley-poly as long as they are doing the right thing. When we feel that they're not doing the right thing, we can always and should always be able to revoke our name from this cooperation. But I think we should start. I think they are doing the right thing, they want to do the right thing. Let us help them and help us by this. Because this is a fundamental problem we get solved if we if pulley-poly succeeds. We actually have a way to access data of our users. We can then understand who are our users. We can then do user experience based on data, not on gut feeling. And that is a major difference. So the question is how? What can we do? How can we work together? And I think obviously the first thing is that we should start collecting data, feeding it into the pot. Obviously the next thing is we would need to create a feature on the LibreOffice side to access data on the pot. All of this obviously needs to be opt-in for the users. We don't want to force anyone in using the system. But we should start to build up this alternative for users and see how it is accepted. See how we feel about it. If we if we go this way and we find other free software projects doing the same. I think creating a Floss feature depot is something very very attractive. The last thing I think we can help Poly Poly with is building up the governance structures they need. And the good thing about all of this is that I got the promise from Poly Poly that they will help us when we are willing to invest time to improve the situation, to gather user data, to evaluate user data. They have promised me that they will support us. Whatever that means, we'll see. But at least they are open to this. They are not only open to this. They encourage us to do this. And I'm trying to encourage you because they convinced me and I hope that I can sort of convince you. But this is the time to say thank you for your attention in this talk. And I think we should switch over to a discussion where we will welcome Felix. And so I say thank you for the moment. Thanks for your attention. And I hope we'll be able to create something new that actually revolutionizes how society deals with personal data. We need to provide an alternative people can use and that become the new normal. An alternative that doesn't allow any company to use personal data against us, neither on the personal level, nor on the level of society. But personal data should only be used for the good things it can be used for. So again, thank you and I'm looking forward for our discussion.