 Welcome to my talk about the public domain project stream Okay My name is Christoph Zimmerman. I'm from Switzerland and I'm in the public domain project. I'm the Serra administrator and Mainly the guy for all technical questions computer related or any other technology involved and the public domain project essentially is about Collecting and digitizing music and bring them to all the interested people who are interested in in music and and In a way for us it is very natural to do that and in the Linux and open source community. It's very natural Idea to do that, but unfortunately in the professional world of archives and libraries it It should be a usual way of thinking and the normal thing they do in their every day every day job But unfortunately, it's not that Great at the moment the situation. So we try To help in this situation and go forward with a good example how we think modern archive or or library should work and should step Leap forward to the users so Yeah, as I said, we are focused on Digitizing music so our intended audience are of course music listeners music historians academic world that is interested in interpretation history, but of course all private people collecting music interested in listening to music or of course in In the case here at the Linux audio conference where we have a lot of active Musicians they are also interested in music. They can freely use for their own works as a base form to get samples from it or as a as models to To model physical correct Virtual instruments out of very old Instruments where you only have old recordings as as a reference to as a basis for your model for example and there are Of course a lot of other use cases where I can use music that is freely Rehabil and usable in the internet like just simple background music in some holiday movies and all that kind of stuff that's Otherwise gets somehow deleted also by all people cannot watch it because they see it is little Warning signs that says yeah, you are not allowed to see this in this country You can easily avoid that when you use for example music from our platform or Other platforms that provide free to use music so this brings me unfortunately to a more Euristic hillside of my talk I have to talk about what does free actually mean in this context When I mean free I speak about free as in freedom and In this sense, we have to talk about the copyright laws that are and in power in in a lot of countries and We were talk about works General term about it in the copyright law is always a work and it means any creative Thing so we we don't have different rules for for images or paintings or music productions the all of them is is a work and Work a creative work is automatically Under the copyright law and you gain some Monopol for some for a certain time and After a certain time your you you you as a creative person you you you Lose your rights on on your work. That's that's true in all Copyright laws that are in in force at the moment that at some time in the future you you lose your your rights and when this time frame Exhausted your work enters the so-called public domain That that's the correct legal meaning of the public domain. It's In that case in that saying the opposite of copyright when it is public domain. It cannot be In the copyright and as I said copyright is in every country time limited and For the German speaking people or from German speaking areas in The correct the legally correct term in German would be gemeinfrei Very seldom known word, but that's that's the correct term for it. So when I say Different copyright laws What you should remember of this picture that it is colorful Every color stands for a certain Set of rules that apply to Copyright terms. I only speak now about how long a certain work is protected under In these countries and as you see every color means a different time frame the The copyright law is in force So this makes it for us quite complicated To to make sure that the works we digitize and upload they are really free to use But it should be our job to do this complicated work. So you should not care about yourself. You Should come to our platform and be sure. Yeah, I download that and use it that's more In that sense more more background information how we What we have to do our work in the background and As I said, I'm from Switzerland. So also our servers are located in Switzerland and That means that we have to follow the Swiss rules and That means at the moment that we have to wait 70 years after the death of of all the authors of a certain work until we can release it and Together also 50 years after the first publication of a certain work Of course, it would be also possible that copyright holders Give the license to us that we can freely use it. So for example the CC by all the Public domain license from the creative Commons framework would be suitable When a all top lies that license to it to a certain work then we would Be very happy to it to include it in in our You know archive so here a short example how it Looks like what what we have to What kind of information we have to get from a certain work? to make sure that it is really usable I chosen here the Everybody knows the song happy happy birthday to you. Please don't sing it I explain now why To know the copyright status of this work you have to know for the copyright itself who has written it and in this case The melody of this track was written by by two sisters patty hill and miller chair hill and After they wrote the melody for for Completely different kind of song it was used together with different lyrics for the happy birthday song So for the melody itself, it's only Interested to when patty hill and will be milled wood hill died In this case we have two co-authors So it means that we have to wait 70 years after the death of the longest living Author so in this case it was patty hill and she died in 1946 so that means in the in Switzerland and the European Union the copyright of This track will expire on the 1st January of 2017. So Please don't sing that song yet because you still have to pay royalties for that Even if I have my birthday today, I still ask don't sing Then in the copyright law you have also the so-called neighbouring rights. These are all the rights concerning the labels and manufacturers of the of the records and for this we have to know when the work was first published and And This story is for this example more more complicated Because no one knows for sure when the first version With this melody and the change lyrics really appeared the first time it Historians say it is sure that it was released before 1911 So that's that's fine for us in the you know in the European Union The story in the United States is very complicated. They changed a lot. There are rules in in the field of copyright law and As it is a very popular song and also economically interesting It went to court to to decide to see For sure how how the status of this this work is and the good news is it's also expired in the United States so At first of January in the new year's eve the next time start singing Right after the after midnight Could be a good joke to do and to explain other people why you are singing it and Celebrate that this great folk song we all know and it for a lot of us It is important that it is finally in the public domain at the next New Year's Eve So So I will come back to to our project what we are doing the first some Some facts what we we achieved in the last years we Talk to a few call a record collectors and already gained about 50,000 physical records, so I don't Ask a lot of collectors anymore because it's a It's a bit a hurdle to handle so many records already in a small in a quite small project We have Only one statistics from Wikimedia Commons for one month But it was quite impressive what we got back from from Wikimedia that we have around 40,000 downloads and Wikimedia Commons each month For just only around 900 files that we have uploaded to Commons, so that's that's really nice nice achievement And then yeah, we are around 8 to 15 Active or more or less active people So this is the entry page of our our project where you can see what parts our project consists of of course we have The the main part where you can download our stuff and browse through our files and Then we have all the online radio station where we use our own digitized music to make radio channels out of it Then we have different language versions of our wiki where we try to collect information about Artists musicians lyricists Conductors and all that stuff that we need for ourselves to to provide and There it is the in the background of the project there is a Non-profit foundation that we have based in Switzerland Yeah Now going a bit deep on the technical stuff. I mean I'm here at the Linux audio conference I have to talk about technical stuff not legal stuff And I go shortly through our digitizing process How are we do it at the moment and you see a quite new photo of our current office in Syrac? Very of a nice bonzo. It gives us place we can use First we clean our Our records and then we digitize them were with a laser turntable Come back in detail to to these steps afterwards and we export that Stuff as flock files and then we do the complicated copyright investigations and then we upload it to to our own platform and to wiki media comments Here you have see two examples why you should clean your records, especially when they are 80 years old or 90 years old stored in unknown places So Yeah, we clean our records with the Keith monks record cleaning machine It was purpose-built for archival use so you can also Clean records that are bigger in size than 12 inches for example, which are have been used in radio studios then when we have proper clean records we use our Japanese laser turntable to digitize them if it is possible it is a great machine With a great Output resolution and then it's really handy to work with but it has some certain limitations for some certain special old records for example, there are you have single-sided Edison discs very thick and only single-sided and You cannot digitize them with our laser turntable because of the thickness the laser is just unable to focus on this On these records another problem would be Colorful disc it's only able to to digitize black records No, so no picture discs and yes also on shellac records. You have picture discs The door it was already invented in in the 30s 40s something It is great Machine we we are very proud to have one and we are still proud That Wikimedia Germany bought us one of these machines As I said we digitize all our stuff in lost lost luck files and We shoot with cannons to two small Two little birds by digitizing 80 year old records with 24 bits and 192 kilohertz Just to have as much information out of this is all the records and see what we can do with this this detailed information And then as I said and explained then we are doing the the copyright investigation and during this step we also add all the The nice to know metadata to our files like Yeah, of course and on the record itself You can see who who was the right of this song But a lot of time the lewis is dismissing and you don't know the release date and you would like to know in a live performance where it was Where it was first played so we gather all this information to the get them and Yeah, that's one example how it looks like in the end in our wiki front end Where you see what what kind of information we collect on our information You also see that we make pictures of the of the labels of the record so you can also Browse browse through this and here the important information You see for Different areas how the copyright status is And as you see in the international area there are some countries with longer protection timeframes so But this is a pretty good example you can use more or less all over the world Yeah, and then we upload it to our own Storage servers which are also located in syric With a great sponsor and we also upload it to wikimedia commons So if you are used to search for your music or samples on wikimedia commons, you are probably Find also all our stuff down yeah, and I attend of course the linux audio conference because we use a lot of open source and a project like us wouldn't be Wouldn't had any chance to be possible without All the great open source that's that's outside. So that's also one reason I'm here to thank you if you are Involved in any coding any documentation any project that helps us and the others and Yeah, and also to to say again that That I am doing a great project I'm very very happy what I'm doing, but I couldn't have done all of this without This help of thousand other people So what kind of open source we use on a daily basis? So as I said, we use media wiki as a front-end so more or less and to organize a lot of stuff Then we have all the Yeah, a lot of stuff, you know, and then we have a lot of server systems. You also know a patchy and Statistical stuff and all the virtual machine management stuff and and it all runs really really great I mean we have 2016 and you really can rely on that on your on your daily work on highly loaded sites And it wouldn't be possible without without all these these Background stuff. I mean a GCC. It's I don't know if anybody here is able to To to write the compiler by himself, but you all need one. So thanks for that Then as I said, that's the more interesting part for for your audio guys here We run our own online radio stations. Of course, we use also free free and open source in this for this purpose Just a short diagram how online radio works if you don't know that already Think so a lot of a lot of you will do You have somewhere a studio source Nowadays, it's don't has to be a studio anymore This source creates one stream with the music you would like to distribute to your users Then you have the the streaming server, which is actually Handling all the user requests And the internet radio stream itself will be multiplied for each user. So it's a simple It's just a multi The multicast it's a Unicast System it's not multicast and not broadcast. So online radio needs a lot of bandwidth compared to what it Should use if you could be able to use multicast in a local local network But that's not the problem anymore nowadays where we have Yeah, believe in the time of video streaming though online radios are getting very cheap now So that's how the system works in our case in the background we use for For the streaming servers we use icecast to very well known project quite old Also nowadays, but it's running Very solid You have uptimes of two years with this project without any problems It also provides still Streams to shoutcast servers because it's still the most popular radio platform to to attract users and We provide our streams in open formats like or corpus and then also in the Common user known formats like mp3 and they AC plus but of course we always use open source encoders for this proprietary format and To generate the whole stream and automate what music is played we use liquid soap That's a very great project from France written in OCaml and it's kind of scripting language to do all to your Manipulation so for you, it's very common to think in in data streams like you have in in PD or Faust and Liquid soap has the same approach how you how how you handle audio streams you have somewhere Inputs can read files or read Network streams and you manipulate them you add metadata to it you mix them you make crossfades or whatever in modules and then you Plug it into some outputs and the outputs in our case So of course this streaming servers where we feed our audio stream to the ice-cast servers in very select Encoding that should be used and stuff like that Pretty handy it has also some You can use the old Lutz bar plugins with it for example and In the meantime liquid soap is also very stable rock-solid project so Yeah, the last time I saw what crashed was because the Some lock files filled my heart drives and then after three quarter of years It myself was just stopped running and this that was the only problem. I had during one year of So administration with the online radio, then we also have HTML5 front end for the radio now. That's also great based on the open source J player Works also really well Then inside office. We also have Computers to digitize our stuff This workstations are also filled with with open-source software, of course Unfortunately because of some missing drivers for proper high-end audio gear we have to run it on the windows. I Don't like that part, but I have to mention it But that's always the case when you are using long time open source, you know, it's all coming down to to drivers and support of manufacturers for developers who would like to write drivers, so it's In the audio world. I think it's it's still needed that we go to manufacturers of our beloved gear to to punch them again and Punch them every year if we forget it that they should open at least all specs so Yeah, that's about the project and the technical background with it. I Give some small outlook what we are trying to achieve in the in the next In the near and midterm future the midterm future is Mainly due to Extend our community and to extend our reach reach is also by means of of countries We only have an office now in in Zurich where we have a core group Dedicating that that time for to digitizing the stuff But of course that's only focused in in records that we have here in central Europe So we we are collaborating with a project in Uruguay They don't have the recording equipment yet, but they are gaining people and are building up the The needed information about authors from Latin America So they already know the copyright status of the stuff when they will start work on on digitization. That's really good Thing there what they do We are still Wanting of course an archive with control climate for our records We remove them at the moment every half a year a year because we we get a new sponsor or we need a new room for it that's quite bad situation for the for the archive and The more short term Goals are of course get funding. It's a volunteer community based project with fixed costs every month so we have to always be up to try to get get money for that and Then we have also a lot of technical stuff that's not solved yet as I said, we are using media wiki So yeah, how do you work in a media wiki? It's all done by hand. You have to write your own links You nothing is automated at the moment And this has to change of course because of quality stuff and nobody has time to do all manual labor work We would like to make that We would like we all of us would like to have a automated workflow as most as we can go And also on the server backend side. It's at the moment only a simple File storage without any project protection measurements against long-time failures like like beat road or People changing files by accident and stuff. So We need to do a lot of groundwork on on the In the background to have a fault tolerant storage system that these last two parts is all As I'm the technical guy I Haven't found any any people in Fixing these problems for me. So I have to do it myself. So during this year. I'm more or less working full-time on this kind of stuff in that in our project and Yeah, the other part is also to go more out from our project that our files and our Metadata that we generate is more usable for other projects or more usable by other means that So we have to improve our output from from the media wiki that is at the moment optimized to read by humans to be also readable by machine so Search machines can find our stuff better that we can Include our metadata that we generate for example in three music databases like music brains It's very important to bring our research. We do in open open databases. I talked to the music brains guys the So some weeks ago If they are interested in recording information on these old records and they said yeah for sure, but we only have a few Uses who have access to this record. So yeah, we are very happy to To get that information into and the other part is for example, you're piano It's a great project. It's a Meta library of all the European culture cultural content. So a lot of museums and archives are feeding in their catalogs into your piano, so you have one Platform where can search for for interesting paintings musical recordings all all over Europe All right, so it would be for us really great to be also included in this huge catalog Yeah, that brings me more or less To the end of the of the talk Of course, we are volunteer based project. We need a lot of volunteers As a lot of all the projects in this room, I guess I Would like to thank our our partners that the support us for several years now in in different areas in technical areas financial or also in Political areas where we are where we need also help to protect The rules at least on the level we are now or Hopefully getting a bit better, but still to defend the line where we have where we sit now on the on the copyright discussion Then our sponsors who make this possible and yeah, that's mainly what I would like to Explain about the project and hopefully we still have some time for questions Also beside of the technical problems where that's the beginning, but yeah Yeah, you mentioned the image discs Shellac image discs or video discs or something like this picture discs. Yeah when you have a record that That's not black, but it has a nice cobalt Beneath the the surface of the of the record so how does this project relate to archive.org and Who also like publish a lot of public domain content? Yeah Archive.org has has a lot of similarities what we are doing We try to work together with them because they share a lot of the same ideas and then Why not going the same direction together? What I saw on on archive.org that is in my opinion a bit problematic is that They don't have any means of quality control or Review process that at the beginning so you can have a lot of mp3 format files on archive.org In my opinion very problematic Then you have a lot of files with without with only file names as information And then no entries in the meta text of the file so there are more Technical points. I have to discuss with these are with the people from archive.org in direct, but I'm really happy what they are doing and they also have their own digitizing equipment and Offices for that and training people and then that's really great to see. Yeah I'd be interested in I don't know the English terms for it There's the Urheberrecht and the Leistungsschutzrecht and for instance if if an author of Recording is dead for no 70 years or something and it's copyright free. What about a recording that has been done in recently? By someone who actually paid for it so for the Leistungsschutzrechte which then appear Then it is still Protected in the sense that you are not able you are not allowed to use it for For any Commercial use so it means you you are not allowed to remix it. You are not allowed to to stream it in an online radio without the license So do you have to to differ between these two a lot or is it just if it's an Especially with classical music. It's the it's the normal daily case that you have to know all of these things and then it Yeah, I mean Beethoven it is very old very famous and We can only provide online the all the recordings that were made before 1966 In Switzerland in the European Union and that means in Wikimedia Commons We can only provide 70 70 years after the first really so You have to go to go back even 20 years earlier So it's the same time ranges for the copyright and the Leistungsschutzrechte No, because the neighbouring rights they start with the first publication and The time frame for the copyright starts with the death of the oldest author The cake can be very different for for different examples and I So in the world map you saw there was a color for a country for which has no copyright No, which one is this? Yeah, let's see. There it is. I saw it No copyright and unknown Yeah, but that should be For your land, I think that's just the coast But yeah, I looked that up once and no it's done with this is bigger Yeah, that's also possible but Where it is there you have your example up there East of Afghanistan. Can you see it? There you have your example Close to an unknown though in Afghanistan. It's unknown. It was unknown when was this map made It's already some years old So around 2010 or so, I think I don't know if in Afghanistan changed anything in The area of copyright or if they still try to fix all the more important problems there and You mentioned you first do the physical work of digitizing and all that stuff and then you start Researching the copyright why I would expect the other way around At the moment we have no catalog of our collection so at the moment it's really taking out a box and open it and looking what's inside and my my good friend who is doing all the Historical research stuff. He Looks through and selects highly probable candidates to digitize so he uses his Acquired knowledge to select which ones to digitize but it is possible that in the detailed research you find out yeah There is on this specific track an additional lyricists and This one makes it Non-public domain, so we we have this where the front side is is on our servers and the backside we we have to Captain in our place until some ten years or so and we also do it This way because the copyright investigation is a very time-consuming so digitizing I don't care about building robots putting Records on the record cleaning machine and then afterwards directly to a laser player and stuff like that I get this question quite often how why I don't build robots and automate this Kind of things for this huge amount of 50,000 records I mean I am really understand this question, but in the end Digitizing or a record takes half an hour or so including cleaning both sides and All the work in audacity But they are really really lucky if you are done with your copyright investigation after a half an hour so There should be the place to automate stuff Speaking up Are there any any databases that kind of keep track on on well the things that are unknown for you? Do you have like a really? Good go-to place basically for yourself if you try and digitize a record and you're like, oh, okay I have to find out if this is under some sort of copyright in some country or my country or whatever Do you do you I don't know is there something like music brains for instance for for this kind of stuff? Yeah There are several Places you can go most of them academic so my friend is Is heavily using the charm project of of University in the UK That catalogs their own collection and this collection is so huge that it has a lot of interesting information in it for us And of course it it has gaps for our use cases, so If we can we use Wikipedia plus plus the other references that were used that are used on Wikipedia and We also have a big shelf of dead wood To to have a known library of reference books out of this time so there are modern collections like the music in Geschichte und Gegenwart the music encyclopedia in 20 Books we have that for example, but we also have very specialized books on Label XYZ from that that existed only up to 1940 and was bought afterwards and there is a catalog of all matrix Numbers that were released by these labels including the release date the first release that the first release date is a lot of time the most complex Information you to to get because it's not written on the record and Not a lot of people are interested in it, so it's most of time this this Books with list of matrix numbers. They are most of the time bought by archerists and and collectors Collectors who would like to know okay these numbers. I have this one is missing. I have to search this one So but these books are very helpful for us, but also that's Bit of problem for us because we need money to buy this book these specialized books in Sometimes out of print sometimes very expensive and There are digital Versions of these encyclopedias, but they are very expensive to to have licenses with it But that's the other story. I told yeah We try to make our structural data usable for other projects, so We have these books now, so we should try to Reuse that that information in a in a more open more usable sense for for developers and Economics Isn't it copyright on those books then as well on the books you have copyright You don't have copyright on facts So this part of the books that are just facts. We are free to reuse At least in Switzerland I have to Limit this sentence here in Germany because you have here in Germany, so called database Database right and but don't ask me for details about that. We don't have that in Switzerland Yeah, thank you