 Xian is a computer linguist and a member of Fair Soldering. Sebastian Jekuj has been active in the topic of Fair Electronics for many years and Lara Pfennig-Schmidt is currently writing her master's thesis and works on Fair Tronics for a job. And they will give this talk about how to rate the sustainability of electronics and how public data can be used to find the material footprint of an electronic device. I am very glad about this and say have the stage for your talk. Yeah, thank you. Me, Zeba, will begin. Thank you to the techs and the introduction. Not that easy. Welcome to everyone. We will introduce you to Fair Tronics, which is a software product to analyze concretely electronic hardware and specifically where the resources are coming from that are in there. And we do that because of this year. These are kids here working in the Democratic Republic of Congo. They are breaking down these rocks. This is cobalt ore. And that is something that Amnesty International showed a couple of years ago that a lot of the cobalt used in batteries that we use, mostly in car batteries, but also in smartphones and such. There is always child labor involved in the supply chain. So, for example, in Papua New Guinea, this woman's home here burnt down because there was money found. Well, not really money, but gold was found, which now a big company will mine that made a contract with the area. And so they have to be evicted, which is also a classic topic in resources for electronics. These are just two examples for why it might be interesting for an electronics developer to look into where their resources are coming from. So that's why we invented electronics because we want to have like four concrete devices. This is a specific mouse with specific ingredients and parts. And then based on the data we have and also some estimation, but also a bunch of sources, maybe we can find out how much child labor and eviction is involved. And that all can go into a report. Maybe a bit more concretely here. We want to show with one specific example how we get to our conclusions and how we use the data that's available. So to be clear, there might be advice. So we look at the devices, we need to know the devices. The manufacturer knows that maybe we can disassemble it. And we have to find out what materials are used, plastics and metals and ceramics and stuff like that. And then those again are made from raw resources like mineral oil, like cobalt ore, crude oil. And these all come from specific countries and then for each country we have specific risks involved. But we want to look closer at the stuff that's circled in red here, which is the transition from parts to materials. The interesting thing is that some manufacturers actually tell us what is in their parts. This for example is a little resistor that this manufacturer borns, which is a US-American manufacturer. It tells us that this thing weighs 2.24 milligrams in total. You can see here at my mouse pointer, we know what ingredients are in there, which materials. I don't have to go into everything. And we can see the percentage by mass. So this is just for one manufacturer, a family of resistors. This is an entire family of metal film resistors. There are also some other resistors from this manufacturer. But now our problem is that borns is not that important of a resistor. But one of the ones that actually publish this data, there are a lot of manufacturers. Maybe also in the mouse that we discussed earlier. And we can't really use the data from this resistor. But a bunch of manufacturers actually use these full material declarations, short FMP. So this link often doesn't exist. So we're using a trick. So we use the part properties. So if we have a specific part where it isn't published, what materials are in there. But we can use the properties of the part. So this is an example. The Vichay, which is a popular manufacturer, has a Zener diode here, a standard part. And there's a datasheet here. And datasheets are published everywhere for selling the part. But it doesn't list what materials are used. But these data can even be queried electronically. On the right here, for example, using Octopart, which has all of the data for all of the parts in there. For example, the package is listed and the weight, which is a bit different. And those exist in different grades of quality. But this we can now use by using these material properties to then guess the materials. On the basis of the information that we have from other manufacturers. So this exists, but only from a few manufacturers. What's circuit green here. And then the other thing we also have, the other green circuit thing. And now we have to deduce the red circuit part from that. All right, I will get into the technology a bit more now. How we can figure out the material composition from a new part. So now we need some sort of basis of data from where we can actually use this. Like some parts where we already know the material composition. And we did that with crawling. We assembled a database there. There are a few manufacturers that actually published these FMDs. And you can download them by hand or maybe scrape them automatically. So we focused on two manufacturers, which is for one NXP semiconductors and T connectivity. NXP makes FMDs for CPUs and diodes and semiconductors. While TE usually does plugs and cable systems and stuff like that. Which is very different. So the crawling is mostly done with scraping, which is a Python library which ran on an external server. So the programming effort for this kind of crawler is not that big if you use a crawler like this. So this is easily expandable into more manufacturers, but for now we stick stuck with those. But this can take a lot, a lot of runtime. Because especially if you don't want to overwhelm the manufacturer servers. So we had like a week of runtime for some of them. But we are very content with the data we got. With this 200,000 XML files and also 7500 FMDs in PDF format. Which we couldn't scrape that easily. Okay, so now we want to look at what does this kind of FMD look like. So for each FMD we have a part maybe. Which might be a resistor or a chip or a capacitor. So this might also have more parts, like sub parts. And then these sub parts might be from homogeneous materials. Like metal alloys or like bronze or solder. Or maybe plastics like nylon. And then there are other chemical substances like gold and copper. Maybe salt and water and plastics. So these usually have a number associated. And when we can use these identifiers. We generally get about a thousand different ingredients there. And then there might also be others that don't have a number assigned. But this is how many we got. So for the homogeneous materials we get about 8000 different ones. And then there's about 20,000 parts that we have. So what do we do with this now? 20,000 parts is a lot. And when we want to compare them maybe. And maybe find out which part is the best one. Then we don't want to look at all the 20,000 parts. That's why we tried ourselves at clustering to reduce the data down. And maybe we can collect them together from manufacturers. Because to resist us from a manufacturer are very similar. Maybe we can only look at one of them stuff like that. So based on the material composition we tried to cluster these parts together. So we did that with psychic learning. I'm not quite there yet. Wait. This is the kiln means algorithm mostly from this library. And then we did that on a server that has a GPU. Which has an interface where we could use some GPU algorithms and data structures to do the collection. And then based on the material composition we did the clustering. Okay, now we can go one step ahead. So the last thing that's missing is the comparison between two parts. And for that we now want to make a similarity heuristic. And so the properties of these parts we now have to look at. For this we have the octopart API where we can go with a part number or name. And then we get an answer with all of the technical properties of the part. And the textual descriptions, the images, the data sheets and some manufacturer information. We had about 3,000 parts from the 20,000 that didn't have any technical data associated. But only 250 that didn't have a description. And often then in the description there's a lot of information. For example the resistance of a resistor. So often the descriptions are an alternative source of information that can be used even better to find out more of the technical information. But for now we only stuck to the technical properties that are listed in the database. And there are lots of different ones and they are very broadly distributed. And so now we want to look at these properties. So there are four different ones. There are numerical values. There are numerical values with units. There are categorical values. And then a list of categorical values. Then for these numerical values we have actual numbers. Like the number of pins or connections. So for these kinds of attributes we probably want to have a value between 0 and 1. So we can use a part of that. So we take the smaller divided by the larger. And then for numerical values with units that's a bit more difficult. That's where we have stuff like megahertz or kilohertz for frequencies or voltages or maybe even negative temperatures. So these have to be cleaned up. We have to maybe remove the units. We have to recalculate different units to make them comparable. And we don't maybe want to have the negative values. So we have to move that into the positive. But then we can do the same thing to create similarity values between 0 and 1. And then the categorical values are stuff like how is this plug made? Maybe what gender does the plug have for example? Is a part compliant or not compliant with a certain standard? And other regulatory stuff. And then we can just check for equality. If they're equal it's 1 and if they're not equal it's 0. And if we have multiple categorical values. Then it's often like what kinds of connection types does it have or storage types. Some maybe have Ethernet or HMI or USB or whatever. And then we can look at the cut set and provide that by the combined sets of two. So for every property we now have a value from 0 to 1 for the similarity. And now we can take the average of all of these similarity values. And then we have still a value between 0 and 1 that tells us how similar these two parts are. So now we only need how many parts we now have in our assembly. And so we could do this with all of our parts to do a complete comparison. But this would take a lot of time and a lot of computation. And hopefully it won't stay with the 20,000. So this would get very impractical very quickly. But so this is just an upper bound basically to see what kinds of comparison strategies there might be. So the lower bound is a random choice where we could just choose one other part and compare to that. We couldn't really get any easier. But ideally we want something in between these two bounds. So we do a sub-sampling. We are looking for a subset of the parts and we'll then see what kinds of parts are the best fit for the properties we need. And then use those for the material properties. So as an alternative we can have a look at the centers of each cluster because we know that those parts represent the cluster very well. And so we can reduce the search area like this way quite well. So there will be a nice graphic here. So on the x-axis we can see the number of comparisons that were made. And on the y-axis how many percents of the materials were correctly guessed. So we can see on the bottom left the random choice which only makes one comparison. But we finished the line to make this visible better. So it's around 30% of guessing the part correctly, which is not very usable. But a complete comparison does around 75%, which is pretty good. But it also takes a lot of time. But the sub-sampling and clustering are lying somewhere in between. But we can also see, and we can look at this slide here, that all of the methods are better than the random comparison. So this is very good. We also see that there is a lot of quality if we only do a third of the comparisons. Around 6,000 of the 20,000 parts we have, which is also very nice. But we also see that the clustering isn't that great. Because that's below the sub-sampling. And this can mean that the cluster center points don't represent the technical properties of the cluster very well, but only the material composition. But it can also mean that the sub-sampling just got some better data by randomly. So this is something where we can go into the analysis even more. Or we can just say that we will just use the sub-sampling approach because it's less complicated than finding a better clustering. Okay, so this is basically what we did. So let's put it all together. So from a new part we wanted to figure out what the material composition is, so we can do this social analysis of the part. So what we can do is we put it in a request with octopart for the part and we get the technical properties of the part. And using these properties we can then look into our database of parts with already known material compositions. We can compare those with the other properties. And we can then figure out a part which has the most similar and the most fitting technical properties. And now this is just an assumption. This can have a lot of errors, so because we're guessing around a lot. So ideally there would be another way which is that the part manufacturer would just publish their material composition because then we wouldn't have to do all of this way around. But also the results would be much more accurate. And we could focus on the social analysis of the part. So that's all for the technical part. Yeah, okay, and that's where I will take over for this general stuff in the end. So the idea of these phatronics is that we put a list of parts in and we get a risk determination out of it. And now this is based on a model that uses... and that sees the risks of each part. And that's where we had this view from earlier. So what do we do with this model? The original thought was that phatronics should be a tool to design electronics. So that people that are thinking about how fair this thing might be in the end. So you can choose the right parts to make the device as fair as possible. And also during the design figure out and determine the impacts of these parts. And also the second thing is that phatronics also wants to do some education. Especially in education it's great if one has some model devices that can be played around with. Where you can change around the parts involved in the assembly. And figure out what different results would result from the use of various resources. So maybe if I pay attention to certain certificates on parts. Or if regulations change. So this is a big thing we want to do. And so otherwise phatronics in a very big context is in the big journey of the human society. And so there's an example here. There's a human-wide project that wants to maybe ensure the well-being of the humans on the world. And so this is a big transformation that can only work in our opinion if it's based on data. So for the sustainability relevant decisions shouldn't be made and aren't just made in one place. They are made in lots and lots of heads in ten thousands of organizations and governments. And all of these people need access to data that they need to make these decisions for themselves. In terms of sustainability goals. And so this is both for big corporations as well as individual hackers in their hackerspace. And everything in between. And if one looks at the state of this kind of sustainability data. Then there are a lot of problems that one often has somewhere else as well. Lots of things are just in PDFs and are hard to extract. The tools you need for the data are often proprietary. And the databases where this data should be listed are licensed and expensive. And some data isn't even accessible at all because they're company secrets and one has to sign NDAs or something that you can't publish this. Which is of course something that hinders us from actually using this data for public service. And results in this data only being available to the people that actually have the expertise and structures and the money for licenses and such. And one thing that we would like to give everyone is that maybe think about this topic about data sustainability. This is a very hackable topic and it can very much need it actually. The open data thought to have it in its head and also a term like sustainability data which I hadn't really found on Google before doing this. And maybe be involved in this kind of transformation. So thank you for all of your attention and I'm very curious about the questions. Yeah. Thank you very very much for this very important talk and all of the effort you made and all of the background knowledge that is needed to actually look behind the curtains there. And the open sustainability term is something I learned. The questions that came up are of course that our work this awareness of the let's say entertainment devices here does of course leave a footprint in the world. And so there's also the question in general in consumer heads what we can act what what we are actually doing. So one of the questions here as if there is some sort of trade certification that consumers can use to orient themselves in the market. So what's your opinion on that? Who wants to answer? Okay. As we've seen this is a very complex field. So there is no product that's 100 percent fair. So with bananas or chocolate or coffee there are products that are certified. I like I know the idea to do this with electronics but there's 160 or more components that go into it. And of course it should be tried. I don't want to give up. But yeah it's not that simple with the seal and and there there are some seals that go for the least evil or the least bad. So there's the Blue Angel for example that also includes social aspects. And then there is environmental seals. But yeah we are about social sustainability and so there's not much. So there are there is the mouse that we shown it does not have a seal it's not certified. And of course these certifications are quite expensive and you might know Fairphone as well. And they published a lot on the website what they are doing. I think that's much more believable are very believable and we have to know this. We have to research a lot. But yeah this is the situation at electronics. So another question this clustering algorithm that you used. So in this question I find maybe a little bit of hope. So is this clustering algorithm also applicable to different areas? Maybe or did I understand that wrong? Well it depends on the other area generally yeah. So is used in different areas at the moment? So maybe a bit far-fetched but for example coffee and bananas and maybe cocoa maybe this is also in other material sciences can be used. Well maybe to just provide context because we didn't go into depth how it works. It's a standard algorithm basically. It's nothing complicated you can read about it in Wikipedia. It's used in a lot of cases where you have data where you think that there are hidden clusters. Or that some data points are closer to each other than others and you want to cluster them. But in itself it's nothing that has a certain goal or target. It's just a tool to work with data. So maybe that helps you imagine it. Okay so there's another question here that you only said in the beginning that only one of the manufacturers really published this kind of data. So why is there only one manufacturer and could there be other manufacturers? Maybe they can be convinced to also show all of their material data. Yeah I just wanted to talk about this SMD resistors. There might be another one. There is not many producers maybe a few thousand worldwide. When we analyzed this mouse or a laptop. So 60 producers or so we noticed and not even 20% published their data. NXP, Schoenbelauer and TE connectivity. The names are complicated. Some are quite relevant but especially the big ones. I wish they didn't do that. And how can we push them? Well they could be forced. They won't do it on their own. They won't volunteer the information and they could. And they were asked for it for a long time but they just don't. Those who do don't have to hide anything or are not afraid. I don't know maybe they just save work. But yeah there are laws coming. Yeah so some laws want to know certain materials. So if there is lead in there you have to publish that. And at some point they say well then let's publish everything and we'll comply with all the laws. But many don't know there's no lead in there. I won't tell you anything else. So yeah we just have to keep making laws. And put the right people in the government. Well maybe we need a new understanding of proximity. It's not about only the lead itself but also where is the lead being mined. And how are certain raw materials processed. This needs awareness and we need a new awareness of toxicity. And you've contributed to a better understanding. And as he said there's not only the legal way but also through your work there can be a consumer force with mindful consumers. So I've tackled all the questions. So now I can only thank you for your wonderful presentation. It's been over far too quickly. And I want to also thank all of those who listened. And I want to point you to the Q&A section as well. Thank you very much.