 Hi, everyone. I'm going to have technology evangelist at Red Hat. Open source software has clearly been a great success and has had a significant impact in so many ways. However, openness and related concepts aren't limited to software. They've also played out in areas such as data and hardware. Now, I would argue that there's been some pretty significant success in some of those areas, while at the same time, we run into some barriers that don't apply to software to nearly the same degree. So in this talk, I'm going to take you on a whirlwind tour through what some of those different areas and their barriers are. Let's start by talking about data. And I'm going to start this discussion by taking us back to 2008, but there's an article written by X, Wired Magazine, Ed, or in Chief, Chris Anderson. And in this article, which was probably deliberately provocative at the time, Anderson argued that we don't really need this theory. We don't really need these models behind data. If we have enough data, the data itself will effectively talk to us. Now, as I say, I think Anderson was being overly provocative at the time, and he probably knew it. At the same time, though, when we look at what's happened with machine learning over the next, you know, 10, 15 years, again, 15 years after this was written, I think we can see that there are many use cases where having access to a lot of different data sets, whether or not we have some sort of causal model behind that data can be very useful. Now, what's available today? Well, among other things, the US government just give one example has opened up a huge number of data sets out there. Google has some public data sets available. They can be accessed with their BigQuery engine. You also have some data which is not necessarily fully open for reasons of privacy or confidentiality. Something like Harvard Dataverse is only open to qualified researchers, for example, and serve a full discussion of anonymization and privacy is sort of beyond the scope of this talk. I do have one or two other videos up on YouTube. If you walk into more detail, I will touch on it a little bit, though. And then finally, if we look at something like OpenStreetMap, for example, this is public data set, community created, and I think it's fair to say that for certain things like hiking trails, it's better than what Google Maps puts out there. This is an example from US Geological Survey, USGS, among many other things. They track river levels within the US. And this data used to have this really painful screen scraping to get at this data, but now they have a pretty good, if somewhat, arcane access through downloading JSON files and so forth. So these are the kind of things that are available. And it's very possible if you live certainly in a city of any size, there's going to be a bunch of local data about things like crime stats and health code violations and that kind of thing, which you can download. A very common thing is to stick them on a map, but you can also do other kinds of manipulation. One of the challenges of all this data, well, data may not actually lead to action. One of the old data warehousing stories was basically that when guys went to this pharmacy after work on Friday to pick up diapers, they pick up a six pack of beer when they were there. So maybe those would be close to each other. And it turns out with a real story there was that this correlation actually did show up in the data analysis, but the story never actually did anything about it. And that's a pretty common problem. A lot of issues with machine learning bias again, not like into those here, but it's very easy to train your data on bad practices. And anonymization is basically hard. Some of this is going to be sensitive and it's far easier to de-anonymize than you think it probably is. Reason here, and this is an example, you have these public voter registration records, you have certain specific public information there, date of birth, gender, zip code, you can correlate that to something like hospital visits about 90% of the time, just those three items. One of the things that the US Census, for example, brought in in the 2020 Census is this technique called differential privacy. And again, not going to hear a lot of detail, but you inject random data in a mathematically rigorous way. And that's a good way to basically create a privacy budget and set data limitations to technique. It's not perfect. But these are some of the things we're doing around open data sets, so that we can sort of balance off these very legitimate privacy concerns with the ability to use data. Let's switch gears to hardware. Many of you from the US are probably familiar with Radio Shack, which was a source of do it yourself, electronic hobbyist products going back to the ham radio days. The story of Radio Shack is a story of two separate companies that eventually came together to form the store we know today. Now, moving on from Radio Shack, five few decades ago, open hardware for individuals, whether those are consumers or whether they're inventors of some sort, have gone a number of different directions. And the lower left, there's a lot of interest with 3d printers, for example, in terms of creating prosthetics, which is are normally extremely expensive. And they wear out and they don't necessarily fit well. So there's a lot of interest in 3d printers here. Three 3d printers as a whole, I think have arguably maybe been a little bit of a disappointment. The materials turns out you can't actually replace a lot of things with the sort of plastics that are used in 3d printers, but there continues to be progress made there. And again, for people who would normally be using little machine shops and so forth, 3d printers have been a pre effective replacement for certain things. Another area that I think has been really interesting over the last 10 plus years is in the case of the upper upper middle there that's Norduino, which is an open source microcontroller. Something else many of you are probably familiar with is Raspberry Pi. Raspberry Pi itself is not open hardware, but there's obviously tons of schematics and so forth for connecting other hardware to a Raspberry Pi and Raspberry Pi also runs Linux. So you have an open source operating system there. And one of the reasons I think these devices are interesting is electronic hobbyists used to be working with discrete components or very simple integrated circuits. And what really happened over time was the stirrup stuff that you could build from the ground up. It kind of became very complex. Heath Kip wasn't really that interesting any longer. But what we see here in the upper left, for example, though, is an open source hardware project for an instrument called a thereminus. You've never heard it that kind of eerie high pitch sound that you might have seen as science fiction film at some point. And in this case, so circuit board open schematics and interfaces with the Arduino and basically by having all really all this computation the power of an Arduino, they're able to have an instrument which can build yourself is relatively low cost. Really, it's very low cost. And it still gives very high quality. Now, we switch gears to open hardware for what I would call large scale needs. And there have been efforts here for a while. You can go back to OpenSpark, for example, in Sun Microsystems days, risk five is a very interesting project here in terms of having an open source risk processing chip. You've also got open power and almost great Linux is not specifically hardware. It's a Linux project, but obviously in the almost space, the Linux operating system and the hardware in the actual car are closely tied together. I'd argue open hardware has so far here, not made as big an impact as in the as in this is a hobbyist consumer space. You need a lot of broad cooperation among big industry players, you need capital, you need to kind of overcome legacy and the inertia of how things are being done currently. But I'd say there are still promising efforts in this area. Now, the last two things, knowledge and education, and maybe you're going, well, aren't those kind of the same thing? And I'd argue they're really they're related, but they're they're not for important reasons. And Wikipedia is probably one of the best known examples here. You're now archive Project Gutenberg, which really predates all of this new thing with web stuff. Web 2.0, this whole idea of you had to read, write web and individuals could publish content for the back of everyone. And then, of course, we've also had libraries around forever, which I don't want to get too precise and open means in all these contexts. But I think there are there are many areas, whether they're pure true open source or whether they're kind of open source adjacent or open source overlapping, that really have opened up and democratize knowledge to a significant degree. Now, I kind of draw a distinction with education because education almost inherently or at least historically has had this kind of certification aspect to it. And the first thing we really saw in the open source education space was MIT OpenCourseWare. And again, I don't know if they're really the first, but they were certainly early early on. Now, you didn't really have this certification thing here because actually MIT OpenCourseWare initially was really pitched as this raw materials for a teacher to assemble a course using some of the same curriculum and notes and so forth that were used in MIT classes. And one also suspects that when MIT president Charles Best initially pitched this concept, it was probably easier to serve though, hey, no, this isn't a class at tall, would it? No, they're not getting an MIT education for free, they're just getting some of the raw materials that can be used. And at the time, there was very limited audio and video anyway, so it really was, you know, a pile of lecture notes, maybe some external reading in order to do, because there just wasn't a lot of cheap video recording, for example, available time, and a bunch of other schools started doing something similarly. Well, move along and a lot of people were going, yeah, this is great, but, you know, I don't want to put together a course myself, I will take a course. And this is what these MOOCs promised, the idea of very large classes enabled by with online lectures and multiple choice exams and maybe automated program graders and that kind of thing. Now, so that we had courses in a box. Now, one thing I'll say is, first of all, these are at best free as in beer. These were not open source. There were some open discussions, which boards in these classes, which frankly, mostly didn't work very well. And there was a lot of hype about MOOCs. Lots of people would sign up. Lots of people win finished course. There's certainly good material out there. And I'm certainly not my bad mouth MOOCs in any way, but I think what we can say is they've generally failed to meet promises. They certainly didn't meet the hopes of the VCs in many cases. Number of these are our VC backed EDX is a non-profit. It was originally started by MIT in Harvard. And, you know, some of them have tried to put in certification programs of various ways. Others have really just nakedly pivoted to being essentially professional education for young to mid-aged or older folks in the workplace. So where does that leave us? So few things I take away from here. Glass half empty, glass half full depends on your perspective. I tend to like be optimistic. There are obviously a ton of data sets out there. They're being used for useful things. But there are some sharing challenges, particularly when we're talking data at an individual sort of levels, such as we might well use for things like health care. And secondly, I see a lot of organic stuff here. You know, there are I think some of these like big efforts like the VC backed MOOCs, for example, like sort of the enterprise level open hardware. It's been harder going, whereas I think there's been a lot of grassroots types of things like Wikipedia, like hacker spaces around Ardeno's and Raspberry Pi's that have really been much more successful in general, I think. And that's probably kind of true of a lot of open source software as well, although, of course, there are big open source projects out there. So with that, thank you for your time. And I hope this has been at least mildly interesting. Thank you very much, Gordon, for this interesting open data doc. Yeah, I think I will post the link to the breakout room and the people can meet you in that room. So I'll share the link in this. Chat in the chat box over here. It's really a great presentation. Yeah, so we have we don't have any questions at the moment in the Q and a session. So we will be having a break for one hour after that. We'll have another talk lined up at one PM Eastern time with about unlocking the potential of openness through diversity by Kate and Arnando and Arnaldo, sorry for the pronunciation. And thank you very much joining you again.