 is taking a few steps towards making sure that everyone who's attending has a really great understanding of some of the aspects of generative AI, both from a technological perspective, and also just how people are talking about this from a policy perspective, that way everyone can be operating off of a common framework. As we describe it, is a set of technologies, it's a term that's been used for a long time to describe a whole bunch of different things, but roughly they are technologies that are designed to emulate some elements of human intelligence. And when we think about generative AI, there are lots of different types of technologies. Recently AI has trended more toward the machine learning approach to these technologies, and within that machine learning umbrella there are a suite of tools and capabilities that have been developed for generative AI. What is generative AI? It is AI, it is this tool for emulating human intelligence or human behavior, that is really around producing content rather than processing other information. So producing content could mean producing text, images, audio. There are so many different things that people are doing now with these generative tools, but ultimately it is a kind of bucket term to describe a bunch of different approaches for creating tools that emulate people's ability to generate new content. Modern artificial intelligence, you can think of AI as like a big box. I'm going to try some more metaphors and stuff. You can think of AI as like a big box. The Pong Paddle, the other Pong Paddle that still counts as AI, or you go one little guy, Google is running around, that's AI. And then you get into machine learning, which to some extent is where data writes a little bit of the code. Like as it's going through the processes, it's no longer just a decision tree, there's at least some probabilistic element that emerges as a function of employing the data. And deep learning just takes that to like another level of abstraction, and I'll explain to you what I mean by this. So there are two parts of artificial intelligence. There's training and inferencing. And when you hear people talking about like an eating hundreds of millions of dollars for some people to do their stuff. And honestly why you see the massive fundraising of the generative companies, because there is the need to have all these gigantic computers for the training aspect. The inferencing aspect that can happen on like a raspberry pie these days. People continue to make the function of using it in an instance more and more like accessible and doable. So the way you can think about this is think about crossing the street. You have these kind of factors about what makes you go or not. They're obvious to cross-sign. Some are kind of random. You can feel the car coming. Or there's some strange, you know, some edge cases that come up. And by the time you're probably, I don't know, 14, you don't want to just do anything really stupid while trying to cross the street. You know, I'm just assuming this is my old age and how stupid I was. And so that, right, is the training process. The processing of all that information over your life and just some kind of abstract probability of when you should cross the street. And inferencing is the next time you walk up to the street corner. So in terms of generative AI, I think I will stick to talking about the specific types of AI that we develop at OpenAI. And I think the analogy we like to make is that you can think of at least our tools as a really fancy autocompletes. They're, in some ways, very similar to what happens when you are typing your email and something pops up or you're typing a text message and something pops up. It just turns out that if you do that with a lot of data and a lot of compute, you get a lot of interesting capabilities. But the reason we like to talk about it in that way is because I think what's important is having enough of an understanding about the technology works to know what it can and can't do, what it's good at and what it's not good at. And so when you think about the process of how models are trained where you take a lot of information, a lot of that information is from the Internet, and the way it builds sort of a picture of the world that sort of makes statistical relationships between what the types of data is exposed to, then you will have a better understanding of why, for example, the models are prone to provide inaccurate information sometimes, and sort of that can inform how you interact with the models. And so I think that's a useful framework for at least beginning to think about the inner workings of generative AI and then how that's directly relevant to how you can think about it from a policy perspective or from a music perspective. Explore expressing ideas in different ways. I think that that's really exciting for me. That's probably the most useful thing just because I am a person who really likes to work in conversation with other people and occasionally they're not available to give me feedback on my draft. So I can short circuit that process a little bit. I think from the image and other media creation perspective, there's a really interesting frame on this work too. I know that, and I'm sure we're going to talk a lot about some of the concerns around IP and rights and all these things, but I also think that it's really valuable to recognize that these are tools that allow people to fill in the blank. These are tools that allow people to create images. These are tools that allow people to generate text. So for me, I think that there have been many tools throughout history that people have been worried about kind of disrupting or devaluing the skill associated with a particular trade or task. And I also see there being really interesting conversations to be had around accessibility and the ways in which, you know, things being harder for a lot of people isn't necessarily a good thing. And so for me, what has me most excited is people who are exploring these tools as a capability enhancement to folks and making it a resource that can make certain activities more accessible for people. I know that we'll get into as well some of the potential risks of disrupting tasks and work in that way as well. If I can add one thing, one thing that I love about it the most is that it gets rid of the layer of like bullshit esoterica kind of on top of everything, right? Because there's a certain level of, hey, these are just terms. I use a bunch of acronyms. There's like certain steps to communicating about a certain thing that's a gatekeeping aspect, right? It's just like if you do not know how to fill out this outline this way, you cannot participate. If you do not know how to say what a KPI is, you cannot participate, you know? And I think a lot of that is kind of like a relic, much like I think suits are a relic. And so I think we should move, like if we are going to actually make societal improvements and if we are actually going to rethink what our lives and the lives of our neighbors and children and families and all the things you say to get people emotionally engaged, right, are going to look like, then I think we have to not start from we should keep doing it the same way or anything that removes that barrier is bad or it's not meaningful to do that because it is meaningful, right? I think for myself it's meaningful to be able to cut through at least one to two layers of you can't do this. I agree with a lot of what they said. I think it is very clear that there are a lot of things that these tools can do that are productivity answers in like a very obvious sense. They allow you to do something that you're currently doing faster and we are fully supportive of that. We think that's very important. I think there's another way to look at it too, which is also something we're very interested in, which is seeing all the new and novel and interesting ways that people can make use of these tools that sort of shift how we think about things and I can give like a specific example. Do I think this is the most important example? No, but it's just something that I find fascinating and that is this idea that let's say I'm facing a dilemma. I don't know what to cook for dinner. All I know is what's in my refrigerator. I don't want to go to the store. So a thing that AI can do is you can put that information in the AI and ask it to give you a recipe and this is something I think that is very, very different than what you can currently do on the internet with traditional search. You can certainly go through and manually scan recipes and be like, okay well I can't make that or this is close but I need to make a substitution. But you can't do something personalized to your situation in the same way the internet just isn't organized for it. The information is there in theory you just can't access it. And so this is just something I like to think about as an interesting way to sort of shift how we access that information and how we can make it more useful for our individual circumstances. But that recipe may or may not be good. You will get a very believable recipe by just like... I would do want to raise one other point here which is that we are dealing with an extraordinarily general thing and that's about to move to a more specific thing. So every single company that is now like, you're doing a GPT-4 thing, they're going to take your existing data and in some way they're not going to tune the models until it's really cheap to do later but they'll at least have some type of information like embedding something for it to refer to because the models respond based upon your prompt, right? Like you type the prompt in whatever way and it reduces the probability of stuff that it should say to you to like a slice of that, right? And so having different aspects to it, having different pieces of data that it can access will change it dramatically. Now each of these companies is also going to have their own custom LLM or their tuned LLM that functions again goes one... Right now you have kind of like a spectacular intern, right? GPT-4 is like the best intern you've ever had that can do anything pretty decent but if it doesn't, everybody can't use the internet unless you use plugins, right? So it might just make stuff up all the time. However, it's already plugged into the internet, right? Like all these things are already kind of being solved on the, you know, is it using information from 2021, whatever but we're going to move to the stage of it's all the interconnection, it's all the customization that's going to remove some of these concerns about raising ones. First of all, National Air Research Resource. Second of all is one of the important factors of democratization. I would say that there are two really critical factors in why democratization is important from both like research and implementation and like a societal basis. First of all, even if you don't care about people at all, our AI is going to be worse, right? So if you don't have a sole of your sociopath and you don't care about anything except for that, like your AI is going to be worse, so let's have that away. Second thing is it's the contextual application, right? If you drop one thing in one town with one set of circumstances and one thing in another town, and I'm talking more about like narrow, you know, deep learning AI in this case, but the large language models have the same like implications in terms of the kind of cutoff probability thing, but if you do not have that broad, broad base of testing of research of things that are not explicitly, commercially valuable, you're going to fall into some random hole you didn't know was going to happen, right? And again, it's because you're working with a 650 billion point probability cloud, right? Now I think some of the folks that are like, it's just going to randomly wake up and do some stuff, or there's these black swan scenarios that we kind of thought experiment, but have no way to put probabilities on themselves and likelihoods. I think that that's a less constructive focus of energy than things that we know will fairly likely go wrong. And when we talk about things like licensing or like having regulatory mechanisms, one of my great concerns is that we do not have the research and the knowledge to actually effectively know how those processes should work outside of governance mechanisms, which are very important and a great starting place, right? But from a technological perspective, we just have a crazy amount of research to do and a crazy amount of translational research, which is novel, but unfortunately we have the mechanisms to do that now. So if I was going to say one policy intervention, I would say take very seriously the narrow, like the related stuff, right? Broad-based testing, which can also be learning exercise, which we have AI red team.org. Check it out. But that's a really key thing, like from the stuff we already have in place when you're looking at licensing stuff, at least make sure that it's going to bring some transferable governance things to other people and doesn't just apply. Small subsection. And then finally, do not forget that it's just people. It's always people. It's not the AI doing something crazy. If some insane thing happens where AI tries to do these black swans, it's going to make a pandemic. It's not going to be because the AI decided it was a good idea. It's going to be because somebody was being like a terrorist and tried to make it do that, right? So I think it's extremely important that we separate from this idea that AI is going to become self-directed as like priority one. Again, AI could certainly do that. It would be because a human hacked together a system that had an objective and a bunch of different models working together, plugged into a bunch of stuff on the internet. So if you're concerned about that, that's concerns should always be associated with you. Well, for those of you who were paying attention last week, this may sound a little repetitive, but as a company, I think we've been pretty forward on our policy ideas for how to regulate AI and NRCO has been out there talking about these ideas. So despite Austin's skepticism of this idea, I think the idea we've put forth is the idea that at least a subset of very, very capable models, generative AI models, should be licensed. That licensing is a means to an end because what we're really interested is making sure that the AI developers are putting in place a set of safety practices that include evaluations of capabilities, evaluations of emergent capabilities, rent teaming processes, external validation and testing. We do a lot of that. We are making our best guesses at what some of that should be. And what we are hoping to work with governments is to figure out a mechanism to improve that and make sure that we're doing the right thing. So the ideas we laid forth are a licensing mechanism, a mechanism for creating the appropriate set of evaluations and standards driven by a multi-stakeholder process to ensure that it can reflect and be updated in line with developments in the technology and then some sort of global agreement or global angle to this to make sure that whatever we do in the US, they'll only touch US models. And I think if we are really worried about the implications of AI, then it needs to touch all AI models and so there needs to be a global angle to this type of regulatory or governance. Alright, I'm glad you did that because now we can have a fun conversation. Okay, so I will say first of all, I do generally agree with that as like a frame. You know, here's several things that need to happen in some meaningful way. I think my issue or like my lack of enthusiasm for licensing would be primarily based on there's just going to be this massive amount of non just like commercially deployed models, right? Like the open source world is going to continue. And that's why I like the governance things and the things that are learned from that process. I think what you're saying are going to be in my mind the most immediately impactful thing, right? Like people thinking that they're going to have to do licenses in the immediate term is almost more useful than that mechanism being immediately put on it because everybody has to evaluate their governance systems, right? Like what they're doing to at least like have good faith, we're doing our best, you know? But from my perspective, most important again, the most important aspect from that is almost the transferability of it, right? For anybody that's making an open source product to be able to adopt similar if it is like our reals technology or if it is governance mechanisms or testing protocols or systems, all of those things like letting the ecosystem develop without like having supporting the development of the ecosystem of kind of like assurance and testing. Again, part of what I found immediately when we announced the red team thing when we're doing it together at DEF CON is that none of it exists almost if that's right. There's like very little mechanism for these types of exercises at scale and even not at like a certain scale, right? Because this work is done as a general matter, you know, through the dev process or whatever and we're just now at the point where it's touching consumers and consumers are using it regularly so it's valuable to have that access. So again, credit you guys, everybody else, we have all the folks sign on across the both like commercial side and the open source big picture side but that's what it kind of has to be, right? Everybody working together with the things that they specifically know are either open concerns and again, most of this is going to live at the application level, right? Most of this is going to live like most of the conversations I have are folks coming in and being like, hey, we see how this could be extremely useful for people but we're not sure if we should do it right now, right? And so one of the things I really recommended again as a governance mechanism, something we should do is if you're not sure, just try to research it with people and test it with people, right? Like just figure out who it is likely to benefit and adversely impact and then find like a reputable university and then make sure it actually serves the folks that you're attending to serve or at least like some similar archetypal scenario in terms of like need, application, whatever, but that's like your safe way for it. Again, even if you don't care, it's a safer way, a lower risk way but you're going to have better technology, you're going to have people more engaged and I think that there's like a very real sense that in a time, like people feel like they should have more power now and it's not going to be obvious until somebody tries to shut it down and take it away once we're like one step further in terms of its functionality and like interoperability with the other services that you have but if you do not allow for folks to not feel that way it's going to be really scary, right? And like I'm going to be pissed off too, you know? But I think that we, that's one thing we'll walk into and not understand that we're walking into until it's too late. Great. Bea, I'd love to give you the opportunity to jump in on this very big idea and big conversation as well. For sure. Thank you. I was frantically googling to know about, because I think that, you know, building on to some of the things Austin has shared, that this reality that like these systems are interacting with each other in the real world by people, right? Like these systems behind the scenes are piloted by or used by people and that means that we need to study how these systems interact with people and how people change their behavior. I think one of the great frustrations that I had working in the policy space on the Hill was that advocacy groups would come to me with some recommendation, oh, if we make this thing illegal or if we incentivize this, then behavior will change without thinking like, too, therefore is down the road of like, okay, well, and then people will adapt to that new status quo and they will have this new form of behavior and therefore, you know, this new outcome might come to take place. So I think the emphasis on, you know, testing and trying to sandbox where possible is really powerful. I also think that this is something that we built into the algorithmic accountability act was like, nobody knows. Nobody knows what the best way to measure these tools are. Nobody knows what the right threshold of like, how much accuracy to the data set a system ought to have in trade-off with other variables that might be measurable. Nobody knows which things we aren't measuring yet that maybe we will come back and look at and realize like, oh, wow, there are really interesting network effects here. There are really interesting interactions between different ensemble models that create these weirdo feedback loops. So I think that certainly a strong emphasis on testing is really valuable. I also think that there is, as much as I understand a lot of the arguments and the behavior of the licensing approach, there is a really strong regulatory capture concern that I hold. I think that while I want everyone who is developing these tools to do so with concern and with care, I also think that, you know, I live in the United States of America where we have like a Goof-a-Doo healthcare system in part because of regulatory capture. And like, I don't want to see that happening here. There are some really big players. I take a lot of resources sometimes to build some of these systems. That is already a threshold that's really tough. And then having this additional layer of regulatory capture to me raises some red flags. So while I do think that there should be mandates for greater testing and transparency, both what goes into these systems in terms of data sets and training practices as well as monitoring of the outputs and how these systems work in the world, I'm wary of certain thresholds being used for defining kind of a set of players that are kind of allowed to play in this space. The other thing that I'll add to that is that I think that, you know, coming back to this idea that it is all people, it's really important to recognize like where these tools are actually being used. With generative AI, we're in the very, very early days. And, you know, to Alex's point, that's really exciting. Gosh, there are so many funky-dunny ways that people are going to glue these things together and come up with new stuff, right? Like, even for a lot of folks who are working in the space, the idea of generating computer code was like, oh, dang, yeah, like, wow, language. I hadn't thought about it that way before. And so I think that there are a lot of really exciting things that are going to come out of those interactions, but it is also the case that there needs to be a better understanding, I think, from a policy-making context in terms of where are these tools actually being used? What systems are they actually being built into and how can, you know, whether it be consumers themselves but let's be real and how many of us read terms of service, but organizations that serve on behalf of consumers to actually understand what's under the hood because a lot of these systems are being white-labelled in ways that are kind of unintuitive to people behind the scenes, who aren't behind the scenes to know which parts of our lives are actually relying on generative or other AI systems. Hallucinations is like a primary concern because two things, first of all, it's just like your best intern ever making up something based on their connection to something else. So, generally, hallucinations is just them being confidently wrong. The second thing is this is... What would you call it, bullshit? Yeah, yeah, yeah, I don't know. Being wrong. Yeah, just being wrong would be cool. But I think that ultimately, because, I mean, if you use Bing, right, if you use the different tools, it's not going to be right all the time, but at least it's getting better. Like, here's when we search the internet, when it happens, all the plugins. You have Wolfram Alpha plugged into chat GBT now, so all your mathematical, the stupid mathematical stuff it does now is being adjusted by Wolfram Alpha. So, I think it's like, we got to stop getting stuck in time. Like, every two days, some stuff changes, right? Like, every two days, some stuff changes. I get stuck on some other topic for two or three days and whatever I said would be cool. Two days before that has just already happened and then forked twice, right? And so, whatever way we have to do to make folks understand that and not get stuck in stuff is critically important. That's great. So, I'm old enough to remember when Photoshop became really powerful and I remember a very similar debate happening back then about, well, how are we going to know whether an image is real when there's these powerful tools out there? And, you know, I think what we've seen now in hindsight is that we just kind of got used to it and we are more skeptical consumers and we've learned to sort of just understand that, like, look, there are these tools out there that mean that's things that we look at. We can't necessarily rely on it at 100%. We've been living with Photoshop for a long time. So, I think that sometimes I want to tell people, like, let's take a deep breath. Like, if you fix our problem, that's something we should be concerned about. Let's not ignore the problem. But also, let's give the American consumers a little bit of faith that they will kind of learn as we've come to live with these tools for a good amount of time. Really only like eight months into the world of generative AI. Once we've lived with these tools for a little while, then we'll just get used to it and we'll be able to live with them and understand how to deal with them and how to be sort of smart media consumers. What about that? Alright, I got one thing to say about DeepFix though. I feel like it's actually maybe going to be positive for some sections of society to stop trusting everything they see. I was always kind of annoyed even when it first started because I'm like, it doesn't require that. Humans love believing the thing they want to believe. I remember my great aunt or something shared the Obama, where's the Muslim ring and it was really the one ring. It just had 50,000 likes and everybody's looking at it. I brought this point up at a staff deal one time and everybody was annoyed at me on the panel like, what do you want to do, fix society? I don't know, kind of. I'm not telling you to just fix society but let's just start and be like, this is the thing we've got to think about and care about and we don't do propaganda. Shit man, I can't believe I said that. But you know what I mean, so what do we do? Question mark, well I almost tried to do it. Oh and by the way, the testing and playing with stuff with your hands on, that's the best way to do it. I think it's an easy trap to follow too but I think a lot of people still think when they think about generative AI that the AI is looking stuff up in its dataset and I want to emphasize it is learning from that dataset but what comes out of it is something much smaller than the data that it's trained on and it does not have access to that dataset and I think again, the reason that's important is not because mechanically it's super important to know but because it has implications for how you think about what it's doing and what it's good at, what it isn't good at. Building on that, I've seen people say things like chatGPT has admitted that this is, like learn this from or this or that about when people try to ask it questions about its capacities, Alec, does that go along with that same idea in terms of it's not really actually capable of doing that kind of self-referencing and explaining what its internal sources were or what is actually in its dataset because it doesn't have the direct ability to query that. Would that be right? Pretty much. It answers that like anything else. It's not like it knows specifically that you are asking it some self-referential question. It just treats it like any other text and it answers it like other similar pieces of text without necessarily specifying or thinking that there's anything uniquely unique about that that requires a unique type of answer. Great. The myths, misconceptions. Thanks, yeah. I think that the main myth that I would really like to counter is the idea that it isn't people behind the scenes. There are so many people behind the scenes both in curating and labeling these datasets but in building these powerful models as well as creating the use cases that they're applied in. I think part of why in algorithmic accountability we approach things as critical decisions rather than critical systems was because you can have a super screwed up decision-making process and you throw some automation at it and now it's a way faster screwed up decision-making process. We have to interrogate, make society better. We have to actually ask the question are we doing a good thing here? I think that's one myth. Another myth, something this is not necessarily a myth but it's something that I really appreciate and the phrasing that I like to see here is that I think early on in discussion of these technologies there was the name LLM, large language model and something that has been really wonderful that I've seen push back on is like they are large text models. They're great for generating text but actually there are lots of languages that aren't representative text including ASL. Here in D.C. we have found most folks at Gallaudet University doing incredible work there and I think that's one small language tweak that I think we could all make to not over-represent what these systems are doing. And then the third thing that I would say in terms of myths to debunk and this is actually something, speaking of our IBM days when I worked at IBM there was this great myth that IBM would tell about augmented intelligence. This isn't about replacing people. This is about empowering people with technology and some of you may have seen the recent news of IBM being like oh we're going to replace 100 HR people with automated systems. And I was literally in meetings at IBM where us idealistic young IBMers were like it's about augmented intelligence and clients were like well I want that. I want to replace people. I'm trying to save money here. I do think that a myth that we have around technology is that oh it's going to create a bunch of jobs and therefore it's going to be fine for people and it is true that a lot of technological tools shop among them have created so many new opportunities and so many professions that didn't exist 50 years ago it's wild and inspiring and also the process in the meantime has sucked really bad for a lot of people and so another myth as we think about the power and the capability of these technologies is this idea that oh these are not substituting human tasks or skills and therefore it won't lead to labor disruptions I think we should really challenge that. I don't think that it means that we're necessarily automating people's entire work but there's a great article that the name is escaping me but it coins the term photomation like fake automation. There are so many times where we introduce a technology and it may make some parts of a process better but it also may make some parts of a process worse and it's not always just because you introduce technology into something and just because it makes it faster or less expensive or whatever that doesn't actually mean that the experience of it is better I personally like checking out with the human cashier my little rebellion against the photomation of our systems but I think that when you standardize things when you automate things when you build systems around things that does have real implications for the humans that are doing those tasks today and that's not to say that we necessarily shouldn't do those things but we ought to think about what's going to happen and how people are going to be able to adjust to those changes and whether or not the new awesome exciting web developer roles that came out of all of these technologies are actually, for instance, going to the people who were working in the newsrooms or whatever other case maybe that is being disrupted by those technologies as well I think the same fundamental principle applies which is just like you have to make some type of partnership I mean, first of all, I think just any kind of natural investment in what folks are already trying to do because there are a lot of people that are working on things already and I think there's kind of a presumptuousness of some type of limited progress in emerging markets but I think going in and looking at how can we make multilateral or bilateral research work that does help provide resources for some of the aspects that are going to be more difficult especially around some safety stuff but I think the support is about understanding I mean, the example is kind of terrible is the one I think of all the time which is like there was the justification of investment in rural communities and building some rural broadband was the fact that a contested environment is like that in more time environments there's a reason that military funds a bunch of stuff but I think the flip is not that but something like it I think it's the need to find joint interests, joint needs and then from the perspective of just international development entities at all, I think it's just acknowledging that there is a lot of progress to be investment and grown and there are like some awesome research institutions popping up, the companies are anchoring and I think part of it is also demonstrating the value and having information together that shows like here's why you should do this you have to make it kind of turnkey for people to do something good a lot of times anyone else want to jump in on this one or we could have one more question I'll just jump in really quick on that so one, I think that the labor disruption conversation is actually really like central to a lot of the parts of the world who have been on the receiving end of a lot of these technologies historically it's not a great outlook and I think that there's a lot more value and need for a lot more conversation about which tasks and which roles are facing disruption in this space I also think from the development perspective we've all heard about there's been a big push for a pause on advanced AI research or whatever and I know that there's a bajillion interpretations of that but I actually think there's a lot of research work to be done there's a lot of need for pushing the boundaries of what are considered low resource languages so resource languages that don't have tons and tons of data online already and there's some really fascinating work happening across the African continent because of the wealth of languages that exist in that space to really push the boundaries in terms of what capabilities are for lower data like reliant systems and so I think that part of what maybe we need to do is also reframe some of these spaces while certainly when I was working at IBM we were working with folks who having their data center be constantly powered was a big challenge and so it's not to discount the real concrete infrastructure challenges that people face but also I think that there are real opportunities when we reframe those problems because these are the cutting edge AI research questions that we have and those folks who are doing that leading AI research in many of these spaces are the people who kind of have the they have the wealth of knowledge to give to these spaces and they are working because you know it's like writing great poetry like sometimes having constraints leads you to greater creativity and problem solving and I think that in these spaces we would do well to honor the challenges that those problems present kind of the frontline leader challenges that we as an AI community say hey actually we want to really work in this space and work on these issues. So the first thing I would say is that chat GBT is like a different type of technology with a different purpose it's really likely that you'll probably have some middle layer of some chat bot with other little checking things that accesses deep learning traditional narrow AIs but I think that that's I think you're going to see the added layer of specialization I think you saw Reddit close their API everybody that has sources of data is now saying I think to the question about copyright it's now ironically to Mr. Nelson's point here getting more oily suddenly because it's easier to monetize in that way but that's still for the people to hold the third party data right so that's not going to change and I think perhaps if anything it's likely that the open source community I mean to the point of necessity the open source communities desire to be able to remove the hardware barrier has just dropped it down lower and lower and lower right I think you're going to see something similar with open source LLMs that are specialized but they're going to be different verticals they're going to be interconnected with different types of things and actually I think you're going to see kind of like governing things that are bouncing back and forth between some type of check the LLM and some narrow AI So the big competition concern that I have is in proposal to try to to the question of is data the new oil should be paid for is if you do require payment for it that it'll make it impossible for any upstart to get into the business because when you're talking about billions and billions of pieces of data, content and anything that gives any sort of reasonable compensation to anybody is going to just do the math one penny times billions and billions is a lot of money and so that's a real concern and the space is that if you try to impose some sort of licensing regime whether that's because of copyright or anything else then you will end up in a world where only the wealthiest companies or even if them at all can afford to do it I think what might end up happening is that you end up, these models end up getting developed in other countries where which have more permissive copyright regimes and Japan has a very permissive law permitting training of AI models I don't think as policymakers here in the US should really want that result So assuming that we have a world that recognizes that training of AI models is very used under copyright law and does not require licensing then I think it's open to a lot of companies to try to build their own models because they're able to access the same amount of the same data that everyone else is able to access I want to challenge that a little bit not to challenge the idea, I do think that there's a real risk of a push toward enclosing what has historically kind of been public knowledge There is this really wonderful thing that is our information commons that is the knowledge that we as humanity are amongst each other out in the open and I do see, I share the concern around what it looks like to enclose that and what it looks like to permission it under frankly archaic systems of intellectual property That being said, even without that the barriers are tremendous Just having access to it, theoretically I could look at all the pages on the internet but can I actually look at all the pages on the internet? No, I will die first The reality is that to have the kind of compute systems to do this type of analysis is still a tremendous barrier to entry and it's part of why, you know, I think that there are there is a lot of value in having these kind of resources made available with more intentional kind of competition frames but we shouldn't fool ourselves into thinking that this isn't really expensive I've known people who worked in past lives companies that provide cloud computing for massive battles and bidding wars happening inside of those organizations between developing companies who are just fighting over raw compute resources So like, while it is true that like maybe keeping information out in the open provides some opportunity until we move toward architectures that favor those kind of like low resource, low intensity training methodologies, the reality is like the game is still going to be played by big money and so I think that like there might be a role for public models in that space there's like a lot of open questions here about how to address that but like we shouldn't fool ourselves into thinking that just because information is open that means that everyone's on an even playing field when like access to compute itself is still such a precious thing I don't know, there's like, there are a lot of like tiny, tiny companies out there