 Hello and welcome, everyone. This is active inference guest stream 72.1. It's March 1st, 2024. And we're here with auto meta alignment lab AI. So welcome. Please feel free to introduce yourself and kick it off. And we'll take it from there. Unmute and then go from there. Yeah, it'd be really good if I if I did unmute for us. Thanks. I appreciate the introduction. My name is Austin. I run alignment lab AI. We kind of build out in the open source and aim at least I personally we aim for kind of building solutions to problems before we have to deal with impacts. I think in general, it's probably pretty important that this AI thing goes about as well as we can get it to go for everyone. I think it's it's really important. You know, there's there's quite a lot of. There's quite a lot of outcomes man that we can really get from all this because I can say with with a lot of confidence from the inside that the rabbit hole really just going deeper. You start to look into the stuff and it's meaningful because the rate that things are moving right now is not a temporary thing where I really I don't think that we're moving as fast as we're going to be before it's all over said and done. And it's a good thing because so far hugely positive impacts from my perspective. I think yeah, I think that like overall it if it's going this well this long and where you're in it's going to be it looks better and better every day I think for the way that things are turning out. Okay, so many places to jump in a year from what better according to what. Oh fair, you know a year from GPT Prometheus bring us the fires of Olympus and infinite free data to train whatever kind of model we want. I think that's really the big thing that this happened is this the performative use of language by the models really allows for this nice kind of gluey middle ground to build data sets out for different use cases just by saying that you want them. Prior most things you need to build classifiers by hand from scratch and that alone is just one step of a larger pipeline I think that that is a barrier it's like no longer there for a lot of things which is nice. Okay, let's take a step back and then re approach what were you working on as you turn to language models or how did you get into this space. So I've always had, I guess you could say like a bit of an obsessive nature, and I've always had a lot of involvement in my life with complex and challenging things. And at the time I was just getting into like image models and idea in the earlier LLMs before Lama really leaked out. And that was when I first started to take the space like very seriously I had dabbled a little bit and done a few little projects like exit speech models and things of that nature beforehand. But when the GP4 came out and I was messing with the original version of that model when it first came out, that was like a very different experience and I definitely stayed up for a couple of days messing with it. When it first came out I had like a hundred messages every three hours you could talk to it war for and it was so much slower back then that was plenty. And I don't know I just thought it was really fascinating and then I just kept doing it and then I started making money doing it and so I just kept doing it harder. And since then I think all the original people that I first met up with in the open source. We still all talk to each other and we just kind of fell in together because that's just what happens I think when you're doing it for free. Okay, again a lot of places to go but maybe give a little bit of an overview for those who have or haven't worked on open source software projects like what does it really mean for one of these models to be open source. Are there different senses that people are using it. Yeah, so largely in the open source is it's pretty it's a weird split between machine learning engineers and people using it for like formal academic reasons. And then people grabbing a lot of the newer models just to use them and have their own use cases and play with agent pipelines. I think that it's going to be really interesting when we start to get to a place where the tooling has gotten there to allow for people to just use the agents. And he's going to have like a lot of impacts that are super cool because once you can do it a little bit you know it seems like everything that I've sort of pushed at in this space. A little bit of progress turns into a lot really really quickly because like as soon as you can do it well one time you get a great data set. And that lets other people iterate on it and then everyone works together in the open source typically so it moves things pretty quickly. I have a hard time like kind of identifying them from the outside what what exactly the big appeal is to people who aren't obsessed with it. Aside from the fact that it's like very sci-fi so like that at least I know for a fact isn't temporary at least in my case I think it's a sci-fi. Can I ask if if you've had much of a chance to get really really deep into anything like that yourself as far as dealing with models directly or. I think we have our own brand of active inference sci-fi speculative futurism etc. And also though I have used local and cloud based language models. Yeah I think it's going to be really fascinating too when people start to access the lower level stuff because the LLMs are really interesting. But the use case are way narrower because they're so big. I think things like classifiers and linear regression models which are just very tiny models that you train on your CPU and do very specific tasks. People really don't realize that you're still getting that same kind of weird gooey generativeness with the smaller models except you also just get to do it really really fast on a CPU and get 100% accuracy. And so that has like a huge amount of benefits for the businesses too. It's like it's things worth really hard to code really easy to train a model for. I think it's hard to say that like man yeah there's very few industries that are going to really be impacted. We're so in bed with digital stuff you know. How would you say this is different than just doing statistics on computers. I mean it is isn't it. It's that's an interesting question. So the the long answer or the short answer is that it is it's just statistics on computers. I think the more interesting short answer those that statistics on computers is very similar to statistics on brains. I think it's very similar to nondeterministic systems. It's stuff that we don't understand the individual nuanced interactions but you get a lot of information at a higher level that you don't have to actually do all the work of understanding. Every single individual thing about it. And so being able to do that on a computer in a way that is so robust that you can do it you know many billions of times really quickly is something where. The limitation becomes how clever are you about approaching a problem because like now you have the option to not even know how it works and train a model to identify patterns and noise and just figure it out for you. And yeah I think that's going to become cooler and cooler the smaller and smaller models get because they're I've never seen anything get smaller and stronger at the same time over and over and over again like I have for this last year. So I was having models at home to me looks like a foregone conclusion and that's really cool. Yeah. Okay yeah to the statistics on computers I mean it's always been a joke about machine learning and such that it is just under the hood statistics and a linear aggression. Yes you can go from observations to the line of best fit and also you can generate synthetic data from a linear aggression. So what has been occurring to facilitate the expansion across modalities so ferociously is this related to computational resource accessibility to what extent does this have to do with architectural changes. So it's pretty easy to get if you think of it from like the standpoint of like your task to go make the model that's multimodal and then also from the standpoint if you want to use any AI. Right if you want to use something and you're just going to interact with it naturally because the best thing that we could do with these LLMs is have this very natural way of interacting back and forth for interfaces or whatever thing. We're not we're not text based right like we have faces and voices and tonality all these other things and so the use case is pretty clear and then from the development side. It's much much easier to build a data set that's multimodal once you have one multimodal model. You know once you have an LLM that can see a picture now you can just build as many data sets as you want to make as many LLM see pictures as you want. If you don't have that you have pictures and now you have to get captions on a million of them is much harder to do. So it's like an iterative process and you know especially if you notice this the vision language model so that we did have lava. Aside from that and like lava are which is Lyons version of lava basically is just another iteration on the same thing. What option did we have before GBT for V to really build these large scale vision language datasets for these models to become multimodal and conversational at the same time. All the bother actually did us a huge favor to not intentionally but they made home. Yeah it was intentionally it was an open source release they released a Quinn audio which is very similar to lava as well. It's except audio and language which means we don't have to build tons and tons of little pacifiers to go run over audio data to label it for us and just tell it what to do. And tell you what to do is much much easier and like that's the cool thing right like that is the benefit of these huge LLMs. You just tell what to do. Beautiful. So it's often remarked upon that there's a next token prediction in play here. So coming from the engineering side. How do you understand that because in the active inference space in predictive processing there is a lot of discussion around prediction and real time prediction the role and cognition but just coming from an engineering side like how did it come to be that predicting the next token became so entrenched as a mechanism to generate. Even structures that have long term structure. So you can think of it as far as what's going on on the hood. It really is very similar to just a search. I mean it's very similar to the exact same thing that you're doing when you're doing a text completion on your phone and it's finishing the word for you. But in order to model like statistically model systems that are complex. You have to actually model the features that you can derive from the system. So when I say a sentence and I get a response and that response is coherence the only way to predict that coherent response is by operating on the logic that's implicit in the sentence that I out that I gave right. And so you do have this kind of higher level sort of meaning that comes out of strings of tokens even though they are selected just based on their likelihood. And whether or not that's that's something that we're just inputting like imposing over the responses or the models coming up with because it really understands I think is more philosophical than a matter of engineering. That being said. Letters aren't meeting and language has restrictions and there's really tokens can only be used in so many ways and so likely it's a really inefficient way to do that if if indeed. I mean not even if indeed that is what we're doing right like the whole reason the language models are good is because we like that when they talk it makes sense and we could be doing that better. You know there's probably and Carpathia actually just made a post about this on Twitter I think talking about how tokens really are a huge issue. I think it's true it's really restrictive. So can. Sorry I'm like I'm like distracted because in the back of my head I'm thinking about that active entrance and. You know real time kind of reactive modeling of these of these these systems because I think that then becomes a sort of meta layer on top of that right where you have a. We have a. Sorry my dog over here. You have a. Wow it's totally lost my entire brain but. When you're actually modeling systems in real time. That's different right because what transformers are doing is finishing a string it's stateless with you run to inferences on a regular model like an LLM like everyone's used to. Those two sentences will be the exact same sentence both times if you don't have literally a random number generator right to conclude and turn to a new one. To do that I think in real time is like not sending part of the equation right because reality is never the same thing twice. So you start to be able to interact with a very complex systems and in that is just good data to model them just because there's nothing else to model when you have data that granularity. Yeah well that makes me think about how the temperature the noise that has to get injected in there's this path of least action the path of most likely continuation. That has to be stochastically or pseudo randomly broken in order to get out of valleys. And so for models where we have total observability and control pseudo randomness has to be re injected or generated endogenously. Whereas just like you described for the organism in its engagement with the niche it is getting fuzzed by reality continually. And so that is a very different setting and when there is a real time precarious engagements it opens the door to extremely different kinds of knowledge and represent representation and adaptivity. For example in the active inference partially observable Markov decision process type models all of the ways that the world changes get encoded in a transition matrix. But there isn't necessarily any representation of like overall trajectories more like sort of flows and next moments and gradients. And so there's a really interesting tension between what is explainable with digital or synthetic intelligence and with biological and it's like explainable to who or what part would make it explainable or understandable. Because even when we understand the whole linear aggression or the whole transformer architecture. There are still it's almost like it's all there it's like if you knew all the ingredients of dinner would you have understood dinner. Yeah it's it becomes abstract the closer that you look just because the complexity is so high right like when you're looking at systems like transformers it's like very very many small decisions that happen very very quickly. And I do think it does become arbitrary at some scale because like I also don't know exactly what's happening in my CPU at like the electron level right like I don't know what all the individual bits and bytes are doing underneath my computer. What's interesting I think is to consider if you're modeling this landscape that's entirely different that all of the information is reactive. Then all of the data is no longer human data like it is right now where we train the models to basically on the Internet right it's all human expression which models humans really really well. But if you're not doing that and instead you're just modeling everything where you're modeling is real life and I think that to oh my gosh I apologize hey get out of here it's gone. We're modeling real life and you're getting feedback from real life the only way that goes is to improve. I mean if you're building a system which is effective at iterating until it can make some progress the hard part then becomes defining what the progress is. How do I tell when I'm doing a good thing right because for me to do that with an LLM I now have to define things I don't understand things that are bigger than me. If I'm trying to like linearly improve in all cases which is become this weird goal post I think with LLMs because we have been now we have many many many benchmarks for many many many tens of thousands of examples and like that still really doesn't do it because you can't say like real life in real life. So feedback from reality hugely important to reinforcement learning sort of right like it although I think that term has become kind of vague to now that everything's going to come together as the scales have gone up. Well just say a key difference between reinforcement learning and active inference you frame the challenge as um how do I know when good things happen. And the reinforcement learning answer is well will propose this reward function such that there's better and worse on this scale and then we will pursue policies or observations that are better up in terms of how rewarding there are will reinforce what works on this second level ad hoc proposed reward function then to get to another piece of what you said things I don't understand and basically learning opportunity that has to get shoehorned into that reward function somehow with like curiosity bonuses all these different techniques that people have developed. So in contrast to an active inference we're doing direct optimization on the generative model such that rather than defining like body temperature as rewarding and then pursuing reward. We define body temperature of homeostasis as expected and then pursue what's expected. So that gives a direct measure of how well things are going in terms of how expected things are going and it gives a direct measure of things that are being learned or not understood in terms of epistemic value. So it's not quite, at least in the open source space applied at the scales that a lot of the models that you've been mentioning are. However, that's what excites us every day is that in principle, there's a simpler and a physics grounded first principles approach to address some of these exact questions about like agents and their autonomous engagement with the niche. Yeah, I think I think that anything is better than what we're doing now. Absolutely sure. I think that if you have like a baseline by which you can start from and you're measuring variance in that baseline agnostic to the kind of human perception of good and bad. Then you do just kind of set a floor right and all you're doing is really raising the floor incrementally because. Inherently, there's stochasticity in the system and stochasticity produces higher quality and lower quality. So if you have a floor anything below that quality, you can just throw away. And now you've got a means of improvement that you can derive formally, which is way, way more real than what we do with LLMs, which is kind of guess and then into it and we just sort of try stuff and train it on only the good until it's seen enough to do that really well. And because like really that kind of system. It's insane that it actually got to the scale modeling language really well, because I mean you think that just speaks to human creativity or ingenuity or something because what that's for is for labels right or like assigning a value. It is I mean it's cool that we can do that it's also wildly expensive and there's no way this is efficient at all. You know the these models are shrinking really fast in spite of the fact that we have like we're just kind of like this might work better and it does that that that's not going to last and if that's the case with LLMs with all of their inherently have these giant limitations in every direction, like transformers in particular, then everything else can probably go the other direction to just get denser. And that to me is also really fascinating since the questions then like what can you start to do with that. If you can imagine some sci-fi scenario where and now you've got networks of models that are very, very small and they're very, very reformative. You can start to build systems that are very complicated for cheap and cheap is important because money right now is kind of in charge of how quickly all the technologies move. And I think you're having some system by which you can take these many, many small systems and derive sort of group performance enhancements without having to define group performance enhancements in a way that is purely I mean it's just formally understood. It takes away all the weirdness with LLMs because those are not formally understood like you can't derive what a qualitative improvement is because it's qualitatively all human data. It's just bias. If that made any sense at all I might have gone in circle there. Yeah, interesting. Well, it makes me think of the large and monolithic nature of the current models. And yet inside of them are defined smaller computational entities like a node in a graph that takes in some stimuli and then outputs or another change or outputs and activation variously. And this is a common theme in active inference, which is like, you can have a more monolithic like representation of a cognitive system. And then you can all that's like a top down descriptor like the ant colony as a unit in terms of its foraging distribution. And then there's also the bottom up that describes multiple smaller autonomous components and how their interactions with each other with a niche, the ecosystem of shared intelligence. How that comes to be one in the same as the view from the top. And so it is interesting that as more ways like tall buildings with small pebbles, small buildings with large pebbles and all these different combinations of different computational components. But now that state space is massive in terms of the compositional space of models. Even before parameterizing them or touching data just the simple architectural. So how do people think about that kind of structure learning and meta learning? In what ways do you think that there are gains with like showing a vision recognizing AI, like showing images and having an update making a kind of closed loop with image generation enclosure or even critiquing its own architecture. Okay, so the, I mean, if you're trying to see if there is a limitation as far as how much of a representation you can fit into a mod like how dense can they get, or how how non dense can they be they can be infinitely non dense but it doesn't actually seem from my perspective like there is much of a limitation. And beyond the amount of time you're willing to wait and the amount of data that you have, the actual, you know, functional components of the logic by which the models gain or lose performance along whatever metric that is that you decide is it seems almost unrelated because it comes down to the question really then becomes like well how complex can a thing be and how much variance can you fit into this element. Probably find some sort of maximal state of compression, but whatever that is, it's, it feels like it's very hard because we haven't started to slow down on how small the models can get. And they do just keep getting as dense with no breaks. So I think as far as like self feedback, that's, that's like it, that's even that is also like another layer of kind of efficiency that could be gained there because you don't really even then have to store things in the model. If you're storing it in kind of a social exchange or some kind of back and forth. There was a paper earlier last year like it was, it was something camel I think is the name of the model. And that was kind of like a clock and it was just two models inferencing back and forth and performing agentic tasks by giving themselves in line that way. And it is like a step of getting around this sort of stateless behavior that the elements have implicitly. I think that this is probably an element in at least some sense for any kind of architecture which is going to involve transformers and still have a linear kind of progressive interaction with things. But I think you have to be really clever. I certainly don't think that you should be having model feedback on itself a train on it unless you're doing that many times quite a convoluted way because you're just going to get a model like it's very stylistically whatever you decide it was a good thing to train itself on for an LLM anyways. But the storing information I guess in like the logic of the systems rather than in the weights the models I think is a good place to look at if you're trying to solve that problem. Okay shifting a little bit. You wanted to call this open source AI the truth is the first casualty in war. Yeah, it's happening. Well we're warned because I really just want a pretty dramatic title to the video so that I could get people to look at it because the thing about the open source man is that it looks really, really good for the outside right now. Like if you're engaging with it, it looks very vibrant and there's many thousands of models and the leaderboards are shifting around every single day and people are engaging and companies are getting funded. And it's the exact same number of people there was a year ago and those models are all the same models moving back and forth and none of that funding is hitting people who are in the open source some of it is that's not true. So like some people are definitely getting funding I think that like like news just got funded. I think that it's really more of our fault than anyone's because I think dude it's really hard to do a UI and do HTML and CSS when you can go do AI instead. And if you're doing HTML and CSS it is much harder to go be inspired to work for free for a long time. And so that that gets bridged right like that. So that's I think that's really mostly what this market thing is, but narrative is important and especially now because we're not like out of the weeds, you know, I think that it's very much either taken way too seriously or not seriously enough, you know, at all this idea of like, where do we go from here? What does it look like a year? And what is everyone arguing about right because there's noise in the space and there's people talking about AGI and everyone's kind of scared because the technology is really, really fast and really hard to understand and AGI and being scared and arguing about it is the only thing that is really truly potentially going to have the consequences I think because like in both cases it's like your own best interests are not being served. Because all that needs to happen for the people who are very like scared is they just like access to this really dense opaque wall of whatever we're talking about, you know, because it's not it's not really scary once you look at it you kind of get it and it just kind of is shocking at first and people aren't used to things moving fast and that's really most of it. And AGI is I mean we're not even there. It's been a I can say for a fact it's been a year we've made 0% progress towards AGI open AI is marketing when I'm talking about AGI. No one's got AGI. No one's there in the right direction. I promise it's a no it's it's narrative and it's argumentation and it's political and it's money. And it's the same thing that's always been except this time like we really can't mess around too much because like we have a chance of enabling people to have access to all of the like understanding that we've compiled like as a species on the internet and books and in each other and you know you can do a lot with that and you can really improve your life substantially and be free of a lot of the stuff we deal with a lot of the manipulation that we get from other people other companies other whatever's and we could also just totally drop the ball and we could just center all this on like five guys and they could just through even through no fault around damage the social structure that we exist in I mean this is already happened and the nature of the beast is that AI is is opaque and is very hard to understand. And the first thing that we ever did with it in the markets was use it to manipulate people's behaviors without them knowing they're interacted with. This is what the bubble is in the algorithm you know whenever you see two people online and they're just both ridiculous because they're both total polar opposites because they're both taking like baby's first take about whatever topic does recommender systems you know that was that was optimization for engagements that was unsupervised and so they people just didn't know that was happening and they were making money so they didn't really look too hard and that's something that can go away and like these kind of problems can be dealt with as long as we're the signal to noise ratio stays good and the utility stays focused on but yeah I guess that's what I mean when I say like the truth is the first casualty in war it's because like don't get disillusioned it looks good because it's like it just looks good you know it's not bad but like it's not the open source isn't improving just because people are doing stuff you know I mean because like the things that people are doing is just stuff it's not all negative it's quite positive but you should do more during the open source I'll give you free stuff we do it all the time it's great I heard auto meta wants you I heard nothing to fear about AGI but fear of AGI itself I think you call it like you saw it I mean what is truth that's a tough question I think that it's oh man to take a very famous quote that might be kind of funny if you recognize it I know it when I see it how do you recognize it or know it or know that you know it when you see it maybe you don't right that's the problem I think if it's a person and you're talking to a person that's that's pretty easy that's at least doable maybe you can fit really good patterns well I think with AI and I don't think that anyone is doing this but the potential is there to certainly manipulate people in a way that's very very very hard for them to detect at very large scales for very cheap but I think that it's less important to highlight that in particular right now just because we have so many more fundamental things that we should deal with first because like that problem solves itself and when I say it solves itself I mean to say that those systems which do exist exist sans the people intending to make that right so it's always accidental when there's going to be consequences from those kinds of things and the accidents are going to be largely because someone set it up and then we left and then other people ran it and they just put it kept putting money into it and then it just was gone and these these are just going to become robust and there's no way it doesn't because like it's emphasized now and it's becoming more and more important and generative systems are no longer really being relegated to a thing that runs on a CPU in the server room and no one touches it it's all going interconnected with everyone's pipelines because now this is starting to very clearly become like the middle of what everyone's doing because even if you're not in the AI space as an AI company you're still a company you're still dealing with data and you'd rather talk to it than code it so we're in luck there but the truth I think is up to you really if you want to take a philosophical note on it I think you should be able to choose right and that's what alignment really is it's not really some moral sort of framework that you decide for other people it's it's you have a tool and the tool does tool stuff and it doesn't blackmail your grandma it's aligned so I guess automatic says don't blackmail my grandma totally noted where do you see alignment in relationship or tension with other virtues that people have I don't think it is that's the thing is like it's not fundamentally it's if it is then it's not alignment that's the whole point and it's not even a complicated thing it's it really shouldn't even be controversial because the truth of the matter is something that's aligned is something that does what you think it does when you think it does that and it does it every time isn't that just servility I don't know I don't really think of it like that I mean I just think of it like a tool right right now I mean maybe in the future that might be servility I think probably not anytime soon I think we'll have a lot more kind of pragmatic discussions before socially we decide how we're going to arrange ourselves on that front and we'll probably have a lot of lead up time to get there I think I think it is compelling though it bears mentioning like the ability to model language like in a very good way with statistics is compelling to people especially if they don't really know what's going on in the hood and I don't mean that to say that it becomes boring once you look under the hood it becomes more interesting but once you do and you do kind of understand what's going on it's like oh yeah clearly this thing is not alive this is a trick right like it's a search engine and when you don't do that though and especially if you're first interaction because like right now there's not too many entry points and you just have change to be tea and change to be tea very much acknowledges that it's an AI all the time and it has like a very it has a personality like a very strict set of behaviors which it responds to deterministically given the right kind of inputs and it feels regular and I think that gives people the oppression that they're all like this and that it's non human sounding nature as it is now which is intentionally implemented by the way so that it doesn't so that you can't use it to like fool people and impersonate anyone because the truth is like actually really good at that you can actually get really really natural behavior of the models right now and don't get tricked I guess is all I'm trying to say because it's just compelling but you keep messing with them you give it about a month man you mess with every day you're going to by the end of that month you're going to get it like it's it's there's a spark in the eye and then there's like glass behind it you know what I mean with LLMs as they are presently anyways I feel like you might have been aiming at a different point and I walk right past it yeah yeah now specifically but that's an interesting way to say it um what about the participation in the open source component I mean what are the actual touch points to be involved and how has that you know where does that really working and how does it become accessible procedurely and as a networking community not just as an API that gets provided some people people sure I mean there is an implicit technical complexity if you want to get low level but man that's true of everything it's technology if you just want to use a model like that's not as bad as it looks that's that one's pretty easy um you might have to read a read me or two in github and I know it's still in github we don't have too many good websites for this because I have to host the model so I'm going to give you a website and you got to pay me and like that's not really open source anymore you know because like how someone can hold the GPUs up but you go to like you can go get the aphrodite engine or the tech generation web UI or one of a few I mean there's there's quite a few inference engines now where you can just pick the model from hugging face and hugging face is like github it's just a website where you go and there are models uploaded and data sets uploaded and you can they have this fantastic coding library that makes it really easy for guys like me to code everything on the back side so that you can just click a button and put in the name of the model and then it just works and that's easy to get started I mean if you want to talk about like communally how do you engage it's mostly on discord which is weird but like there's probably a whole bunch of reasons for that but I don't personally understand because I'm not a huge fan of discord but go on there and like every major company that's in the open source and every group and organization they all have their own discord a lot of it's on Slack Twitter I mean likely if you're here watching this video probably you kind of know where to look the cool thing is all of the inroads the open source are the same kind of network so you can just like slide right in kind of any can be cobalt like big Malian's community mine you know in the tenor of it is very much all your interests I mean the bloke himself is a or tag me I'm like both Titans on this front of connected to everybody communities sort of just generally AI is cool both extremely performative so the best stuff is always like right in front of them because they just put it there I think it's a good way to go hugging face in general what surprised you the most so far in this year oh man just like every ten minutes like I'll do that is it yeah like I can't even answer that because it's like I'm in a constant state of shock like yeah I don't think humans are made for things that move this fast the opening I just bought that company that's making the humanoid robots 01 right so they got a robot and they're like hey guys we're making humanoid robots like cool and then it's like what month later hey look they're folding clothes and snacking box isn't that cool and it's like yeah now we have billions of dollars we're going right open AI Microsoft I mean yeah that is it's so fast like everything is moving so quickly it's really interesting and cool I'm quite excited I still think it I think it still might be like a media thing that much I'm not a thousand percent sure we're about to get robots it'd be cool if it did though I wouldn't be too upset by it but probably it's a cost thing if anything it's a lot of hardware given the speed and the intensity and the novelty even when you're in have you seen anything or do you have any thoughts on kind of the operational psychology of that or do people write reflections or do you reflect or anticipate other than just confronting the moment as it flows through us as we throw a flute sure what do we do with our minds and bodies I like I predict the next token and then that's as far as I know yeah I think what it is the reason it goes so fast is because there wasn't really too much money in the space there was but it was in like a very narrow corner that was like I think most people who are very understanding of the systems wouldn't have engaged with or otherwise didn't have the expertise to and then when you know GPT 3.5 and GPT 4 and GPT 2 really even to a lesser extent we're starting to get a lot of momentum and eyes on things and then everyone's like I can go get 10 billion dollars you know I think that really motivated a huge influx of cash and there already was this like NVIDIA was already making giant models opening I was you know there still but the difference like now is that because there was money being made rather than just spent and there's a market use case that's become clear now the people who are engaging with it are doing so for cost first like reasons like they want to reduce costs as much as they can and improve efficiency as much as they can it turns out that you can like if you're not doing a formal implementation of a new thing the first time wants to write a paper and then throw it away you can actually do a lot in that front you can really optimize everything in a lot of directions and you know I think also the H100s happening when they did make a giant difference H100s are quite crazy NVIDIA actually took if I recall correctly and I may be wrong but if I am wrong you can verify by going and watching the NVIDIA like keynote where they released them in March of last year April and maybe even the month before March but they took warehouses full of A100s and they chopped them full of ray tracing AI and they predicted the patterns of all of the photons moving through this insane ultraviolet laser they made by turning liquid tin into pure energy with a microwave disk and then they didn't run into the uncertainty principle because they didn't measure the light they just predicted it all and so they're the print architecture below two nanometers so the you know there was no interference from the collapse of the wave form which is that's all sci-fi words every single thing that sentence is sci-fi and so like it's just a lot of forces at once kind of snapping together and efficiency though is like not I think I think people often take efficiency and how good the models are now at reasoning and like this way as progress towards AGI or towards tornado or whatever and it's not because I think one thing you do laterally when I like discuss this with other people who have better opinions than my own or sometimes even like people who are just kind of thinking about it in general because there's a lot of really intuitive and really smart people out there the it's not that like it's not about how big you can get it and how smart it becomes and how good of reasoning it gets because that doesn't solve any of the problems that are currently you know there when you think about it like if you just drop GPT-4 in a McDonald's it's not selling anybody a hamburger right like it's not even close to selling anyone a hamburger and it's not because it's not good of reasoning it's because it doesn't do anything until you go touch it right like you have to go send it something to get something back and it doesn't know that the next time you send it something and there's all these features that would have to be there for something to like actually start to exhibit this kind of continuous live real life interacting back and forth you like evolve something that we do over time that we're not even thinking about like we're not even heading that way so fear not I guess we're just getting really cool tools mostly if it's going to get a lot easier to code which is going to let you code really cool and interesting things because you'll be able to just kind of describe what you want to do and walk through the logic and not have to learn new programming languages which is great because I don't really don't want to have to learn JavaScript and I think I have to learn JavaScript I'm hoping I can just wait it out. I think that angle with the the language of thought and system ease and the ability in systems modeling as applied in artificial intelligence and active inference and so on for truly increasingly if you can dream it you can do it if you can say it that was the composability and the path through semantic space that's why that person said it that way at that time and when we can start to have the interfaces that support that. We don't question that three plus three on the calculator is going to give us a number and we already barely question the epistemic hedonic treadmill so fast barely question entering in text and getting an image. And as that kind of meta fluidity as the input output relationships that are bodies from development and evolution to have become used to those modalities are being remixed and augmented. Massively, and you're totally right that a very small inside conversation has tremendous propagation of causal consequences for the action perception loop of people who aren't even here today. Okay, can you in what sense so I agree. If if you're saying what I what I what I've been shooting but I feel like I'm missing a point here when when you're referring to a action reaction inside look like an internal like a small conversation being important I understand what you mean in the sense that like. Yeah, if somebody needs to hear something and just understand what we're talking about because they don't know anything about AI and like it's very important. Then that is this but what did you that was a metaphor right that was an analogy or something. I think I missed that part was it. I'm talking about them. Yeah, it could have been a fractal dimension metaphor but that's not high intended it. I just meant that when these semantic augmentation techniques are crafted, they're artificial artificial intelligence and then they're disseminated, including their plasticity to end users. That's a level of semantic and intermodal control. That's just getting air dropped on on humans who are here today and or not. Yeah, no, this is actually touches on something that's very interesting to me personally because I've like watched this happen. The entire time I've been in the space there's always been this through line of people who are. I weirdly like almost never AI guys, they're always like pure Python guys or pure coding types. And it's always like the really smart people and they're always it's it's it's specifically modeling social systems at a very high rate of iteration with AI to get work done, which sounds a lot drier than what it really is, but I don't know the good like a good way of phrasing what's actually happening in there. It's like very interesting and it's like it's just exploiting the inherent structure of the way that we communicate because we had to structure that just to make it this far. And what's cool about it is that the models do it and they don't know that they're doing it and the people make the models do it because it works and they don't know why it worked. They just do it and all the way up to the tippy top. The only thing that's really happening is this, but fast. And because it's so fast processing information really really quickly in a way that has like this emergent quality of its own of being more robust to creating structure so like an example would be. GPT engineer right or little coder or baby AGI where it's communication explicitly like linguistic normal back and forth producing work. And that feels important and I don't know there's not a word for it really with what that type of thing is that I think. I wonder what you would what you thought about this kind of area of structure one aspect seems to be that rather than the monolith model, which is, let's see how well this monolithic API returns evaluations on these on this data set. Instead, there's kind of a co constructive elements, which is like, well the benchmark of any of these given nest mates in the colony is not that important what's important is like the overall architecture that the colony creates up that's one aspect. The other aspect is, um, for those whose work is organizational, or a facilitator any other things in that area, then hair and compassion is expressed through technical means. And so there's the situations where it's a technical ends. And then there's technical means for non technical ends and all the other quadrants over there too. But it can truly be the case that that ultra advanced technology can be utilized for pro social and for human ends just like an engine could be used to help people. This is an information engine. And it similarly can also be used even from its early days and phases for that goal. That's a really interesting take I hadn't thought about that. And so there is like, yeah, so when you put it that way it's like, yeah, it is like it's clinical when you consider that, yes, our social structures have a logic to them. If you do them fast, that logic happens fast. But it's, it's more interesting when you consider what that means in terms of humans, like our expression and our like the way that the things that we value right because like that is important to that working. So, if I guess if there is a latent fear right that people have at the moment of getting pushed out of whatever the economy of the world becoming like very technological and dry. Then fear not because this stuff works really well. And I could see there's clearly a utilitarian use case for preserving it. And, you know, it's not I get probably won't even have to convince anyone you just have to get really fast. It is interesting to consider like what if at some point our technology I guess it really actually kind of is becoming this weird blending of social stuff with rigorous formal bits and bytes. You know, I have many questions on this. I mean, I think we've also talked a lot about the outputs and the quality of the outputs. But again, our experiences in these knowledge systems are not only outcome oriented. So if we can, if we can train models on all dissertations, and then it's possible to one button right dissertation. How does that change the training environment and the the actual pace of our lives. When things that we spent time on have just been not just like rug pulled or accelerated, but they've been outmodded like steam engine to gas engine. So it's like, well, we don't need that. We don't need that kind of fuel anymore. Or we don't need that kind of construction anymore. I think I think it's overall positive thing because you can't there's really not a good reason to say that we'll have less jobs. Right. If you want to take it like as broad as possible, because there's no thing that you can add more complex stuff that's like so complex. No, even like can describe it fully in a single sitting or 10. And then it's easier to deal with, right? Like that just means you're going to need more people to handle it because like I have more edge cases and like reaches broader, but you're just solving the problems. But we're not going to run out of problems anytime soon. I mean, I myself like I have a hard time thinking about the I have a hard time abstracting from the input output. What is the utility? Why are we doing it? How much, you know, profit problem solve whatever it is to do. But that's because like, you know, we are buried in problems right now. You know, there's everyone is 100% of their day every single day when they wake up to when they go to sleep solving problems all day, every single day. We're gonna be fine. Right. I think that if you really like something and you are very much attached to it being enabled more by AI to do it even better and with a broader reach and do all the cool weird stuff that you would have like needed like 50 employees to do before. It's huge because you're going to be able to do it even better. And now you're actually going to be able to give people the thing that you have in your head that you were otherwise like didn't have time to do. And that's that's a big deal to me. I mean, like, functionally, you're never going to have a situation where everybody in general gets the ability to express themselves more fully and have more of a ability to do the things that they care about. And it's going to be worse. Unilaterally throughout all the history. I mean, there's yeah, there's that case. There's individuals. Sure. And there's systems and groups. But overall, we did we got here. Right. And it's just there's a floor. Right. You want to raise it and you can. And so if you're worried about that, if you're worried, like, what if I get outsourced? Ask yourself what could like if you were also really good knowing what you already know that probably that many people don't know as well as you do in your edge in your like niche. And also you have an AI that you know how to use like now you're Superman, right? One and then two, if we don't have to work, like, why would you make people? Right. I mean, I don't think that we won't have to work. I think we definitely have to work. It sucks. I'm sorry, guys. But it does at some point become a question you have to ask otherwise. Like, why would you make us go to work? We don't have to. This sucks. Quickly becoming empirical and actual are many philosophical debates related to meaning and life and all these big topics. And I guess. Current moment, you have a lot of alarms near you. Yeah. There was, I guess, some cops going down the road there. It's all good. In the current day, we really have the empirical consequences and playing out of these debates. And I guess it's hard to feel like there's even a tempo or a beat to reflect on it. And maybe that's just part of the game that most of the dust kicks up, just dissipates away and isn't recorded. And the footsteps are recorded, but most of the consequences of that footsteps are dissipated away. Dude, it's data, right? Like, that's really that. That's just data. I'm like, I can do a lot of data. I think that you can do a lot of data. The problem being confused, I guess, by the noise and not really feeling like you can get your bearing. I mean, that's why I'm here, right? That's what we're doing this podcast for. I think that's what many communities are currently activating for and being activated around. Understanding what's happening is free. It's fundamentally still largely a research space. Nobody is going to think you're dumb where everyone is narrow because it's so complicated. So I guess that's one piece of advice I'd like to apart on anyone who's watching this. If you feel like you want to ask questions, but you don't really want to look like dumb because you don't really understand. Dude, it makes no sense. Don't worry about it. It's really complicated. I totally get it. Ask questions. It's a science thing. It's science. We like questions. It's okay. It's highly respected. You're cooler if you do ask the question, honestly, because it's fun to be able to answer them and not have to ask them ourselves. So don't feel intimidated by the space. And we all know the internet sucks. Every single one of us, you know, I know everyone I work with knows. Everyone you work with knows. Ideally, we get to a place really quickly where the only thing you have to do is want something and want to know something. And you just get that. Whatever we have stored, full and total access to it without having to try and learn data science and learn to code and all these other things and need a bunch of hardware. That's doable now. Like I could do that today if I had, you know, a bunch of money, but overall it's, if it's achievable with what we have at this moment, then it's a foregone conclusion. The space is moving fast, but it's moving fast in a lot of ways at once. So just because you don't see movement in a particular direction, it doesn't mean that it's going to be a huge amount of effort to get there and it might take years and years. It's really just because everyone works on too many things and we're all overworked and there's like not that many people who are really feeling like they should engage with space or that they can because they didn't, you know, go to college or something. I work in a vape shop. You know, you can you can definitely handle it. It's just intimidating. So certainly anyone who's interested to jump in it's cool. It's sci-fi like all the way down. It doesn't get boring. I promise it might get a little boring. Sometimes if it does just look at one of the other crazy sci-fi things that are happening in the space and just go look at that thing. So that's that's nice. I feel very similarly about questions. The oldies are the goodies and the basic questions are the ones that beginners can ask and the ones that in live in the experienced as well. And it's either a question that has been asked at which point the answer is new to you and it's education or it's a question that hasn't been asked or it's been asked but never answered and then it's research. And so it's like at the speed of question marks, you can accelerate your own epistemic journey and or the broader epistemic adventure that we're all on together by putting new questions together. And the speed is moving faster than we can blink, which brings some new kind of dynamics like quantity has a quality all of its own as they say. But yet that human component and the fact that hopefully the welcome to those of all backgrounds can be made clear I think is also really positive. In fact, dude, I don't know if you've had this experience, but I've had this a bunch of times where somebody has no idea what's going on. They're just like, this is the coolest thing ever and I'm involved now and you're going to help me get there. And I'm like, okay, right. They have the best idea like there's people who intuit things that took me months of just eating bricks to learn and they're totally in the space. There's no idea what I'm talking about, but then they go and build something with like basic Python that's totally new and completely shoots off in a different direction. It's because there's really not that much that's been done. It's all been very iterative and very slow on many, many fronts. There's been a lot done, but there's a lot of ways to go still. You can really strike out in basically any direction and immediately hit the wall of like, oh, no one's ever tried that before. Okay. And just try it and find new stuff. And not only that, find old stuff that we just forgot about because there's so many papers. I'll make one comment and then I'll ask a question from the live chat. It reminds me of studying ants and loving ants in graduate school. It's like, wait, no one has measured humidity and the rates of water loss in this one species in this one valley. It's like, how can we not know when that's such a beautiful part of nature? Yeah, I guess when you start at AI, you're already way specific. Yes. Okay. I'll ask some questions from the live chat as we kind of head towards the end. Sure. Matthew H wrote, will LLMs continue to lead the way in AI and for how long? I don't think LLMs could ever produce AGI. Am I wrong? No, you're absolutely right. If they do, they're going to be part of something. It depends on how much money people are really going to keep spending on them, man. Maybe two years, I think at the outset, it's real bad. But I think probably not that long even. We're going to be mainly focused on LLMs. And then what's behind that system? I don't know. Something else, like something cheaper. Probably LLMs is like they will, they cost quadratically more. The longer your context length, the more weights you have. Like every single direction of improvement is almost always just a million times more dollars. You had one thing and now it costs, you know, as much as everything else, plus one, like again. So that alone, I think, is plenty of reason to say this ain't it. But also because it's too simple, I think systems make more sense than blobs. I'll just mention one kind of active inference recent advances for the planning capacities. Also, that kind of exponential computational complexity was experienced. Like to go from a plan of length 10 to 11, there's a blow up just like chess playing. However, recently with some work on direct policy inference, now it's possible to have linear increases so that making a plan of 11 steps versus 10, it's only 10% more computationally intensive, which is kind of closer to how it feels in terms of like 11 feet being 10% longer than 10 feet. And so that is a really exciting way that potentially some of that planning explosion. But that's not the whole, that's not everything, it's not that exact combinatoric explosion. But what is, what's a quadratic today could be linear tomorrow and all of a sudden new categories of possibilities appear. I think that's absolutely true and also phenomenal and also it's still not enough. Even linear is like, if you're going token wise, linear is like real hard to make work, man. I hit a million tokens like in the last couple hours, you know what I mean? I think that's cool. Yeah, no, I don't have any comments though because I'm not, I'm going to just talk dumb stuff about active inference in the place of, I don't know what you were referring to as far as like this most recent advancements. But now I want to know. We'll explore, we'll explore more. I think like, I guess sort of as we close, there's so many things I'm curious about, but I'm going to ask something that just is what is on top of mind. How do you physically structure your setup and your approach with just the choices of what to do with the one of you? What? Like, how do you choose what to do with the one of you as you have the visibilities to spin up all of these symbolic kind of emissions? Like what's really the positive and guiding reminder or I'm not sure. I'm just wanting to know the human side. I don't know. That's a question that's like either I can be honest and we can go on for another three or four more hours. I might cry. Or I can be more honest and say, I just like cool stuff and like sci-fi is pretty neat. And I'm like, I'm okay with math. So like it just kind of falls into place a little bit. I think I really do. Yeah, I just, I think that it's a human right, this whole AI thing. I think that we should just enable people, man. That's it. That's really my whole message I came here with today is like, it's cool and you should have access. Okay, I'll ask a closing question and then your answer can be your, your closing thought upcycle club wrote, do you have any insights on emerging trends in open source AI development? Yeah, dog. Oh man, what is like emerging trends in what context so like, are we talking about, dude, it's all emerging trends the whole thing is emerging really really fast like in a bunch of different directions I think we're going to graphs I think we're going to reinforcement time series models are going to be really, really important really soon. I think transformers suck, but I think we're going to keep using them for a while. I think we're going to get probably like some gigantic MOE that's going to run really efficiently because of all this like one bit point six, eight bit stuff that came out about quantization. But like, that's probably not going to use a whole lot because what they don't say when they talk about this stuff is that you then have to go write kernels if you want to use it in your company and no one knows except for a couple people who cost a lot of money so in general, yeah, a lot of trends like all the time. Well, thank you for 72.1. We can come back and maybe see a little bit of generativity a little bit of active inference and thank you for for engaging and I look forward to you. Yep, absolutely. Thanks a bunch man. Bye.