 Alright, hi everybody and welcome. I am here with JJ Alleya. My name is Jeremy Howard, and we are having what I originally was very proud of myself for inventing the idea of a two way AMA. I wondered why other people haven't come up with this idea. And then I realized, oh, I think I just invented another name for a conversation. So this is either a conversation with JJ Alleya or a two way AMA. We'll see if it turns out to be any different. So good day JJ. Thanks for joining. Great, great, great being here. So I always like to find out a little bit about, you know, people's environs. Where are you talking to us from? I'm talking from my home in Newton, Massachusetts. In Newton, Massachusetts. Newton, yeah. And where is that? It's like, it's kind of due east, due west of the city, sorry. Maybe 15 minutes outside of downtown. So it's an inner ring suburb. And why there? Is this where you've always been? What's your favorite place in the world? I'm not even close. No, I started in, I grew up in, I was in Philadelphia till I was 13 and then I was, I moved to Minnesota, I was there for a long time till it's about 30. And then I, and then I moved to Boston because of work, because the company I was working with, and then I just ended up staying here. So, and then I'm, I'm here because my public schools here are great. So I probably, all other things considered live in the city, but for the time being, I want to take advantage of the public schools. Boston's a great town. I spent quite a bit of time there when I set up an office in one of my earlier companies there and I got very into the Boston Red Sox, as you do. Yeah. Got very into the Tennyson Racket Club. I have a, I have a very good friend. I used to be a member of Tennyson Racket and I have a very good friend there who's who's still a big place there quite a bit. So yeah, that's a gem. Yeah, that's a good town. And I also I loved that I, you know, because I didn't know anybody. Yeah, it's definitely a town where you could just go to a random bar, sit down, watch the game and whoever's around you will chat. Interesting to talk to. Yeah, that's true. So you end up, you're in Australia now. I am in Australia now. When you were in Australia before. Yeah. Yeah, so I'm in, I'm in Queensland, which is a kind of, well, the part I'm in is kind of a subtropical beachside town. It's, it's not a resort town, but it's kind of like the nearest capital city is Brisbane, which is like four or five million people. And it's the nearest kind of beach town that people would go to for a weekend or something. So yeah, yeah, it's I always wanted to live in Queensland. Never understood why everybody didn't want to live in Queensland. And now that I'm here, I'm even more convinced everybody. And the university is in Queensland too. It's not like, yeah, so the university is in Brisbane. So I'm a primary professor at the University of Queensland, which is 45 minutes from here. When I teach there, I yeah, just drive in and do my thing and drive home. Cool. And in between here in Boston and California. Oh, yeah, so I grew up in Melbourne, which seemed like the centre of the world at the time. And I never understood why people talked about Australia is being far away because it seemed pretty close to me and but then yeah, moving to San Francisco. I lived there for 10 years for a previous startup called Kaggle. And suddenly realized, yeah, got Melbourne is a very long way away. Physically and everything else. I do now feel like it's a very good experience for somebody who grows up away from a kind of intellectual centre like that just spend some time living in one just to experience that for sure. And yeah, so tell me about what you're doing now. So you're the CEO of studio studio, which, which I started that about 11 or I don't know 11 or 12 years ago, and that was started off originally as just an open source ID for our. And it was just it was not actually intended as a company, it was just me and one other person. And we had worked on lots of development tools and programming languages and authoring tools and in our previous lives. And I had been involved in in graduate school and as an undergrad in social sciences and statistical programming that social sciences and I sort of originally that was what was what I wanted to make my career. And that swept up into software. And so when I, I finished with the startup. And then I found out about our, and I said, Wow, there's an open source statistical programming system that's cool I really would like to work an open source. And it sort of, as you know, is written by statisticians for statisticians which gave it a lot of things, you know, a lot of things that got right but then some of the software tooling part they struggled with. And so I said, Well, here's I can make a contribution here I know the tooling part and, and I'd like to see this project get get used by more people. So we just started working on that on the ID. And then, so that was just a couple of us and then long story short, we ended up getting to know Hadley Wickham and he was working on what was then not the tidy verse but the deep plier and ggplot and things and we. So let's let's all work together. And then that sort of beg the question of well how how is it that we're going to work together and make sure everyone gets paid and everything so he said well let's try to make a company out of this. And we did that by sort of, you know, building sort of enterprise grade, sort of servers that made it easier to adopt a lot of open source software. I knew Hadley from before our studio, because of course everybody in Australia and New Zealand knows each other. So, yeah, I remember actually hanging out with him in in Texas, and he was at Rice University. And he was already famous for his amazing contributions and he was saying to me, and I was saying like wow you know the University must love having somebody like you there and it's like, no, quite the opposite. You know, they don't appreciate it at all. And yeah, to get support for what I'm doing. Yeah, and I just like short shit you know what a. What a terrible thing about academia is going on here. I know. And so glad when he found you know you found him and he found you and that's worked well. Yes it has so that's good and then the companies has developed well and so that's you know afforded one of the projects I worked on. And that was our markdown, which is kind of a literate programming system for our, and that actually started working on that about 10 years ago. And, and that we had a lot of success with that but it was like it was quite narrow in a sense and that it was why did you do that you had some previous interest in literate programming. You know, honestly, I, there were two things that happened. I, there were a bunch of the I was working with a bunch of faculty who are teaching our and they were teaching. They were everybody was trying to, well, they were teaching at the time as we've, which was this sort of latex based that are programming environment that that was built into our, and they were doing that because they wanted to teach people that are programming and reproducible workflow but that they're teaching latex, which was really. But that's unusual already right like not many people. I had this thing as we've built into it, you know, in like 2007 or something they were way ahead of, or even before like, so, so our always had this sort of in the community and it was it was actually one of the core members of the our team who built this as we think so they're pushing this program idea so I kind of got infected with it by exposure to that cool. And then at the same time I went to us are in 2012 in England, and the one of the, the people who presented was presented a three hour seminar on org mode, and presented another system for literate programming that was more, you know, human readable ASCII oriented. And so just to clarify for people who haven't seen it so org mode isn't isn't emacs it's not just a mode but it's also a file format, which is in many ways a lot like Markdown is not at all compatible but it's the same basic idea a text based, you know format, but also in org mode, your code can kind of be evaluated and the executing results of the execution appear in the document. So it has a lot of like what our markdown is right it's kind of like a lot like executable code, the outputs appear. That's exactly right so so it's sort of like this idea well we've got to ask we've it's really hard to teach people a lot tech, some people were saying well is there a way we could get this into office can we get. Can we get through this with open document how are we going to get people to do this without the world while not burdening them with learning lot tech. When I saw our mode I said wow that's a better idea to me and more just asked on human readable ASCII based idea, but at the time Markdown was already really taking off and it was already in use on GitHub. It was in use in a bunch of wiki systems. And so I said let's let's take the core ideas of org mode and sweep and build a Markdown variant them. And I did it with our because that's the environment I was working in it was just had sort of blinders on like let's just make this work in the environment that you were personally like doing stuff with literate programming yourself and it found it useful or your it was helpful because I was building websites and documents and yeah I definitely was thought this is a great way to work. And then, and at the same time, you guys he was created had created a package called knit are that was sort of a replacement for it was sort of a better like sort of feature enhanced version of sweep. And at the same time he made it open so it could do restructure text and it could do any ASCII doc and it could do Markdown. And so we he he and I got together and said let's create this thing called our markdown which basically says we're going to use nitter as a computational engine, and we're going to use Markdown. At the time it was just like we basically you sundown which was get hubs markdown processor, and we added math, you know, so that was pretty straight. These tools all have are in them are they all exclusively our tools. They, they are, you, they require are to run they're pretty much are they now they're multi engine so nitter has this idea of engine so there is a Python engine and Julia engine. But you're calling Python from our you haven't embedded Python session in your our session you've been embedded Julia session in your our session. So it's like it's very our centric even though it's multiple languages it's very our centric so so yeah and then. So we did the first iteration of it and you could just make you could just make web pages, and then at the same time Pandoc was kind of evolving, and people were trying to figure out they were like oh let me just blew together our markdown with Pandoc and then I can make more documents and pdfs and that's going to be something a lot of people are not familiar with so Pandoc is a basically a markdown processor it's it's. I think it's written in Haskell right although it's a pilot binary so that doesn't matter for most people and yeah it's kind of like a pretty. I mean it is a kind of a markdown processor but it can take almost any input and convert it into its markdown and then convert that into one of those any out. Any text sweaty text. Yeah it doesn't actually even convert it to markdown it converts it to. An internal format that's a sort of abstract document. And so like if you're going word to PDF it's never seen markdown it's just going to say so JJ I had used Pandoc before talking to you about all this stuff, but I had used it. And it's very kind of naive way of just being like oh I've got a document of HTML document and I want to convert it to latex or we're going to convert latex to markdown or whatever and I just run it. Now what I've learned from you is that actually, you know Pandoc has this like embedded Lua interpreter and this kind of very generic system kind of a bit like nb convert the notebook world takes this input as a kind of abstract abstract syntax tree you can munch it however you like you spit it back out, you can fit that anywhere in a Pandoc path to kind of construct your own. It's like a doc it's a pipeline of transformations to the document. And most obvious of which is, I just want to make a PDF or I just want to make a Word document or a web page but there's other. The thing to mention is I mean, as you say it doesn't particularly require markdown but but you know by the by the way you know Pandoc markdown is this fairly universal format because you can express things like divs and classes. And then yeah the markdown syntax, you can express the the whole Pandoc AST in Pandoc mark. Yeah, it's a kind of a markdown steroids. So the idea is that though that was taken really seriously by John McFarlane when he created Pandoc was. So the original markdown had the idea of raw HTML because the idea John Gruber's idea was like this is just an easier way to write HTML so of course you can put raw HTML in there if there's not if it's something isn't in markdown just go ahead and add the HTML. So that's a good idea, but what he had and he was interested in creating technical manuscript so he extended that to you can put raw a lot tech in there. And so he basically said, also you can have a lot tech and he made it so it was a very good at generating a lot tech. So he sort of added because there's also Pandoc citations, for example, and citations right so he added this idea of let's take a lot tech really in a way that other markdown processors tend not to because that's not really their use case or a lot of them are tied to like content management systems and things are producing web content. And then let's take citations really seriously so they had a really robust implementation of citations and integration with citation style language. So, you know really first class citations and supportive lot tech and then ultimately support of office document formats and open document and things like that so it was a more elaborate comprehensive hackable, you know, version of Markdown so then we migrated we created sort of our Markdown v2 was based on Pandoc. And then how long ago was that. That was a couple years I was about about eight years ago. I see pretty early on we moved to Pandoc maybe even nine years ago it's pretty early on we just moved to Pandoc. And then, kind of to make a long story short we, we created a lot of extensions are Markdown we created a thing for making books and we created a thing for making blogs and we created a thing for presentations and for for kind of like fancy grid layout of documents and so we had all these we did a version of the distill machine learning journal from Google. If you've seen those articles we put in our Markdown version of that so we kind of we sort of innovated a lot in a very fragmented way. And so we ended up at the end of this with we have this system that has a lot of functionality that's fractured across a bunch of packages with a bunch of inconsistency that's our only. And so we said that is kind of a dead end, in terms of having a bigger impact on scientific computing. And so we said what if we could take a step back build a system that was agnostic to the engine, the computational engine. And at the same time try to roll up a lot and synthesize a lot of the ideas that we developed over that 10 year period into kind of one uniform system. And that would be that would be kind of what we needed to do to really like continue investing in a way that we felt like this project is going to be meaningful in decades you know that so we kind of it was almost like take a couple steps back and that was a couple years ago. We said let's start working on Cordo which is a language independent engine engine agnostic, where the first two engines supported our knitter which was what we supported in our markdown and Jupiter. And so those are sort of equal citizens and it is possible to let me just get that up so yeah. Okay, so here's quarter. Okay, so this is what you're working on. That's what I'm working on now. So that's pretty much what I've been working on for directly or indirectly for the about the last three years. And this looks a lot like markdown. It does. Yeah, it is derivative syntax and and an approach to things is to is derivative of our markdown. And so you've got some some YAML front meta so some metadata which is supported by pandoc I believe that's right. Then you've got some markdown. This looks like something that's not in any markdown I'm familiar with. That's right. That's that's a cross reference. Okay. So it's saying I want to know. Here is the label. Imagine code. And so now we've got, as a result, the markdown here, the metadata here. The code is also folding. And I guess I can't click on this picture if I could. And a hyperlink. This was all the cross reference it's numbered the figure there's it's only figure one but if there were 17 figures you'd see one, two, three, four, etc. So that's kind of the idea so interactive documents as well. Yep. Yep. So we do integration with observable and Jupiter so really with Jupiter we put the most effort into Python and making everything work great and Python, we've put some effort into Julia. There are any any Jupiter kernel works with it but you know if we do a little extra work then it works better. So, yeah, so I mean seriously anything works. I've been recently playing with APL and I created the first ever. Yeah, APL kernel. Nice. And so here's. Okay, links to APL cliff documentation. Yep. Here is auto generated table of contents. Cool. And then here's a Python. Python one. Yeah. Yeah. So yeah. So that's what I've been working on and I know the way that you and I got connected what we got introduced separately just take you should get to know each other and then we got the talking about. Yeah, that's right. And, and then all three rows of answers and error and such like. Yes, yeah. So, so Wes introduced us and, and it was like, what are you working on what are you working on and we just, you talked a little bit about and about what you do and literally programming and, and I said, well, you could this quarters might be related to what you're doing. But it's, it might be. I mean, I already knew very much knew you by reputation, because I was, I was, I was not a big user of cold fusion but I was an enthusiast of it, which, you know, I can come back to and talk about that. I'm a user of windows live writer so these are both things that you would build and windows live writer was something which felt like it reminded me of the original Mac OS graph calculator. It felt like better than all of the other things like because I came from Microsoft it was kind of felt better than all the other things that were around it somehow and I thought like how did something so how did it end up in the windows windows extras or whatever it was windows plus windows. Yeah, there was like plus. Anyway, yeah. So yeah. And then I remember it at university of San Francisco, one of our admin stuff said oh there's got this question a guy from I was thinking of flying in for the lessons, you know, you might want to get in touch with them to see if that's suitable as I go it's like it's a guy called JJ a layer and I was interested in first day I that's really cool. I was very well what the reason I was going to do that was I was working on I was working creating an R interface for Keras. And then we had done the we had done our created the R interface to Python which is called reticulate and then we built the TensorFlow interface, and then I was building the Keras interface. And I said I'm going to go take Jeremy's course in Keras. And then I found out wait it's not in Keras anymore. Right. It's, yeah, I said okay. I would still like to take the course but it's less right down the middle of what I'm doing so I didn't do it but I actually convinced one other person to do it with me. Although you did tell me that some fast ideas did end up in some of your. Yeah, yeah, yeah. So, so yeah so studying fast AI as we, especially as we did our pytorch work. And we because as you know pytorch doesn't doesn't offer you much in the way of like a built in training loop right and doesn't really organize your work. Keras does right. And I think we rather liked the things you did in fast AI. And so we said let's can we do, can we do some variations of those, you know for our interface because we clearly it wasn't enough to just say oh you can use torch from our I mean it's for some certain researchers it's fine, but not for end users so. Yeah, I mean I try to encourage even researchers not to just use raw pytorch for everything because you know you really want to be incorporating best practices as much as you can not. I did have a couple since we're on fast AI I did have a couple questions. And one of them is like. How you help both new users ramp into things and make experienced users productive right you you provide these abstractions. And there's a dial of how leaky you want you let the abstractions be all the way from hey we've hidden you don't even know pytorch is here. At one end, the other end is learn pytorch then you know learn our special shortcuts, and in the middle is somewhere like. PyTorch is present. It's not hidden. You can probably extend this with pytorch and you know like, I think different software design problems than themselves to different levels of leakiness. How did you think about that or do you think. Yeah, so. I've been coding for 40 years, you know, and I spent a lot more time coding than building deep learning models and a lot more time reading and studying coding and deep learning. You know, software engineering is based on our ability to make do good things with computers is based on being able to use abstractions. And those abstractions are turned to based on being able to use abstractions and, you know, so forth into a machine code. We're hidden from the hard disk control. You know, you know, et cetera, you know, and there is none of those levels of extraction is the correct level. They're all correct for what they do. So with fast AI. My approach, you know, has always been just the same as all the coding I've always done, which is if I'm writing some high level API. I write it using some lower level API, which I then write using some lower level API and so on until I get to the point where it's, you know, that each of them is trivially easy to use ideally. And is a kind of carefully designed set of primitive operations that make sense at that level of API. So for example, the high level so there's three main levels of API at fast AI the high level mid tier low level. The high level API is focused on applications. We provide support for four, which is vision, text, tabular and collaborative filtering. And then there are other folks in the community who have added stuff around, you know, medical and audio and whatever. At each case, you basically use the same four lines of code. Okay, push button interface. If you want. Yeah, like the recipe you follow. Yeah, and that was very much designed about the idea that one day we want to get rid of the code, and there'll be a higher level API still which is not good. Yeah, this is what I wanted to ask you. Well, when you finish I want to follow up question. Okay, cool. This is really important for stuff like deep learning because the more boilerplate you have the more things there are that you can screw up, you know. And so if you have to like, manually create your validation set manually make sure it's not shuffled and manually make sure the training set is shuffled and manually make sure that the augmentation is only applied to the training. Like, each of those is something that you're reasonably likely to forget. And when things break and deep learning they don't break properly. Generally they don't give you an exception or a sec fault. They just give you slightly less good answers. Or it's leading or misleading metrics. So, so then the mid tier API is the bit I'm most proud of. And I find that's often the hardest bit to write you want something that's extremely flexible, and that you almost never have to go deeper, but still really convenient. And so for example we've got a thing called the data blocks API, which came from me. You know, I've been doing machine learning for, let's see, over 30 years now. And, you know, I just thought back to like, well, what are all, what's the entire set of things I've had to do to get data into a model training. And I, you know, realize that there was just like four basic or five basic things. And realize that when I pulled out those four or five basic things. The huge number of classes I used to have before I built the data blocks API, I realized I could replace them with just these five things by putting the blocks together. And so I was able to reduce the amount of code I had by 10 fold and increase the ability for me to write my high level API. And then to give the same thing to all my users. And then, yeah, the bottom level API, it's still above PyTorch was mainly like filling in the things that aren't in PyTorch, which should be. So for example, I like using some object oriented programming. And I believe that types should represent where possible semantic things. It's something which that doesn't really exist in PyTorch. So I added object oriented types, semantic types to PyTorch. Something that they've added. It's still not amazing, but we could first is like a computer vision library that entirely operates on the GPU and does things in a really efficient way. So kind of stuff like that. So then the idea is that a user. We want them if if they're doing something supported by our application API, we want them to be able to use it. We want them then to be able to say like, okay, that worked okay, but I wonder what if, you know, could I make it faster by doing this work by doing that, and they can just replace it with a mid tier API thing, you know, and then they, so rather than starting at the bottom, and then adding, you know, simplifying things with a high tier start at the top. Which is also how we teach, you know, and then add in lower level things if and as you need them. Did you have a goal like kind of what I'm thinking about like for leaky abstraction you have a goal where it's like well if someone has found it and I have not personally used PyTorch but I use Keras quite a bit. If someone finds the equivalent of a layer, you know someone has written a layer for PyTorch they find it on stack overflow how do I you know you know reduce the error here whatever. Oh, do this. Is it, you know one level would be like oh you can literally just, you know, point to that, or, or there another level would be like you kind of need to package that you need to put that in a frame that vast AI kind of consume. Yeah, so. Everything, you know the idea is basically that everything should be very easy if you to grab stuff from elsewhere and just use it so we actually have. You know, so we've got a bunch of integration example but in particular. You know this like okay what if exactly that yeah that's a great, great virtue of a system of it if it can do that. Yeah, then it then it doesn't suffer from the we have to do everything. Exactly. So special special packaging special wrappers. So what I did was I grabbed for this one I actually grabbed the endless training code from the official PyTorch examples. Yep. And they originally had it as a script so I just changed it to a module, you know, and so I. So here it is. So this is their code. Right. So I took their code. And then I said okay well how could what if we wanted to replace their training loop and test loops that's a lot of code right and it's also not a particularly good training loop and test loop with the fast AI one. And by using the fast AI one you're going to get for free things like tensor border weights weights and biases integration you're going to get. You're going to get all kinds of metrics you're going to get automatic mixed precision training whatever. So the answer is that you can take all that trade and test stuff and replace it with these two lines. That's great. And then run this one line. And this is now also going to run with one cycle trading so it's going to do a warm up it's going to do a cool down it's going to print out as it goes and that's literally it. And it's the same for other things you know so for example, you know, you could I grabbed the PyTorch lightning quick start converted to a module. And so those data types the data types one line of code are used by fast AI, since they're fundamentally the PyTorch data types that's how it all fits they're not they're not secured. Yeah, that's, that's either true or we recreate our own API compatible versions. Yeah, the PyTorch data loaders are things which take things that are either indexable or streamable one item at a time and batch them. And we created something with the same name. The fast AI data data loader, and then we added stuff to it we said all week we had a bunch of callback hooks that you can modify the data, you know, after it's been batched or after it's been turned into an item or, or whatever. So when I was thinking about your, your application layer because I know like in your course, you say, you need to, you know, high school mathematics, and some programming is what you need to be able to learn this. And my question is, you could imagine and I don't even know if this is a good or a bad thing so it's more just a question you can imagine you know, as you said earlier, an application that's like does transfer learning and you know takes various types of data that are known and lets people say oh I'm doing computer vision or is that the right layer or not right. Do you think that's a desirable layer to have or is the, are you at the right layer now where the person will encounter enough complexity that they really best know some math and know some You can see where it would not be desirable to go further. So, the answer is so far we've, we failed at our goal to make deep learning accessible because we require high school math and a year of coding. And that's not accessible because most people, like I think only 1% of the world has like that coding background. The goal has always been to get to a point where I use the analogy to the internet right so when I started on the internet, you would have to do it all through the terminal and even when the first GUI things came in you would have to set up like PPP configuration files and whatever. And, you know, I'd read use net news with our and which you know with always arcane keyboard shortcuts. I mean, I loved it, but it wasn't the most accessible thing. Nowadays, you know, my mom who's 83 uses the internet every day to chat to her six year old daughter on Skype and whatever. That's what most, you know, I should look like. Okay, we're starting to see a bit of that with things like codecs and Dali mini and the Dali to and mid journey and whatever GPT three where, you know, I don't know if you saw but yesterday a book on open AI prompt engineering came out. Okay. In fact, I'm gonna see if I can find it because it's quite interesting. And so basically it's like there's still skill involved in trying to create beautiful and relevant images using Dali to, but it's not. It's not coding. It's. Yes. It's a different skill is prompt engineering. Yeah, yeah. And okay, I think I found it. So let me share my screen here. And this I like this because we're all about. We're all about domain experts, you know, yeah, yeah. And so, you know, here's a whole book about how to create nice pictures with Dali. And it doesn't have with lots of examples of nice pictures from Dali, and there's no code in it. Right. It's saying like, Oh, we've done some research to find out what kinds of words create what kinds of pictures. Here's examples of that for you. And that's like someone learns, essentially here you learn a craft. Right how to see the right sorts of things is totally different than programming. Right. And it requires like a genuine understanding of domain. So if you want to create good camera shots that didn't that don't exist, you have to know how to about the words like explain close up and sinister late hundred. You can become very, very good at this, you know, extreme long shot. I would like describing shadows and portions. This is what this is the kind of thing we want people to be spending most of their time doing and also the kind of people I want to be doing it domain experts in that field so we want, you know, product marketing people, you know, product product photography people using their product photography skills to create product photography mockups we want disaster resilience experts to be doing disaster resilience we want radiologists doing technology. Right. Right. Right. Right. Right. Right. By. Yeah, I, you know, right. So the yeah. So the, the tool that you would build for for a radiologist. I mean, in a way you could even have you can imagine a radiologist is training a model. Basically, in a way, they're doing transfer learning they're applying their data there. Yeah. You know, they're, they're, but it's in there. It's in there. You know, it's in their Dicom viewer, you know, on their radio, their radiology workflow software. Okay. Again, the answer is that you would like to go quite a bit farther than you have. Right. I don't quite remember what we said at the time we started so when my wife Rachel and I started fast AI. We were thinking it's at least a 10 year goal and of making deep learning more accessible and like our first step was, well, we should at least show people how to use what already exists. That's why we started with a course. That was the first thing we built. So that way we would find out, well, what, what doesn't exist but ought to, you know, and so then it was like, well, basically nothing works except computer vision at the moment, we should at least make sure this works a text. So step two was, I did a lot of research into text and I built the ULM fit algorithm and integrated that and you know, so there's a lot of research to do and then it was like, okay, well, from the research we've done, we've realized that there's a lot of things that you could do a lot better if only the software existed. So then step three was to make the software exist, you know, so then it was very a lot of coding. And then, you know, come back full circle, do another course, you know, now showing here are the best practices using everything we've learned and built. Where are we now? You know, and so repeat this. So we've, we've, we're just about to launch version five of this, of this process which except for a year off for COVID has been an annual exercise. Yeah, yeah, I wouldn't be surprised if in the next five years we have quite a bit of the like code free stuff that we're aiming for. Okay, yeah, okay. All right, my turn, if I may. Okay, you got it. I wanted to change track a little bit if I can to talk about your background, JJ. And the reason for that is I like to understand the background of people who are doing interesting things in interesting ways and like what are the ways I find you interesting is that your title is CEO. But in an interview I read, you said you spend about 80% of your time doing coding. And I know from personally interacting with you over a lot over the last few months on building and be dev to that. Yeah, you know, generally speaking, if before I go to bed, I send you a message saying there's a bug here and by the time I wake up in the morning and say I fixed the bug is, is that you know, so that's unusual, you know, and also it's unusual that I feel like you. I don't know you seem to do things differently to most people like you do you. If you know you feel more like a kindred spirit to me in a lot of ways that like you seem to like doing things reasonably independently but leveraging a small number of smart people. And you know, I was also interested to learn that, like me, your academic background is non technical you did. I did philosophy. I'd love to hear like, yeah, what, what, what was your journey from doing poll sigh. Yeah, who's founding kind of three at least three successful software companies and now working in scientific publishing. Yeah, yeah, how did that happen. Well, it really. It started with. Yeah, there's a there's a couple of different threads that come together. So one was how I got interested in data analysis and statistical computing was, I was a huge baseball fan. And when I was like 12, I got a hold of books by Bill James, who you probably have heard of. He was a he was a math teacher from Kansas City, who wrote the Bill James baseball abstract that essentially created this idea. Why don't we empirically measure everything we can about baseball and see what, see what's true and not true. And I use an amazing. I don't know any other sport that has a whole field of academic study of its statistics named, you know, save a metrics, you know, based on that. He started all that anyway, but what was impactful for me was, I was also very interested in politics my parents were political activists and was I was mostly interested in politics I was interested in baseball I got the bill James memo. And I realized like everything that people said on television about what was true about baseball not everything but a lot of the stuff was just nonsense, the coaches, players and broadcasters nonsense. That had a big impact on me I was like well if that's true, then then a lot of the things people say about a lot of things are probably nonsense, and probably data analysis is actually really fundamentally important. And so I kind of got, then when I was looking at political science, that was my lens I was actually happened to. I happened to find a great mentor in college who was also really into it. Can I just mention, I had a similar background but for a totally different reason which is I started at a big management consulting company, when I was basically 10 years younger than everybody else, and they all worked using their expertise and experience, which I didn't have so my was like I'm going to have to use data analysis because of the website. Yeah, so. So I was anyway political science and I actually was convinced I wanted to be a political scientist focused on date still focused on on data analysis and things and so I basically went to graduate school and to get a PhD in political science. And that's when I actually taken a year off and I had worked at the Minister of Department of Revenue as an analyst, and I used a lot I had done plenty of messing around with software I had learned you know debase and hypercard and you know various other kind of, you know, scripty things that a lay person didn't have access I wasn't I had no training in computer science and I didn't take computer science in college, but I was able to get my head around things like hyper talk and and debase and things like that so yeah. And so then yeah and SAS and you know all these kind of I was exposed. I was reading you were doing stuff with SAS and SPSS, you know, Excel macros, I ended up at the Department of Revenue I did a lot of SAS I did a lot of very pragmatic programming tools, very pragmatic. You know all this so. And so that I got to graduate school and I just found like wow, I just really care a lot more about software right now that I do about political science it was actually at that moment. When it was 9092 93. When when it was software was really coming into its own. Can I just ask that discovery. Were you okay with that because, because I wasn't, you know, for me, I felt embarrassed. I did because mentor. Oh my God I spent four years five I spent so much time with my mentor. And, you know, I just was like wow this is this is, I know what I'm supposed to be doing, and this is not what I'm supposed to be doing. But I really just went with the evidence of like when I go to the bookstore I spent all my time in the computing section and that lights me up and that's what I want to talk about. And I think you had more self confidence than I did. Well, I also had a negative, a negative experience with academia, even though I had a couple great professors. And it didn't feel like I was going to, you know, I didn't feel like I was going to succeed even if I was into that I didn't feel it didn't resonate when I got there. And so I was like well I'm not going to do this and I think I want to do that so I'm going to go try it. So I basically went off and said I'm going to you know I'm not trained to write software I need to learn a bunch of stuff. And I went and started, you know, teaching myself a bunch of stuff I needed to know and then I eventually got bootstrapped into doing some contracting. And then I so I sort of was a contractor and kept learning stuff and then I kind of by happenstance and good fortune ran into the internet. And I had actually worked with my brother on. So when was that roughly that you was in 90. Well, we got we got the internet at college, my senior years that would have been 91 so we had. And then the web was 93. And my brother was really into the internet and he was going around the Twin Cities. And then he got city pages which is the the news public the look you know the city newspaper, he got them to say we're going to do classifies and forum and we're going to do all this stuff on the internet. And then I, and it was like my brother doesn't write code so he's like hey JJ you're you're a contractor. What do we do this as like sure I can figure this out so I did that. And it was just like and I that the other thing that the big thing that happened for me was that I was a fan of these tools that let ordinary people program. I was a fan of DBase and hyper talking and spreadsheets and so I was like that's really empowering. And so when I what happened was I said wow you know, my brother just told me he's going to learn Pearl so he can write websites. Yeah, and I'm, and I'm looking at what, and I'm looking at what I did I let pills so I could drive websites. And put data in and out of a database and putting it through like a template, you know, and mapping form fields to data, like this is not that we don't need Pearl here, you know, I mean turns out to do to do fancy stuff you need the equivalent of pearl but to do the most basic things you don't. And so that's what I kind of came up and I would, and I always I loved the idea of tools and abstractions and making computing accessible and programming accessible. And that's just one of those tools for the web was was Australian it was a hot dog to remember that. That's right. That's exactly right it was. Yeah, somebody. Yeah, so I did I kind of said well I'm going to take a shot at making a tool and see what happens. And that was called fusion. So, and so, and that I would say the other. So here it is called fusion. It still exists. It's still, it's still a part of Adobe now developer week happening looks like now. So what. Yeah, what year was the first version of this 9595 I mean that's good longevity. That's good longevity. That's right. That's right. Yeah, no it's, it's, it's, it's how it's had a great existence and one of the big ideas though that I that I learned one of the things the biggest things I learned when cold fusion came out there were probably 10 tools that did the same dish thing. Before after front page because that was it was a concurrent with front page and front page didn't really do this front pages. So it was it was concurrent with front page and basically the two of the biggest differentiators where we have her basically really good documentation and really good error messages, you know, and we just, I mean we'd see competitors that had twice our feature set get no adoption. And what language did you write this. This plus. Okay, so I mean, but I don't remember a point at which you said you learnt C++. When I left when I left graduate school, I learned C++. Yeah, it took a couple years and, and I did the city pages, you know project and then this was my first serious project and I wouldn't say I was good at it at that time but I was certainly enough to ship something so. Yeah. So, so yeah, so that was that and that was a great experience and I learned a tongue from that I also learned that as you were saying you know I didn't particularly route I didn't particularly relish the parts of entrepreneurship that didn't involve product development, you know. So, and there are a lot of those are really important things that need to happen. Right. So nowadays I think you said you said before you delegate that largely to your president. Yeah, the president of the company runs everything and I, I do get involved with the, you know company strategy and certain there are certain things that really important for me to be a part of to like preserve that roughly you know 80% of my time coding and I actually think that but it's not just an indulgence I actually think that great products need to have people who are aware of the whole matrix of what's going on why is this important why is this feature important what users are important how do users think that stuff close to the keyboard is imperative and a lot of times that's that's that doesn't happen because somebody else. Yeah, somebody else I've spoken to who has a similar approach is a Michael Stonebreaker, who's built a lot of the best database tools in the world at many companies and yeah he told me he, but he's also an academic you know so he kind of invent stuff and then finds a trusted partner to bring it to market. I don't think he's ever called himself a CEO, he kind of calls himself CTO, but you know it's vision and somebody else is running. There's conceptual integrity and what he creates and he gets all the, all the trade offs I mean there's like seven trade offs today that you know he's in Boston right now I think about it that's right. Yeah, he's in Boston Boston yeah I met him once happy and I met him once. That was mostly fun just watch those two of them talk. I should let you have a go at a question. I was, I wanted to get into a little bit of getting back to NB Dev2. Oh please. So maybe just to orient the listeners who haven't seen NB Dev or NB Dev2. I mean you've taken, you know, notebooks further than anyone thought possible, and have created something really, really incredible and so I would love to hear, or I think other folks would love to hear about the framing of what that is and I have some follow up questions about it. Yeah, sure. So, I mean, one of the best things I received was when the original creator of Jupiter and I Python notebooks sent me an email and said this blog host, he's printed out and put on his wall, and he shows it to everybody who wants to understand what the original notebooks are meant to be all about. And basically, I really enjoy writing code in notebooks, and this is what this is what my notebooks look like so this is a bit matter here but this is the first few cells of the first notebook, which is used to generate NB Dev2. And when I first started, I didn't know anything about notebooks internal so I had to figure out what is a notebook and so I wrote this thing that reads a notebook, and then I look inside it. And as I do that, I'm a huge fan of the scientific idea of journaling, right, most of the world's best scientists have been very thoughtful about how they journal, you know, so for example, the discovery of the noble gases, you know, was something where basically, you know, this left over a little bit of residue because the scientists have been so careful about the process and journaling the path. They recognize that shouldn't be there. You know, it's not I made a mistake throw it away, but it's like, let's look into it like it's it helps with the with a rigor and knowing what's going on so I, I like to document what I do is I do it. And I also know that at some point I'm going to want to share this with somebody else I want to show them what I found out. And I got to forget this in a year so I want to forget Germany year what I found out. But then I don't want that to be a separate artifact somewhere else like I as I go along I'm writing little functions right so initially these two lines of code would have been in their own cell. And I would have been, oh, okay, that's how you open a notebook. Let's make it a function and so I check and chuck a def on top and you're also articulating your understanding. Yeah, exactly. And then it's like oh I think it ought to give something like this and I check and it's like oh I did give that and so now I've got a test of my understanding and the API and I've got to check that it's going to be consistent and so that becomes a test. So, so if I, let's actually have a look at this. So here is that the notebook which creates notebooks, which creates MB dev. So here's notebook number one. And so we can then look at the documentation for MB dev because writing documentation like most people don't really do it. I'm saying that the whole reason that called you to succeed is because we wrote documentation. So you'll see that my documentation here is the same thing as the source code. And that's because source code and documentation and tests they're all, they're all in the same place. And this is like this is kind of in some ways a lot more than just literate programming is what I call exploratory programming. And it's this idea of like trying to recognize that programming is a process done by humans and that we can support humans doing that process by by giving them tools that fit that process. So that's really what MB devs all about. And it's not a new idea. Yeah. So obviously, noose was the guy who kind of created the idea of literate programming, combining programming language with the documentation language. And, you know, these ideas that programs should be more robust more portable more easily obtained and also more fun to write all things I found to be true. When I'm writing code like this. I tend to be in the flow zone all the time, because every line of code that ends up in a function I've run it independently I've explored it and I played with it. I know how it works. You know, so I don't have many bugs. And if I do they're not, they're never weird bugs that I don't understand. So I'm always progressing. So then Brett Victor, who I really admire, you know, talked about a programming system for understanding programs. And he has some amazing examples of like, what could programming look like in a way that's much more exploratory and playful. And so then another thing which was fantastic, my friend Chris Latna built Xcode playgrounds, which again it kind of lets you see what's going on, you know, how many times it's going through the loop and what does it look like. So there was a lot of like, and of course small talk, you know, small talk was explicitly designed for exploration like it's, you know, you have this whole. Small talk in my father questions. That's great. Yeah. So, so there was all that going on and then perhaps most relevant Mathematica, which really developed the idea of the notebook and I really always enjoyed working in Mathematica. But never enjoyed not being able to do anything with it because there just wasn't a great way to like take a Mathematica notebook and give it to somebody else to play with. Yeah. So when Jupiter came out, I felt like, oh, this is a good opportunity to take these good ideas and turn them into the thing I've always wanted, which is a way to build real software, real documentation, real tests, but in this exploratory way. So that's what NB Dev is. So you write your software and notebooks. And you basically, you know, run a cell or a CLI command and it exports it to a module and that module in Python and that module automatically ends up on Pi Pi so you can install it. You can condor install it or it automatically gets the documentation website or it automatically gets continuous integration tests. Somebody who actually just tried using this the first time a couple of days ago told me from from zero to having a website and module and continuous integration done was 10 minutes. And that's what you want, right? Because it's like, you know, you want to be to say like, I brought you a little tool. Here it is. That's the website. Yeah. And then when I get like pool requests, you know, they're generally good because. They wrote them in the notebook so they can see exactly what it's meant to be doing. They can see the tests there. There's like, they don't forget to write tests because they're in the same place. They don't forget to write documentation is in the same place. They understand the context of what it's about. So I also find it helps, you know, with open source collaboration as well. Yeah. Now, I will say the tooling we built it on top of, which is largely kind of NB convert and stuff, the kind of the surrounding tool set around notebooks. I was never fond of it. I found it a bit slow and a bit clunky. I'm very grateful that open source volunteers built that stuff, but I didn't particularly like it. So then when I came across Korto. Well, the first thing I noticed was like, Oh, this looks like NB Dev. Like you guys are actually using cell comments. Yeah. Just like we pioneered. Which we got from, which we took from you from fantastic. And I was like, because we were struggling with attaching metadata to cells. And as you know, notebook editors have a facility for that. That is hard to find and requires you edit raw JSON. So he said, well, that's not good. And so he said, and I saw you do that. And I was like, because people are using tags. They're also using tags. Absolutely. You know, and I was like, well, even the tag interface is really clumsy. That's really. And so I was just like, why not the comments? You know, I saw you. Exactly. But you guys do it better because I saw yours and I and yours were like comment followed by a pipe site. And I had always kind of struggled with his idea of like, how does anybody know whether something in bedev is a comment or a directive? So you made that explicit. And I kind of thought. I wasn't surprised, you know, because I kind of thought like, okay, JJ, I've always admired this guy's work and he's now taken. You know, I don't know if it's, now I know it is intentional, but I didn't know at the time it was intentional or not taken my work and made it better. And that's always. And I thought that's great. We should. We should at least use that syntax. Sure. Sure. And then I started looking into like what you're doing with it. And I thought, like, oh, no, this is like a whole tool set that does everything and be convert does and a lot more. But it's also more delightful to work with because it's got much better documentation. It's got much better defaults. You know, the tool, the stuff that's built in for free is much better. And then when I spoke to you, because I kind of, I kind of said like to you, like, you know, this feels like something I could build and be dev 2 on. Tell me a bit about the technical foundation, like how is this working? And you explained to me. And I started reading the source code to understand it. But it's actually this like relatively thin wrapper around. Fantastic functionality that already exists in Panda. That's right. It's an orchestrator. Yeah, which, you know, on a bunch of good defaults. So like it's kind of like what fast AI is to pay torch in a way. Right. It is amazing foundational technology. That's actually just too hard for people to get their head around. Yeah. Let's give you, like you said, good defaults, good ergonomics, you know, and it's the same. So it's also pan talk. I just had so many problems with it. Like, you know, when I used it, it just very often didn't quite work. You know, so you've also like just make sure it works. Like, ooh, you know, that's unfortunate. Okay. Make sure that works. Yeah. So, so, so NB dev 2 is basically like, should look very, very similar to NB dev 1, except for the, the pipes after the comments, but it's, it's dramatically faster. Part, you know, partly because, well, partly because I wrote a lot of stuff myself from scratch by, by using the Python AST dot Paa stuff. So I'm working with the abstract and sex tree directly. I'm making sure I only have a part at once. I reuse the cash to AST, you know, and then partly because, you know, we leverage corto, which is fast, much faster than NB convert. So it's much faster and it's kind of the code base, even though it does a lot more, it's a lot smaller, you know, than NB dev. Again, by kind of like trying to build better foundations. Yeah. Yeah. Well, the, the interesting guy, I noticed that the title of your blog post was use notebooks for everything. And I, I, one thing that would be interesting to explore. So I kind of came up through this, this interactive computing metaphor, which was really defined by, have you heard of a ESS? Emacs, the Emacs Speak statistics. Yes. That was sort of an Emacs mode for, for R and for S, actually, originally, and then R. And it was like it, one of the things that it sort of said, you want everything to be interactive and responsive and you're always in a live session. The way they achieved that was through rather than having a notebook, they did line by line execution. That's like the fundamental model is I select a line or a group of lines. It can be smart syntactically. Like, oh, I see the line continues. And you just edit lines basically. And then at some point you might, like you did reorganize that into functions and so on and so forth. And so one of my questions was, and I think one of the most delightful and powerful things about notebooks for, for Python is that they give you this interactive development experience. I sort of see it and you know, small talk gives you an interactive development experience with yet another kind of way of organizing the interactive development. And so, you know, one of my questions is, and so we are building now as we build tools, we have this tradition from R of this ESS derived kind of like line by line execution. You see your side effects maybe in another, another, another pain or console. And then we have notebooks and we're sort of trying to do tooling for both. Yeah. And one of my questions is how much of the, of what's amazing about notebooks. Like, so there's, there's multiple ideas wrapped up in notebooks. There's everything in one place. There's bundling output and, you know, and then there's interactive computing experience and there's an immediacy. Like there's the thing that a lot of people hate, which is also state. And the state, right. And that's a side effect of, you know, it's all trade-offs, you know, and the state, you know, it's like. Which, which I think I was actually part of what's excellent about notebooks. If you know how to leverage the state, it's actually, if you know how to leverage the state. Yeah. Yeah. I mean, it's like your file system, you know, your home directory. That is state. That's, that's all. When you CD into something and you copy something, you know, it's, it's, it's state. And this is your home. You created a side effect and it happens to be a, you know, a model or a data set. It's like, this is what you, this is, you've created. I have it now. I have. Yeah. You've created this environment to be in a state that you want it to be. Yeah. Yeah. And we have. Yeah. It's funny because we have some religion, you know, and are like, well, you need to, you need to, it's like, you need to be able to execute the thing from top to bottom and have it work every time. Sure. But then there are people who will say to you, well, I don't really want to do that because I actually, this was really expensive to create this piece of state. And I don't so much want to do it up to bottom, you know, so, so, you know, there's, I think there's a little bit of people have tried to build, you know, the sort of way to split the difference. It's funny when I, when I first encountered these ideas, I was like, wow, it's so messed up that there's all the state. I was like, Mathematica must have some solution for this. I went up to, I was at like some conference and I walked up to him and I was like, how do you guys do this? Like, we don't just, you just execute, you know, it's, and I'm like, okay, because it turns out, you know, if you want to solve that problem, it's its own quagmire. And people have reactive notebooks that, that essentially do solve the problem, but then are really painful to work with interactively because as soon as you're doing anything that takes more than 10 seconds, you're now. Yeah. So can I tell you, so I'm happy. I can tell you a bit about my thoughts about, you know, I would love to. That's kind of that. So that's like the set the table of like all the stuff that's out there and where do we go? Yeah. So, um, so a lot of people are very into line by line based approaches in Python as well, particularly using the, the IPython repo. Yep. Yeah. So, um, and it looks, basically identical to how people coded in an APL 50 years ago, except they used a teletype, you know, right? Yeah. And it's based on that idea. And, you know, APL kind of invented that way of working. Yeah. And, and APL was more than just a programming language because it was your REPL. That was also how you would like text chat. There was an APL command for that, you know, like everything was that was your, that was your OS, if you like. Yeah. Um, and there's nothing wrong with that, but we have, you know, there are, there are other ways, right? And so a notebook, um, you can do it top to bottom if you want to, but you don't necessarily want to, because, um, it's often nice to go back and change something a little bit earlier to answer the question, I wonder what happens if, right? And so you change that and you select the four cells underneath and you hit shift enter to run those four cells. It's like, Oh, well, what if I did this? And, um, and then you kind of think, okay, let's try three different versions of that. So you copy and paste those three cells twice and then you select them and then you run those the two different versions and then you compare. You're doing experiments, you know, and the artifacts of those experiments are right there all in front of you. Um, and that doesn't mean that then you're finished, right? Like hopefully you've learned something with that. They're refining your understanding of the problem. Right. So then you kind of package it up a little bit. You kind of say, okay, well for somebody reading this notebook, I want them to see these three different versions. And so like maybe you put it into a little for loop or maybe you create some kind of function to display it and put it on a graph or whatever. But it's, um, you know, for me, like there, there are two critical critical keyboard shortcuts in notebooks shift M and control shift hyphen shift M merges two cells together and control shift hyphen spits them apart. And so I'm always like grabbing a single line of code. I'm running it. I'm exploring it. I'm, you know, assigning it to something. I'm trying to change, fiddling with that. And after a while I've got three lines of, you know, normally all of my functions are three to four lines of code. I've got the three lines of four to four lines of code that do that thing. And I just shift M a couple of times, you know, indent the block underneath the death, add a doc string and then all those examples, they're all still there underneath. And so I had some pros for each one. And that's a nice way of working. Yeah. And like, and as you say, particularly in deep learning, like sometimes I'll be like, okay, well, I want to show how we can interact with like a language model. All right. Let's run this for 10 hours. You know, I come back in the morning and I put a language on it. I've got it just where I want it. Yeah. You know, I mean, maybe that's not a great example because I probably serialize that as a pickle file or something. But yeah, we don't necessarily want to run everything all the time. Yeah. An hour or 30 minutes might be. Yeah. Just make the point just as well. I think there's an issue which is, it reminds me of my time in spreadsheets, you know, I'm a huge fan of spreadsheets. Yeah. Even though a lot of people use them badly. Yeah. And I read a book 30 plus years ago, which was a book of spreadsheet style. And it was designed to be like, you know, what's that English style book. It's designed to be kind of like, you know, rather than grammar and style of English, it's kind of like for spreadsheets. And yeah, it explained like, here's how you add care for auditing, error checking, self documentation, whatever the spreadsheets. And so ever since that, you know, I've tried to follow these rules so much spreadsheets. Yeah. It's taking a very flexible tool and using that flexibility to create a process for using that tool, which works really well. Same with notebooks. If you, yeah, you can shoot yourself in the foot with them, but that doesn't mean we should tell people not to use them. Yeah. We should help people use them. You can shoot yourself in the foot with a .py file or sitting at the ipython. Or a C++ file. Or a C++ file definitely. So yeah, so we're kind of adding like more stuff, more and more stuff. So something that I've built as part of NB Dev2 is something called exec NB, which is something which is just a tiny, tiny little Python module that just runs notebooks. And you know, you can parameterize the runs. You can, it'll save the results back into the notebook. You know, with this idea that like you can very quickly and easily run some experiments, share the results with people. And NB, any NB Dev repo, I mentioned it creates continuous integration for free. That continuous integration runs every notebook top to bottom. So if your notebooks don't work top to bottom, as soon as you commit, you're going to find out. So it's kind of harmless to create a local out of order notebook because it's going to get checked. So I mean, yes, you've deluded yourself temporarily, but there's a net. Yeah, that makes sense. All right. So if I can come back to a quarto a bit, JJ, I wanted to understand where you're going with it and why. So you mentioned earlier that scientific programming is broadly speaking, something you were trying to like improve. But quarto is not just scientific programming. You've got all this stuff about kind of scientific publishing as well. Yeah, yeah. So what are you trying to do with quarto and why are you trying to do it? Well, it's, it's, and I would say it's quarter as much more scientific computing. You know, that's what our studio and tidy verse and, you know, arrow and all those projects are about scientific computing. I'd say that quarto is very squarely about scientific communication. And I would say that there's a few things that just by working in the field for a little while, I have noted that I think warrant significant improvement. So one is the fact that we have scientific communication for a lot of good reasons is very tied to print. And that the point of the realm is these print articles. And that's fine. And there's good reasons for that. And there may even still be good reasons for that in the age of the web where, where, for example, a PDF is a more durable entity than, you know, a website that might get taken down or have its links break, et cetera. Maybe, maybe not. Okay, so I'm just, I'm what I'm saying is I've certainly seen some discussions where people say it's not a terrible thing to have a self-contained representation of your whatever better to have like a Docker image that can run everything and what anyway, but so very tied to print. And so one of the things is to help scientific communication take better advantage of the web while still not losing the focus on print. So not going completely like, hey, everything in now and in the future is web. But now all of a sudden I actually can't write an article that I can publish with that with that mindset. So that's one, one piece, another piece which was huge focus of the art community, which is reproducibility. And this idea that everything should be in a dot, you know, in an R Markdown document that runs top to bottom where your figures and your tables and your, your results and everything is all reproducible and produced by code. And so helping people do that is a big motivator. So let me come back to the first one, which is about scientific communication, making it more web friendly. Yeah, I guess like, why, like, what's this got to do with our studio? Or is this like, what's this got to do with you? Like, what are you? Well, to me, my own, my own kind of beginning of the Renaissance was the Bill James baseball abstract eyes opened, and then I get to its politics and my mentor is is is demonstrate he's also like, wow, we're making decisions that affect hundreds of millions of people with no evidence or making, you know, medical decisions with no not or no evidence, probably an exaggeration, but really weak under under rigorously prepared and under evaluated evidence. And so to me, it's just like doing science well has a lot of consequences. So this is like, they're hot. This is this is a this is a this is a mission for you to do science better. That's right. And John Chambers in his book about about about R and S software for data science. He actually has this concept in there, which I used in all my slide. It's called the prime directive, which is basically like accurate, trustworthy computing of scientific results is the prime directive. It's really important for the same thing for social policy for medicine for, you know, just safety. So that's it. I mean, I was really compelled by that. So helping people do science really well and communicate technical content and persuasion well is to me very, very compelling. Is there something about like accessibility there as well for you like making it like making science more accessible and making scientific publications, like more accessible. Not per se. I'm taking scientific communication at face value that it serves whatever purposes it serves and has whatever virtues it has. I'm not, I don't, I'm not saying let's change that. That's not at least my thing, but I will say that another related influence was the, you probably read it, the tough D has this pamphlet, which is the cognitive style of PowerPoint, you know, pitching, pitching out corrupts within, you know, and he, he sort of breaks down what's wrong with a lot of the way we communicate about technical information. And he sort of at the end, he says, you know, really what we should be doing is giving each other handouts that have analysis and evidence and data. We should be reading the handouts before the meeting and then we should be talking about them, you know, not pitching, you know, bullets at each other. So I was compelled by that too. So I was sort of very compelled by the idea like let's give people tools to, to communicate effectively about technical matters and, and science. So that's, that's, that's very motivating to me. So just showing this, this is the tough. It's a really great. Yeah, they have, there's a really funny thing in there where he says, here's what the, was it the Gettysburg address would be as a, as a PowerPoint, you know, presentation. And, you know, it's, you know, similar ideas in like how Amazon do things. So, you know, they, they do a six page kind of memo. And of course, also Feynman, you know, in talking about the challenge of space shuttle disaster felt like a lot of that problem came from simplifying complex ideas. We just saw your, and I think you just posted on your blog about this evidence update regarding masks and COVID-19. That's exactly what I'm talking about. Like let's have a dialogue about a matter of public health importance and use evidence and communicate a commute, do technical communication really effectively. The reason I asked about accessibility is that I mean, this, this, so this was an article that I'm going to be in this team, this team and I, wrote in April, early April, 2020. So, you know, within a month really of the pandemic taking hold in the U.S., well, within a month. But it wasn't published, I mean, it says here, except December the 5th and then I think it was published quite a bit later than that maybe even. So by the time this was available on the proceedings of the National Academy of Science, it was almost obsolete, you know. But what we did do was we also put it on preprints.org where it was there from, here we go, 10th of April. And these were very minor changes, right? So, and this version has received 439,000 views of the abstract and 98,000 downloads, which is the, by far, the most viewed preprints.org paper of all time. And, you know, the fact that that was much more, you know, if we compare it like... Well, let me re-answer a question, because when you say accessibility, I read that as the accessibility of the discourse. Can a layperson understand this? And that's not per se a goal, but it is, you know, accessibility in the sense of the way scientific publishing works and the delays that are inherent in the progressive refinement of knowledge and the various choke points that there are for publishing that gives people credit for their careers. That is all kind of pretty messy. I don't have personal good ideas about personally about how to resolve that, but a lot of people do. And a lot of people are working hard at that. So it is motivating to me to build... If I could build a tool that's widely adopted for scientific communication, then I can marry that to good ideas that are out there. And easier to adopt. I mean, that's kind of why I asked, because... That's kind of the number one goal, even though we got to a third, but that's a hope that I have. Because, like, I mean, so I, you know, to be clear, I hate thinking about talking about writing about people learning about masks. I find them tedious and annoying, but, you know, I have to because other people aren't. And so I just, you know, I updated that paper quite recently. But I didn't put it on a journal. I put it on our website because I felt this is more accessible. And also, because, like, I just couldn't be bothered like doing all that latex stuff and blah, blah, blah. These were real links to real, you know, you can click on it and go there. This is a goal we have, which we haven't, it's not evident yet, we're working on it, is that you should be able to basically create a blog like this that's got this content, but, and take this, and repurpose that same content and send it to the journal. Exactly. That's exactly what I want to do. It's like single source publishing where you can be, you can almost be web first. And then, oh, look, we also know how to make LaTeX that you can submit to the other places that you need to get this published. In fact, I can show you how horrible this looks nowadays. So I did exactly that for, I did a paper about vaccine safety with my friend Yuri Lanna. So Yuri wrote a study, or was a senior author on a study, which for whatever reason got picked up by the conspiracy theorist world as showing that vaccines are harmful. And so him and I got together to write a paper that said, basically said, here's what that paper actually says. So this is the paper here, layout L-2021. But again, after, you know, we actually wrote this probably in about April 2021. And in the end, I just, nobody had yet reviewed our submission. And so in October, I just kind of went, oh, fuck it. I'm just putting it on the web. So I had to take that LaTeX document. Oh, and turn it into a web. And I did use Pandoc to help me. But as you can see, we end up with these like, oh yeah, I tell you kind of references. And I had to kind of paste them down at the bottom. And then here they are, but they're not high points. Go to the GitHub org. This will be out by the time this video broadcasts. There's GitHub org called Quarto-Journals. And this should show you, yeah. So basically we're working on journals. So you can see like, you know, if you go to one of those, let's see what it shows you. Yeah, so scroll down. Anyway, it's not showing you much, but go to that template.qmd file there. And you'll see sort of an example of, you know, you've got, you know, your metadata, your authors, you're, you know, all the stuff you need to do. It's making the LaTeX that the journal wants, including getting all the fiddly bits right. But then the same exact content is going to render perfectly in HTML. That's great. Yeah, I love it. Geography, it's going to do everything, it's going to do everything right. So that's, I think the idea is let's just write in Quarto. And now we're going to be able to put it in on the web, maybe web only, you know, but also. That world of publishing, my God, I was so shocked when I discovered how it works for this, for this PNAS thing. So PNAS is I think like the third highest impact journal in the world. And so, you know, I thought like, oh, this is going to be a smooth professional experience. And, you know, I did the whole thing in Overleaf and LaTeX and BibTeX and just fine. It was pretty easy. And thanks to Overleaf, you know, with all 19 of us authors could collaborate by working on different sections. And so then when it came to publishing it, you know, I had to upload the rendered PDF. So I uploaded the rendered PDF. I wasn't quite sure how that was going to help them. And then, you know, like a while later, yeah, they contact me and say like, okay, we now need you to like look at these questions. And they were basically, they put annotations in the PDF, which is already kind of hard to work with. So I ended up trying to like reply in the PDF to the annotations. And then eventually they're like, okay, now you have to go through and look at the kind of camera ready document, whatever, and look at these things. And they sent me back a Word document. And they've taken the whole thing and redone it in Word. And then just wait for it. So then they're like, okay, they had a question about a reference. They're like, maybe this reference doesn't really make sense there. I think they said you're not allowed to use it because it violates some rule or something. I was like, I don't want to fight about this as far as a spy and you can get rid of it. And they're like, okay, so what you need to do is remove that and then renumber all the references afterwards. There's 150 references and this is reference point seven. Yeah, this is more like a proper, you know, scientific markdown system. We'll do that and we'll renumber everything. Exactly. So I just said, I just said no. No, no. I've made that change in the latex. Yeah. I think that is the PDF with the corrections. Yeah. You fix it. Well, I almost view it as like, you have to give people tools that help them with the problems they have now. And then you're, which is, I need to interact with all these journals and publishing systems and then you have a chance to help them, you know, evolve what they do and help them do things they never thought were possible. So I think that's one of the reasons like we, we are focused on really tooling latex well and letting you, you know, we're very focused on that, even though we think, wow, it would sure be great if we didn't have latex. We're not, we're not ignoring it. We're saying, okay, we'll tool that and, but we'll also tool the web and it'll be great. And we'll all get there eventually. So that's, now can I ask some, I'm very, very excited about this, by the way. I love that. Like this is something I'm passionate about. Yeah. And like kind of a slightly a weird way in that, like I'm passionately anti how academia works. Yeah. To the level that everybody was assuming I would go into academia following school and I refused to. Yeah. The basis that I didn't like how academia worked. Yep. And I've now finally come full circle. I am actually a. Yeah. Professor. Yeah. You know, only because I'm able to do it on my terms and I totally refuse to do any of the normal things. Right. So it's, it's great to have you involved in this fight. Yes, we are going to be very involved. We have a couple of questions from the community. So. Okay. So this one is actually asked to me, but I wouldn't mind asking it to you as well. And then I can come back to myself. This person said for Jeremy, your productivity amazes many people, including myself. Do you have any tips that might be valid in general? What does your usual day look like? Now I feel the same way about you JJ. I'm amazed at what you've done and what you do. And Hamel and I are both, you know, like, wow, how does JJ do all these things? So, so quickly. So yeah, I'd love to hear your. Well, I would say that to me, the main lever for productivity is not how fast you can code. That certainly helps. I think it's more what problems do I choose to solve and, and what order and at what level of depth, you know, to me, getting through a problem or a problem domain is about making those choices. And there are, there are side, side quests you can go on that waste three times the total effort required to actually solve the problem. So I think that a lot of that just comes from experience. So I think there's the choosing what problems to work on. And that, I think you can, you can, you can level yourself up by talking over what you're planning to do with other people. And so I was thinking of trying to solve this and then this, and then they say, huh, well, why is that important? Isn't that only important to this and couldn't you do it? You know, so I think some dialogue helps interdialogue is great. If you have a lot of experience, maybe you can get it done with mostly interdialogue, but talking to people, I think tactically, so that I think then there's just throughput, how much, how much code, how many features can you write? And to me, the biggest thing is just, you know, several hours of completely distraction free time. So you kind of like turn off any notifications, notifications, build up a stack, get your stack. So you got to get like an appropriate head of steam and not let yourself be distracted. Do you work at an office or you work from home? I do. I do work at an office. Yeah. Yeah. And I find that to be, to be helpful for that, for that purpose. I do, I do have a good setup for working at home too, and it's separate enough from the rest of the house. So like I can, I can approximate that pretty well at home too. But, but yeah, so I feel like, you know, I need to get four or five, six hours chunk of distraction free time. So then it helps just to batch up things like, okay, and you can even batch up things by the day. Monday, I'm going to do all the fiddly bits and distractions and calls and, you know, or Monday and Tuesday, I'll do that. And then I know Wednesday, I have nothing scheduled at all Wednesday through Friday and I can get two good focus. I mean, that is significantly more hours than most experts in creative fields so they can achieve. Like normally four hours seems to be considered about what you can aim for as a best five or six is. Yeah. Fantastic. Yeah. Yeah. Was that because they're because of just, just sustaining concentration. Yeah. Yeah. And it's not just in like, I mean, it's like in, in, in, yeah, like the kind of deliberate, you know, deliberate practice stuff. Yeah. Yeah, yeah, yeah. Kind of what you're doing as well as deliberate practice. Four hours is normally what the best violinist is doing. That's just the helpful genetic attribute that I have. Yeah. Yeah. Yeah, I can't do that. You know, I, I very rarely could do four hours. I did all three is good for me. And you try to get the three hours distraction free or. Yeah. I mean, and also. So I mean, my, my main thing by far is, is a deliberate choice I made as an 18 year old to spend on average half of every day learning or practicing something new. Yeah. Yeah. Yeah. Yeah. Which is, yeah, it drives everybody. I work with crazy pretty much because it makes you a very, very creative inventive, able to see around a lot of corners and solve problems in ways that people. Yeah. And I know, I know tools extremely well. I tend to be like all the keyboard shortcuts and all the tricks and whatever and all the libraries. But it does be, yeah, people are working with me and we're like, okay, we're going to have this thing finished by Friday and you're, you know, learning. Yeah. And then you look at it along view. And it also means like, you know, very often using a tool I'm not very familiar with to do something, even though it would be five times faster to do it manually. Yeah. But yeah, like it's definitely got me to a point now where I find, you know, the vast, you know, nearly everybody I work with, I, I just get things done, you know, often 10 times faster and it tends to work the first time. And I kind of often find when I do live coding or whatever, people are like, oh, I didn't know that tool exists. So I didn't realize you could always work in that way. Yeah. And efficiency and yeah. And then so I think, yeah, I think something people would be surprised about with me. People think of me as productive. How few hours of productive time I have a day. I spent a lot of time hanging out with my daughter and going for a walk on the beach and eating ice cream and trying to be in a good mindset to have a good, a good three hours. It's a very, very good three hours that you have. Yeah. Not many people have good three hours that often. That's right. No, I see that a lot of people that I work with, their days divided up into small bits and there's probably not three hours of even of engineering in there. And they're all broken up. Yeah. And it's also a case of being good at saying no, like I very rarely do meetings. And if I do, I want it to be a good one. Like, like this, you know, like talking to somebody I really want to talk to about things that we care about. And so generally assembly is like, can I get on your schedule for a half hour phone call? I'll say no. But, you know, if you send me some email, I will respond. You know. Yeah. Okay. So, you know, yes. So your brother apparently does some rapping. Somebody else wants to see you doing some rapping JJ. That's not going to happen. Okay. Yeah. So both, both of us. When making design or development decisions regarding envy dev two and quarter, were there any trade-offs? Yeah. Yeah, I would say two, two trade-offs one was going back to the discussion we had earlier about leaky abstractions. You know, how leaky an abstraction over pandoc should we be? Because our markdown actually fully, it's pretty fully abstracted. Pandoc. Like you just used all these R functions and you didn't even know pandoc was there. And if any given piece of functionality needed in pandoc, you know, you needed to address. You'd need some hacky way to work around the fact that we'd written this rapper. And so for quarter, you know, I think that's a really good question. Which basically says like everything that's in pandoc is kind of their pass-through. Partly that's because pandoc had evolved. It used to be that it could only accept a lot of things by command line parameters. And so now it can take everything through YAML. And so like it became a system that you could interact with more reasonably. Without a special rapper. And so, you know, that. I felt like if we decided to try to wrap it. We were trying to keep up with everything people are trying to do by making it leaky, we would sort of free roll on everybody's knowledge of pandoc and all the things that are in pandoc. So that was one. And the other one, I think, which we didn't really decide on until about a year into the project. Was how much we should be batteries included or how much we should be sort of extension and plugin driven. And, you know, extension and plugin driven can be very dynamic, you know, like the JavaScript ecosystem just like keeps evolving every three weeks. You know, on the other hand, it's really hard for people to get their bearings and things get. And so we went on batteries included because we felt like we actually, it was a somewhat bounded problem. There was a bunch of, it was sort of known what the, we looked at a bunch of systems. It's a known feature set. And the users are not JavaScript engineers. They're analysts and scientists. They will appreciate batteries included. So I would say as a user, I've definitely appreciated that. I, yeah, I don't want to spend my time figuring out how to add it at a JavaScript based syntax highlight, JavaScript based table of contents and how to modify the CSS to create a collapsible sidebar. I mean, nobody, yeah, nobody wants to do that. I mean, everybody needs all those things. So just, you know, you give it to me. But you do a good job of making sure I can replace it if I want to. Yeah. Yeah. And there are plenty of things that I wanted to replace. And, and, you know, so very kindly, one of the first things you did for us was you added the IPI and the filter directive where we now have a Python script that takes us to it in a notebook and feeds back has did it out a notebook and it's been modified. And by using that, we can totally do anything we like between that and the lower filters on the, on the AST. Yeah. Yeah, we just introduced recently sort of an plot extensions, which is basically it's Louisville. Yes. They're installable. They're kind of easy to bundle and so that's a nice, a nice. Yeah. Yeah. Hamel and I were talking about that this morning. I think, you know, the main one is actually not just about MB dev, but kind of everything we do, which is in Python, there's a schism between treating it as a kind of a static language that you write a bit like Java versus a high D dynamic language that you write a bit like Lisp. Yeah. And in my opinion, Jupiter is best for the latter. And in general, I like writing code using the latter approach. I liked, you know, I like exploratory code where I manipulating objects and taking advantage of metaprogramming and dynamic features. Yeah. The Python community has very heavily leaned in towards the former, you know, so static typing and a lot of very more surprising approaches to testing and documentation and lots of single use tools with their own concepts to learn and stuff, you know. So that's the big trade-off we've made is to basically opt out of the usual way of doing things in Python to the extent where we're always starting to think like, why don't we describe this as a different dialect of Python? Because it's not particularly recognisable to... And you wouldn't want people to expect, oh, I can just pour in all the stuff that I'm already using. Well, I mean, it interacts with it or fine. But you write it in a different way. So like, if you're used to using VS Code and very heavily relying on static type annotations, you're not going to love our libraries because they're so dynamic that VS Code doesn't generally know what the hell's going on. It gets confused. So whereas in Jupyter, Jupyter always knows exactly what's going on because it can do real-time introspection of the symbols. And, you know, this is something you've got to say about the Python community. There's this kind of basic principle that comes from Greedo, the original developer of Python, which is the... Ideally, there should only be one way to do it. And I don't understand how this ever became a thing because as soon as you say that, you basically turn off innovation because if you want to do something better, you're not allowed to because you've just created a second way to do it. And so the Python community often is, you know, or at least this kind of core group is often quite anti-fast AI stuff because we're a second way to do it for all values of it. You know, we have a different way of testing. We have a different way of building libraries. We have a different way of doing types. We even have a different way of, you know, we have a Julia inspired type dispatch system. Like we do a lot of stuff inspired from non-Python languages. And, you know, I think that's really problematic. Whereas R seems to really... This seems a much more flourishing, welcoming, and diverse community than the Python community. It does feel that way. And there's a lot of different... There's a lot of variants in how people do things. And it's generally accepted, I would say. Yeah. Yeah, there's a lot of stuff... People are always finding new ways to use the language as dynamic features to do... to express things differently. Yeah. Yeah. So, you know, there's a lot of things I don't love about R and there's a lot of things I do love about R. You know, like you, I came out of the, you know, SAS, SPSS, Excel world. We used S plus, you know, back before R was really a thing in the previous startup. That was my world for many years. And I wouldn't say, I wouldn't go back to it. I don't... I like the language of Python more. But there's a lot of stuff I wish I could have, you know, like everything that Hadley's written. And the community, the documentation. Yeah. Yep. The formula language. There's been a lot of good stuff. All right. We've got one more each if that's okay. Okay. Sure. That's great. Yeah. All right. JJ, NB Dev2 is built on top of Quarto. Do you have any other thoughts for stuff that might be able to build on top of Quarto that would be interesting? I think a couple of classes of things and NB Dev2 is an exemplar one, which is, I think of it as sort of generation of web content from software artifacts. So I have a software artifact. In this case, I have my notebook that, that defines a bunch of functions and exports things. And I can generate a website from that. You could think of, you know, it has real time elements. It's a lot of stuff. It's a lot of stuff. It's a lot of stuff. It's a lot of stuff. It's a lot of stuff. But like tensor board. There's these artifacts created in a directory. And then they create this web experience from it. And so I do think there's a lot of things. And the bio conductor project had this thing pre R Markdown. But they had these, these S four objects that were very complicated. They could have like gene sequences in them and all kinds of stuff. And if you just literally get, you call it a function. Pass the object and it makes a website from it. You know, so I think. It's a lot of stuff. It's a lot of stuff. And then just creating websites from them is, is really interesting. And obviously like documentation for a software package is one variant of that, but there are other ones. And the other is, you know, we sort of promote, hey, look, you can make a website. You can make a book. But you can pretty much, you can feed just about any publishing pipeline. Through, you know, from notebooks through quarter into the publishing pipeline. So like, you know, you, if you've got a big Hugo website, you can pump Mark down into that or you, you're using confluence and you need to put all your articles there. You can pump things. So, so it's sort of building these publishing pipelines downstream of Cordo to these other, because it, you know, it's great that you can easily make a website, but oftentimes you need to get it. You need to get your content somewhere else. And so, you know, hopefully we can teach people how to do this, how to do this. I mean, it's all possible. I remember you guys asked about. I was like, oh, here's an example. You can totally like feed a doc a source site with Cordo. You know, I know how to do it, you know, I got to teach other people. Yeah. Yeah. Yeah, great. And so then my last question was. Jeremy NB dev made literate programming in Jupiter feasible. NB dev to improves upon that even further. What are some open research slash exploration areas that could help improve literate programming even further in the future. That was one of my questions too. So that's good. All right. So I'm just going to totally hand that over to somebody else much smarter than me. He's thought about it for longer than me, which is Brett Victor. Yeah. So Brett Victor has his talk from 2013 called the future of programming. And Brett talks about this idea of coding being, you know, trying to work with a direct manipulation of data. And so I think to me, you know, as I say, it's not so much about literate programming. It's about exploratory programming. And Brett's given so many great examples of directly manipulating things to code. But he actually shows his examples from the 60s like sketch pad where Ivan Sutherland was directly drawing things on a display, believe it or not to like create constraints and or to create automatic drawings. 69. A prologue based approach to kind of describing what you want. Pattern matching. Doug Engelbert's ideas from 1968. Again, like all like manipulating things on screen directly. Yeah. Rand corporations grail of like building things up in this way. And of course, we've talked about small talk. And yeah, it's all interactive responses. And so people like Brett and Ellen Kaye talk about how we've somehow, you know, lost our ability to, you know, write things in environments that look more like this, you know. I mean, there's a classic example from Brett Victor where he's designing a computer game like a Super Mario style computer game. And he sets up this kind of time travel debugging type system, but it's actually shows you the exact way what would happen if somebody pressed the buttons you pressed in your game just now and like shows you where the characters would end up and he like modifies them in real time and you see them moving. Yeah, this is like what it should feel like to work with code is it should feel like this artisanal real thing. We're pretty far. It's funny we know books are great and data science repels are great. They're probably like 15% along the way that they need. I'm pretty excited about working on those problems too. And Brett had also a great example of he had this award winning iOS app for basically the train schedule the but schedule in San Francisco and he showed this example in one talk where he describes how you could have written the whole app entirely using a kind of graphical object to the system that's just totally unlike any coding that I've ever seen. Yeah. Yeah. Well, thank you JJ. I appreciate about two way AMA slash conversation. I think I did just reinvent the idea of a conversation. You'll have to see if you're going to if you're going to if you're going to promote it as a two way AMA or a conversation. All right. Well, good luck with the last couple of weeks up to the launch. Yeah. Absolutely. Well, thanks. And we're going to be launching around the same time. Exactly the same time. It'll be fun. All right, mate. Take care. Bye. Thanks.