 In the last couple of weeks, I've really spent several hours each day focusing on cleaning up a lot of my personal projects over on my GitLab. You guys have seen me make videos about some of this recently regarding DM scripts and shell color scripts. Well, most of the last two weeks I've spent actually focusing on my website over at distro.2 because I wanted to completely rewrite the website, redesign it. I previously maintained a small website that I wrote in org mode and I used Emax org mode basically to publish a static HTML site. Well, I had something a little grander in mind. I still wanted to use org mode, but now what I wanted to do instead of maintaining a site that had, you know, 20, 30 pages, I wanted to see how org mode would handle a website that had around 22,000 pages. Now I know some of you guys are going to say, that's crazy. You would write such a large website, a static website using Emax and org mode, right? Well, that's the point of this is I wanted to see how viable of an option would it be. Could you actually maintain a rather large website using strictly org mode and Emax? So before I go into the details of how I created this site, let me actually show you the website in its current form. So this is distro.tube and what I did is most of the documentation here, most of the articles and things, they have been on my previous versions of the site. But what I wanted to do to really push the boundaries is now I have this section here called Linux man pages. What I did is I took all of the man pages that were currently installed on my Arco system because I have a rather bloated system because I have a ton of stuff installed on my main production workstation, mainly reviewing software and things like that. So I grabbed all 21,500 and something man pages that were installed on my system and I converted those man pages over to org and then I have to convert the org over to static HTML. And this seriously took a lot of time. When I say this took me a couple of weeks, it took me a couple of weeks to do this. Now it's one of those things that took a long time. There's a lot of upfront costs but maintaining this going forward I think will be rather simple. You see under the Linux man pages, I have it broken down by section. If you're not familiar with how man pages work, they're actually sections of man pages. For example, if I did a, I don't know a man on man, you see man and then out to the side you get a number one. That means that that particular man page, that program man is section one, meaning that's one of the executable programs or shell commands. So that is what that is. As you can see, there's about 10 or 12 different sections of man pages. And if I click on one, we'll do section one here. Section one and I believe section three are very large. There's many, many thousands of man pages in those sections. So instead of having a really lengthy list of man pages to view on one page, I broke it down alphabetically. For example, A and then here are all the section one man pages that begin with the letter A. For example, I could read the man page for audacity. Let me click on audacity and there is the man page for that. And you can see some of the styling and the CSS, which I didn't spend a ton of time on the colors and everything. I kept it kind of similar to the previous version of the side. I made it a little darker. Something I did want to add because I spent so much time with these org documents in Emacs. When I converted it over to HTML, I wanted to keep the bullet points kind of like you have hitting bullet points in Emacs org mode. Let me show you. So let me navigate to the GitLab repository where I have all of the source code for the site. And let's look at the index page here and you can see Emacs org mode. You see the bullet points that I zoom in here. I wanted to keep the same style bullet point, not just in the org documents, but also in the converted HTML. Now creating some of these man pages was really challenging. For example, this listing of all the man pages in section one that begin with the letter A. How do you do that? It's a challenge, you know, getting all the alphabetical stuff separated. And here is the main index page for the Linux man pages sub directory. And how did I accomplish this? Well, let me show you exactly how I accomplished this. Let's navigate to this sub directory here, man-org. And this is the directory for all the org man pages. Let's go to the index page. And I have this section here search alphabetically and then I created a table with A through Z. I also included numbers because some programs will start with a numeric digit rather than a character. But say I go to the A page here. So let me click on that. How I get this alphabetical listing in the rendered HTML is I actually create a source code block here in org mode. Let me zoom in. And it's a bash script, right? The source code block is a bash script. And when this bash script executes, it prints the results in the org document and it also prints it to the rendered HTML page when I run org publish. And just quickly, this bash code here, I create an array and I call the array starts with A, right? Everything that starts with the letter A. And this array is basically going to be populated with the find command. I search in the man directory for type file for any name that starts with the letter A or capital A. It's an insensitive name search. And then I take that list of all the files that begin with the letter A and I pipe that into the sort command because I want to get an alphabetical listing. But with the sort command, because I'm dealing with some subdirectories, and it will sort it based on the full path. And I want it to sort it based strictly on the file name. So with the sort command, I actually use the dash t flag. I'm going to specify a slash as a separator, as a field separator. And I want the third column. I want you to sort by the third column, because these file names with the find command are actually going to be period slash, name of directory slash file name. And I want you to sort on the third column with the slashes, which is the file name, right? That'll be the third column. So that's how we do the sort correctly. Otherwise, the sort will be all messed up. And now that we've got our array of all the files that start with A sorted correctly, we take that and we put them in a for loop. So this for loop here for the variable X in the starts with a array, run the following command. And this command, all it does is it cleans up the file name a little more, because each of the file names are going to have .org as an extension. So we get rid of the .org extension. That's what this echo into all can decide does. And then finally, we echo this here. This creates a link in org mode. A link in org mode is created with double brackets. And you can see the first part inside the double brackets is dollar sign X. That is just the file name, the file path. Actually, that is a link. And then dollar sign name is actually what shows up as the description of the link here. Now when you're dealing with hundreds or thousands of files running some of these source code blocks in an org mode document can take some time. It can take several seconds to several minutes. So these kinds of files here, these take a while to generate. But it's one of those things, I only have to run it once, right? Unless I'm adding more man pages down the road. This is not something that I'll have to execute very often, but I'll show you this on camera. I have my cursor here on the end source line here on the source block. So let me hit Enter. And in the echo area down here, you can see it's writing to slash temp. Doing a code block evaluation, it just said code block evaluation is complete. But Emacs is still tied up because now it has to take that code, write it into the org document here. And yeah, now it's done. So this here, we're not dealing with a huge file here. Let's see how many lines, 806, so pretty good bit of results. But that took several seconds. If you had something that was several thousand files long, that you had to do some serious calculation on. Don't be surprised if your computer is taken up or Emacs is taken up for a couple of minutes. And remember, Emacs is single threaded, it's not multi threaded. I can't do anything in Emacs while it's running a process like that. So all I can do is just sit there and wait for it to end if I want to actually use Emacs. So I have several pages on this side, quite a few pages that use the source code blocks to generate the output and what you guys actually see on the pages. So here's that page, we were just looking at the eight page. Now you're probably wondering how I got the individual man pages. For example, the man page for a 52 deck, whatever that program is, I'm not familiar. How did I get those man pages converted to org? Well, what I did is, let me open a terminal here, I'm gonna zoom in. Let me show you where your man pages are on your system. If you go into user share man, this is where all the man pages on the system are. Now there's a lot of language specific categories, but here are the real sections of man pages, right there. And that is what I put on the side here. If I go back to the sections, section zero P1, one P2, three, get back to the terminal, zero P1, one P2, three, et cetera. So let me CD into one of these sections. Let's CD into man one and I'll do an LS. And you'll see that these man pages are .gz, so they're compressed files. So before I can work with them, I had to uncompress them. And what I did, let me get back into Emacs, is in this project directory here, I created a directory called scripts. And I just created some simple shell scripts. For example, let me zoom in, decompress-gz.sh. This is a shell script, obviously. All it does is it creates an array using the find command once again. I created a folder called mandb for man page database. Find every file that ends in .gz and then the for loop pipes that into gzip, which decompresses that. So it's no longer a .gz file, it's in just a standard trough format, which is the man page format. So now I have about 21,000 uncompressed man pages that are still in trough format. How do I get them in org format? So for that, I used pandoc. So pandoc, for those of you not familiar with this program, pandoc allows you to do file conversions, file type conversions. For example, trough to org mode or from markdown to odt or PDF, HTML. It allows you to convert between a lot of different text formatting languages, LaTeX, for example. So the way pandoc works for the conversions I was doing is I would do pandoc-o for output and then name.org. So we're writing to a specific file name, name.org. And then dash f man, that's the format that the original document is in. And then we do dash t org, and that is what we're converting to. And then name.man or whatever the extension happens to be. That is the file that we're converting to name.org. Now obviously you don't want to run pandoc 21,000 times, right? So this is another thing that you'd probably just stick in a for loop. And let me show you some of the other scripts that I had in that scripts directory. So I had one for convert, the man pages here. So same deal here, we're creating an array using the find command and we're finding everything in the man page database slash man three. So this is everything in section three that has the name, any name. And then I've got this for loop here that basically formats the name in the correct way I want the pages to be named and then I run the pandoc command. Very similar to what I just showed you there in the terminal. Pretty simple stuff, I'm not going to go into detail. Some of the other scripts I have is I have specific scripts for each section is I didn't want to, when you're dealing with 21,000 files, I didn't want to run a script that had to convert 21,000 files. With pandoc, right? So what I did is I created a script for each section. That way I can just manage each section. So let me show you a little bit about org publish. I've talked about org publish on videos before, but let me find my config for do me max. So let me search for config doom config.org here. And then my do me max config. You can see I've got a section here labeled org publish. And I have this variable set here org dash publish dash project dash a list. So these are our org publish projects. So our org publish directories, essentially you can think of them as websites. Well, what I did is because this website was so massive, I actually split it up in the website without the man pages. So just my 50 or so pages that have nothing to do with the Linux man page section. Because that way I don't have to render 21,000 files if I just need to render stuff outside of that particular directory. And then I have each individual man section, man 0p, man 1, man 1p, etc. all the way to man 8 as their own projects. That way, again, if I'm not tying up my computer for hours at a time, I can actually split up the conversion into sections. So if I run the org publish command, which in do me max is space M capital P. And then you see I get suggestions for the next thing I can do. So I can do a for org publish all. I don't want to do that. It's going to take a long time or I can do F, which is a single file. So if I'm only making an edit to one single file in this 21,000 page project, right, that's the one I want to do. I want to go edit that one page. And then when I'm done, I'll do space M capital P, lowercase F. Now I love working with these websites written in org because they're essentially self-contained org sites, right? This whole site, I can just read as a org document as its own little org wiki because it just links to everything that's a org document, right? And then the conversion just magically translate your org wiki essentially into an HTML site, right? And it looks the same and it's really nice. It's really seamless. So the website and the org documents don't look that different. Also, because I'm pushing all of my source code to GitLab, those of you that use things like GitHub or GitLab, they have the ability to actually render the org documents correctly. So if I go to my GitLab page at gitlab.com slash dwt1 and I go to my repository for distro.tube right there and I scroll down, there is the index page, right? And it actually renders the tables and the links and everything. And if I clicked on one of these links, it would actually just go to that page here inside GitLab. So I just love creating websites in org mode. Like, I will never go back to being one of those WordPress users. I mean, it's okay for... If you have a need for a dynamic site, you know, something that needs like a dynamically driven database, that's fine. 99% of the web though, 99% of the websites out there could just be static websites. And if you're an Emacs user and you're using org mode, it makes sense because if you're like me, I write everything in org mode, which means literally everything I've ever written, all my notes, anything on my system that's in org, I can convert it to HTML just like that. And it could be a website. So this is the project I've been working on for a couple of weeks. I've had to do a lot with this. So such a massive project, 21,000 plus pages for that big of a site, does it make sense to use org publish for that? Well, I will say that the upfront costs are massive because it's gonna take hours and hours to convert all those pages from org to HTML. Also, if you're gonna run into mistakes, right? Nobody creates a website from scratch and there's not gonna be mistakes, right? I had to do a ton with grip, said, awk to correct things because after I run the conversion I got 21,000 org documents and now 21,000 HTML documents. And then I find, you know what? The link I did for my CSS page, I accidentally put HTTPSS colon, right? Like I misspelled it. So this is not gonna render right, right? When I throw this up on the web, it's not gonna look right. So I had 21,000 pages, actually 42,000 pages because I wanted to convert both the org documents and the HTML documents back to having the right links. You know, that's where I get into things like grip, said, awk. I also had to rename a lot of file names because a lot of the man pages include colons and I can't use those in org. Org does special things with colons, especially when you have two colons in a row and a lot of man pages actually have two colons in a row as part of their file names. And for org, they can't handle that, especially if you're linking to a file that has two colons in a row in it. That's not gonna work right. That link will be broken in org. So I had to do a lot of renaming. We talked about renaming on a video just a couple of days ago. And I talked about all the tools for renaming files. I had to use a lot of those because a lot of those man page file names had to be changed. So a lot of upfront costs, many hours, but now that I've got it to the point it's at now, it's still, there's errors on the side. There are things that are still not right. But for the most part, it's right enough to where I went ahead and put it up live on the web. But don't be surprised if some of the man pages, especially you go and look at, they don't look right, because I do know that there were out of those 21,000 man pages, there were at least 50 or 60 I know that are not going to look right because they included some special formatting in the trough format that Pandoc could not handle. Like Pandoc could not convert that over to org. And really this was just kind of a proof of concept. I wanted to see could I make such a large website using org publish and I can. And it was fun because there were so many challenges. I had to do so much, you saw all the crazy little scripts that I wrote to make this site happen. You know, I had to do so many conversions, file name conversions, also text conversions of one of the scripts I wrote. Let me show you, because I don't think I mentioned this particular script and it's actually kind of neat because I ran into this problem here because the man pages, once I converted the 21,000 man pages, I needed to add the header, you know, the org mode header to each of those files. And you know, that was kind of crazy because you know, how am I going to accomplish that? Well, the way we accomplish that is with this for loop here, and with said, we actually insert on the very first line, you know, this line of text here. And then we do the same thing, you know, with these next five lines. Now what it actually does is it inserts them as the first line. So this is actually line five. This will be four, this will be three, this will be two, this will be one, which is the correct order. And these variables here in the for loop for file name and extension, this is for me getting the correct file name without the extension so that the title of the page can be man pages dash, you know, name of program, for example. So many challenges to this site. So many things I had to overcome, but it was fun. I had a blast. And I hope you guys appreciate the new website over at distro.tube. Now before I go, I need to thank a few special people. I need to thank the producers of this episode. Devin Gabe James Matt, Michael Mitchell, Paul Scott, Wes Allen, Armored Dragon, Chuck Commander, Ingrid Diokai, Dylan George lead, Lennox Ninja Maxim, Mike Erion, Alexander Peace, Archon, Fedor, Polytech, Rib prophet, Steven and Willie. These guys, they're my highest tiered patrons over on Patreon without these guys. This new distro.tube website, it wouldn't have been possible. The show is also brought to you by each and every one of these ladies and gentlemen, all these names you're seeing on the screen right now. These are all my supporters over on Patreon because I don't have any corporate sponsors. It's just me and you guys, the community, right? You like my work, want to see more videos about Lennox, free and open source software, subscribe to distro.tube over on Patreon. All right guys, peace. Them users are jealous. Nano users are probably confused.