 I'm David Bremner and welcome to the Not Much Boff. These things have a tendency to turn into talks instead of boffs and the fact that we're starting with slides is already one step down that slippery slope. So... We heckle you a lot during the talk. That's perfect. Yeah, I encourage heckling. So the chief heckler in the front row is Carl Worth, who we've been working on this project together now for a few years, although we just met today. So that's one of those internet stories for you. So what is Not Much? It's a library, although I guess originally it was just a command line tool and then people started to want to do crazy things like call it from Python. And so it morphed into a shared library and a command line tool. And it does... You think of it essentially as a front end for Zapien, the full tech search engine with specializations for dealing with mail. And so, well, it does indexing, otherwise searching is very slow, right, if you don't have any kind of index. It deals with tagging. And in my opinion, the most important thing it does is fast searching. So I think in the beginning, many of us thought that tags, Gmail, wow, this was really the thing, but eventually over time, I think the most useful feature it supports is full tech searching. So here's a diagram which took me a while to draw. So that's why I show it every time I talk about Not Much. So really the heavy work is done by Zapien and to a slightly lesser extent by Gmime. So Not Much is not a GNOME project and is full of GNOME haters or something. I don't... Haters is too strong, but people who prefer not to use GNOME. But nonetheless, Gmime is great. We have bindings and there's a bunch of user agents built on top of it. So... The blue box, yeah, great colors, huh? Example tools. So these are some tools that have been built that are not mail user agents. So they're mainly tools that are built using scripting or using the bindings. So I think part of what I want to pitch briefly is that it's quite an extensible system. So there's lots of different ways to use it. If you want to write C code, you can. For the rest of you, there are various bindings. Honestly, the Python bindings are the best supported. They're the ones that people are using daily. So unless you have reasons to really dislike Python, that's probably... Well, or scripting from the command line is also a perfectly sensible thing to do. Okay. So anything else not make sense here? All right. So Bower is written in Mercury. So if you've always wanted to work on a project written in Mercury, this is your chance. Okay. So this was up to date as of May of this year. So it's slightly out of date now. So you could see that Carl did a whole bunch of work in terms of number of commits. And then... I think it's really small. The rest of us do too, actually. So yeah. And then half of my commits are like, you know, merge release branch to master or something. So I overinflate my own importance here. So the real people to thank lately for their ongoing efforts are Austin Clements, Yanni or Jani, I don't know how to pronounce Finnish names, Nikola, Mark Walters, and Tomi Olila. Something like that. Okay. So we're Finnish heavy, which is always a good sign in a free software project if you have lots of fins. So you know, all right. So you should get on that bandwagon or move to Finland and get on or get out of the way, right? That's what they say about... It's kind of aggressive. I mean, I don't really care if you... Okay. So here's a sort of man page level look at the command line interface. I guess Carl can say whether it was consciously modeled on Git or not. You copy a user interface feature. I copied the one good user interface feature. The video people are getting twitchy. You'll have a microphone. If I'm going to copy from something that has notoriously bad user interface, which Git does, I got the one thing that I thought was really good. So we have query syntax, which is essentially Zapien's query syntax. So if you're used to that, then great. So that has its pluses and minuses. So we occasionally find it a bit limiting, but we're working on that. But you can search on some headers. You can't yet search on all of them. It's a constant feature request, so I'm sure it's going to happen at some point. Yes, Ian? Can I have the one? Yes. You said I should heckle. Right. So query syntax. I'm missing body from that set. Missing body. Search in the body. Oh, don't do any of these things and you'll search in the body. Search only in the body. If you search for a term that you're wanting in the body, you'll also get occurrences where it's in the subject or at the front. How would it actually be a hard feature to fix? No. I mean, we wouldn't even have to change the indexing. I don't think. Yeah. All right. So if you ever start to use not much and you make a feature request, it wouldn't be that hard to do. Body only. Okay. So sure. So here's an example of some queries. So I'm looking. I have some kind of hierarchical scheme of tags which is just my convention and I'm looking for things that are to me. So one of the original design decisions of Carl's is that not much doesn't modify your mail. And so here I am modifying your mail. But I'm doing it by doing a search for mail that has been tagged previously in some user agent as deleted. And then once in a while I do a purge and remove those. We can come back to these scripts and critique my purl. But so what I would say is that it's not just because of the language here that the Python script probably looks cleaner because the Python script is using bindings that have been written whereas the purl script is acting as a glorified shell script. It's just shelling out to not much. I think the only reason I included it is that that's a piece of production code that runs our bug tracker. So it's real. I mean that's worth something. You can do Ruby, Ruby, Python, it's all the same, right? Python with Rails, sure. So there's the aforementioned bug tracker. I'll do most of this stuff as demos. Let me just mention the demos I could do if people were interested. So there's a bug tracker based on not much, there's an Emacs-based mail user interface, there's a Vim-based user interface, there's a lot down here. And these are all actually, it's unfortunately, you know, bitmap images not reproducing too well. All showing the same search results. Yeah, this was in, this is out of date. These are from last year, these images, so. Right, you may well argue that if I just managed my mail better I wouldn't have to develop this software, but there you go. So I don't know if Zack can give a more informative demo of not much mud than I can. So it's a way of sticking with mud by taking advantage of the search facilities of not much. Sure. There's a web client which I've actually got running on my laptop right now. There's sadly not the version from Debian, and if you'd like to know why then I'll rant about that a bit. But so I can demo the web client. Rob is hiding in the corner there, has sort of by mistake written another web client. So you have your choice. You can use the portable, portable Marshmallow, not much itself is written entirely on that stage. Right. Okay. So we have a web client written in Haskell and a web client written in Clojure. So. I know me early. Yeah. Okay. And what do you have on web applications that are written in a live? Fair enough. You'll have to write your own then, because we're clearly short. Okay. So let me finish off just by mentioning a few of the things that have recently happened, so if you're running the Debian, not much, you won't see these yet. They're in Gitmaster or, so, sorry? So Melpa's a bit dicey with not much, yes. If it works at all, you will get not much jump. But the problem is that Melpa only brings you the Emacs list, and so that can, it really should, you should get the C and the Emacs lists together, but. So well, some bug fixes, some fixes for unread message handling, some UI features. You notice Emacs is coming up here a lot. So that's sort of, you can see where most of the development is happening, or rather most of the development that I'm aware about. So some of these things are not happening in the main, not much free. A lot is its own project, and not much Emacs, or, I mean. Not much outside of Emacs. Yeah. What's that? What's not much jump? So not much jump is a cool new UI feature, which I can demo for you. It's simple, but cool. So what we're working on, this is kind of foundational stuff. We're going to go from database versions to database features. It's sort of like having a symbols file instead of one SO name. So I think it should be groovy. We're going to get metadata modification time. So you're going to say, do me a backup of all my tags that have changed since Thursday. So we're going to be able to do incremental backups of your not much metadata. Some improved message citation stuff, char set, and CID. So some stuff with HTML and the ever ongoing battle with character encodings. And so not much insert was introduced. So right now we have two MDAs, not much base, and this is really irritating. So basically not much insert was introduced and it misses a key feature. So we're going to fix that so that we'll be back to having one. Ian. So I had a bit of a read round, but I'm still kind of a bit confused. So not much as an indexer, but also if it's got a, if it needs an MDA, then it's a mail store. And now I'm told it's a mure as well. So which of these, what is it anyway? So it needs an MDA because the thing it indexes is mail. So it doesn't, I guess it's mission creep in some sense. Well, I don't know. So the actual history of where not much came from, I started using an email program called SUP, which actually goes a lot to explain the name. Not much. Not much was the answer to SUP. Anyway, so SUP was written in some horrifically slow language, Ruby. We don't hate Ruby on it. No, Ruby's great. I'm sure it was great. But it was, it was, it was using Zapien and Zapien as a C++ library, which gave very fast full text searching of the email. And I thought that was really great. But the indexing code, the thing that actually parsed the email and can shoved it into Zapien was all written in Ruby. And just the thing that just like read in an email message and constructed an internal data representation of this simple little blob of text was horrifically slow. And it took forever to index your mail in the first place, not because Zapien was doing the full text indexing. That was actually pretty fast. But that Ruby was actually interning these strings. And I said, okay, this is really ridiculous. So I set out just to speed up SUP by writing a little bit of C code that could read in an email message and shove it off into Zapien. That's all I was trying to do. And if I had, because if I had tried to do anything of what we're talking about up here, I never would have started. I would have been terrified. That's a big project. But I'll just, I'll just accelerate the indexing here and SUP. And so I got, I got done with it. And some of the design decisions that have afflicted us for a long time, one of the features that I never did at the beginning. And in the past, I don't know, two years that I've totally abandoned not much and let these guys continue to run with it. Thank you guys. One of the features that I never did, and I'm surprised that no one has done yet, is the indexing of arbitrary headers. Or even the indexing of just the list ID header, which everyone complains about. One of the reasons I didn't do that is I was intentionally from the beginning making an index that was absolutely compatible with the indexing that SUP did. So I could do like bitwise comparisons that, yes, I fed all the same data to Zapien and I got the same index out so that we could turn my code over to the SUP author and say, hey, take it. It makes a bit for bit identical index. And I got to that point and I said, here you go. Take a look at it. What do you think? And he's like, I don't want any C++ code. See, no, it was under C code. I don't want any C code in my project. I want to be able to fix bugs in Ruby. So no, we don't want it. I thought, oh, well, okay, well, I can index my mail and now that I can index it, I could write a little bit of code to search it. That's not very hard. And I've got this command line tool. And then Keith Packard says, well, heck, at that point, Carl, it would just be, I was struggling with the SUP user interface, which is curses based interface written in Ruby. I really wanted something inside of Emacs. He's like, that's just a little bit of ELISP code. I could write that really fast, said Keith. And somehow he tricked me into doing that. A couple days later, I'm like, oh, I've got an email program here. And I switched to using it entirely. So it is, so finally getting back to Ian's question, it is an indexer that expects to have a local mail store available. It doesn't do anything in terms of fetching your mail, getting it onto your system, being an MDA. This guy who talks about it being an MDA, well, we'll talk about what that means in a second. So from the beginning, it assumes that all of your mail is available locally, which is great, because if you compare it to something like Gmail, you've got a nice, you can have your mail store live on your laptop and not be visible to any third party. So that has some benefits. And then on top of that, so now we have a command line interface. And I want to do a little bit of demo of the command line interface. Do you have HDMI over there, by the way? Just VGA. Just VGA. Do you have a VGA? I bought too new of a laptop that did not give me a VGA. Anyway, I'll try to figure out. You want to do it on my machine? Yeah, would you? We probably have more or less the same mail. What it takes, we don't know, but there's HDMI in. We have HDMI? I'm going to get one more piece of it. No, that's OK. It's OK. No, we want to do the actual demo. If I could just, I just need to get this one line command to you somehow, or you could type it in. It won't work for you because you don't have the code repository I'm dealing with. I'll talk about that in a second. Oh, holy what? This is real life. This was the command I typed this morning. This will be a story in a minute. So I can't type all that. It doesn't matter if you get it right, because it won't run. OK. It focuses it well. He doesn't have the code repository. There's some curtains. OK, so not much is an index, or it assumes a local mail store. We did a library interface, and my idea was once I got it to the point where I had something working inside of Emacs, I said, nobody is going to want to use this. Well, there's these few diehards that run their entire lives inside Emacs, and they'll be happy. They'll love it. They've been waiting for this for years since they ran over Wanderlust's speed and limitations years ago. But anyone else, like, I don't know, they want to use Thunderbird or some other email program. We'll need to give them a library interface. That's why we did a library interface, so that someone who's actually maintaining a graphical email program could simply hook it up, hook not much up, and we'd get fast indexing for evolution and Thunderbird. And nobody's ever done that either, so that's been kind of sad for me. Not that I want to use those programs, but just to be able to say that we did that. So an indexer, a library, and an MUA built on top of it. OK, so now I'll do David's story since he's typing this. Are we failing? OK. I can't even get it. It's a one-liner. I typed it this morning. Yeah, but it took you all morning. Yeah, put it in a paste bin and I'll grab it. Turn on your webcam and point it towards his laptop. I don't have a webcam. This is technology. This is difficult. I'm so glad we asked for this to be streamed. Can I just paste it to hash not much? Are you in there? Oh, yeah, I can be. Oh, I could be if the network worked. OK, so I don't know why I have no network. So this morning, Eric, Ann Holt, is a not much user. And he says, oh, I had this not much disaster happen to me yesterday. I guess, what was that? He said, well, I thought I was in a filtered view of a few commits, and I ran star minus inbox. These are Emacs key binding commands, meaning star, act on all messages in the current view, minus take away the current tag, not inbox, but to do. Star minus to do. And he said I cleared all the to do flags, because he's flags, emails that I wanted to get back and to do something with them later. But he wasn't. Yeah, yeah, wasn't that great? Well, that was the thing. He thought he was in the filter view, but he was at the top level view, and he threw away the to do tag from all messages. Well, the tags are the only bit of precious information that exists in the not much database. Everything else you can recreate from the mail store. Tags are something that you've added manually. And so now he said, I didn't remember anything that I had marked to do. I said, well, that's great, you don't have anything to do. Your work just got easier. And he said, well, surely you can recreate a bunch of that. He said, yeah, I could remember that someone had asked me about that kayaking trip, and I was supposed to reply. And he said, I went back and recreated that. So the biggest problem was this software project he's dealing with called Piglet. He said, there were a bunch of patches that people had sent, and I had marked them, because they were tricky to review, and I wanted to get back to them and get back to me. So in particular, the people that had sent a web patch, and it was their first patch, and I wanted to make sure it really got reviewed and got at least replied to and inserted, right? I said, well, so you're talking about people who have sent patches to the Piglet mailing list. Their patch has never been applied. This seems like something that's really easy to describe. So we wrote this one liner. We still don't have it up. Okay, so anyway. This part's not my fault. This was just, it was just a little bit, he's got the not much part at the beginning. Okay, well, I can keep typing. Yeah, so yeah, keep typing. And the rest of it is just to get short log thing. It's not really that interesting now that we're making David type it. The point was we could generate a query. So this first part, not much search to Bremner subject. So Bremner, this was actually Piglet, he changed it on me. So we were looking for all the mail sent to the Piglet mailing list with the subject of patch. And that was about 2,600 email messages, more than Eric wanted to go through. But we wanted to exclude from that any mail sent from someone who had a patch successfully applied to the Piglet code repository. So that was a get short log with some said, blah, blah, blah, blah, and that was perfect. Thanks, David. And they then resolved, do you want to type? No, you're not going to type the output of the command. Because the output of this command was, I don't know, it filled my whole next term. 60 lines worth of query, but you could run and you get about 100 messages out of it. And even that, we had some names like Shoze where the accent was in the get message or the, you know, there were some easy things to filter out. I think he had 70 messages after about five minutes and that was enough, that was a small enough list that he could go through manually recover. So that was kind of a cool story. The scriptability of an email program has been really valuable to me. And that was something I didn't expect to create from the beginning. So basically what you said is that because you've managed to make an Emacs mail client that doesn't support Emacs undo, you ended up writing some hideous shell script. That's not it. Well, and he talked earlier about this incremental metadata update. So that Eric could have been backing up his tags in the beginning and avoided this problem. Right, but that would require people to make backups, which is kind of a silly. Okay, so I don't know where people want to go. Somebody asked about not much jump, Rob. I think the other thing to note there is that there is a dump and restore command, which you can run whatever you want to. And it keeps a file containing all the mappings between the message IDs and your tags. And so if you do that, then you can get most of it back later. And people do fun things with not much dump because it's a text file that comes out. You can put that under Git revision control and you have, like I said, that's the only precious bit of your database that's not recreatable from your mail store. So you can put that under Git revision control. It's pretty cool. And that's actually the basis of the NM bug thing that does the bug tracking. Do you want to demo? Sure. So let me do two demos at one time. So- This is on your real mail store. Yeah, so we're gonna- So you can now do some injection attacks. If you want to email something to David right now, you could put embarrassing things on there. Oh, cool. I was gonna go to the bug list. Okay, so let's go to the review list. All right, so not much jump is nothing other than just some cool key bindings for saved searches. So now from anywhere in not much, you can jump to what I guess not much users would call folders and it's much faster. So we used to have sort of a top-down, we still have in fact, so this just adds on some more key, let's not look at my inbox. God only knows what's in there. There's much better, right? Nothing incriminating on a Debian list. So, okay. So, right, so this is in fact an Mbug. So an Mbug is just a search in my, in some sense, just a particular search in my mail client. So instead of going to the BTS, I just bring up this. And so for example, if I decided that- And this is the list of outstanding patches to be reviewed or? Which is only 66 patches at the moment, which is pretty good, I have to say. I mean, it's been worse before anyway. So suppose that as our mailing review takes place, our patch review takes place on the mailing list, and suppose that I think, you know what, you only live once, let's just merge this. So, I just remove that tag. And then I do, oh, this is not too good, is it? All right, so then I could do, but I'm not going to, just do an Mbug commit, which commits my tag changes to a shared get, well, to a get repository, and an Mbug push. So it's a variety of distributed bug tracker. And so why do you care, right? I'm just messing about with my mail store. Well, the point is it's not just my mail store, right? Other people can fetch this same get repository and share partial versions of tags with me. So everybody who's interested can participate in this metadata sharing. It's experiment, well, I guess it's not really an experiment anymore. No, it's been the bug tracker for not much, for quite some time. And for people that are using not much, it's been very effective that we can collaboratively work on this single name space of tags. And yet we all have this reflected in our own mail store. So you don't have to operate, since all of our code submission is by emailing patches, all of our code review is by replying to patches, now all of our bug manipulation is simply adding or removing tags from those emails within the email client. Mika, one of you. So now I understand your saved searches in Emacs were all not much colon colon foo tags, which are all related to this because those are the bunch of tags you're sharing through the Git repo. Right, so I have a couple of saved searches. And I mean, it's only a convention, there's a configuration for this webpage. So the set of tags is just a set of tags, right? And so not much and I, or the webpage and I have a set of views which more or less agree, except that some things I hide because once I tag something some way I don't want to see it anymore, but yeah, essentially. Has anybody thought about downloading the Debian BTS running not much to index it all and then tagging all the tags in it and see what comes out of it? Yeah, it should be possible. I mean, because yeah, the Debian bug tracking system is similarly all mail and then some extra metadata on top. I don't know that, and it's ever thought- Well, Jerome has a comment probably about the size of the mail store. You're right. I was just gonna mention that in terms for people it might be relevant who are thinking about evaluating it. In terms of performance, it performs pretty well, but when you get to a very large number or a not, not quite, not much number of mail, unless you have an SSD, I suspect it can get a little slower. So if you're up in the 1.5 million messages or whatever, it'll be a little pokey on a like fast laptop drive. And of course, the BTS is many times that, right? I have no idea what the total number of messages in all threads, in all BTS bugs, even the unarchived ones, that's a, what, it could be tried for sure. The early conspiracy theory that came out, I was working for Intel when I started the not much project. I'm still working for Intel was that, and I came out saying, yeah, it works great for me. I have 800,000 messages. It's blisteringly fast. And everyone's like, wait, what file system are you using? Oh, what about it? EXT4 or whatever. And they're like, oh, wait, I'm using an Intel SSD and it goes really fast for me. So the conspiracy theory is that the whole project is just a way to sell SSDs from Intel. But it actually works really well on an SSD. And I think there's some kernel bugs in something that, or something, we're tickling some bad behavior somewhere. Right. So at the moment, I'm using VM, which I may be not everybody knows. It's an Emax mail reader. It, VM likes to take your whole inbox. My inbox is, you know, several tens of thousands of messages. I have to rotate it about every six months. And it slurps the whole inbox into, into core. This works particularly well if you have a 32 bit Emax. Can I evaluate not much without having to convert my entire mailbox to MailDeer in case I, you know, because converting back is going to be pain as well. So you said you currently have to rotate it every, occasionally that's because of some VM limitation or something. There's one Emax file. Right, I have one file. Right, we want to do one file. And so Emax can't, the 32 bit Emax can't have them all during the whole Emax file. Right. So you've been, so you've been regularly running into the limitations of the Emax file format for a long time and just dealing with it by rolling this file over. Anyway, okay. Support for Emax file indexing. Well, as I said to Richard Stallman, currently nobody's working on that but you'd be welcome to. Yeah. But I mean, as far as permanently, it's pretty trivial to run Emax to MD and see what it would be like. Which by the way is the most awful, sorry if the author's in the audience but is the most awful user interface ever? I mean, it makes Git look like genius. But it is, it is really bad. It like doesn't take relative paths, you have to give it absolute paths and the weird things like that. But you give Mbox to MD, absolute path to your Mbox, absolute path to the mail there you want it to create and try it out. Maybe you'll never want to go back. That would be nice. Questions? Yeah, I just wanted to mention a pet peeve of mine, which is totally in the area of patches welcome as David already told me. And I just wanted to mention it because it might be relevant for specific use cases. So namely I'm living in a counter which is very much accent full, which is France. And as far as I understand, there is no kind of stemming, which is not really stemming, but canonicalization that makes accented letter being indexed as also as, also if they didn't have the accent in the first place. So if you have a lot of word with accents and especially if you have the same word appearing with and without accents, we essentially need to do a lot of queries with a lot of ors. And I guess we might need to do some kind of canonicalization before indexing, but if you are not doing that, that will be really painful. So a lot of that feature would be down inside of Zapien. Zapien does have the ability to specify, oh, this is the language of the text I'm indexing and it can behave differently. And so you might trigger some of that behavior one way or the other. But you have different languages. But you have different languages. And so there's a fundamental problem here, which is email doesn't come at, decorated with what the language of the text is. There are various projects that examine a body of text and try to automatically determine the language. And then we could pass that on to Zapien. I don't think we've yet incorporated one into not much. So if I, I mean, I think we talked about this in Switzerland, too. And, and, and I guess it doesn't sound, at least to get some initial handling for it. I mean, it sounds to me like Zapien is the right place for it, but, and, but the problem with that is then, well, do we really want to use bleeding edge Zapien? This is a constant question for us, but. Do we know how much language is important? Cause I think not much is currently telling Zapien English. And I don't know if you changed it to. I'm not sure you want to. I think that Zach has this sort of unicode map in mind that maps everything to asking. So as far as I understand the compositional role in which every single letter can be decomposed in the letter without the accent plus the accent. So you can imagine doing some sort of normalization. Dimitri is going to tell us why this is impossible. Well, I mean, it's so itself, like half of that database, it has translation tables from most of the languages have that have ever existed into canonical translation. You could use lip, it's so. But it's like seven megabyte database, which comes with it. And then you could translate it, but I'm not sure if Zapien hooks into that or not. But it is the canonical way to canonicalize localized text. There were a couple of other questions. The video team has a question, which I don't understand. You're happy. Okay. Hi. How well does not much handle duplicate message IDs? So you take that? No, you go ahead. So usually in most mail stores, if you have the same message ID showing up in multiple places, you're actually dealing with logically the same message. And that's how not much treats it. In the database, it uses the message ID as its primary key. It will keep track of the fact that there are multiple files associated with the same message ID. And you can, there's even UI where you could do say, hey, give me a list of all the message IDs that map to multiple files, and you could prune out the duplicates and things like that. What not much currently does the last time I looked at the code was it indexes the full text of the first such message it encounters. Later if it encounters another message with the same message ID, it assumes it's already indexed it and won't index it further. In common, what happens to me, for example, I'll get two copies of every message sent to a mailing list, one that has and one that does not have the mailing list little signature thing at the bottom. So that's been, that's worked in practice just fine. There's, you can imagine potential attacks where people are trying to forge message IDs and trying to mess up my mail store by sending me duplicates and we haven't dealt with that potential problem. Yeah, I mean, the other thing is I think the consensus is what we really want to do is index all the copies for at least so that even if you have these false friends which look like the same message but really aren't, you can find the message you want. And that's, believe it or not, that's why we haven't indexed the list ID yet. So there's some chain of causality. And then there's probably some UI things to be done to say, hey, by the way, this message has three other duplicates. Would you like to see them as well? And if we had that UI in there, I think at that point you could avoid any potential problems from weird duplication. Yeah, I mean, besides the footer at the bottom, there's also the list ID, like you mentioned. The headers are different for those and if I'm going to search, oh, I want to see all the ones from that list where we had this conversation, suddenly the ones that came to me that got indexed first are missing. And that's what exactly what David meant. That's why we haven't indexed the list ID yet because we'd have to do it reliably by doing the copies first. So we've got to deal with the indexing of copies before we can add that. Why on earth are you using the message ID as the primary key? That's obviously insane. I should start email sending an email with the same ID all the time. That was actually a question. I want to know what the answer is. Why? I have this great built-in answer to most of these hard questions that had to be answered early in the project. It was I was doing only what sub did, right? I wasn't trying to invent something new. So I'm going to pass off all the tough questions to that point. But no, do you have a what, my study turns out, like I said, it's not that primary. It's a thing that's searchable. There's nothing really primary about it other than that we don't duplicate, we don't index duplicates. Once we're indexing duplicates, there won't be anything primary about it. It would just be another term that's indexed. Why is it that when I use different format outputs, I get different headers output? Like I was looking for all the spam assassin rules that hit and different formats outputs didn't have them included and some did. Not for searches, just for dumping, showing the message. So the format question here, this is whether you're going to output plain text versus outputting JSON versus whatever else. And I don't imagine there's any good reason that the set of headers from different formats would be different. It could be a bug. So it seems it's most likely that the plain text versus one of the structured formats got out of sync. I mean I expect the structured formats to all be in sync. But these days nobody uses the plain text as a sort of key part of their workflow. So it's maybe bit rotting. Yeah. And I think relative to that, for people who aren't familiar, it's very handy that they're talking about the command line client. The command line client, you can ask it to spit out, it's well documented JSON or S expressions of your data. So it's very easy to build on top of that. Yeah, I mentioned earlier we have a library, which is a C library interface and we've got these bindings on top of it. But almost always when I've built something on top of not much, I haven't used the library, but I've built on top of the not much command line interface. All of the initial EMAC stuff was done that way, even with the cheesy plain text output, not in any structured. Unfortunately now it uses the structured output, it's much more reliable. But that's been the nicest thing for scriptability is using the command line interface. Other questions? We're probably low on time. We're probably low on time. I was going to do a... We have like seven minutes. We have seven minutes. Oh, ten! Look, Joe's got signs for us. Ten minutes. Other questions? Otherwise David's going to come up with other stuff to demo. So we mentioned undo, or at least I did. You've got any plans to make undo work in EMACs because I use that in my Mua all the time. So one thing... Yeah, we're getting closer maybe. I'm not sure the right way to do it because it essentially has to keep the database in sync. I mean, you can reach behind the EMACs interface and do stuff in parallel. But one minor UI feature, which isn't what you asked for, but which is getting closer... Oh dear. So... Right, so you can't really see, but here I just asked NM bug, how is my mail store diverged from the canonical one? And I'm undoing that. Undoing my previous demo. Okay, so it's back. So one minor UI thing that's changed but is... So you can sort of... Again, it's a workaround. But so one thing that's changed is rather than making the tags disappear, we're now giving visual feedback. So this red line says it's actually gone from the database but so you can put it back. So it's a related UI feature, I guess. What's the recommended way to fetch your mail? So I guess... I mean, in an important sense, it's not much it doesn't care as long as it lands there, right? But in the Boff spirit of how do we solve these problems, I use Sync Mailder. I have SSHACs. Many people use offline IMAP. I hesitate to make a recommendation because I'm still using fetch mail, but I don't necessarily recommend that. Straight, straight, R-Sync of the Mailder. R-Sync of the Mailder. Yeah, R-Sync, in some sense, you know it's much better tested and isn't going to do crazy and compatible upgrades. Before we get to the next question, I want to say one more thing on Ian's question. We have discussed making every command constructed journal of the changes it's made so that you could do... You have a sync point in the journal and you could replay back to do your undoes and things like that. So what a solution like that would look like, we haven't implemented that. I actually wanted to respond to that as well. I'm not sure what undo you might be referring to, but I don't know how familiar you are with not much itself, but as it is in the Emacs mode, you have when, for example, you're constructing a message, you can do all the undo you're typically doing with your text editing. No problem, it's just the not much commands where you're adding a tag, for example, there is not the Emacs undo for that add or remove of a tag, but you can quite simply just do the opposite command you just did if you just removed that tag, or if you removed all the tags in a particular view, then you can just add them all back. You just don't have the key binding to undo that. And as David demoed here on a single message, we also, if you have a search, for example, showing me all the messages currently tagged to do, and you hit remove the to-do tag from all of those, it won't automatically update that search, which would lose the messages, they stay there so that you could say, oh wait, I messed that up, add them back, so you can usually recover from those kinds of things. Right, but we had the example earlier, VM has virtual folders which are a kind of search really, and I'm often doing unwise things to the contents of a folder like, you know, deleting the, delete all the messages in this folder, in this virtual folder, when I previously deleted half of them, right, and now I want to undo that deletion, that's very easy, I just did undo. Yeah, we're not, absolutely, the full featured undo would be a nice thing, and we'd love to have that. We have two minutes, so I think we're down to two questions. Given that you don't index this ID, how do you deal with mailing lists? Usually, most of us just use the to address, or a conjunction of several to addresses at the list, the to address has changed, and it works. No, for example, if you wanted to search for anything that was sent to the not much, you could just do right now, search to colon not much, right, and there, that's all the message sent to an address that had not much in it. Now, you could have written the complete address, destination address for the mailing list, but that's what most of us do, is use the to address. So, I should mention that another option is delivery time tagging of some kind, which some people do, or if you happen to deliver into, in the old proc mail style, or the current proc mail style, I guess, into folders, then you can search on folders. So, this is a little harder, but maybe this will work, you never know. Ooh, look, so, I have folder, I happen to have my mail more or less organized into folders by mailing lists from... Don't get sidetracked, answer me. And so, I'm able to use that file structure as well to search. And let's not talk about the warning. I apologize that I missed the first half of the bof, so I don't know if there has already been discussion about indexing the clear text of encrypted messages. Ooh, ooh, ooh. Yeah, I wanted to talk about this. Right now, it's encryption. I've switched a few months ago to encrypt by default all messages that I send, if it can find the, if I have the key of the person I'm sending to, sending to an individual person, not to a mail list, my client will automatically and transparently do encryption. I'm not doing encryption because I necessarily have some private content of the message. I'm just trying to protect the transport of the message so it's not intercepted as it goes along. Because of that, if I'm imagining a day where all of us are sending a lot of messages back and forth that are all encrypted by default, I would love to continue to be able to use not much. Not much right now can't do any indexing of that encrypted content. What I would like to do is, when receiving the message, not much will currently decrypt it. You know, it'll prompt for the passphrase. I know the passphrase will decrypt it and show it to me. I'd like it to say, oh, I just decrypted this message. Should I insert the plain text version into the mail store? Which I would almost always say yes to. And then I would protect my mail store separately, such as using an encrypted file system or something like that. Protect your index in the mail store. Protect your index in the mail store. I would do those together. But why is it that your mail store is still encrypted but the index... No, I'm suggesting that... I would suggest that when decrypting the message to show it to me, I would insert a plain text version into my mail store. This is the system I would like. And then that would take protection of the mail store and the index as a separate step. Give DKG and make him stand up. So I think people are going to have a lot of different preferences about the sensitivity of their mail. So I personally want my mail store... I want the messages in my mail store to be byte for byte what I received. One of the things that I love about it not much is the fact that it doesn't tamper with the messages in the mail store. And so what I would like is I would like to have the index be something that I can protect. And on a per message decision in the way that you describe, be able to say, please index the clear text of this message. And I recognize that if you have access to the index of the clear text message, you can probably reconstruct at the very least a bag of words of the message, if not... You've got the order. You've got the order as well? So you might as well put the clear text there. Okay, so... But there's still a question of, I've got my mail store and I've got my index. You'll have two copies, one in particular and one out. Both are identical. Right, but I don't want to have to think about protecting my mail store. Right now I understand what's going on in my mail store. I understand where the encrypted messages are. And if I'm doing R-sync or offline IMAP or something, I don't want to be pushing clear text messages back up to the server's mail store. And my index is local and it's going to stay local. So I want to protect the index and I would like to leave my mail store the same. So the workflow I would like to see is just focused on the index. So I should have started that question with the answer that currently not much does not index encrypted messages. So the index is safe in that sense. It's useful in the sense that you can't search any of the encrypted messages. So yeah, we do need to develop the workflows and the right protection for that. So this is the end of our time slot. There is no other talks in this room so you guys can continue as long as you want. But there are probably other talks going on. Just so you know. Thanks. I was just going to say that presumably it's just about any message that's encrypted and not the headers so you are still indexing the headers and all that information, right? Okay. That's correct. Typical email encryption will encrypt just the body and not much will successfully index all of that unencrypted headers. So you can still search those. And so anyone is free to leave at this point. There may be other more interesting talks going on but anyone's free to stay and chat. Yeah, so I'm interested in encryption as well. So I thought about the problem and I was wondering you can simply... I don't know if it's simply or not but can you just think about encrypting the whole index with some transparent layer on top of Xavian or maybe working with Xavian people to just do that? It's certainly possible. There's a project... Is it a mail pile project? Is that a free software thing? I can't remember. Yes. They are currently doing an encrypted index, I believe. And they're using Xavian? No. No, they did their own indexing. They're focusing entirely on a web client. So it's another project perhaps worth looking at. So it seems to me like the workflow is very much a filter prior to index workflow where every message that comes in has no filter and then you can apply filters for the particular indexing of any given message. And then of course I guess it needs to be ordered filters because you could decrypt a message that then has a PDF in it to index the PDF successively. Yeah, but we're not currently doing... Not much doesn't currently I think go to that level. We will index the file names of attachments but we're not sucking them in and indexing other attachments, etc. Unless someone's out of that. Yeah, no, I don't think so. It'd be pretty simple to do. I think it would be an issue with how much are you going to blow at your index. So I'd suggest that I mean it's the beginning of DevCon. I'm around all week. I'm around all week. So let's sort of wrap it up and please find us if you want to talk more or want to work on much. Thanks for coming. Patch is definitely welcome. Appreciate it.