 One of the steps I absolutely dread in writing manuscripts or grant proposals or anything is the process of having to insert citations. If I have to use EndNote, oh my gosh, don't come near me because I'm just going to be pissed. Nothing, nothing is fun about using EndNote. So what do I use instead? Well, if I had to be honest, it probably isn't a whole lot better than EndNote, but at least I'm not paying for it. At least I understand what's going on and things aren't going to crash and burn for me, like they typically do when I use EndNote. Hey, folks, I'm Pat Schloss and this is Code Club. In each episode of Code Club, I apply principles of reproducible research to an interesting biological question. We've been working on a project over many, many, many episodes and we're at the point in the project where we're working on the manuscript. And if you've watched previous episodes, you might be thinking, gee, Pat, you haven't cited anybody. Are you really going to write and submit a paper that has no references in it? No, don't worry. I'm going to cite other people. I mean, I need to cite myself, right? No. Anyway, so I hate using EndNote, as I said earlier. And what I'm using instead is something called Bibtech. Now, don't worry about the jargon. I'm going to walk you through the process of finding references and then inserting them into your document and showing you how what we already have set up will allow us to render those citations, make a nice looking bibliography and without the computer crashing and burning like it always seems to do whenever I use EndNote. All right. Good. So I'm going to go ahead and I'm going to do most of my work today in Atom rather than in RStudio, because I'm going to do a fair amount of iterating between the R Markdown document and the final PDF. And I personally find that that works a lot better doing it in Atom and the terminal than doing it in RStudio. Anyway, you might prefer it other way with RStudio and know that the process is basically the same. So you do what works for you. And I'll show you my process. Okay. So in my project, on my computer, I'm going to go to my project root directory. And I'm going to go ahead and fire up Atom. So I'm going to go to my manuscript.rmd file. And as we scroll through here, you'll see my R Markdown document, which is again, that blend of Markdown, which is simple, light text formatting, along with our code. And because we're generating a PDF or Microsoft Word document, there are some other field codes in here for doing things like inserting vertical spaces or daggers or whatever. But we've talked about that previously, we don't need to worry about that today. What we want to think about today are inserting citations. So we're writing a short form paper that I think is limited to 1200 words. And this paragraph is my introduction. So it's really compact. And you'll notice there are no citations here. I prefer to write without worrying about the citations to do this, though, you need to know the literature well enough to know what you can say and what you can't say. Generally, when I write, I do it a little bit differently than what I've done here is that as I'm writing my sentences, I might put something in at the wherever I want to put in a set of references, right? And so I'll put in this note to myself to say, hey, Pat, when it's time to insert the references, go ahead and put it here. I don't want to put my references in as I'm writing because that slows me down. That allows me to procrastinate. I can, you know, open up my browser and populate just thousands of tabs when really only need one paper to put in that citation. So again, I like to save inserting the references to the end when I'm, you know, at the point of having some good text and I'm getting ready to polish up the paper, edit it and submit it. So what I'm going to do with you is I what I do is I kind of look through my text, and I look for different cases of where I think I need to put a reference. And I might make a mental note of what references I want to go here, right? So this one here, this distance based thresholds that were developed and now widely used 3% were based on DNA, DNA hybridization. So this is going to be stack and brand. I'll say stack and gobel. So I know what I need to plug in there. Genome sequencing technology suggests that widely used 3% distance threshold, blah, blah, blah, is too coarse. And I might put in another refs there, right? And as an alternative, implicant sequence variants have been proposed as a way to adopt the thresholds, right? So I want to put in a set of references there. And so here I might put in things like the reference to data to you noise to the blur. And let's go ahead and put an MED, that's a Marin's oligotyping tool that kind of got all this started. And then we'll also put in a reference here to that. It was Callahan and Holmes. Basically, the paper that has the headline that like exact applicant sequence variants are what you should use and should burn OTS and never think about them again. Anyway, so let's think, let's kind of keep coming through here. And I think a lot of this is kind of getting into being data and not necessarily something that needs to be cited. Maybe we'll go ahead and put a reference here for like the RNDB reference there, right? And so we could kind of keep working through the manuscript inserting these different callouts to myself of where I need to put references and what references I want to put in there. Some things like this sentence, and I don't know exactly what paper I want to put in there. So I'm going to kind of hold onto that. And I'll figure that out later. So I could keep going through this looking for different places where I need to add citations. Places that I know I'm going to want to add citations would be this paragraph. I've already here, I did put in citation, right? And I want to make sure I replace those because I don't want to submit a paper that says citation. I need to make sure that I've like, you know, put in something. There are a number of papers that have somehow gotten published with things like that in there. And we don't want to be that person. Anyway, so we want to include references in this paragraph. Again, we're looking for sentences that are kind of statements of fact that we needed them like back up with a reference to the literature. Also, in our materials and methods, there are going to be various places where we need to do a citation. Again, like the RNDB, you know, Silva reference alignment, I want to reference for that. And so forth, mother and on and on and on. Right? So let's come back to this first paragraph. And I'll show you as an example, how I'll go ahead and start populating values to put in for some of these references. Okay, before we can get going, we need to come back up to the top of the YAML that remember the YAML is that code chunk. It's not really a code chunk, it's the header material between those three pairs of three hyphens. And we'll do bibliography colon, and we'll do references dot bib. And we will then do CSL as ASM dot CSL. Okay, so what's going on here. So references dot bib is a file we need to create that contains bib tech formatted reference information for anything we're going to cite in the paper. So we'll need to touch that file to create it here in submission. So I'll go ahead and do new file. And we'll do references dot bib. Okay, so that's there. It's empty. We'll leave that there. Good. And then we'll also create ASM dot CSL. And ASM dot CSL is going to have the formatting information to format the references to be compatible with the style guide for American society for microbiology. Now, this is the CSL is a what's called XML code. And I'm not going to prepare it. I'm going to go grab it from GitHub. So if you open your browser, I'm going to do GitHub, and then CSL. And this should bring me to, yeah, official repository for the citation style language CSL GitHub, you can see I was here earlier trying to check it out. And this has 1324, I guess, 2324 entries with each entry being a different CSL file. And luckily, American Society for Microbiology starts with an A. And so we can do a very simple search for American Society for Micro right here. And so again, this is XML code. I'm going to go ahead and click on raw. And you can see what it looks like again, there's no need to read it at this point, we might do a little bit of tweaking to it later on. So I'm going to do a command a copy. And then I'm going to paste that into ASM dot CSL. And I'll save that. And so now I have both of my files. Before I forget, I'm going to go to my make file. And I'm going to add both of these files as dependencies. So we'll do submission asm dot CSL. And we'll do submission references dot bib. Okay, so now if we modify asm dot CSL, or if we add a reference, and then run make main submission manuscript PDF, it'll regenerate that for us. Good. Alright, I'm going to go ahead and close that make file because I don't think I need to change anything more there. I'm going to close asm CSL for now. We don't need to worry about that. Let's go ahead and add a reference. So the first one that I know I want to put in is a reference to the 1994 I think paper by stacking brand and global. And I will go, I'll show you a couple ways that we can do this. So let's go ahead back to our browser. And I can do scholar dot Google dot com. And we could do stacking brand global 1994. And let's see what we get. So taxonomic note, a place for DNA DNA re association. This is the paper I want. This is the paper that I know is where we get that 97% definition from. So how do we get the bib tech formatted information out of this? What we can do is we can click on those quotes. And this will then open up a window that's got various formats of the citation. What we would like is this bib tech version. So if you click on bib tech, we see that we get this field, this chunk of code, and copying it and pasting it into references dot bib. We see that we've got stacking brand 1994 taxonomic. It's got the title, the author, the journal, the volume number on and on. And then again, if we insert this into our document, that asm dot CSL will tell the rendering rendering software PAN doc how to format the citation. Okay, what I like to do is we're going to use this name as the tag when we cite the paper. It's kind of like the field codes or the name and number from end note if you're familiar with that. If you use Otero or paper pile, I'm sure it's it's the same kind of idea. So I'm going to make this instead capital S stacking brand 1994. And I'll remove the taxonomic I'll use the last name or the last name of the first author and the year. And again, if I copy that into here, I need to add an at sign before the stack of brand 1994. And again, if we save that, and we make sure that we saved references dot bib, we can come back and we can do make submission manuscript PDF to double check that everything all the plumbing works. And if we do open submission manuscript PDF, and we come down to the end, we see that we've got a citation here for a stacking brand and gobel. The formatting is a bit not so great. But if we again come up to our introduction paragraph, we see that now we have one for that reference. Great. Now that's going through Google Scholar and getting the bib tech format is one way to get the bib tech formatted information that we need for our references file. I'll go ahead and open end note. I don't have end note in my doc. I hate using it so much. So I'll look to a search for a stack of brand because I've cited it before. And so it's this citation here from 1994. And all that information is there. The output though is in ASM format. So if I want to output it into bib tech, I can do output output styles open style manager. And then I can say down here bib tech export and make sure that that is checked. Again, if you're using a different reference manager, do a Google search for the name of your reference manager and then export bib tech and you'll get what you want. So we'll go ahead and use that. And so now if I come over here and do bib tech export, I can then grab this format and I can paste that in there, right? And so this then is the same type of information. It seems to be formatted a little bit more nicely. And this tag up here, again, I'm going to do second brand 1994. Save that. And we could then again, re render what's going on. All right. So that's our first reference, our first citation in our manuscript. I'm going to close end note because I hate using it. And we won't use it any further. What I'd like to do then is let's go ahead and look at the next reference here in the manuscript. We'll do another one this Callahan and Holmes and I'll show you my typical workflow for most cases. If I go to PubMed and do Callahan, Holmes and search for them. And I come back down. So where it's this paper right here, the exact sequence variance to replace ASVs, OTS, right? So clicking on that will open up the page for this paper. And I will highlight the DOI and copy that. And then I will go to a website called doi2bib.org. I have even a button for that in my bookmarks. So if I paste that in there and click get bib tech, it then outputs the format or the bib tech information for that reference. And I can then copy that, paste it into my references bib. I try to insert things in alphabetical order. So it's easier for me to find it if I need to edit anything down the road. I can save that. And again, then it's got this author, first author, last name in the year, which I can then paste into my square brackets with that at sign. And then I can then build it so that again, if I then make the submission manuscript PDF, the outputted manuscript, the PDF will have that reference formatted and for me. And again, looking at this, we should now have a two for the definition of ASVs. And if we come down, we see that we then have that citation for Callahan, McMurdy and Holmes. Okay, great. So this is not putting it in the ideal place. It's putting it at the end of the figure legends, but we're not going to worry about that for now. What I need to do is let's show you one more thing of we've got four references here that we would like to have formatted so that we have a range of numbers or citations for these four papers. And what we can do is go back to PubMed. And we can do McMurdy, or no, it's Callahan, au, Holmes, au, and then data to TI for the title. And there it is. Again, I'm gonna grab the DOI, go to DOI to bib, paste that in there, get the bib pack. So this is Callahan 2016. And we'll go to references. So if this had also been published in 2017, these titles like this Callahan 216 need to be unique. So I could do like A and B. So I'll go ahead and copy that. And I'm going to put that in here. And again, we'll do the at sign Callahan. And instead of commas, we're going to use a semicolon. So the next one I want, I'm going to come back for you noise because that's a little bit different. And we'll want D blur and med. So let's go ahead and do D blur. And so D blur is in the title. And the author was Knight was the senior author on that. And again, we'll grab the DOI. Again, you might already have a lot of these in your reference manager of choice. And you can like I did from end note, you can output it directly. I'm kind of between reference managers, if you couldn't tell. So this is a mirror that goes up at the top. So I'll grab a mirror. And then wherever I want to put it, I want to cite it, I'll go ahead and put there with that at sign. And then med is from mirror to Aaron. And because entropy was in the title. Now, I think it was so again, there's also an author. So Marin. He goes by Marin. And I forget if it's, forget what his actual name is. So let's see entropy, maximum interview. So Aaron a m, ah, it's not m. So minimum entropy decomposition. So we'll grab this DOI, you know, probably would be worth me coming up with, you know, using an actual reference manager. Again, this works for what I'm trying to do with writing these papers. And I'm pretty, it's a little bit cumbersome. But it's mine, right? All right. And so then again, that needs a semicolon. So again, the semicolons within the square braces allow us to string together a handful of references. The next one is you noise to and actually, you know what, I think it has been published. So let's do Edgar and you noise to TI. Keep thinking it was a preprint. And let's see, there's like 400 things there. So I know it's in bio archive. So we're going to do a search for Edgar and you noise to there it is. And I think it's been published, but maybe not. Yeah, it looks like it's just as a preprint. So it hasn't been published yet. And we can look click on citation tools. And we can then click bib tech. And this will actually download a file that we can then open up. And we see bib tech formatted here. And I can then copy this into my references. And so that will go between C and E. And I'm going to make this tag Edgar 2016, save that and then pop that into here. And again, we need that semicolon to separate them. Okay, so let's go ahead and make this so we can see that we have a number to another number. And we'll have I think six papers in our bibliography now. Okay, so we have our six references here. And coming back up to this first paragraph of our intro, we see we have two to five, right, one, two to five and six. Great. Okay, so this is basically the process I go through to insert citations. Now, as I do this, I'm confirming what I think I remember from the literature. And sometimes I might need to edit the text around that citation a little bit. And if I get something wrong, yeah. So I'm going to go ahead and insert these, and I'll be right back. All right, so I've gone ahead and inserted a bunch of references in here. The reference I ended up putting in for this first sentence is the classic Lane 1985 paper, which is really the kind of the the first paper out there from Norm Payes saying, Hey, we can use 16 s to characterize communities. You know, there's lots of different choices. But I think that's a good one. And then again, down here in my discussion section, I had a number of references. And I also put a number of references in my materials and methods. Let's go ahead and build this. Let's go ahead and build this. We'll do make submission manuscript PDF, let this turn through and we'll see what our bibliography and references look like. And we see that we've got all sorts of citations in here now. And again, if we scroll down, we have the reference page, but our references are instead down here at the bottom. Actually, we see also that we have 25 citations. I did not intend to have 25 citations, but the M sphere instruction to authors limits you to 25 citations. Although if you look at other observation papers, you'll notice that they tend to have more than 25 citations. Anyway, I think we're in good shape. I would like to move the references up to the reference section instead of being at the end of the figure legends. To do that, what we'll need to do is we'll need to come to the references area where I have that title. And I can then do div ID equals refs, close and then close out that div with a div tag. And this is going to tell pandoc to put the references in this spot. And again, if you make that, what we'll see is that references instead of being at the end of the figures will now be where I have that references caption. All right. And so again, we see that the end of our document is the figures. And if we scroll up here, that we see we've got our references. So that looks nice. Something that's a little bit annoying is that I'm used to having my references have a hanging indent. So the first line of each reference is against say the left margin. And then each subsequent line would be kind of lined up under the L in lane there. We can do that as well. Through some, you know, fancy googling. I was able to figure this out that we could do set length backslash set length, and then par indent. And then outside of that curly brace do minus 0.25 inches. And then another set length. And here we would then do left, skip, left, skip, and we'll do 0.25 there. And then we will add no indent. And if we save that, we can then after we do the references. So basically what the first thing does is says indent the left quarter of an inch, and then set the length a quarter inch. So we're basically going back a quarter inch on the top line and then moving everything over a quarter inch. I want to then set this back to zero for my figure legends. So again, that needs to be zero inch zero in and 0.25 in there. And we'll go ahead and build this. And hopefully it looks with like like it has that hanging indent. Alright, so that ran through let's check out our hanging indent in our references. So yeah, we see we've got a hanging indent it's not a perfect. So normally there might be a tab between the number and the last name of the first author. I've noticed here Rodriguez R is missing something. So maybe what we'll do is we'll go to that paper and figure out what what's going on is that actually Rodriguez or is it a hyphenated last name that somehow got screwed up in the translation. But I also notice that Stack and Brant and Gobel is all caps. So let's go back and take care of that first. And I'm going to copy Stack and Brant down and then make Gobel a title case, save that. And then Rodriguez, that's up R becomes before S Pat. So let me look at this paper and let's go to our browser and see what the author's last name was. Huh, they've got it as dash R. I'm going to assume that's right. It kind of surprises me. Let's look at the PDF and see what it says. Yeah, it's dash R. Okay, cool. Well, I just wanted to check. I don't want people to be mad at me for citing their names wrong. Okay. And we'll save that. And something else I'll do as I look through here is I just want to look for like major problems in the formatting. And so one I notice here is DNA DNA gets concatenated together, whereas it should be like DNA hyphen DNA. So let's look at Stack and Brant so and see how they did it and see why it wasn't done that way. So DNA DNA is in each of them is in curly braces with a hyphen in between. And again, the one that we want to fix is Goris. So come back up to Goris. Again, this kind of thing where you're making sure that your citations your references look right. I think I have this problem with like everything I do with every reference manager I use. So maybe we'll put in hyphen instead of kind of whatever it had there before. That should work. And look through these again, ASM is not going to get bent out of shape about the formatting of these I do like them to look good because I don't think I don't want reviewers to think that I'm a chump and I'm kind of doing things half asked. Coming back to this Rodriguez one, I just have 84 which is the volume of the journal. So let's come down here and see if what might be going on. And so Rodriguez has the volume the number but there's no page number, right? So this like Schloss from 2009 has pages. But this from 2018 doesn't. And again, if we look at the manuscript on the ASM journals website, in infometrics, it'll tell you how it should be cited. And so if we come down, we see citation, right? And so it's got kind of the last part of the DOI right there. And I think I want that to be the pages. And so I'll add pages equals and then in curly braces that with that. I don't want that period, right? Yeah. So that'll do it. So maybe I'll come through and look at some of these other ones and make sure that I've got the right thing. So again, here's another one from Amir. It's got the volume name but not the paper, not the, yeah, the paper ID. So this has E location ID, I think that should be pages. So we'll put that in and kind of keep looking through here. Again, I'm just trying to do my best to make sure these references look good and look like I've actually proof read them to some extent. And here's another one from Johnson, different Johnson here. And here again, we don't have the pages. There's probably a way to modify the CSL file, but we don't want to get into that. And let's see if the paper has an indication of how it should be cited. And let's see, what do they say site like this? Yeah, so they say 5029 is the paper number. And I'll go ahead and put in 5029 there as the pages. That should work. And let's keep looking. That looks good. Looks good. Here's one from Barco up here at the top. And again, another one pages. And it'll be this number here. And we're getting to the end here. That looks good. This looks fine. This looks fine. Okay. Very good. So we'll save that. And we'll go ahead and make that. And I think we're going to be in good shape for our references. And we've kind of done our due diligence to proof read things. Sometimes, as you know, different journals will abbreviate the titles of the journals differently. I'm not going to go in there. I don't care that much. I think having everything spelled out is fine. And so let's see, one of those was Barco. So yeah, so now we see we have the volume and the electronic ID for that. And we're going to be in good shape. So I'm happy with how these references look. And I think they look close enough for submitting the first time around. I think they look close enough that a reviewer isn't going to complain that things are formatted poorly. So, you know, if you're reviewing this, and you see this video, please don't yell at me about my references. Focus on the science. Anyway, again, a lot of these journals like from ASM will have copy editors that will go through and make sure that everything is formatted properly. Okay, so again, this is my process for inserting references. It includes when I write the text, maybe putting in little flags to myself to remember to come back and put in a citation there, or perhaps after I've written everything, going back and figuring out where I need citations. I then use this process with bib tech, where I take the DOI number, I then use that with this DOI or DOI to bib.org website to get the bib tech field. You can also get that information. If you're using a website like bio archive or ASM journals have that same interface, your reference manager also can typically export the reference information to bib tech. And so that you can create this references dot bib file. The other thing that I just want to remind you that we did in manuscript.rmd was very up here at the top. We had bibliography colon references dot bib CSL ASM dot CSL. Again, you can go to GitHub and get all the CSL files you'd ever want for your favorite journal. And also down here, we use this div ID refs to insert the references at that location. And of course, we've then made this as part of our make rule for building the PDF. We've added dependencies for references dot bib and ASM dot CSL. So if we add a reference, it's going to trigger rebuilding that manuscript file. Very good. Again, this was a big step that was kind of glaring a mission from the manuscript. I hope you didn't think that I was going to try to submit a paper without any references. That'd be that'd be pretty weird. Anyway, I know we all have idiosyncrasies in how we insert references or how we manage references. It seems like all of the systems out there just kind of suck. I find that this works really well for using with our markdown documents. Something I've done in the past is I've used end note with our markdown. And what I do is I'll copy and paste the field code from end note into our markdown, and then I'll generate a word file. And then that word file, when I'm happy with everything else, I can then use end note to build the bibliography for me. And I suspect if you're using some other system with the Zotero or paper pile or whatever else you might be using, you could do the same type of thing. So you don't really have to worry about building that references.bib or that asm.csl. At the same time, it gets a little bit klugey. And if you then have to have that end note library with the manuscript up on GitHub, if you want anyone else to be able to build the paper for you. So again, this isn't the perfect system, but there doesn't really seem to be a perfect system out there. That's very good. It works well for me for writing papers. Give it a shot. Let me know what you think of it. And maybe you have some ideas that you've seen other ways to do this that are a little bit easier. Let me know down below in the comments. And you'll teach me something, hopefully. The next episode, we're going to move on to thinking about our title, our abstract, the importance section, those things that will really sell our paper to readers that we want their eyeballs on our paper. How do we do the marketing for our paper? It's kind of a way to think about a title and abstract, right? We tell people, don't judge a book by its title or by its cover. Well, we do, right? So we want people to judge our paper by the title and abstract, so they actually read the thing, or maybe they just cite it without even reading it. Anyway, stay tuned over the next few episodes. We're getting closer to submitting this as a manuscript, and I want to have you along for the ride. Anyway, keep practicing with these ideas. Tell your friends about Code Club. Be sure that you give a nice big like on this video so others can find it more easily. And we'll see you next time for another episode of Code Club.