 Why do you think anyone is going to read and cite your paper? I'm not being mean, I'm not trying to be negative, but really, why do you think out of all of the bazillions of papers that are out there why is someone going to read your paper? The odds are good that if they're reading it it's because you had a catchy title or your abstract had some kind of hook on it that then led them into reading your introduction and the rest of the paper and that because you wrote a solid piece of research that made key contributions to the field that you're now going to have big impact but that all started with a compelling title and a solid abstract. That's exactly what we're going to talk about in today's episode of Code Club. Hey folks, I'm Pat Schloss and over the past 20-25 years I've published more than 100 peer-reviewed manuscripts. All of those manuscripts have had a title and an abstract as well as hopefully solid science behind them. Well, I'd like to think the titles are compelling and the abstracts are well written. The reality is that I'm not so convinced that my abstracts or titles are anything special. So as I was putting this video together I decided, you know, I'm going to be really intentional about how I think about putting together my title and my abstract so that I can give you good advice on how to do the same. It's good to think of your title as a breadcrumb leading to the abstract and we've all lived now in the internet age for decades and we've heard of clickbait, right? And so clickbait of course is the idea that you've got a title that then induces you to click on a link and then you get to that link and you realize that really has nothing to do with the title. So we don't want to do that but we do want to have a compelling title that a reader sees as they're searching through Google Scholar or PubMed or some other search engine for scientific publications and they say, wow, that's an interesting title. I want to learn more about that. That's the first step, right? A compelling title gets you into the paper so to speak. It brings you to perhaps an abstract and as you read through the abstract you say, wow, this is really compelling. They've got some interesting results. I want to learn more about their motivations, their questions, and how they did what they did, right? So those are the last sections of the paper that I work on when I'm working on my manuscript. If you've been tracking previous episodes of Code Club you know that the abstract and the title are two elements of the manuscript I've been working on that I haven't I haven't fleshed out yet. Yeah, I have a title in there. It's fairly tongue-in-cheek, a play off of another very provocative title that's in the literature. I'm not going to use that because it's kind of like borderline plagiarism and I don't really like the original title at all and so I don't really want to riff off of that title. We'll come up with our own title and I'll show you how. So instead of starting with the title I'm going to start with the abstract today and before I start writing I want to know what the journal expects my abstract to say, how long it expects to be, whether it expects there to be any type of structure. So my plan is to cement this paper to M Sphere. As you read through this instruction to authors I know this document is really long and dry and tedious and it's like who reads these things? Well I'm telling you read it because it's got a lot of great information in here to understand what they're looking for. So as I'm scanning down through here I'm looking for the abstract and so what I see is that M Sphere and all of the ASM journals have a two-part abstract and so they have abstracts consisting of two sections with their own headings, an abstract and the importance and so they're going to be published together. If you do a PubMed search and find papers published by M Sphere or any of these ASM journals you'll find both sections are present. They give you a sample structured abstract for guidance so the abstract section should be 250 words or less and should concisely summarize the basic content of the paper without presenting extensive experimental details. The important section is a 150 word non-technical description of the significance of the work. I have had a horrible time trying to figure out how to write an important section in the papers that I previously published in ASM's journals. I think the key thing to focus on is that the importance is a non-technical description of what's going on. Of course 250 words for the abstract, 150 words for the important section. One of the nice things is that M Sphere here gives me a link to an example structured abstract and so brings us to this paper single nucleotide polymorphisms and regulatory encoding genes have an additive effect on virulence gene expression in a vibrio cholera clinical isolate. Really nice descript a direct and tells you exactly what they found in this paper. They have the abstract which I can see has a lot of kind of technical jargon, a lot of gene names, a variety of abbreviations. I would try to kind of minimize the number of abbreviations in there but then as you look at the importance section we see this is written in a language that most college educated people I think could read and understand. It doesn't expect you to be an expert in the field to understand what's going on. There isn't a lot of jargon in here, there's no abbreviations and it's obviously not technical. So I think this gives us a good framework for thinking about how we write the structured abstract where we have the abstract that's more technical and the important section that's non-technical but really highlights the significance of the work. One of the things I like to see is that the final sentence of the importance section here starts with the significance of our research is dot dot dot and they can then complete that sentence. And so I'm going to make a mental note to myself that I want my important section to have a sentence that makes it crystal clear what the significance is of my work. Excellent. So another tool that I'm a big fan of as I already mentioned is that nature publishing has a, they don't call them an abstract, they call it a summary paragraph, it's the first paragraph of their papers and they have a template for what they're looking for when they receive a paper and they're looking at that first paragraph. And so if you recall back a number of episodes I talked about the and but therefore framework for writing that you have a series of observations that are connected by and and and but right and there's a but statement then that increases the conflict or identifies a whole in our current understanding and then a therefore statement that resolves the conflict and says therefore we went off and did this or therefore we then did this experiment and found blah right and so this is really laid out as a series of two and but therefore threads. So one or two sentences providing a basic introduction to the field and right two to three sentences giving more detailed background comprehensible to scientists in a related discipline that's another and the other thing they're doing here is it's a funnel right so two sentences giving basic introduction right that's sensible to any scientist and then more detailed background to people in a related discipline so they're really bringing in that funnel then the but sentence but one sentence clearly stating the general problem being addressed in a particular study therefore right here we show or something like that summarizing the main result so there we have our in but therefore where again we're funneling the reader to the result then they give you two or three more sentences explaining what the main result reveals in direct comparison to what was thought previously another way you might do this would be kind of you know detail the results a little bit more and then one or two sentences putting the results into a more general context and so there might be not be another but here but this is kind of a therefore that again puts things into the general context finally they have two or three sentences to provide a broader perspective readily comprehensible to any scientist in any discipline this is a lot like our important section for asm journal papers the other thing that we're starting to see more of i think more like cell publications from cell press is a visual abstracts so a picture to give some kind of like a non-technical overview of what happened in the paper so i'm going to use this outline this template to create my abstract for our paper to do that i'm going to go ahead and go to my project root directory and i'm going to open up my manuscript i'm going to do this in my text editor rather than our studio it doesn't really matter where you do it so i'm going to come down to my abstract section you'll see that i have it just kind of empty here with the number of words and i'm going to paste in some bullets i've got these double pound signs for the instructions from the nature outline i'm going to then move this down to my important section the next step after i've got this template inserted is to flesh it out with some bullet points what i might do is grab sentences that i've already written in the manuscript and drop them in here i can then flesh those out maybe modify them pull them together and edit and so i'm going to do that over a series of steps one mental image or model that i have is that i don't know if microsoft word still has this feature but they used to have a feature called executive summary or make an executive summary where they would take your like term paper or your document and they basically pulled out the topic sentences of every paragraph and then plopped it together as an abstract that's kind of what i'm thinking about here but again trying to fit it into these different categories and of course i'm not going to do that with the topic sentences but that's just kind of a mental model i have for what this template is going after so again one or two sentences providing basic introduction to the field i might put something in there about what are ampli-con sequence variants what's the controversy with otus i guess that might be in here right the two or three sentences with more detailed background whereas this first sentence or two might kind of outline you know this area of you know the importance of 16s rRNA genes and what it's done to overall microbial ecology and microbiome research yeah and then i could maybe go into saying what are otus what are asv's and then point out that there's this conflict over you know a movement and popularity of asv's to supplant otus and then i could say the main result that we found is that if you use asv's you're going to artificially split genomes into multiple different clusters so i'm going to work on that and i'll come back and show you what i've come up with okay so i in many cases i've taken sentences from the main body of the manuscript and used them to address the various questions or elements of the template that i want to create and so one or two sentences providing a basic introduction to the field comprehensible to a scientist in any discipline right this 16s gene sequencing is a powerful technique sentence is the very first sentence of my paper right and so i'm kind of plagiarizing myself and you'll see that as we kind of massage this bring things together with transitions and editing to get it down to 250 words that it's not going to seem so redundant between the abstract and the rest of the paper but anyway i've gone through and done that where i've filled in key results key statements into this template to a point where it's quite long and is but has all the information right and so if i highlight all this my word counter will tell me how many words i have of course it's going to have those those templates and at the bottom it tells me i have 429 words and so you know maybe a hundred of those words or 150 of those words are these bullets that i'm then filling in so we've got too many words but it's better to have too much than too little because then we can still cut and prune and and edit right i also in my important section that i'll show you here is that again i have some bullet points that i think are general and non-technical right i've got 16s rRNA gene sequences but that's a little bit of jargon but i don't know i i i feel like i have to include that and as fees and otus because that's really what the paper is about i only have 150 words so i can't break it down too much further than that and and so we'll see but again this this is again quite long how many words do we have here we have 159 words so it's close but we'll want to pull that together so the next step that i'll do is i'll go ahead and remove these bulleted titles and i'll pull it together and i'll start editing all right so i've removed those bulleted titles and i've concatenated my various elements of text together in my abstract here i have 320 words so i'm 70 over the 250 that is a hard threshold i've experienced this before where i had 251 words and the submission system would not let me proceed so it has to be 250 words for the main abstract 150 for the importance so this is 70 words over the other thing you'll notice in here is that i've got some holding some holder text to describe things about like say how many genomes were in the database that i used and how many species were there as well as this information about the thresholds i can fill that in with our markdown but again for now i'm just focusing on the things that you can do in any platform whether it's microsoft word or any word processing software i'll come back later and add in the r code to populate those values i've done the same thing with the important section and here i'm at 159 words so i'm a little bit long so what i'm going to do next is think about what can i cut what can i consolidate to make the package a lot tidier i look at this section the abstract paragraph and i noticed that so this is kind of my but sentence right however asv's in the use of narrow overly narrow thresholds to identify blah blah blah creates the risk of splitting but i've got all this text up here that is kind of introductory material that's 85 words so if i could reduce this down to like one sentence or two short sentences i could save a lot of text the other thing is that i've got this sentence here that is kind of my general statement of what i found but that's also using up a fair amount of text about 26 words so i could perhaps forego that sentence and instead kind of emphasize the results a little bit more with the actual values so there's a variety of things i can do to clean this up to tidy it up and to get it to be shorter and perhaps more impactful so i'm going to keep editing and i'll show you what i come up with next so here's the abstract that i came up with it's at about it's at 249 words so it's just under the threshold and i think i think this reads pretty well i think it's compact and it has this funnel it gets me to the important results about the problem of splitting an individual genome into multiple bins when we use as v's are overly fine o2 definitions my importance section is at 147 words so it's under the 150 word limit i also have uh the significant sentence you know the current research is significant because it quantifies the risk of artificially splitting bacterial genomes into separate clusters okay um and so this is um i think this is going to work um i'll still do more editing of my abstract and important section as i proceed i also need to go in and plug in the values for these holder values i'll do that later using our markdown you've seen me do that in previous episodes so i'm not going to belabor the point here um we've got a good abstract and important section i think so the next thing i want to take on is the title i'll go ahead and remove that 250 words because i know i'm in good shape so the title that i have in here is a holder um it was a tongue-in-cheek play with the original title with the title of kind of a paper that i'm responding to and my version of the title was asvs should not replace otus in marco gene data analysis the original title i think was asvs should replace otus in marco gene data analysis um i don't want to i don't want to be accused of plagiarizing their title i also don't really like the original title because i don't think the data in the paper really did a good job of backing it up as i've kind of shown here there's at least one element of their argument that i think i've shown is incorrect again this problem of artificially splitting genomes into separate bins the other thing that we'll have to come up with is a running title if i go back to the instruction to authors let's see where was that and if i do a search for title see if there's any information about titles so here we go so the title running title and so forth so we need a title and it should present each manuscript should present the results of an independent cohesive study so we don't want like a series of titles so like pats paper one pats paper two pats paper three each paper should stand on its own um let's see avoid the main title subtitle arrangement avoid complete sentences and unnecessary articles um we want to include the title and the running title so the running title should not exceed 54 characters and spaces um this is an example of the running title here for the i mean the document is called the instructions authors but this what what i have highlighted here is where that running title goes so if you look at the top of any page you can kind of see a an abbreviated version of that title and so um that has to be less than 54 characters they also have a sample title page mine is pretty close to that and i think we're in good shape so we need a title and we need a running title there's a variety of thoughts on how to write a title one thought is that you should kind of explain or describe what was done this kind of goes back to what we talked about previously about figure legends or figure captions do you want them to be descriptive or do you want to tell the reader what they should see in the figure i think we should tell the reader what they should see in the data um another thing that i try to avoid in my titles is a description of methods um the result i find should be true independent of the method right so i don't want to say you know characterization of the human microbiome using aluminum i seek data right i mean that's kind of a crappy title anyway but i wouldn't want to certainly include using aluminum i seek sequencing right uh because it should be the result i find should be independent of the sequencing platform that i used now if i'm developing a new method using the aluminum i seek platform that can be in the title but otherwise um i don't want my title to include the method right so i wouldn't say asv should not replace o to use and marco gene analysis based on analysis using data from the rn database project or whatever it's called right um so i don't want to include the methods i also want it to be direct and tell the reader what they're going to learn when they read this paper i don't want it to be generic about like you know analysis of intrigenomic variation of um 16s genes and bacterial genomes because that that tells tells you kind of what we did but it doesn't tell you doesn't allow me to tell you what i found and again if i want you to click on my paper i want you to know what i found so you're eager to read more so what do i do to come up with a list of titles or to come up with a title well the first thing i do is i come up with a list of possible titles so i'm going to go ahead and brainstorm one of the things that i like to do is to have a document where i keep track of a bunch of titles and that that occurred to me over time and i'll as i'm kind of like doing whatever in my life i might email myself from my phone a title idea and i'll create a document with all those title ideas and some of them might be variations on each other and then i'll go through and try to kind of find patterns maybe come up with new title ideas and then i'll refine it further so let me come up with a list of ideas for possible titles and and we'll we'll go from there so i came up with a list of maybe about 10 or 12 different titles they're all kind of variations on a theme some of the things themes that popped out to me that i wanted to be in that title were things like operational taxonomic unit amplicon sequencer variant artificially splitting intergenomic variation those are kind of phrases or themes that that reappeared repeatedly and so if those terms don't show up in my title i don't want to be sure that those are included as a keyword when i submit the paper so again this first title is an example of what i think of as like a boring or overly descriptive title analysis of the intergenomic variation among bacterial 16s RNA gene sequences again it doesn't tell you what i found it tells you what i did and i want you to know what i found this was the the holder title that i used and now that i think about it you know this might be a good tweet as mSphere and other asm journals and other other journals now are inviting you to provide a tweet that you might use in helping to advertise your manuscript and so this is a little bit provocative and and might be a good way of i'm helping to sell the paper someone on twitter might see that and be like oh wow that sounds like that's on fire i'm gonna go check that out right so who knows and then you can see as i kind of scroll through these lists a variety of different titles you know these all kind of these two are similar to each other you know split back trail genomes into different or into separate units of inference unit of inference is kind of wonky here these start with adoption of ampli count sequence variants blah blah blah and you can see i've got artificially splitting in here right and kind of going through here one of the things that i noticed with these titles is that for the most part they were negative right they were kind of saying they're kind of bashing ampli count sequence variants and so i wanted to write a couple titles that were positive right so use o to use because they are good right not don't use asvs because they're bad right so i came up with a couple down here um these last four that i saw as being more positive um i don't know negative cells unfortunately so but i wanted to have these in the mix as as part of my overall list of potential titles so the next step once i've got these dozen or so titles is that i want to refine them further and maybe get down to a handful of titles that i could then shop around to other people so let me take this list spend some time with it and then i'll show you the final list that i came up with okay so i came up with seven titles uh two of them were the ones i said i didn't like so the first one was that overly descriptive boring one that says what we did not what we found the second one is that placeholder one that would probably be a better tweet um and then three through seven are kind of what i thought were the best of the different categories of titles that i had so what i'm going to do um is to take these titles and i'm going to put them into my lab slack forum and ask the people in my lab what they think is the best title this is something that i would encourage you to do pick a couple titles share them with your friends or colleagues and see what they think is the best title so i'm going to go put this in slack and i'll give give people in my lab a few hours to mull it over and we'll see what they thought was the best title i let my survey run with my lab for a few hours one of the things you can do in slack is i put in my numbered list and then i put in seven reactions so the number the emojis for each of the numbers and then invited everyone in my lab to to choose two of the titles they liked the most and so you know there's six and seven didn't actually get any votes so those were the more positive versions of the title which is interesting right that the negative sells better people kind of liked the provocative one number two somebody liked the more descriptive version but but what sold what really resonated the most with people in my lab was title three amplicon sequence france artificially split bacterial genomes into separate units of inference again that separate units of inference it's a little wonky but um i think that works pretty well i could say into separate clusters but you know i think units of inference is good we might come back and change that with clusters and i might ask people again you know what do you prefer units of inference or clusters but for now i'm pretty happy with that so that's going to become my new title so i will come back to my manuscript and i will put that in here and we're in good shape now my running title has to be 54 characters or less and what i'll do i think is this is um 96 characters so let me put this down here for my running title and maybe if i do as v's artificially split bacterial genomes into separate units of inference that's 74 characters and what if i did clusters what do i get there that gives me 64 characters um maybe i'll do our asv is artificially split bacterial genomes and that's good um and so this comes in at 41 characters and i think that that'll be a good running title um you know i'm not going to get my paper accepted or rejected because i have an amazing running title um i think i think it's it's purely functional for when people print out the paper as a pdf so that's great okay so i have my title my running title my abstract my important section and i'm really happy with how those look now the final thing that i want to do is i want to create a document of keywords now whatever i'm submitting my paper i always forget to come up with a list of keywords before i'm submitting the paper when i'm submitting the paper then i'm in such a rush that i'm just annoyed that i have to come up with keywords but again if we're thinking about search engine optimization we want to think about you know coming up with a good list of keywords in general we don't want our keywords to be in the title and probably they don't necessarily need to be in the abstract i don't really know how google scholar or pub meds engines and algorithms work but title it probably doesn't need to be in the title so let's make a list of keywords so asv as an abbreviation for amplicant sequence variance o2u i've got 16s i don't have 16s so let's put in 16s rRNA gene and let's do 16s rdna i hate that phrase but people search for it so let's include it let's do microbial ecology microbiome and microbial communities and let's do bioinformatics and this might be a list that i and this might be a list that i curate over time kind of like the title ideas where i ruminate on it and i think of other keywords that come to mind i think 10 is probably as many keywords as i want to include and i might something i might do actually is to go ahead and google some of these keywords and see what comes up so why don't i do that let's go and google um asv o2u and see what types of papers show up um and so we see o2u versus asv shows up for xymo research uh this exact sequence variance paper that i'm kind of reacting to um amplicant sequence variance asv versus o2us o2u versus asv so there's a variety of the this kind of o2u versus asv so i think what i'll do is i will actually use that as one of my keywords because when people search for asv o2u i want my paper to be at the top so let's try o2u versus asv um and maybe asv versus o2u i don't know that that really matters kind of flipping the order but we'll see so i think that's a good list of keywords as i'm continuing to edit and think about the paper i might revise this list of keywords okay so save that as keywords md and this is now located in my submission directory so i'm in good shape all right so we've done a lot today and this is really important stuff again this is creating the funnel um of how we get people that we want to be interested in our paper to get to our paper we give them a compelling title that's direct that tells them what's really important that leads them then to the abstract which is going to be a funnel from broad interest to more narrow identify the problem show them the results show them a little bit about how we found that result and then give them the upshot the therefore statement of why this was important and then hopefully um you know if they don't stop and read the important section for a more general statement of significance they then go on and read the rest of the manuscript so hopefully we've done everything we can to bring eyeballs to this paper um and time will tell whether people read it whether people search for it and whether it comes up in search uh but you know we can do our best so what's more important than titles or abstracts or keywords or running titles or any of these things is that the researchers really sound i really want to emphasize that i though definitely want eyeballs on my papers and as much as we tell people don't judge a paper by its title don't judge a book by its cover we all do right a good title a provocative title is going to bring eyeballs so i'm confident that the science here is solid so i want to do my best to bring eyeballs to my paper so that people read it so people cite it and so people are influenced by that paper i want to have influence and getting people to read this i think is going to help bring that influence and get people to think deeper about how they're using these different methods of asvs versus otus anyway i hope you find this useful i hope these are some tips that you can use regardless of what you're studying regardless of the type of work you're doing i think it probably transcends even broader than scientific publishing you know writing a blog post how do you pick a good title how do you pick a good um paragraph a good hook to get someone into the paper in the next episode we'll continue on on our march towards getting this paper submitted we'll go over my tips for how we edit a manuscript to get it polished really tight and to be a good package that we're proud of and ready to submit anyway keep working on your own writing let me know down below in the comments what are some of your tricks that you use to pick good titles and abstracts i'd love to hear what you come up with please feel free to share your own expertise um and and we can all learn from each other anyway until next time keep practicing with these concepts and we'll see you for another episode of Code Club