 So it's my pleasure to introduce Professor Harry Sweeney, who you've heard yesterday. Today you'll hear him again and he will give some words about how to write a paper. And Harry has written some of the clearest and most interesting papers that I've ever read. So although I've heard him talk about this a few times, I'm going to be moved to what he has today as well. Okay, so we're all involved in writing and it's worth thinking about how to communicate effectively. And want to make some suggestions in the next few minutes. Now this is different from proving theorems. This is one person's perspective on writing. There is no rule book and there are no proofs of what works and what doesn't. But we all know that there are some papers and some talks which are better presented more readily accessible than others. And so I want to discuss some of the considerations making your work more accessible to a larger audience. Here are some typical numbers. They're just, they're no exact numbers. It's hard to get figures but in some journals they publish information on the number of people who hit on the title of a paper or download the abstract or download the full manuscript as a PDF. So you have some idea of numbers. In the five years after an article is published, the numbers are something like this. The precise numbers should be taken with a grain of salt but just to give an idea of the response to a paper you write and it appears in the literature that we write. You have in it certain key words. People are doing searches using Google Scholar, ISI, Web of Science, the archive and various other search engines looking for particular terms that relate to their own interests. And maybe 10,000 peruzers of the Web will find a, using their keyword search, your article will pop up. That's not untypical. That's a huge number. When you see that, you think, wow, a lot of people are going to read my article. Well, they read the title and then a few go ahead and read the abstract. The title has some words in it that they're not familiar with. The reader is not familiar with. They have jargon or some arcane terms. But they go to the abstract. Some find it interesting and proceed. And they read the abstract. They read the first sentence. And they have in their keyword search, using Google Scholar or some other search engine, they have turned up a list of 600 articles. So do they really want to read this paper? They start to read the abstract. And then out of these thousand readers, maybe one tenth will go ahead to look at the paper online. To just go beyond the abstract and to begin to look at the paper. Of those who look at the paper, maybe ultimately, if you're lucky, ten people will cite the paper. So you see there's a considerable fall-off in numbers here from those who first see the title of your paper or some key words that are listed in your paper that match their own interest. And of this 10,000, one may be fortunate enough to have five years later ten people cited your paper. So we want to think about ways of increasing this number ten. Having a larger audience appreciate what great work you have done. So let's think about the title. The title just common sense to say it should describe what you've done. You want to have some key words in it that others interested in the topic would recognize and want to look further. In writing a paper, in writing a title, in writing an abstract, in writing a full paper, you want to write for the broadest audience possible. I don't mean water it down and write it in the way it would be presented on Fox News. I mean, write for a scientific audience, but using the, if possible, no jargon, minimal amount of jargon, and words that would mean something to a broader audience. Of course, broad has different meaning if your audience is going to be nature, which really is very broad, or the physical review section B, which is Condensed Matter Physics, a specialty journal that's in, say, the area of general relativity and some terms would be common to that audience, not more broadly commonly known in the scientific audience. Now, you see titles sometimes, especially in specialty journals, that go on 25 words, including some long chemical names and various terms that are unfamiliar. Try to be brief, try to be brief to the point. And as I said already, and we'll say many times, avoid jargon and acronyms. If you're in any field you're in, we'll have acronyms that are very commonly used. We use PIV in our experiment. What the heck is PIV? Particle image plus the loss symmetry. Well, if you do experiments with PIV, it's part of your everyday vocabulary. If you work on quantum field theory, it is not. So, if possible, avoid acronyms. There are few acronyms common enough that you might use. NASA, N-A-S-A, it's okay. You don't have to write out the National Aeronautics and Space Administration, but not many, not many such. So, now you have the paper. You have downloaded the paper, and you're going to read it. And I want to consider the order in which people read things. And I would like to have seven volunteers from the faculty, some of the people in the back and front, to write the order in which they read things. Well, if you pass out that side and you could pass out this side. So, this is a sheet in which you fill out for yourself the order in which you read things, and like 12 participants to fill this out. And by filling it out, I mean there is a number. One is the first thing we read. Two is the second, and three and so forth. And then you have different sections of a paper that are listed to the right, and you can put yours right up there on the board. Let's get some volunteers. Please come, would you go up and put your list on the board, please? You come up. Need more volunteers just to get an idea? Okay. Yes, please, please come. Well, let's get 12 participants. Would you please? Okay. All right. How about you? You go up. Yes, does everyone have a sheet? If you're not up there, you can fill out the sheet. Would you like to list the order in which you read a paper on the blackboard up there? Just pick one of those columns and complete it on the right side. Pick a column. Okay. So participants, sir? You want to come on this side? I just want to see if there's a difference in participants and the faculty. We need some more faculty. Where are the faculty? Bala. Yes, thank you for volunteering. Yes, over on the right side. On the left side, excuse me, your faculty and participants. There's plenty of columns over here. Yes, I see more here. You're on the right correct side. Sorry. All right. It's missing more thirds. Pardon? That is some more thirds. Are they your friends? Or your enemies? Exactly. Right. Okay. Does everyone have a sheet and filling it out? Done it? Okay. Let's see. You want to go up and then fill out? There's some empty columns there. We have several empty columns. We need some volunteers. Volunteer? You're a volunteer. Yes, there's an empty column there. There's no grade given. Everyone can be a winner. No grades. You want to go up? There's still some columns that are empty. All right. Let's fill in as many columns as we can. You can go fill out one of the columns, please. Okay. Okay. Yes, you have a column there. All right. You've done it? Okay. All right. Good. Okay. So let's look here and see if some pattern emerges. There is no right order and wrong order. But what's interesting to think about when we write a paper, the order in which people read the paper. Okay. Thank you very much, all. So we see if we look at the first thing that people read is, in most cases, the abstract. Oops. Not everyone. Here's people who look at the figures. Look at the figures first. All right. Now let's see what's next. Does everyone read the introduction? That's B. Well, if we go across here, here's someone who reads the introduction. Remember, you wrote that paper and you worked days, weeks on that introduction because that would really set the stage for people understanding what you've written. But look, most people skip it. Right? Here's someone who reads the introduction. But there are a lot of people. Look at this. There seems to be a predominance of H's. Right? Those are figures. And if you look over the board here, does somebody have a cell phone camera or something? I'd like to have a record of this before it gets erased. I see lots of H's in the top part here. So people go straight to the figures. They read the title, they read the abstract, and they jump to the figures. Now let's see what is after H. There's one of G's too in the second row. G's. G's. G's. What is G? Conclusions. I would be one of the G's after looking at the figures. I would go to G's. Some people go directly from the abstract to the conclusions. And many who, after they've looked at the figures, then go to the G's. You see, G's would normally be low in the order of the material and the text, but you see G's are predominantly high on the list. People jump to the conclusions to see what the authors conclude is important in the work that they've written. And you go on down. And I see some people get to, what is, references, and then forget about the rest. Some sections are unnecessary. You spent that time writing. There are people, maybe some of your friends, not anyone in this room, who would put really references up on the first line to see if their name is on the list, right? Okay. No one admitted that here. It means it's very important. It's important. It is. If someone doesn't find me, you'd be relevant. Yeah, well, it's clear. They're lack of understanding of what's important in the field. Yeah, they've demonstrated it. There's no need to read that paper. Okay. So the point is that when we write a paper, we think of people reading it in a serial manner, but that is not the way it's read. People jump around, and therefore when we write a paper, we should think in every section we write about making it as self-contained as we can, explaining terms, or giving reference to a table that has definitions. But if we just use a lot of terms in, say, the conclusions that we've used throughout the paper, well, if they've gotten to the conclusions, you might think the readers would understand exactly what you're talking about. But no, they haven't read anything else in the paper. They've just jumped to the conclusion section and you have all this terminology which they don't understand. And so what would they do next? They would put that paper aside and go to another one, right? They have a whole long list of papers that have come up in response to their search for key words. So why read a paper that can't understand clearly by just looking at the conclusions? What's concluded here? And so forth. So someone get a picture, photograph? Yeah, for references. All right. So what I learned from doing this and asking people was that no matter how much care you take in writing this beautiful manuscript there's written to be read line by line from the first word in that beautiful opening sentence all the way to the concluding sentence won't make sense to a lot of the readers because they've jumped from the first line or maybe from the abstract all the way to the conclusion. So let's talk about the abstract for a moment. Think about some of the things we might want to include in the abstract. We'll be writing abstracts today, the next few days. We'll have abstracts on the posters. You write abstracts for papers all the time. Abstracts are short. If it's some journals, there are a few sentences, three or four sentences. Other specialty journals are more often the abstract is longer, maybe eight or 12 sentences. Still it's important to state what is the problem you're addressing. And then very briefly, maybe even in the same first sentence, why is it interesting, many problems in science which aren't very interesting. Why is your problem interesting? And then you've done something. What did you do? Did you do calculations? Did you do some experiment with a new technique? Did you do some simulations? What was your approach? And most importantly, there in the abstract, you say, what have you learned that is new? What have you found? And given that you've stated what you have found, why is that interesting? Why is it remarkable? And how does it fit within the framework of past work? And what are the ramifications for the future? What are the implications? What further work needs to be done? What prior work was done with certain ideas in mind turned out based on your work not to be correct. What needs to be redone? What needs to be reconsidered? Now, there's no need to make notes on this because I have a handout which has all of this. Also, the same talk as every talk, including one Mark Shattuck gave this morning, is online. You can download it from the ICTB website. If you don't have the URL, we'll get it, distributes it to every lecture given by a share at the hands-on school, is online as a video together with slides. You see the individual. And then I have a little write-up in addition. Now, we saw, by consideration of the way people read papers, that figures are very important. We are, as people in a culture in the world, storytellers. Every picture, every figure should tell a story, to some extent a self-contained story. And it should be understood, it should be possible to understand it without referring a lot to the text. If at all possible, you should have it. So the figure together with a caption is a vignette, a short story of a larger story that the reader can understand without digging out definitions and without going through the detailed text because most readers won't. They will look at one figure, they don't understand it, they won't go to the text in order to understand it and their first read-through of the paper or perusal of the paper, they'll go to the next figure. And if they understand that, that's fine. But if they don't, they'll skip again to another figure and at some point they skip again to the next paper. They'll drop your paper and go on. The figures are not clear and relatively straightforward to understand. Maybe they have to put in some effort where the information that is critical is contained in the figure and in the caption. So the ideal figure is one that doesn't require you to even read the caption, much less the text. Sometimes you can have figures that are self-explanatory and interesting in themselves. For example, do you need a caption? Do you need a ten-page paper to describe the situation? Now there are a lot of people who have written guidelines for making figures, but one that I particularly like is a book by Edward Tufty entitled The Visual Display of Quantitative Information. And Professor Tufty or Dr. Tufty also goes around the world giving lectures on how to make figures, how to display data. I regularly get web announcements for $300 I can go hear him lecture or something. But the information by and large is in his book and the book is pretty widely available. But it can be summarized very briefly, not the whole story, but that when you make a figure you want the figure to show results, data. Not a lot of labels and great detail, not a lot of numbers on the ordinate and abscissa, but emphasize the data. So you look at the figure and the result of the data pops out. And that's what catches your eye. So you can make a graph with ten curves, one solid curve, one dash, one blue, one red, one with green triangles and so forth. And you make labels to explain all these differences, but it quickly becomes a quagmire and you move on. You want to make your point as simply as possible without a lot of curves and as far as a legend is concerned that little box over in the top right of the graph that has explanation of the twelve different symbols that are used, that's the kind of figure that people skip to the next one. Because you're trying to figure out which points go on which curve and at some point becomes tedious and you just move on. So now here's a graph by a very famous scientist, chemist, Linus Pauling, Nobel Laureate, and just everyone is familiar with Linus Pauling's work, and this was a graph he had in 1947. Now I want you to think about how given what was said by Tufty, guidelines of Tufty, how can this graph be improved? So pair up with your neighbor and talk about, I think there are probably ten ways that this graph can be improved. Same information, same data points, no new data points, but make this graph communicate better to the reader more eloquently, more succinctly. So they look at the graph and some information about the science pops out. Okay, we'll take two or three minutes, talk to your neighbor, list how many things would you change in this graph? Everyone should be able to list at least six things to improve this graph. If you're not sitting next to someone, move next to them. You want to go sit next to someone there? Yeah. Make a list. Yeah. Okay, make your list of things that you would change. Okay, so let's get some suggestions of things that Nobel laureate, Linus Pauling, overlooked. He should have known better. But you do know better, right? Yes, sir? Yes. The plus signs add nothing, and that's a response also to the admonition of Edward Tuft, maximize the ratio of data ink to other ink. That other ink in those plus signs adds no information. Right? Yeah. Okay? That would be a first step. Very good. Someone else. Yes, sir? Use colors. Well, in 1947, that wasn't so. But what would you color? What would you color? Pardon? I mean, you can make it look prettier in the artistic sense, but what way could you use color to add information for the reader, to make it more accessible, make the figure more accessible to the reader? Yes. Say it again? Instead of dash lines, use smooth curves? Yes. I don't know why they're dash lines. Why not use smooth curves? Okay. That's reasonable. Dot and contrast between dot and dash. Okay. Yeah. You have the dots of the data point. That's the one thing we do not want to remove. Yes, sir? There are some different sets of points. But the result will be in the graph. Yeah. Okay. I don't have an answer. Why does this dash line not go up? It goes up here, but not on the left side. I don't know. Pardon? Do you see any relationship between the numbers identifying the peaks? Maybe you could put the numbers down next to the peak corresponding to the value of the atomic number at that peak. Okay. Maybe you could put the particular element corresponding to the peak. Good idea. Good idea. Yes? You can have four or five reference values and larger font on the axis as well. Right. You could have larger font and certainly could get rid of, say, all these odd ones, 10, 30, 50, 70, 90, or you could even maybe zero, 50, and 100. But certainly fewer numbers on the axis and the same on the ordinate. Right? Yes, sir. Why didn't he put units? That's a good point. You teach your students to put units, right? What's the volume? Here, this is a volume. What's the unit of that volume? I don't know what the unit is. I thought it might be cubic more radii, but it didn't work out. I don't know what the units are. Does anyone have an idea what the units might be? You should have put units, right? And as we said, in the horizontal axis, you could have fewer numbers, zero, 20, 40, 60, for example. And I would certainly, since it's to be read in an article, I would put the numbers horizontal so the reader is not doing like that to read the numbers. Here, it's pretty simple, but just as a matter of practice. Anything else? Yes, sir. Good. If you're reading in a journal or on a web page, most of the time now, we read on the monitor or a laptop or something. Why not make this horizontal? Put it up higher. Atomic volume. Agreed. Others. Yes, sir. Yes, there's no, like this extension here. What does that mean, that dash line there? I don't know. This dash line goes up here. Right. Okay. So, those are very good suggestions. And here is what Tufti did to improve. And we made some further suggestions, I think, are good, which Tufti didn't pick up on. But as was suggested, label the element that is the peak and you immediately recognize that these are the alkali metals, right, that are the peaks. And you have this section right here, which is different, the rare earths. And notice that he's gotten rid of the odd 30, 50, 70 here. He didn't put in, he didn't know what the units were either. A graph in unknown units. Sometimes people put arbitrary units, which is okay. Anything else we notice from this? It's so much cleaner. Yes, sir? He ignores the origin and he makes a point of that. You don't need the origin. Well, I haven't the disagree. I like to see where the zero is because sometimes people offset the origin from zero. But Tufti says you don't need that. Another thing that he emphasizes is you don't need the, oops, you don't need the side and the top. See this bar across? That's extra ink that doesn't add anything to the understanding of the data. I tend to put it aside in the top, but it's a reasonable point. It's more ink that is not contributing to the data. Okay. So this is a graph you can look at and immediately interpret. You learn something by seeing this graph. Okay. Now Tufti has an example of a graph which is not as good. Okay. Yes, he's rotated. Yes, but I still like the horizontal. I mean, this is often written, the axis label is often written vertically. Particularly if it's long and you have units after that, then it's necessary because you don't have the space out to the left. But you can frequently write it horizontally like this. Anything, any other comments about this graph? Ways to improve it? So let's go to the next graph and see what you learn from looking at this next graph. See if you can figure out what the author wanted to present here. Time is moving on so we're not going to spend much time. This is clearly junk graph, right? So let's get some suggestions of things that are wrong with this graph that you would, the labels, yes, are very tiny and you have 28, 29, what are these numbers representing? Age structure of college enrollment. So these students in the bottom part are under age 34. And these are the years 72 through 76. What about the color? How much have you learned by the color? Nothing. It's gratuitous cover. It adds color. It adds nothing, right? Now, do you see any similarity between the bottom and the top? They contain the same information. It's a reflection, right? There is no information there. So there's unnecessary three-dimensionality. Yeah, it's subtraction, OK. You're right. Well, and we've got this three-dimensionality which adds nothing. And we have numbers like this 28 connected to the 29 with this curve. It curves, and then there's a flat line. And then there's a curve and a flat line. But that's not based on any evidence. That's an artist's rendition making these flowing curves, right? The top certainly duplicates the information in the bottom. The fonts are too small, we said, and so forth. So let's think of ways this data, these same data, could be presented in a way more accessible quickly. And you could do it with a table. There are only five points, five coordinate pairs, right? Five pairs. So this is a percent of college students over age 35 through some period, these five data points. That's it. So you really don't need a graph. You could have a table or you could have a graph. One wants to have a graph. OK. Now, this is a nice image I like with a figure that was made by Paul Umbenauer. To make a graph, a nice graph takes time. And you draw it over and over and over. And the example I'm going to give here is this picture that was made more than 20 years ago when experimentalists used cameras with film, ectochrome film. And Paul Umbenauer, who took this picture, set up on a Monday morning to make this picture. And he spent the whole day Monday moving the lights around to take picture of this oscillating structure. I showed a movie that Mark Shattuck made that oscillates up and down. This is a snapshot in time. And you can see these bronze balls here flying in the air, but a basic structure. And the lighting wasn't quite right the first day. In those days, we took photographs during the day, all day, and the camera shop closed at 5 p.m. If you got the roll of 36 exposure ectochrome film into the camera shop by 6 p.m., they would have it developed by 8 a.m. the next morning. So it goes by the camera shop the next day. And the pictures weren't so good. He did that five days. He spent a whole week obtaining that figure. But it's a nice figure, and it ended up on the cover of Nature and in the New York Times and in Scientific American, American Physical Society calendar in various places. You make a nice figure. I mean, this is an image, but also a nice graph. It will be reprinted by other people who find the information useful and illustrative. So we could talk about the graph itself, but it's simply enabled. And there is a more recent graph. It's not exactly following the guidelines of Tufti, but my colleague Michael Martyr made this and won some award for it because fracking, hydraulic fracturing is widely used in Texas and they want to know after the first yield, the first few months or first year, how much gas is going to come from this well. Can they predict? And in this analysis that Martyr and his student did there and colleague, he was able to take the data from 6,000 wells from this particular shale and he did it for other shales. He has 100,000 data points or so from different barnet shales, the different shales that are being explored these days and he found that within a short period of time he could predict with reasonable, with some uncertainty, with appropriate scaling and the key was the scaling, the long-term total yield of gas and that's used all over the petroleum industry now. Okay, now to writing. We're talking about papers where you do have to have writing but figures are very important and figures are easier to interpret than writing. The eye was developed after the great dinosaur disappearance in 65 billion years ago, then various species developed eyes, about 25 species including our ancestors, developed eyes that are similar to ours, similar in structure, so it's a very efficient kind of light detector, image detector. So 65 million years ago, that's millions of generations that the brain has had to interpret images. So we're very skilled in interpreting images. That's why one reason why figures are so important. We're much less skilled at interpreting symbols and making meaning of a set of symbols that is writing, which was developed relatively recently. 5,000 years ago, a short time in history by the Egyptians and Sumerians and even after it was developed, most people were not able to read. Even in 1850, just five generations or so ago, only 10% of the world could read. Most of us in this room including myself had ancestors at that time who were not able to read. So the development of the brain and interpreting the symbols that are read when we read a journal article is much less than the development to recognize that there is a bear behind you about to attack you. You've got the peripheral vision. You can see that bear. But interpreting symbols is much more difficult and that's why we look at figures. Most of us look at figures first. So only a few generations ago did people begin to read. Right, widely. Okay, any paper if it's scientific must be reproducible. That means it must have the information. I'll skip through this quickly, but it must have all of the information that is necessary for another person to reproduce the results. Otherwise it's not science. Right, if you do simulations and don't give the initial conditions on some nonlinear system like many of us study, then different initial conditions can lead you to different attractors, different behavior. If you have a theoretical analysis and you make certain assumptions and approximations and recently we were reading a paper we could not understand. We read it for months. There was a change of variables that used the new variables used the same symbols as the older one. And one of my colleagues, Philip Morrison, after looking at the paper the multiple times realized there is a change of variables between that line and the equation that followed from that line in the next. And the author never stated that. The work must be reproducible. If you have detailed sample preparation you have to describe what was done. References are important. American Physical Society has made a statement about it and I think I ascribe to this 100%. Omission of a partner author or reference is unethical and unacceptable. We all have access to Google and we can find the relevant articles. If a person is written on the same subject but you don't like their work or you've had a conflict with that person that does not mean that does not give you permission to omit them as a reference. Full referencing is the ethical and required. Oh, yeah, and you should reference other works. Here's an example that I showed before. And we were surprised to see the cover of this magazine, a journal every month, has a picture that looks rather similar. It's grayscale, not color, and it's been flipped over. See the two particles there? But no one ever asked us for permission to use that. So, now I'll skip this. I think there's far too much emphasis on journal impact factor. Journal impact factor is based on the citations in the first two years, but for most articles, typical article, 6% of the citations over the long term are in the first two years. The papers that we, some of the papers our group has had only a few citations in the first two years and a couple of papers that I had mentioned yesterday in a talk have thousands of citations, one of them 4,000 and another 2,000 citations. But the citations didn't come until a decade later, or even later. You take Einstein's papers in 1905, the famous series of paper. There are very few citations to those for a long time. It took a while for people to appreciate something. It was different from what they know. Now, if you publish a paper on cancer, a new cure for cancer, you'll immediately, in nature, and it's a cover article, you'll immediately get a huge number of citations. There's a large body of scientists working in the field, and it's of interest to many people and be rapidly cited, and maybe not much later. But many times, work that is really innovated is not innovative, is not cited very much in the first few years. That's just a couple of reasons why I think impact factors no significance. I'm not concerned about impact factors. So I'm through. The most important thing is to write and rewrite and rewrite. Ask others to critique your manuscript. Read a text aloud yourself or read it to a friend or get them to read it. When you read it aloud, you pick up errors and ambiguous expressions. Transitions from one paragraph to the next, which make no sense. And lastly, you should cut extraneous material. You spent six months working on an aspect of the experiment that didn't pan out or it turned out to be a blind alley. All that effort, you feel like you have to say it? No, you don't. Cut it out. Make the paper as safe as you can. So writing the paper is not something, a good paper is not something that you do quickly. It takes time. It is really, I feel, an integral part of the research. In writing a paper, you find, or at least I find, that some of our arguments have holes in them. Some results we present are incomplete. You learn a lot from the writing process and it takes time. It's important to start early putting together the paper thinking about the story. Come back to we're storytellers, a paper should be a story. It should unfold like a novel unfolds and at the end you have this beautiful result. Thank you.