 My name is Jacinta DeSanto. I am a core developer for Galaxy, working at a Penn State University. I've been on the team for a little over a year now. And today I'm going to be stepping you through the rule-based uploader tutorial. So the rule-based uploader allows you to upload data sets or collections depending on what you have and apply rules to them as you upload them rather than upload them just as data sets as they are and then apply the rules, you can kind of modify them as they're being imported. And this can save you time. It can be a little bit more efficient, especially if you're doing the same thing over and over again. Hopefully that gives you some background as to what it is we're doing or why. All right, so this is the actual tutorial. So it should look like. And we're just going to go through some of the basic things that you can do with the tutorial. One thing I want to draw your attention to is these data blocks with this little copy button. When you click this copy button, it will automatically copy this data. And then it's very easy then for you just to control V or paste the data in the upload box. That's a really nice feature. We'll be using that a bit. So you'll want to have the actual tutorial open in a tab that way you can quickly copy that data like that. All right, well, let's get started. So this is the first chunk of data we're going to be working with. I'm going to copy that now. And then we're going to head over to the Galaxy and we're going to upload it into the rule builder. First thing you might want to do is create a new history. I've already done that, but I'm just going to rename it rule-based uploader. And then it'll all be in one place. So from this main page here over on the tool panel is a button that says upload data. Looks like a little arrow pointing up. And if you click that, you'll see the uploader. It should start up for you like this. We don't want the regular uploader this time at least. We want the rule-based uploader. So you just can navigate to that tab. This first one, we're uploading the data as dataset and it's from a pasted table. So we're just going to paste what we had from that copy block that we've built. Now this is what the rule-based uploader looks like. You can see that it has broken our data into these different tabs. We have a column header here that we're going to be getting rid of. And we have a warning up here, which is what is halting us from clicking upload. When you see that this warning has disappeared, this button for upload should be blue and you should be able to click it. But right now it says that it's disabled because the data's not validated yet. It's not valid yet. So the first thing that we're going to do is get rid of this first row. We don't need it. It doesn't have any data that we actually need. If we tried to use it, it doesn't have a URL link. So it'll just break, it won't work. So we're going to, let me show that again. Go to filter, filter first or last and rows. And we want to filter the first row, just one of them. If you had a number of data, a number of headers at the top here that you wanted to get rid of, then you could get rid of more than one, but we just want to get rid of the first. And that's the only one that doesn't have real data. We'll click apply. So the next two things I believe that we're going to do to our data is add a column definition for the name and add a column definition for the URL. So our name is going to be column C and the URL definition is going to be column D, the one that looks like a URL. And that is actually what we need in order to move on is the URL. So we're going to go to the rules button here and click add or modify column definitions. From here, I said we're going to add a name and that name is column C. We can apply that. And then we can do it again. Add a definition, URL, and that's column D. And once we apply that, we see that that warning at the top has disappeared and we can click upload now. That's what we're going to do. They should create six data sets, name these different things, and it should get that data from that's a no-to link. And you just gotta wait a few minutes while it, well, it does the upload. It shouldn't take very long, but if a lot of people are uploading at the same time, it can take a bit. In the meanwhile, let's talk about why we should be using the rule-based uploader instead of manually editing our data. So manually editing your data is not reproducible, which means you can't keep doing it over and over again. It's not scalable. So if you have a thousand data sets and you're trying to put them all into the same collection and you have to change something for each one of them, that's going to take you a very long time. Using this will allow you to get at them like that. And it's also error prone. So when you're doing things all by yourself, you make mistakes. I went out of order, but that's what the section says. There's also a link here. Why not use Excel for this, which you guys can check out for more context there. I see up here where it says six jobs have completed and we can see now that they're all green and we can see that they have the data that was taken from Zanotto. And that's a basic use case of the uploader. That's just for data sets. So now we're going to work on a collection. It does say to create a new history here, but we just did that. Maybe we'll do it anyways. We'll call it simple data, simple list uploader. There we go. You can call it whatever you want. That doesn't matter. And the first thing we're going to do is upload the metadata from the first example. And this is important as a normal paste upload. We don't want to use the rule builder this time. We want to do upload data. We want to go to regular paste fetch data. Copy that again. So I lost it. And we want to paste it there. And I want to click start then. We wait for that to finish uploading. And we can take a look at it. It's just the same exact chart that we had already uploaded. All right. Now we're going to open up the rule-based uploader. But this time, we're going to upload the data as a collection. And we're going to load the data from that data set that we just uploaded. So we're going to go upload data, rule-based. And then you're going to change this to collection. And we're no longer putting a pasted table in here. We're using a history data set. And a history data set is the very same one that we just uploaded. And you can click build. We see, again, that there are some warnings up here. And we're going to need to resolve all three of these in order to move forward with the tutorial. But of course, I'm going to show you how to do that. So the first thing we're going to do, just like last time, is get rid of this first column. Again, it doesn't have any data that we need. So there's no sense in keeping it around. The second thing we're going to do is add or modify our column definitions. We're going to be using D as the URL, just like last time, because that is the column that looks like a URL and is, in fact, the URL. Except for the only difference this time is C is not going to be name. It is going to be a list identifier. We're going to go then add modify. We're going to add a list identifier, C. And we're going to add a URL, D. We're going to apply that. The type for this is a fastqsanger.gz. We can go over here to type. And we can start to type that. And whenever it matches up, you can find it and just click it and you're done there. And the last thing that we have to do is this time it's the collection. So a collection needs a name. And we're going to name it according to the tutorial. We're going to name it this. That's fine. You can name it whatever you'd like. You can name it my special list or whatever. And finally, you're going to upload that. And this time we should see that those six data sets are in one collection, one list. And we'll wait for it to finish up loading there. And again, this can take just a little bit of time. But it can also be very quick. It really depends on who else is using the same resources. And there we have it. We have our collection with our six data sets, all the information that we didn't want was stripped from them, and there they are. So that's how you create a simple list. Next we're going to be creating a list of data set pairs. This is a little bit more of a complex collection. We're going to copy this chunk of data here and we're going to be uploading it as a collection in the rule builder. The rule builder upload we're updating as a collection from a pasted table. So we're going to do this. I'm going to want to clear that out. We don't. We don't want that anymore. We want collection from a pasted table. We just want to get rid of all that. Make sure that I'm copying the right thing. And I'm going to click build. So we again have our warnings up at the top that tell us what we need to do to move forward. And let's get that data in order to move forward. So again, this line up here, if you ever have a header, that doesn't have any valid data in it, you want to get rid of that. So that's how we're going to start here. Start, get rid of the first row. And then we're going to go to our rules menu, select add or modify the column definition and set column C to the list identifier and add our type as well. Those are things that we just did. It shouldn't be, whoop, whoop, I clicked the wrong thing. List identifier, we don't have a paradigm indicator. Let's not get, you know, apply that. Turn that down here, we're going to do. So I want to look at column D here. Now if you look at column D, which we've been using as our URLs, you'll see that that looks like two URLs. And in fact it is, there are two URLs there and they're separated by a semicolon in the middle. So we need to break that up and that's what we're going to do next. From column, we're going to select user regular expression and we're going to create matching group expressions using this regular expression. I'm going to actually copy this regular expression to make it easy, although this is a pretty simple one. Column using a regular expression. And we're looking at column D because that is the one that we want to break up. We're going to paste our regular expression there and we're going to create columns matching regular expression groups. So the second radio button. And I believe our number of groups is two. Yes, that is correct. Two groups, so this is how this should look. From column D, matching expression groups, the regular expression, which is the parentheses, the dot star inside each one in the semicolon in between. And the number of groups is two. When you click apply, we're going to see that we have two new columns on the end here. We've got column E and column F and they are column D split into two different columns. There's information on the tutorial explaining how to use regular expressions briefly, but really dot star means any number of characters, anything inside of the parentheses, which is how it matches that up. So now we're going to get rid of column D. Column D is this one. It has the two URLs separated by a semicolon and we really don't need it because we just took out the data from it that we needed and separated it out. So we can do rules, remove columns, column D. And then when you click apply, D has disappeared and those E and F, they have jumped over. So this is what was on the left-hand of the URL column and this is what was on the right side. Moving on. Now we're going to split our columns. The odd row columns are going to be column D and the even row columns are column E. So again, from the rule menu, we go to split column, the odd row was D, the even was E and that's going to line these up very nicely. However, now it just looks like we have a list and what we really wanted was a list paired. So we need to keep going forward. We need to define, it says inform galaxy, which of these rows are our forward reads, reads and which ones are our reverse reads. And we're going to do this by adding a new column using a regular expression, the underscore one or underscore two in the name of the file. So we're going to go to use regular expression again, column D. And this time again, I suggest you just copy the regular expression. It's easier to just copy it if you're following a tutorial because if you mess up by just a little bit, you could be matching the number of rows if you mess up by just a little bit, you could be matching on the wrong thing. So using regular expression, column D and we're going to create one matching group. So column, use your regular expression. From column D, create groups. You can paste in our regular expression there. I'm looking for one group. I'm just going to make sure that that is all correct and it looks like it is. So you're going to click apply. And now we can see that there's a new column added to the NTR-E, which has taken the one or the two, underscore one or two from here in the file name and put it into this column. There's an optional step here to swap the columns. I'll show you guys how to do that because it could be useful. It says it's more useful to see what you're doing. We're going to swap column D and E and that's just going to make this column with the URLs, go to the end and this column with the paired indicators come forward that way we can see them more clearly. They're more at the forefront there. And now we're going to tell the rule-based uploader that those are our paired indicators who doesn't know that yet. So some rules menu, we're going to add or modify a column definition, our paired end indicator is going to be column D and our URL is column E. So rules, add and modify, we're going to add a paired end indicator and like I said, that is column D, that is the ending that we took off of the URL and then we're going to also add our URL and that is column E and we're going to click apply. And now we can see that most of our conflicts have been resolved, all that's left up here at least is to name the collection. And in fact, that is all that the tutorial also asks you to do. So we can name it whatever we want. We can call it our paired list, P, R, J, D, B, three, nine, two, zero. That is what is in column A and we can upload that. You can call it again, whatever you'd like. And when that's done, we should have a list of pairs that split up along that paired end indicator that was in the URL and matched along those. Again, this might take just a little bit of time. While it's waiting, hopefully you get through it successfully but I wanted to direct your attention to the feedback form which is at the bottom of this tutorial, providing us with feedback on how you thought this tutorial went is really helpful for us. And if you're interested in learning more about the rule builder, there is an advanced rule builder and that's down here too. You want to extend your knowledge using Galaxy and managing your data, rule-based upload or advance and you can click hands on and that'll take you to the advanced rule-based uploader. Hopefully that'll help you understand how to use this tool even better. Again, here's our completed paired list and I hope that was informative for you. Thank you for watching this and participating. I hope you guys have a great rest of the day and thank you for whatever else you have.