 Hello everyone, welcome to this UK Data Service webinar on tables and pivot tables in Excel. The speaker today will be Peter Smyth of the University of Manchester. What we're going to look at today is tables and pivot tables in Excel. There are a variety of things we'll look at, so we're starting with what a table is, why we use them, and why you might want to use them, I recommend them. Tables and data validation. Data validation, of course, you can do without tables, but it's a lot better with tables, and we'll go through some of that. We look at creating relationships between tables in very practical scenarios, rather than getting into the nitty gritty of set theory and all that sort of stuff. We'll just show practical examples of doing that. We'll talk about what a pivot table is, again, nothing too technical, just based on examples and why you want to use them. At the end, if we've got time, I'll just show some pivot charts and dashboards. That's really just because pivot tables by themselves are pretty boring to look at. Most of the webinar is going to be taken up with demonstrations, which means I'm going to have to keep switching screens, but it could be problematic the way Zoom is working at the moment. I'll give it a go. There are a few slides to set scenes and then lots of demonstrations. Let's start with what is a table in Excel. If you've used Excel before, you know what screens look like, what worksheets look like, and so on. You might think, well, isn't everything a table? It's got rows and columns. You can't not have rows and columns in Excel very easily. Yes, they've got rows, yes, they've got columns, but in reality, they are really just collections of individual cells. They just look like a table because of the way they're arranged and you conveniently give columns, titles, and things like that. Individually, though, there are individual cells, A1, B1, C1, and so on. Things that we're going to start looking at in the demo, I'm going to show you the slides first and then we'll get into the demos. How to create a table? You've got to start with some data. You can either import the data or you can have it put in manually. Having got the data in your spreadsheet, there are three ways of doing it. You can use the insert table option from the insert ribbon, control T from the keyboard, or use format as a table from the home ribbon. Plenty of options. Microsoft are probably trying to encourage the use of tables. There's a fourth way as well which we'll come on to later on. What can we say about tables? When you're creating a table, Excel thinks it ends with a blank row or a blank column. Typically, it will try and work out how big your table is, what the extents of your table are. It can't get it wrong, but you do have the option of correcting it, so it's all right. Potentially, the issue is going to be blank rows, because the blank row will cause the table to come to an end, and for large datasets where you can't necessarily see the blank rows, that can be a potential problem. You have to be a little bit careful about that. You can give tables names. Sales are all automatically doing for you A1, B1, C1, and you can't actually override them and give them their own little names if you want to. Tables are much the same. When you create a table, Microsoft or Excel will give it a name very imaginatively, Table 1, Table 2, Table 3, which doesn't tend to be too useful for you, but you can change the name. Generally speaking, you should choose your own names for the tables. You can treat tables as a single entity, a single object. That's part of the beauty of tables, is that it is a single thing, and it all works together. For almost everything that we're doing today, I'm just going to treat them on the screen as tables and within tables. But if you're into writing formulas in Excel, there are many formulas now which will accept a table name as a parameter to the formula. So where typically you might have to provide a range, you can often provide table names. Tables have their own ribbon, and we can see that. You can format tables using themes. When you actually create a table, it actually gives you a default theme for the table anyway. In all honesty, I tend to just leave the default theme in place, but you've got lots of varieties of changing the name should you want to do so. You can include a totals row at the end of the table. Now, that's a very, including individual totals, the same that you commonly want to do on your data anyway. A table allows you to include a complete row of totals, and then within each cell within that row, you can pick what kind of total you want, whether you want the sum or the average or whatever. That's potentially very useful, but what is more equally useful, but perhaps not use, no real realism to use it, is that if you subsequently remove that totals row, and then add it back in again, it remembers what totals you've been using. Obviously, if you've removed it to add more rows of data, and then put it back in, it will obviously recalculate and include those rows as well. The tables will automatically expand as you add new data, and this is particularly useful when we're doing data validation or just adding data in general. The table, if you add something to the bottom of the table, it will expand to include the rows. You can also add new columns to the right, and it will automatically include them. Unfortunately, if you try to add a new column to the left, they're not automatically included, but as I pointed out before, you have the option of manually extending the dimensions of the table. OK, first demo creating tables. OK, hopefully now we can all see this web page, which I've gone to. It's from the Met Office, and basically, it is horrible data on weather conditions and storm away. What I'm going to do is I'm just going to create some data by copying all this data down here. 100 odd years of data from storm away. I'm just going to copy that, and I'm going to go to an Excel spreadsheet, which I think is this last one here, which is just an empty Excel spreadsheet I've opened, and I'm just going to paste that in. That's just the way I've copied it in. On the face of it, it doesn't look very good, so certainly, I want to get rid of these first few rows because that's not really relevant to the tabular type data. Equally that row there. I don't really need that. On the face of it, this looks rather tabular, except for the fact, if you look at this row I mean, or even the top row, it's all bunched together. It's all actually in column A. The first thing we do, we need to fix this. If I select the whole of column A, this isn't tables at the moment, this is just setting up our data. Then I want to go into data, and then use text to columns. If I haven't used this before, it can be very useful. Text to columns will look at what I've selected and say, oh, I think that might be fixed with, but I know better because I know that's limited. I go on to the next screen, and it says, oh, I think that must be tab delimited, and I know better because I know it's actually space delimited, and I want to treat multiple spaces as consecutive delimiters. Here you can see the lines here are telling me how it's going to split up my data. Click on next, and I can say, oh, this column here in black, the first one, I don't want you to import that because there's nothing in there. Then I can say finish. Now I've imported the data, and now it's sort of looked a lot more like a table now because I've got rows and columns and headings and all this sort of stuff. In fact, this is just a collection of cells. To get this into a table, I've got the options of I can use the table here, or I can go to the home tab and say format as table. I think if I come out here, I think that one cell is a table. If I'm somewhere in my data, and say format as table, or insert table, I'll tend to be the one I use, or tell you there you can use control T as well. I'm just going to click on that and see what it says. You can see here it actually works out how big this table is, a1, up there, right down to the bottom 1773. I happen to know there's no space in this, so we're okay here. You've got the option saying it guesses that there's headings in this table, and so it has clicked, checked. I can uncheck that if I want to. Notice that it includes this apparently blank row here. I don't know if you notice when I was copying it, but right down at the bottom, some of the bottom rows actually have something in that column H. That looks good. I'm going to say it okay. Here we can see the first example of the formatting. This is the default color scheme of a table, and the scribed colorings is really just to help the data stand out more. I said that there are, in fact, table options here for the table settings, which comes up automatically. You can see here, down here, the table styles, which you can change to your heart's delight. I tend to just stick with the standard one as it comes up, and don't worry about it. Up here, you can see where it says table one. As I said, the first table, it calls it table one. You can change that. Let's just call it a stone away. I would recommend you change the names to something meaningful for your tables. Then we have the table itself. One of the things you may have noticed is all of the columns have their own automatic filters put on there. These filters are exactly the same filters as you could put on yourself manually one at a time if you wanted to in non-table formats. But I think it's quite convenient having them there because they tend to get used quite a lot. The other thing, which is very clever, I think, is that if I am scrolling this table, watch what happens as I scroll down to these headings A to H. They get replaced by the table column headings. This is really quite convenient. Other little things I can do is if I want to select the whole table, I can use control A, and that will select the whole table. Shall I need to do that? I said, if I control home and control end, it won't say what to do for anyone else. So control end takes me down to the bottom of the table. Things to notice here is if I start adding data into here, like 202.1.3, whatever. As soon as I start doing that, the table expands, and you can tell immediately by the colouring of it to include the row I've started there. Also, on the bottom right of the table, you may be able to see this little inverse L shape, and that indicates the end of the table. I'll try making that bigger. You can see the end of the table with that little L shape. You can select that, and you can drag it, and that's how you can change the size of the table, certainly in this direction, diagonally, like that, and down, and what have you. I'm just going to cut out that data now, back to where we were. So this star is in there. I'll just delete that. If you want to add new columns, I'm just going to go control home to get to the top. I'll notice this column 1 has been named by Excel because it knows my table has columns, because I told it there wasn't a column name for a search call of column 1. I can change that myself if I want to. If I want to add a new column to the right, I can just say new column, new call, and as soon as I hit enter, I get a complete new column added to my table. If I wanted to put a formula in there, again, I select that first cell, and I can just say, I don't know, that equals 1, a very primitive sort of formula, and if I hit enter, it will automatically propagate down to the end of the table, and it will stop at the end of the table. I know you used to be able to, well, obviously you still can, if it's not a table, you can use a little drag bar down there and tell it to do that, but it will do this automatically. I don't really want that column, so I'm going to delete it. We can just go through my little checklist here. We can use text columns. We've done that. We've created it. I've done it three ways of doing it. Selecting the whole table is control A. The table design ribbon, we can see... By the way, if I go out of the table, I lose the table design ribbon. It only appears when you're in the table itself. If I click on it, we see the various thing you can do. We've covered the style sort of. First, header row is there by default. Banded rows is there by default. You can actually switch that off if you want to, like I said, I tend to leave it on. You can see here, I've got total row. I can click that on and it comes down here and gives me totals. It seems to have automatically countered them as text, so that is going to be a sum. But if I want to do my own little one down here for whatever, I can click on the down arrow and I can pick any of the ones I may want to do. It's all only usual contenders are in there. Okay. If you want one that, you can switch it off. The filter buttons, again, they're on by default. We can export and include slices, which we'll do much later on. We can summarise it with a pivot table, which we'll sort of be doing later on as well. If you want to get rid of your table, there's an option to say convert to range. You can just go back to having a non-table if you want to. We've got the resize table there. If I click on that, it comes up and effectively just allows me to manually either change these values here or I can use the outline type thing as you can do on formulas. There's lots of things you can do manipulating the table if you need to. Back to the slides. Advantages of the tables are the formatting. I think the formatting of the tables is only useful because it helps the data stand out. Lots of numeric data. It helps if you ban the rows and it makes the stand out. Automatic filters. Yes, well you can do it without that, but they're nice to have as well. They keep column names when scrolling. That can be very, very useful if you're trying to add data in and you can't remember what column you want to and things like that. You can add rows, columns. You can add totals. They're very flexible, but no matter how you add things or take things away, you still only have one table there. Right, creating table the fourth way. This is where you're going to import a CSV or a tab delimited data set or whatever. The way we're going to use this is using the get data from the data tab. In most cases, unless you're creating your own small amounts of data, this is probably the most common way of getting data into an Excel spreadsheet and making it into a table. If you've done this sort of thing before and you said, oh well hang on, surely I just have to double click a CSV file and it'll load it up automatically. Well that is true, but there are a couple of problems with doing that, which I'm hopefully going to show you in a minute. So let's create some tables. I'm not going to start with the Twitter data, I think. I'm going to start with test data 1 CSV. So I'm going to start off by just double-clicking test data 1 CSV. OK, so that is me starting up double-clicking and you can see it's imported everything from there, but it's got a gap in there and it doesn't leave it as a table. It just imports the data as is. If I want to create that into a table, I'll use control T and you can see here that, yes it's recognized table headers, but notice how it stopped at the blank line. OK, so that's one downside of having to do this. You don't get a table automatically and it doesn't necessarily, well, I suppose a blank line would occur anyway. If I want to do, I could select all of it and then say insert table and then it will create a table of all of it. OK, and that's up to me to deal with the blank line. You can have blank lines in tables. It just won't, it will use a blank line to stop the length of the table if it has to. So that's how you can double click on a CSV file. But what we don't want to do is, we don't want to say that. What we really want to do is create it from scratch. So I'm going to create an empty spreadsheet again, starting from scratch and the better way to do it is to use data from text CSV. OK, so I'm still going to effectively import a CSV file and find it. And the one I want to use is this UK Flatten 1 CSV. And what this data is, I click on it and say import. It's data, it's Twitter data which I've flattened from the JSON into a standard CSV type format. OK, so I've got dates created at, I've got user ID strings in first other things, most of which we're not really terribly interested in. It's really just to show that we can import it. And when we import it like this, again, it will make it some, it will actually read the first 200 rows of the data. And from that it will conclude, oh yes, it's probably got table headers up here. It's comedy limited. Notice this isn't based on the fact it's called dot CSV. It actually looks at the data and works it out from the data. And the encoding is by default it will be utf8 for this type of file. That's probably a safe encoding if you're using social media type data. What is a bit odd about this is this ID string. This is meant to be a Twitter ID string. Let me just show you what this file looks like when I haven't got it inside Excel. If I go to this UK data flatten one, what I'm going to do is I'm going to open this in a text editor called VS Code. It's just a fancy text editor, not Excel. The important thing is not Excel. The other important thing, or useful thing about it is it has facilities to nicely color code CSV files. So you can see here I've got a date with a time on it. I've got that number which you saw down. That's the same because it's all come from the same person. And this yellow here is actually a tweet ID. Yes, the tweet ID. So there are very long numbers. If you compare that with what I had appear to have here, it's changed that to scientific notation. Okay? Now what I want to do is I want to try and preserve the full length of that string rather than have it like that. So I could just say load and it would start here and load it all in. But I would have problems of this. So what I want to do is say load two. Oops, no, I don't want to do that. Let me cancel that for a minute. Try that. What I want to say, rather than loading it at this point, I want to say transform data and the reason I want to do this is for two reasons. I want to show you that these strings are of the important way they were looking before. But if I go up here and this is already ID, it's got one, two, three, so it thinks it's a whole number. So the problem with just doing it like this is that I'm going to lose those long digit numbers. It's going to save this and put it into my spreadsheet in that format. And if I subsequently save it, let me just load that up. I'm doing this a bit backward, but bear with me. You can see here, I've got these ID strings here in scientific notation, e to the plus 18, so they're very big numbers. Now I can select that column in the usual way and say format cells and I can say I want it in number, number here's a zero. If I do that, yes it made it nice and long numbers, but you see at the end everything is zeros. It's because it had already lost the full set of digits and it's now approximating to what it should be. And that's what would happen if I had just double-click this double-click the CSV file as well. So if I want to another sheet, I'm going to drag the same one in only this time I'm going to do it properly. I'm going to say import. And instead of here saying based on first 200 rows I'm going to say don't try and detect the data types. The rest of it I can leave the same. So the major effect of this is that it can no longer recognise the fact that I've got headers say. So now when I transform the data everything is going to be treated you see these little ABCs everything is treated as a string. But at the same time I do have my full numbers in there. So if I take this column this column and I say change this into whole numbers or I'll leave that as a text string. I can use this to promote those first rows into headers. Now the treble A is this now tends to think this is a string a number now. So what I want to do is select that and say it is in fact text. I'm going to replace that and you can see now it's text because it's left aligned and I've retrieved all of the data the full values. And if I say load and close and load rather it will load that in there and you notice whenever you use this method using the data import method when it loads it up it loads it up as a table automatically for some reason using a slightly different theme a green theme but there you go it's loaded as a table. And you can see here on my ID string because I explicitly said these are strings rather than numbers I haven't lost any data whereas these ones here have been lost. So that is one very good reason why I should always import using data import. The other advantages are that it automatically puts it into a table for you and the third advantage is that rather than having a file called something something .csv it actually copies this into a new excel spreadsheet. So this is just called book two now my original csv file hasn't been touched I've read it in but it's not touched. And this is a good thing to do when you're using data never touch, never risk damaging your original data so it helps you with that as well. Okay covers all of that so back to the let me just close up I'm not going to need that one again and I'll go back to the slides and right so double click don't do that use the data tab and then you don't use your data Tables and data validation might be considered a little bit of a digression but excel does have quite powerful data validation options but you need to set it up and if you set it up you're better off setting it up with tables as we shall see okay so here we have very simple bit of a couple of tables I've set up this table down here did I give it a name which I've called items is me adding data to my table okay and this has already been set up and what I've done is if I go to the end of this row here the last data enter in there I'm going to put an egg in there I've set it up so it automatically gives me a new row number I've got data validation there I've got data validation there but I'm not going to use if I come down to here oops I haven't got data validation on that one don't know why I haven't let's go back up to this one data validation here and what I'm doing is I'm using a list here of colours and this list red down to indigo is this list here red down to indigo and so I've forced to pick one of those colours I tried to put any in there in an object on this one this appears to look the same and I've got red down to indigo but the difference is that this one is let me just shorten that table a little bit this one is purely based on this list of cells whereas the validation on this one is based on this table column options is a table you can see the colour coding and what have you the filter what have you so what happens here is I can pick one of them in the same way they seem to be the same but then I realise oh hang on I haven't got violet in here it's not in that list either so what happens if I add violet onto that list and violet onto my table on the table you can see straight away that the table has been extended to include violet here well you can't tell it hasn't done anything but if I go back up to my list on this one if I go back to my one down here I've now got violet included because the way it's been set up it knows this list is coming from a table and when you extend a table it's still coming from the same part of the table so I'll just show you down here briefly how this was set up I won't put the validation on most of them but certainly on the rainbow colour which works with the table on fact the rainbow colour on the table let me just show you how it was set up if I click on the green for the colour and go to oops no data data validation here if I click on that because I've already got it set up here this is going to come up and show me how it's been set up now when you're setting it up for new you just get a blank screen here and you can select how you want these things to work picking the ones you want so for this one I've picked green I've picked the list and then the source of that list I could manually put them in the actual text in there but typically you're going to pick a range which you're allowed to do and here you can see I said L2 down to L7 and violet of course is in L8 so although I've added violet at the end here this isn't going to pick it up if I go to the rainbow colour and look at the same thing oops you can see here it's exactly the same it's a list but here I've said the source is a list called rainbow now the table itself isn't called rainbow it's just a list which is called rainbow and how did I get that list let me just cancel that I can create a list by going into formulas name manager and this has shown me a list of my tables and lists so rainbow is a list and you can see at the bottom here how did I get lists it is equal to colours which is the table name and then within colours it is called options and it's because it's done this way yes you have to have a list of some kind but because this list is defined in terms of the table colours it means that when I extend that table it's automatically picking up the new values so the moral of the story is always use tables when you're using pick lists in this manner because then if you need to add something new to it you don't have to go back and change your formula they're really not going to show you something down here I'm going to do the same thing down here I'm just going to add it to this rainbow column here notice there's no validation on this at the moment but if I go to data validation so here I get the validation screen in its normal empty settings and I'm going to say list and here I'm just going to do exactly what I had before equals rainbow and say okay and now you see the little drop down arrow here and I've got my list of colours in there now the reason I want to do that on a blank list or blank row is that if this was data input I would have my columns like that and what I want to do is extend it to the row covering where I put all my data validation steps in and that is what I would want to turn into a table okay, yes it's got the rows I'm going to say okay and now when I go down here I've got the validation here and when I tab to go on to automatically take me down to the next, begin of the next row extending the table as it goes if I now click on there it's automatically got my validation in there so the idea is you put your columns in, you put your data validation in then you create the table and then you can have data, I've only done it for that one thing but you can have data validation all the way across the board and as you add new rows it automatically includes the data validation that you might require okay so the next sort of obvious question is well okay I'm validating my own data creating my own data input but we've already said that in most cases you don't go around creating your own data in this way, you import it from a CSV or something like that so the question is if I import it from a CSV do I still have the data validation facilities available to me well to look at the slides for bulk input can it be done yes, no, yes, no, yes now the question is is it really yes or did I just run out of room on the slide okay so we'll have a little look at that using a different data set this is a data set which I've already imported it I've made it into a table and down here much as I had before I've got a little table which I'm using for data validation okay ideally if this was for real I would not have this data validation table on this same tab I'd have it on a separate tab somewhere this is just for clarity for the demonstration so the question is can I do data validation on data which is already in there well the answer is sort of yes because if I select this column observation say and go to data data validation data validation list I've already got it on in fact I was going to do it exactly the same way I say list and I'm calling it observation which is a list from this table here okay I'm going to apply that and on the face of it well cold that's in my list nothing wrong with that data so on the face of it it doesn't really stand out and if I just squeeze down here if I was using too quickly you might miss this entry here and this entry here okay now in fact just doing the data validation does actually mark them with the little green triangle in the top corner as you can see there and the little actress when you go next to it and it will tell you if I click on that it will tell me data validation error the problem with this method is it's not very clear very easy to spot now you can sort of make it better by saying by saying data validation circle invalid data oops I should have done that for the whole thing nope no I haven't so now it circles it now again that's making those stand out but again if that was row 96000 and not lines 69 it doesn't have trouble spotting that so ideally what you'd want to do is use some different method so one of the things you can do if we're just checking the humidity rather than using the data validation you can instead go to the home tab and do conditional formatting so if I use conditional formatting um and I can say let's say for my only values that are above or below oh maybe options yes cell value between um say zero humidity between 65 and 75 oops 65 and 75 okay so that's a very similar sort of arrangement to when we're doing the data validation and I should have coloured that let me just do that again um new rule that contains 65 and 75 um and I'm going to format that with a colour oops fill red okay let's try this again ooh humidity demonstrations new rule only cells that contain between 65 and 70 format red so now they're highlighted okay and the advantage of them being highlighted in this way is that if I go down to the filter I can actually filter by colour so I can make all of the errors come to the top yep I'm not going to do that just moving on to the observation ones where it's a bit different because this is a list of names now we know this is already set up so I'm going to do it in quite the same way but what I can do is I can insert a column yep insert a column so it went back to the top for me automatically to do that insert a column and in here I'm going to use a little formula which hopefully I've saved some way because it's complicated this is the formula I'm going to use so we'll see I'm going to put the formula in there before I hit enter I'll just read this to you see if it's just a standard function formula but you can see here I'm using table 2 open brackets square brackets observation type which is effectively the column from this table and then I want to check check whether or not this little square brackets in the outside means the value in my table current table in the observation column so you can refer to cells in tables now having done that if I just hit enter oops cancel that data validation data okay let's do this again using a row on this side insert a row add to my table let me just delete that it's just a curse at me I've actually got observation error here okay fortunately here's one I've done before so you can see in here that I've created a column observation error I've put that formula in and when I use that formula typically if everything is okay I'm going to get a 1 in there and if it's not I'll get a 0 in there having done that I can use my normal filtering and say well one is okay so get rid of that and just show me the zeros and that way you can bring up all of the errors in the observation column okay so if you're looking to do validation on bulk files like that it's sometimes better to use data validation not use data validation but use either formatting or your own little formula okay next thing we want to look at is table relationships what a table relationship is it's a connecting of tables that have something in common and the two types we're going to look at are appending tables and this is where the tables have a common set of columns and we want to concatenate the rows of the tables together we're going to look at merge a bit later on when we do pivot tables and we're going to see there that this is where you have a a single column which is common in usage to two tables and you can combine them together based on those so when you append tables you get the same, typically you get the same number of columns because you're adding a table onto a table if you like at the bottom and when you merge two tables you typically get a table with more columns but the same number of rows because you're adding new columns from the second table onto it but again this is far better done showing you the demonstrations of these of appending tables and we're going to do appending tables first because this is the easiest one to show so what we have here are four tiny little tables A, B, A, D, B and C that was just my naming convention going to write and what I want to do is I want to add table D onto the bottom of table A well if they're nice and small like this what I could do is use control A to select all of table D and I could drag it and drop it there and you can see that table A has automatically been expanded to do that so that's very quick and easy if you've only got small little tables and of course all the columns are right but what happens if I wanted to add all four tables together and you can see here that table B has an extra column table C also has an extra column and got these switched around the other way so can we add all these four tables together well rhetorical question of course we can and where we're going to do it we're going to use data, get data combine queries a little word on the use of the word queries queries means tables in this context and merge we're going to do later on I'm going to do the append for now append tables and you get this little dialogue up to say oh two tables or three or more tables well I want three or more tables and these are my available tables and oh yeah table D is missing so the question is why is table D missing and the reason is that behind the scenes before you do this all of your tables have got to be put into what's called the data model in Excel and I've already done it for A, B and C but I haven't done it for D so I can show you so let me just cancel that go to table D and I can say from getting transform data from table range and when I click on that it brings up this thing called the power query editor and all I have to do is do very nothing much it's already got that table selected it says table D there, that's okay so I'm just going to say load and close to and this brings up this little import data dialogue and the reason I wanted to bring this is I've got to say add it to the data model because I need it in there but equally I don't really need a table because I've already got a table so I'm just going to say only create connection and there we now have on this left hand side I've now got tables A, B, C and D so back to where we started from, get data combine queries, append three or more tables A, B, C or D I'm just going to add the more in and click on okay and you can see there that is the result it's going to give me it's probably going to be easy if I do the load closing load two and the reason I want to say I just want to control where it's going to go down there say okay and now you can see this created table of A, B, C and D things to notice about it it doesn't care if you've got characters in there or not it puts all of the columns in the right order including D and E which weren't in the first two tables it has noticed that in table C A and B are swapped over and put them in the right way there so this gives you a lot of flexibility for tables which are importing which are supposedly similar tables but in fact aren't quite similar tables okay so it's a lot better than just trying to I mean you can imagine that definitely you'd have if you try to just drag and drop this little table onto the bottom of A or D yeah or any other combination okay so appending tables if you do it using this method it's going to cover most of the common type problems that you're likely to come across okay let me just close that one don't need that one anymore appending tables so now we're going to move on to pivot tables I'm going to come back to do the merging of tables as part of pivot tables so what is it it's used to summarise data and it's used to convert long format to wide format so simple example here not that one store pivot so what I've got on this spreadsheet is a table here did I name the table no I didn't I should have named that something table one it's called but what I've got is typical sort of data sales type data stuff and it's in a table and what I wanted to do was create a pivot table now I've already done that we'll go through basically that in a minute but what I wanted to show you is the way it has converted the data so down here I've got north I've got south which is a region for north and south I've got what stores occur in that region A and B and A and C and D for the other one and then for the item column the values in the item column so the unique values have in fact become rows column headers rather so this is the notion of taking this long column and make it into a wide set of values and then within the body of this I have got the values of which is in fact the sum of the values in here so for here we can see for B for bikes you can see down here I've got two B bikes 200 in each and here I've got the sum of them okay so that's the long and the wide sort of thing and what the basic format of the table of this is how do you create it well I'm going to let's create another one let's kill that for a moment anyway in the table I'm going to say insert pivot table normally I would do these on separate worksheets but I want to do this in an existing worksheet just down here below that one I had and I'm going to say okay and what happens here you get this little emptyness down here and on the left on the right hand side here we've got the pivot table fields you can switch these off switching back on using this field list up here notice we've got our own little tabs up here pivot table analyse and pivot table design design is just for changing themes this is where you do all your work and what you do is you get a list of all of the all of the columns from our table and you can put them where you want to so if I was to recreate this one up here what I've got is I've got a region as a rose so I've got my north and south within the region I've got my stores A and B for the columns I've got item so I now get the different columns going across the top there and then I want to put the value and by default for values if it's numeric you get the sum of the values if I was interested in how many of each item I could left click here say value field settings and change this to count for example and it changes them to count so that is the basics of how you create a pivot table you can see that it's summarising the data in various different ways and why you might want to do that the other thing you might want to do is having got the pivot table you might actually want to export it and you can do this so I make it look like a flat CSV file now this works I'll go through the part of doing this but not all of this I'll explain why not all of this if I select in my pivot table and go on to my pivot table design option up here and then we've got report layout and if I say default is showing compact form if I say showing tabular form then you can see here each of these values now get its own row as opposed to up here where they're sharing the same row that's stage one stage two report layout is to repeat all item labels and that has the effect of filling in all these norths and souths and the third thing I want to do is get rid of these subtotals do not show subtotals so now I have pretty much got a basic table and I can copy that if I use control A to select it control C to copy it if I go on to a new sheet and do copy values you can see there how I've got a very basic table and that idea is that I can now export into a CSV file the other thing that you may have wanted to do I'll go back to my original one is rather having all these blank values you may want to put them as 0 and you can do that by if you right mouse click and say pivot table options I can say for empty cell cells show 0 or whatever you want to show and I'll fill all I'm in for you and then you can export that okay moving on quickly on to the last there are the basics of how you set up a spreadsheet you can have things more than once so if I wanted item I've got the count of the value if I drag value down there again I now get the sum of value and the count of the value for each one so I can have the same item in there and use different simulations in there some average whatever you want usual sort of things okay nely finished so for the last demonstration this is where we're going to do the merging of tables I'm actually going to use a data set data service data set SN7613 which if you haven't used it before it's a census training data set not last Monday's census but the one before I imagine it's got about half a million records in it 18 variables and all of data is numeric and it's mainly represent the numeric data is mainly representations of categorical data so for this I want to show you first when you download this data service data set it's hard to say fast part in the data you actually get a code book or data dictionary which explains what all of the variables are and what they mean and what values they can take okay essential reading before you try to do any work on the data set so from that what I've done in my Excel spreadsheet here I have created a few little tables in advance of some of those variables so I've got the age variable age GPT and I've just made into a table the value one is not a 50 and so on and I've did the same for economic activity health and region okay because we're going to need them later on now what I want to do is actually import the data set itself which I don't have at the moment so what I'm going to do is I'm going to go to data and I'm going to say get data from CSV as we did before this data is hidden in here this is the value download it actually downloads this tab file that does not have renamed it I'm going to click on that and I'm going to say import and then we get up this little dialogue or screen saying what you want to do in this case I don't need to make any changes to this because it all looks pretty good to me so all I really want to do is load this in but I'm just not going to say load I'm going to say load to and here we get our little import data tab coming up window coming up and here I am going to want to put this in the data model because I'm going to need it in the data model to do my merging of tables later on and all I really want this for is to create a pivot table so in fact I don't actually have to import the table I just want to keep a link to where the data is okay so with that only create connection and add this table to the data model if I click on okay it will go away and start importing the data and you can see here the progress it's making to serve a 500,000 rows I think to bring in a when it finishes in a minute now I'll tell you why it's doing this one of the advantages of using the data model only option as I've done here is that if this it's got 569,000 rows if it had 5 million rows I wouldn't have been able to import it as a table into Excel because Excel is limited to a million rows but it would still let me import it into the data model so if you've got very large tables and you only really want to do the summaries of it you can put it into the data model and still do the summaries okay but just hover over that you can see the little options I get there for refreshing it, deleting it and various of the little things and it gives you a very little small snapshot of what the data looks like but we're going to want a bigger look at the data I'm going to go to manage data model and this brings up what's called power pivot and in here it's effectively as the name suggests allows you to see what's in your data model and at the moment we've only got our one little table down at the bottom here and this is the data and you can do various things on this data from here which we don't really want to do all we really want to do is create a pivot table so I'm going to click on create pivot table and I'm going to say it in a new worksheet to keep it out of the way and this looks this bit exactly the same as we had before down here it looks a bit different because I've actually got my data model table in here you can tell by the little disk symbol in the corner here saying it's in the data model and it also recognises that I've got my four other tables down here yeah so let's start off with our data creating our pivot table in the same way so what I'm going to do is I'm going to have health down side there I'm going to have age in the columns and then for the values I'm actually going to use purse ID now purse ID is the identifier the unique identifier and it's again like everything else in the dataset it's a number so if I drag that down to values it will automatically give me the sum which doesn't actually make a lot of sense so what I really want here I want to change that and make that into count and then it sort of makes more sense because you can see down here my grand total 569.741 which is a number of records in the file I can change these labels up here into something more meaningful like age group row labels our call health okay and on the face of it that is the end of our little pivot table the problem is it leaves a lot to be desired because well what does this 1, 2, 3, 4, 5, 6, 7, 8 in across the top and minus 9 to 5 on the columns rather than having those numbers what we really want to know is what they are representing so I've got health down here so I go back to my other set of tables here and find health I can oh yes warning is very good health and so on and so forth but it would be nice if I could have these labels actually in here yeah so how do I get them in there well going back to our pivot table fields you may have noticed that my age table has been included for me to use value and label and what I want to do is use the label so if I put the label in the row there it adds it in I can also take out the health one because I don't want that oops and I've done that wrong way around that's age group I should have put that one up there let me just put health back in there for a minute okay on the face of it I've now got these looking a lot better but something doesn't quite add up in that all of these totals at the bottom are all saying the same thing and if that isn't enough is a clue to something wrong I've got this little box up here saying oh relationship between tables may be needed well it's not a maybe it is yes we need to create a relationship so what we do is click on create the dialogue box comes up and the way we do this is that the first table is effectively the table we're making the spreadsheet from which in our case is the one in the data model the census one the column in brackets foreign this is the column which is equals or we're going to equate with a column in our other table which is the primary one so the column we want here this is age I'm looking at is age group and the related table is my age group table and the related column here is going to be the value because it's a value of those which are recorded in the the census table I'm going to click okay and now you can see this looks better now I've got proper values down there so now what we're to do is exactly the same thing for health health table down here I'm going to put the label in because that's the text that we want to see get rid of that and now we've got the same problem again very clear in this case and I'm going to have to create another relationship the top line is again table is going to be the same there here I'm looking at health here I want my health table and here I want the value okay so now I've got something readable which is a lot better age groups yes they all go up in the right way the slight problem I've now got is the ordering down here bad health, fair health, good health no code very bad health, very good health the problem is that I can sort these but as the text I can only sort them ascending or descending order okay so what we need to do if we want them in a more reasonable order we can go back to our lookups and go down to health and I'm going to add a new column here and say sort eye it's a table so it gets added automatically and down here I'm just going to say one, two, three, four, five, six okay so I've added that into my table and this is the way out ideally like to sort it if I go back down I've still got my data model open down here you can see here health one no difference there it hasn't changed that but what I do have is an option to refresh if I click down there and say refresh I can refresh everything if I want to but that will go back in trying to load that half a million rows of tables so I just want to refresh the table I'm on and it says yes it's done that six rows transferred and in a minute I can see my sort by column appearing in the data in the data model what I didn't point out is that as I create the relationships it automatically adds a copy into the data model for me I don't have to manually do that so now having got that in what I want to do is go back to my label column and up here I have an option called sort by column sort by column and what I'm saying is when I've got the label column I want you to actually sort it by the value by the sort by column okay click on it okay and then hopefully when I go back into my spreadsheet you can see I've now got these in the order I think I would probably want them to be displayed in and just very finally from my table here I just want to insert a chart I'm going to go for the line chart that's just a basic line chart and what you can mess around with this as much as I like but one of the things that can make it look quite nice is if it's not too bulky you can add the data onto the bottom and then this table tells you everything you need to know and the only final thing I want to do is we haven't used this little filters option up here so going back to to region I'm going to put in there the region label area name and when I said oh no another create a relationship so again this is always going to be the same one here I'm looking for region at the bottom there related table is going to be region and I'm interested in again the value because that's what they have in common okay so that goes away and that's alright and now I've got the label area name up there and the final thing I want to do is to insert a slicer for that final region label area name click that which I'm going to put down here it's sort of a little bit bigger various things can do with slicers but I wanted to show you this because this is currently all coloured in indicating that I've got all of I'm showing all of the data on this table on this graph but I can do them individually if I want to to show the various health status against age for different age groups or I can put them all together if I wanted to okay so that's our little dashboard which is now I think the end