 Okay, hello everybody. My name is Stephen Farrell. I'm going to be leading this webinar. I'm going to talk for about 30 minutes or so. And what I'm going to be doing is taking you through a couple of data sets which I and some colleagues collated. And so just explaining a little bit about the kind of thinking behind that project which led to the creation of these data sets and telling you a bit about what you can expect to get in them if you download them from the UK Data Archive where they're lodged as SN7875. And some of the analyses that you might wish to do with them. Okay, so I'm going to give you as a moment ago an introduction to the wider project and I'm going to explain a little bit about the original data sources. Then I'm going to explain in more depth the data sets that we've constructed from those original data sets. And then what I'm going to do is spend a little bit of time talking through some of the analyses which one could undertake using those data sets. And then think a little bit about some of the extensions that one could make to those data sets and then explain how one can get hold of and use those data sets that we've lodged at the data archive. So these two data sets were produced using data from the British Crime Survey which is now called the Crime Survey for England and Wales and the British Social Attitude Survey. And we put them together really as a way of allowing us to assess the impact of what one might think of as being that's right social economic policies on crime in the period after she's left office. So the research project which was conducted with social policy analysts and political scientists as well as myself really drew upon ideas from political science in an attempt to bring kind of fresh thinking into criminological debates on the relationship between social economic change and social policies and crime. Now we did all of that because our earlier analyses had really relied on data that was collected at the national level so things like GDP and Gini coefficient recorded crime trends. And that was all well and good but what we wanted to do was to explore the way in which the shifts that took place in the 1980s and 1990s and in the period since were also kind of related to attitudinal shifts so the way in which social attitudes were either shaped by or shaped thinking about social and economic policies but we also wanted to explore subgroup experiences and in so doing incorporate self-reported data. So the first step was to review all of the data sets that we could easily get hold of and that was a very short project funded by the Economic and Social Research Council over the summer of 2008 and then we set about thinking about the best ways of interrogating those data sets so that we could get get to the answers that we that we were looking for. Then afterward reviewed the data sets that were there and kind of gone away and written up some sort of basic analyses using national level data. We then got another award from the Economic and Social Research Council and that enabled us to spend a couple of years bringing together, collating all of the data that that we had good time series runs on from the British Crime Survey and the British Social Attitude Survey and then that was lodged as I said earlier at the UK Data Archive as study number 7875. So the data preparation for the collation of of these data sets took around about 18 months and that was partly because we needed to go through and check that variable names were consistent and of course they weren't that the values were consistent and of course sometimes they changed from being a one to four scale to a four to one scale as it were and to check question wordings consistency in those over time and just anything else that might get in the way of doing analyses over time comparing trends in those data sets. Now of course there were changes in the survey designs over time as well that's particularly true of the British Crime Survey less so of course for British social attitudes but the word changes in the survey designs and that's one of the things that you need to bear in mind when you're analyzing the data that we've collated on from the British Crime Survey for England and Wales. This is in effect given that it goes back to the early 1980s in the case of both of those surveys. Historical data, it's data that what was collected contemporaneously but which we are kind of analyzing or have been analyzing in order to tell us something about the past and about how we got from the past to where we are today and in that sense it has all of the problems that one normally associates with historical data missing values which can't be recovered for example. So as I've already indicated the data sources which we were lied on were the British Crime Survey which started in 1982 now known of course as the Crime Survey for England and Wales and the British Social Attitude Survey which started in 1983. If you download the data sets from the UK Data Archive what you'll find is as well as those two data sets you'll find aggregate data sets from a range of other official sources and these were just variables that we were interested in in our attempts to model fluctuations in criminal justice system actions as it were and crime trends and the relationship between those and other factors. So there's a whole raft of different data sources there which might throw sunlight on crime. What you won't find I'm afraid because there just simply wasn't time to collate all of that data. Were any of the BCS booster samples which have been collected? So for some considerable time the British Crime Survey had booster samples of for example ethnic minorities or additional samples of people aged 12 to 15 who normally fall outside of the typical survey age range. We weren't able to collate in those simply because they weren't of sufficient interest to ourselves and so therefore given that we had limited resources we focused really just on the main surface. So the data sets include a whole range of variables. Obviously the British Crime Survey ones are principally focused on things which one would be interested in if one was explaining crime. So things like the fear of crime, victimization, perceptions of antisocial behavior in the local area, perceptions of the effectiveness of the criminal justice system and so on and so forth. There are some additional interviewer assessments for example whether the areas in kind of a good state of repair or whether there are houses that are in an uncared for state those sorts of things. But the bulk of the British Crime Survey data obviously comes from the respondents themselves. When one turns to the British Social Attitude Survey there is as one might well imagine a whole series of survey questions which really attempts to measure people's attitudes on a number of different topics. As well as that there is data on voting, political engagement, trust, newspaper readership, those sorts of things and both of those data sets include all of the usual kind of socio-demographics, age, housing, tenure, gender, region of respond that was living in at the time of the interview and all those sorts of things. We've also included all of the original waiting variables so that if people need to wait the data for various things then they have that to hand. And as I said earlier there's a whole kind of range of other official data on recorded crime etc. So our philosophy from the very beginning is of course that this data isn't ours. I mean it isn't ours in any sense really. We downloaded all of this data from the data archive ourselves as individual that is to say annual files for the survey. And what we've done is just collated runs of questions that are repeatedly asked. So it isn't ours. It was never our intention just to keep this fantastic resource just to ourselves. So that's why it's been lodged for the data archive in the hope that other people will benefit from it as much as we have. So the data is there to be used for any analyses which you may wish to perform. Obviously you need to bear in mind all of the health warnings which I'm going to touch on in a minute. But other than that the data is there for people to use any way they wish. If you or any of your friends and colleagues want to use the data. So I would encourage you to download it direct from the UK data archive rather than sharing it amongst yourselves. Because this means that the UK data archive can get much more precise data on who's using it and what they're using it for etc. Which of course is invaluable for the data archive themselves. So the health warnings. So I'm afraid to say that we don't have the resources to support users of the collated data sets after the end of the research project. The research grant that we were working on to collate these data sets ended about 18 months ago. We are still most of us working together on another project analyzing the birth cohort studies from 1958 and 1970. But I'm afraid to say we don't have time to respond to queries about these particular data sets. Unless you happen to spot something that you think we may have done incorrectly. So I'm afraid to say we can't respond to queries about the original data sets because we weren't the ones who designed them. And another thing to bear in mind of course is that all data sets have limitations that they're limited by what the original investigators were interested in back in the early 1980s. And then of course in some respects you're limited by what we were interested in. So we've mainly focused on the data relating to crime or that we think might tell us something about crime trends over time. So in some respects this is kind of filtered twice by other people's interests. The sampling as I mentioned earlier for the British social attitude is fairly straightforward. It's rather more complex for the British crime survey and probably what you'll need to do is to read the survey reports provided to the home office or the office for national statistics by the survey company for the years that you're particularly interested in because the sampling varies from year to year for the British crime survey. Inevitably there are some missing variables that is to say variables that we wished people had thought to ask in the early 1980s and since or variables which for some reason just don't get asked at a particular sweep. They're very few and far between but there are examples such as that. What one also has to bear in mind is that the counts for victimization in the British crime survey have been capped so there's a limit to the number of repeat victimizations that an individual could could report. How is the data structured? Well what we did when we were merging together all of these different data runs and it's important to say at this point that there's one file that contains the British crime survey data and there's another separate file that contains the British social attitude survey that they're not being brought together into one file but what we tried to do was to keep the data at the individual level so each row if you want to think of it in an SPSS or stator file sense each row is an individual who has completed the survey at some point and so looking at the the table at the bottom of the screen this is a screenshot from the stator version of the BCS and you can see the first variable on the left is the simply gives you the name of the sweep the year that the data relates to because of course for crime survey it's about the previous year. The source file which is the name of the file that you get when you download the data straight from the UK data archive and then a couple of items there from the actual survey itself the first question is about whether people feel safe walking around in the dark on their own and the second question which you can see slightly clipped at one side is about whether people worry about being burgled so all of that data comes in that kind of format and then has the demographic data appended towards the the end of it so that's that is to say on the on the right hand side is the screen as you're looking at it we've had to collapse some variables together and so what we've tried to do there is to provide different versions of the same variable so if you look there you'll see that there's two versions of tenure so the degree whether it's rented or whether it's rented from social landlord or non-social landlord when it comes to things like income for example we've had to do something slightly different here because we were looking at data that encompassed the best part of 30 years raw income wasn't in itself particularly useful because it was inflated so much during that period so income has been recoded into a standardized variable for every individual which puts them in the top or the middle or the bottom quartiles from the interquartile range within the year in which they were interviewed the data also includes a whole range of questions and this is again another screenshot from the British crime survey on worry about crime victimization, attitudes and punishment and that's fairly fairly straightforward to to get your head around when you when you start to explore it the data as on victimization includes both dichotomous that is to say yes or no they have been victimized and also questions about the frequency with which they've been victimized so if you look along the line there and in the middle of the screen you'll see car damage which is whether the car's been damaged and you'll see that's a binary it has yes or no below that and then immediately to the right of that is N car damage which is the number of times that individual's car's been damaged and you can see for the first couple of people it's just once and for the people towards the bottom of the screen they've had their car damaged twice we also produced summary measures of all sorts of victimization so simply whether somebody had been a victim or not the number of times they've been a victim whether they've been a victim of a car crime or property crime those sorts of things so that that data is appended for you there as well turning now to the British social attitudes survey that again is structured in in very much the same way so again we've retained things at the individual level but with a code for the year that they were interviewed so it's very simple therefore with recoding to take kind of averages for a particular variable into a year and then to compare averages to our answers on on a particular question year against year against year against year so you can use this to set up if you like time series runs of kind of average attitudes about a particular topic unlike the British crime survey which doesn't include any questions about social class the British social attitudes includes a lot more measures of social class although these do vary over time and of course a lot more attitudinal questions the the British crime survey doesn't really explore wider social attitudes at all you really do need to go to the British social the British attitudes sorry social attitudes survey to get attitudinal data even though some of that will relate to crime itself so what kind of analyses are possible using this dataset so bearing in mind it's all being collated at the individual level all of those things that you would typically associate with individual level analyses are possible from things like linear or logistic regressions negative binomial or Poisson regressions for things like victimization factor analysis are possible so a path or structural equation model diagrams and of course t-test cross tabs anything else that you would normally do at an individual level test you would perform an individual level data is is is possible using this dataset because it's been retained at the individual level what one of course can also do is to collate that data into groups and so here is data from the British social attitudes and what we're doing is we're comparing the Registrar General's classification of social class so if you look at the bottom of the graph you see that the black line is the top three social classes and the dotted or dashed lines I'll get my words straight in a minute the dotted or dashed lines are the lower three social classes and then this is their average responses to a question on whether people who break the law should be given stiffer sentences and you see that actually the two different groups of social classes kind of mirror each other over time their fluctuations very similar to one another but that they start at sort of different points and this is referred to in the political science literature as parallel publics moving on from that level of analyses or that kind of style of analysis what one can also do of course a rather more complex analysis so we've been involved in analyses or undertake analysis rather looking at age period and cohort effects in terms of attitudes or in terms of worry about crime there's also conditional formatting which is quite fun I'll come on and show you a bit about that in a moment and then of course there are other things that you can do multi-level modeling is possible I suspect that there are probably too few regions to use because there's only 10 or so in most of these data sets you may be able to use lower level output areas for some of the more recent British crime survey or crime service England and Wales data sets but we didn't append those markers to this data set because we were kind of looking at analyses back to the early 1980s and of course it wasn't available then so you might be able to do multi-level modeling with years I don't think you can do it with regions because there were probably just too few the other thing that you can do of course is to explore what would normally be quite rare populations in any survey so for example male victims of domestic violence which you may find very few in any one any one year survey of the British crime survey what we might be able to do is to pool up or to aggregate those individuals across a number of different years and then use them to undertake more robust analyses that the robustness coming from the the larger number of cases so if there are rare populations that you're interested in then this might be one way of unpacking or exploring those in a little more detail so we've used data from the British crime survey to explore the degree to which anxiety about crime is related to popular discourses which were floating around as it were all being espoused by politicians as at the point where one starts voting and found that that people's fears seem to be a product of the kind of discourses that were circulating as they as they sort of came of age politically we've also used the same kind of analyses technique HPL and cohort analysis techniques to explore the extent to which generations hold attitudes which are more in keeping with kind of thatcherism as it were and that's been published in a paper in the British Journal of Political Science and the lead author is a colleague of ours Maria Grasso what I come on now and talk to talk about conditional formatting so this is a way of getting a good kind of glance at different trends in the data set which you may wish to kind of start to unpack and explore a bit more kind of systematically using more widely recognized statistical techniques so this is a plot of the answers to a question from the British social attitudes about whether the gap between what rich people and what poor people have is too large so higher values that is to say people agreeing with that are the the cells in red and the way to read this is that the columns are the years and the rows are individuals of that age in that year so if you like this is a synthetic cohort analyses so if you look at the first cell in the top left hand corner you've got people who are 18 in 1983 and then if you want to follow that cohort forward well they're 19 in 84 20 in 1985 and so on and so forth along so a cohort if you like should appear as sort of a diagonal running from top left to bottom right now what you can see here is that as a country we felt fairly neutral about the gap between rich and poor people from really the start of the run 1983 until early 1991 then in 1993 there seems to be something of a sort of a change in that attitude right across all of the different generations and that becomes kind of particularly pronounced amongst those people that are sort of in their early 40s in the mid 1990s 95 97 those sorts of things what one then sees in 1998 is a sort of dropping away of that concern as I suspect people think that well now labor we're in we're going to see them do something about economic inequality but then as it becomes clear that labor are less interested in tackling economic inequality than people had initially suspected one sees a kind of a band running from the the middle of the screen down towards the bottom right hand corner isn't it's a it's a faint band but it's discernible sometimes it helps if you make your eyes go slightly out of focus when you're looking at these things and you see a kind of a warming up I was to say a increasing number of cells that are becoming red in the bottom right hand corner what's interesting is that above that for those those people who are around about 18 in 1998 and the years since is we go back to that kind of yellow color where people are much less concerned about levels of economic inequality so conditional formatting is a good way of both getting a handle on some of the general trends in the dataset but also it's quite a nice visual technique for showing to people the sorts of trends that that might be hidden away in the data other analyses that are possible are those things that kind of aggregate up data so as I mentioned earlier you could aggregate up attitudes into years for groups of people or whole survey sweeps and then use that to undertake time series models modeling one could also undertake that using structural equation models we've undertaken dynamic factor analyses in a paper that was published in governance and the lead author for that was Will Jennings and of course there's also things like latent growth modeling now the data that we have archived runs from 1982 for the British Crime Survey or 1983 for the British Social Attitude Survey through until 2012 and there are of course a number of possible extensions that other individuals may wish to make to these datasets so one could extend the run of the years by appending some of the variables from data collection sweeps after 2012 to the data that we've collected there's been of course another four or five years runs of data at least made available so that could be appended we could also or one could also include variables which we didn't collect there were some attitudes in the British Social Attitudes for example which we didn't collect which one could go back and collate and put into the dataset that that we've produced or of course one could add other aggregate level variables from other datasets or from other data sources so for example if you're interested in woundings or near-fatal criminal incidents then one might be able to get access to NHS data from A&E wards and and put that in by year or something like that so there's all sorts of other things that one could do with the dataset in terms of adding in variables that we didn't collect or adding in additional years that we weren't able to collect because the data wasn't yet available then of course what one can also do is to create new variables from the from what's already in the dataset so for example you might be interested in here I've kind of picked one at random single people without a car do they experience different types of victimization or do they have different kind of attitudes compared to other people that have cars or other people that are single so you can create new variables from what's in there and and then kind of use those to explore the things that you're particularly interested in if one wanted to one could go back and do the one of the things that we didn't do which was to collect the ethnic minority booster samples from the from the British Crime Survey data run there are some crime surveys in Scotland of course the first British Crime Survey was properly a British Crime Survey and they included data from Scotland but the crime surveys in Scotland have been few and far between in the years since so that might be might be a challenge but certainly one could do it and of course one could do similar things with other datasets we've just explored the British Social Attitudes and British Crime Survey but of course there's all sorts of datasets like the labour force survey or the general household survey that one could do similar things with in a UK context or one could do similar things with in in other countries using the general social survey in America for example so one of the potential uses well one of the obvious ones of course is data analysis and this is we think a set of data that could be used really at kind of every level it could it's allows sufficiently sophisticated analysis that one could use it at post-grad level either for PhD theses or for masters theses i've used some of the data for teaching purposes when i'm teaching people about designing survey questions at undergraduate level but of course there's lots of people working in q-step programs who may wish to use this themselves and then of course if you're in the business of training non-academics in statistical techniques then this dataset should enable you to teach people lots of different techniques in datasets that are fairly easy for them to to get hold of and of course you have to bear in mind that some of these trends were picked up by the people who went on to design the European Social Survey of course the European Social Survey was initially run by the late Roger Jowman of course it was Roger who played a very very big part in the setting up of the British Social Attitudes survey so there are some questions that map straight from the British Social Attitudes on to the European Social Survey and so there were some some trends that could be explored there the thing that you have to bear in mind with of course with the European Social Survey is that i think i'm right in saying this there's now only 10 years worth of data and so you won't be able to do quite the the kind of analysis that we've done but certainly there's a lot more data in some respects because there's a lot more countries and a lot more respondents but there will be of course some countries that aren't in particular years or some questions that that rotate with their rotating modules but certainly the core questions for the European Social Survey could have this kind of collation done to them and then used in that sense okay in terms of getting the data we lodged the data in early 2016 with the UK data archive and the the study number is there on the screen for you it's SN 7875 and that's free to download you need to register with the UK data archive if you haven't already but that only takes a few minutes and then you can download it it's available to download in status and in SPSS formats and i think possibly SAS as well and so it should fit into whichever stats package you use of course they've become much much easier to to translate across from one another if you do use it and you want to cite the award the ESRC award that the data was based on it's there for you on the screen similarly you may also wish to cite the very short paper it's only 10 pages long which those of us principally involved in the data collation wrote and it's published in the British Journal of Criminology that's because it's ESRC funded work open access you can get that for free anywhere in the world by going to the British Journal of Criminology and that gives some of the thinking behind the project and some of the thinking behind the data sets and again a little bit more sort of contextual information about the wider project so if you do go as far as downloading the data set what you get is of course the individual level data sets and as I said a moment ago you get status and SPSS versions there's also an access spreadsheet for which you can use as basically a kind of a code book and what you also get is the original study numbers so that if you need to you can go back and explore the original questionnaires what you don't get are the questionnaires and the survey reports and the technical reports and all those sorts of things that the original data sets would have had if you downloaded them individually so if there's a particular set of survey years that you're interested in what you might want to do is to download those questionnaires so that you know exactly which order the questions were asked in etc etc those sorts of things because of course as we all know there may well be question order effects in terms of the actual amount of data the British Crime Survey well now Crime Survey of England and Wales you get around about 600,000 respondents we were fascinated when we did a very simple frequency on the age of the respondents sorry the year of the respondents birth and discovered that there was somebody in the British Crime Survey must have been one of the very early sweeps who was born at some point in the 19th century so this is a British Crime Survey in some respects therefore sort of spans parts of three centuries rather than just the one one or two that we normally associate it with you get around about 200 variables from the in the British Crime Survey data set of course in some cases not all of those respondents provide data for all of those 200 variables so there are some questions that weren't asked early on that become if you like core parts of the British Crime Survey later on and of course there were some questions follow-ups to victimization questions for example which not everybody gives answers to because not everybody was a victim now the number of sweeps that you get between 1982 and 2012 is I'm afraid to say 20 rather than the 30 that it does add up to and that's because the British Crime Survey didn't run annually when it first started it first ran in 1982 then again in 1984 there was a holiday as it were in 1986 and then it ran again in 1988 and then 1992 and subsequently sometime after that in the late 1990s it then becomes the kind of an annual survey but the 30 year period is contained within 20 different sweeps of data of course in the five years since then there've been another five sweeps of the British Crime Survey the British social attitudes is a bit of a tiddler when one compares it to the British Crime Survey there are only around about 90 000 respondents contained in the datasets that we've collated and that comes to around about 120 variables when one includes all of the socio-demographics and information about whether people were in receipt of benefits and so on however that dataset encompasses 28 of those 30 or so years the only years that are missing are 1988 and 1992 which are the years that the survey didn't run in terms of the aggregate level data which again may be useful for people undertaking time series analyses there's a whole range of sort of socio- demographic data there's data there on housing repossessions number of children in care, divorce rates, economic inequality all those sorts of things and so there's kind of quite a bit to play with there's also some data from parliamentary questions and public and policy agendas too. If you're interested in the actual research project that we're undertaking the web page for the project is there it's based in the School of Law at the University of Sheffield and so if you follow that that link you'll eventually get to a page devoted to the research project where you can download some of our journal articles and download various presentations the PowerPoint slides and various presentations which we've made in the past. We run an email newsletter which won't be clogging up your inboxes because we do it fairly infrequently we do it when there's kind of major things to announce if you'd like to join that email newsletter just drop me an email on that email address and I'll add your name to it. We also Twitter which is actually quite good fun and our Twitter handle is there and again if there were major announcements relating to the project or anything to do with British politics that we can link back to the 1980s or 1970s at some point then we generally tweet about that there. Okay so that's about it from me other than to thank you for listening and to hope that you do take the time to download the the dataset between us lodging it in February 2016 about a week or so ago when I last asked the data archive to check there've been 46 downloads of the dataset which I think is actually a relatively high number for for a dataset particularly one that's as complex as this because it's it's not necessarily for the faint-hearted so there certainly seems to be some appetite for datasets like this and I hope that you've found this useful and that you you do indeed use the dataset. If you do use the dataset be it for research or be it for teaching purposes if you could drop us an email and just let us know that would be absolutely fantastic we're quite keen to find out how people have used it and so any feedback on that would be great for you received thank you very much