 Hi, I'm Curtis, I'm a research director working at the National Centre for Social Research, where I work on our longitudinal surveys, but I also in particular work on the NATSEM panel. What I'm going to talk about today is how we've developed this approach, the methodology involved, some evaluation of the sample quality it produces and some of the measures we're starting to introduce to try and develop it in the future. It very much falls within sort of a lot of what Patrick and Joel have been talking about already in terms of this sort of push-to-web design. So hopefully you'll see a lot of common themes that I'm going to address, but it's going to be from a slightly different angle and a slightly different perspective. So before I sort of get into things proper, I'll give a little bit of a bit of background and context. So firstly, what is the NATSEM panel? It's the first probability-based research panel in Great Britain that is open to be used for data collection by the social research community. So in the sense that it's random probability, it's unlike the U-govs, the populaces, the ICMs, the light speed panels that have been mentioned earlier on so far today. As it stands, it has around 3,500 people in it, which will translate into around 2,000 people taking part in an interview at any given wave. That will shortly be increasing to around 5,500 as we're doing some fresh recruitment at the moment. It's designed to be representative of the adult population and allow researchers to produce reliable estimates of people's opinions and behaviours in a shorter timeframe and at a lower cost than the traditional probability-based approaches currently available. So to put that kind of traditional approaches in context in Great Britain, and as Patrick was talking about earlier, the sort of gold standard for social research surveys is to conduct them face-to-face. So randomly selecting households, then randomly selecting individuals within them, and having trained interviewers making multiple visits to households over feel-work periods, sometimes spanning months to try and maximise the likelihood of contact and getting people to take part, and therefore minimising non-response bias. However, although this approach provides really high quality estimates, the lengthier feel-work periods and the costs of paying interviewers may not always be appropriate for any given project. Research budgets may simply not stretch the cover of the costs of face-to-face feel-work, or a project may wish to respond quickly to events as they are unfolding in the real world. Further, as response rates and feel-work costs continue to rise, again, we've talked a little bit about this already, alternative methodologies have become more appealing. For example, web panels and RDD surveys have really found popularity amongst the market research industry, but there are still unanswered questions about the quality of these alternatives. The use of self-selecting samples, quotas and very short feel-work periods can lead to hidden biases in the samples, which won't necessarily be adjusted for the very basic calibration weights on sex, age, region that tend to be used by those kinds of survey providers. And also, web feel-work excludes a sizeable group whose experiences and attitudes of society may be very, very different to the rest of the population. There's also evidence these theoretical concerns about these alternative approaches translate into practical ones. Searching the British Polling Council's report in 2015 into the polling myth concluded that that was a result of the unrepresentative samples, and a number of studies have demonstrated that probability sample surveys are consistently more accurate than non-probability surveys. And it's within that context that Joel alluded to earlier, we're assuming that random probability is the best methodology, and that's the context we're working with. Finally, although probability-based panels do exist in Great Britain, for example, understanding society, these are mostly very, very large scale, and therefore still relatively slow and expensive, or dedicated to specific research projects, rather than being open to the research community as a whole. Internationally, however, this has not been necessarily the case. And in recent years, a number of open probability-based panels have been developed, for example, the Jesus panel in Germany, or Amerispeak in the United States. And although these vary in their specific methodologies and their specific approaches, they have demonstrated the feasibility, I guess, of setting up something like this, and maintaining a probability-based infrastructure that can be open to the social research community as a whole. So it was in that research context in 2015 that the Joseph Roundry Foundation, having already considered existing non-probability online access panels, arthnatt sent to propose an approach to establish the feasibility of a bespoke panel, which has a high-quality probability-based sample at its core, which they would then be able to use to explore the attitudes of people living in poverty in a quantitative manner. This feasibility study ran a total of six surveys, from August to November 2015, ranging from short web-only polls with a one-week fieldwork period to larger 15-minute surveys with web and telephone fieldwork, lasting for around a month, as well as establishing the feasibility of setting up this panel. This study, therefore, aimed to experiment with different designs to investigate the best approach for maintaining a panel in the longer term. So, for example, we looked at the impact of different wording of the recruitment question, comparing the impact to different levels of incentives, so offering people no incentives or one-pound donations charity, five-pound, ten-pound incentives, and as I think we've talked about a little bit earlier, we found that the five-pound incentives were obviously much, much more effective at getting people to take part, but the impact diminished as you increasingly increased the amount. We also compared the impact of the number of reminder letters and emails and text messages that we sent out. For example, we found that if you concentrate your number of reminders really, really close to the start of fieldwork and spread them out over the whole fieldwork period, actually the final effect on response is the same, given the same number of invites, but all it does is shift the pattern of response across that fieldwork period, and that's really important if you want to get people to keblit very, very quickly, or if you actually rather spread things out and spend less money on sending out those reminders. We also had a look at the role of a telephone unit and the relative impacts of not using telephone fieldwork include using our telephone interviewers to prompt people to take part online and using our telephone interviewers to encourage people to take part over the phone. Following the conclusion of this feasibility study, Natsent took the decision in early 2016 to maintain this sample of people to open it up and expand the panel, identifying that basically this was, we felt this was a useful piece of research infrastructure that we wanted to make available in Great Britain and this was a really good opportunity to create that and make that available. So following that decision to maintain the panel, we then developed a standard methodology based on the findings of the feasibility study. What I'm going to do now is outline that approach and this is our standard approach, so it's fair to say that we do deviate from this occasionally, certainly if we have particular requirements for a particular piece of research or actually when we're doing our own experimentation or innovation about how we want to run things. The first question we needed to address when setting up the study was how to recruit people to the panel with the following three key goals. We wanted to make sure we had a random probability design to avoid the biases of self-selection and convenient samples and enable the application of common statistical tests such as confidence intervals and significance tests. We wanted to have a high recruitment rate to minimise the non-response bias but also provide larger sample sizes for the analysis of subgroups. And we wanted to make sure that we kept costs relatively low in order to make sure that the infrastructure at the end of this was still affordable and accessible to as wide a group of researchers as possible. We initially considered fresh recruitment and that was the approach that's been used by the GESIS panel in Germany. But fresh face-to-face recruitment was considered too costly and we were concerned and we also looked at telephone, paper and push-to-web approaches, but we were concerned that having that relatively low upfront response rate would initially bias the sample and also it has some of those issues around self-following of protocols for self-completion modes that Joel talked about a little bit earlier on. We therefore decided to use a piggyback approach, recruiting panellists off the back of an existing face-to-face survey, in this case the British social attitudes survey. That approach had recently been used to develop Pew's American Trends panel in the US where they'd followed off the back of a large national telephone survey and has since been implemented in the development of the Cross National Online Survey Panel or Cronos at City University. So the British social attitudes survey is a probability-based face-to-face survey of people aged 18 and above in Great Britain and that means that households and individuals are selected at random and considerable effort is expended by our field interviewers to achieve an interview including visiting that address multiple times. By recruiting from the BSA survey, we were able to achieve those three goals of maintaining a probability-based recruitment design, produce those higher recruitment rates compared to if we had used alternative non-face-to-face modes and keep our recruitment costs low because the only costs that we had were the marginal costs of adding additional questions and the actual fieldwork costs had already been covered by the original study. It also allowed us to obtain a really, really large, rich amount of data, a background data about our participants. So a 45-minute face-to-face interview about people's attitudes, behaviours for a range of different areas of life in Britain is really, really useful for us to have. Those interviewed as part of the BSA were asked to join the panel at the end of the interview and those who agreed were asked to provide contact details. Interviewers were briefed and provided with additional materials to help them answer any questions that participants might have to make sure that they were getting informed consent. Those who agreed to join the panel will then send a further information leaflet and letter confirming that they had joined the panel, providing more detailed information on what taking part would evolve and, again, giving them that option to opt out in case they wanted to. From that point, sampling is fairly straightforward. At any given fieldwork way, following that standard design, we will always issue all of our panel members out for fieldwork or, alternatively, a random subsample of that group. That allows us to maintain the random probability design, even though it's not necessarily the most efficient thing. We're not using quotas, we're not using that, and we're keeping that high-quality element at the core. In terms of questionnaire development, in line with best practice guidelines, surveys run on the Natsyn panel typically last for around 15 minutes. We talked a little bit earlier about what's an appropriate length of interview for an online survey and we think around 15 minutes is about right, but, obviously, that will vary by mode and it will vary by individual circumstances. Although we have extended it, we do try to avoid making the surveys much longer. The reason is simply to minimise the burden on our panellists and any negative effects that may come as a result of that, be that on the date of quality or in longer-term panel attrition. Were a longer interview required, we would typically recommend splitting that out across multiple waves and that perhaps goes back to Patrick's point earlier about fewer or more shorter surveys and high-frequency surveys, but actually that's not something that anyone's taken us up on yet. People seem to prefer to just cram more of much as they possibly can into one survey. The questionnaires are broken up into modules with questions from multiple research projects asked in one survey and that allows us to be more efficient in terms of the number invite letters phone calls incentive sent per question asked and it also makes sense for us logistically. We don't have to manage multiple live web surveys for the same person simultaneously and it also makes sense for the respondent. They don't have to log into multiple surveys simultaneously. You have multiple different reminder letters for different projects which can all get very, very confusing. Due to its mixed mode design, a key consideration is the questionnaire design and the questionnaire development is addressing the potential for mode effects including how questions may appear on smartphones as well as the difference between web and telephone fieldwork as a whole. The approach to this will also vary depending on the question in its background. For example, is it a new question or is the aim to make it comparable with something that's been asked in the past? We might want to take a mode optimization approach, design different versions of the question so that they are optimized to a given mode. For example, we might not use grids on a smartphone but continues to use grids on a normal PC survey. Alternatively, we might go for a more mode neutral approach which would involve designing the question in a way to minimize the different effects between different modes. So for example, if we asked a question, a scale question on the telephone and on the web, we might remove those visual cues from the web version of the questionnaire to try and keep them as similar as possible. Where appropriate, we also use other techniques to address the impact mode effects. For example, we often try to randomize or reverse answer option orders given that primacy or recency effects might interact differently with different modes and by randomizing them we can try and minimize the effect on the data overall. So as I touched upon there, the NATSEN panel employs a sequential mixed mode fieldwork design which lasts slightly over four weeks. At the start of fieldwork, all of our active panel members are sent a link to a unique and unique login to the survey and are invited to take part online. After two weeks then, all of those who have not at that point taken part in the survey online and for whom we have a phone number are issued to our specialist telephone unit to follow up by phone and either support them in completing online or to do the interview over the phone. As with the typical kind of face-to-face fieldwork approach, considerable effort is put in by a telephone unit to again get these, pull these people in. We call them a minimum of six times, making sure we're varying the times of the day and the days of the week to maximize our chances of contact. So by employing the sequential mixed mode design within a four-week fieldwork period, we aim to really strike a balance between quality, maximizing quality and maximizing efficiency. By issuing all of our cases to web first, we maximized the number of cases completing online, therefore minimizing our telephone interviewer costs. The four-week fieldwork period balances between allowing all types of people an opportunity to take part, so minimizing the bias between early responders who are relatively time-rich and able to respond to surveys quite quickly, while still providing data in a timely fashion to those people who want to respond quickly to unfolding events. Following up then with the telephone fieldwork helps us to boost our overall response rates, but crucially allows us to include in our survey sample those people who are not comfortable with or do not have access at all to the internet. In terms of reminders and the additional effort we get to push people online, once fieldwork begins, pan lists are contacted multiple times to provide them with the required information and encourage them to take part online. Multiple modes of contact are used, so we use letters, emails, text messages, and again on different days of the week in order to maximize the chances of reaching people. Again, this is based on findings from our feasibility study that we found that using emails and text messages and letters, each independently reached different groups of the population, different types of people, helping to overall improve the representativeness of the sample. We send these during the first two weeks of fieldwork to maximize the number of people taking part online. So again, going back to that feasibility study finding earlier on, that if you compress your reminders early on, you end up on the same total amount that they take part earlier, that's what we employed here because we wanted people to take part as quickly as possible, so we didn't have to spend time and money phoning them up. As well as these, panellys, all participants are also sent a £5 voucher as a thank you for taking part. As well as these, we also send feedback in the form of summary findings, so interwave mailings every maybe twice a year or so, as well as feedback through our website where people can find out the latest impact that we've had. We do our best with those, again, to make sure that those communications are designed to be informative and emphasise the importance of taking part, but trying to avoid otherwise influencing people's behaviour or suggesting that there's any kind of preferred option or bias in our study and the project as a whole. Finally, as with any survey, despite the efforts we do go to, the Natsyn panel will suffer from non-response bias. So once the data are collected, we do need to adjust to this. So in our design, non-response can occur around four stages. Non-response bias can occur at the point that people are recruited to the BSA survey. Non-response bias can occur when people say that they don't do or do not wish to join the panel. They may then leave the panel once they've already joined or they may just refuse to take part at any particular survey wave. To account for the non-response at each of these stages, we compute a weight to adjust the sample to look like the population. And one of the key advantages of recruiting people from the BSA is that all of the panelists have a large amount of consistently collected background information and that's really, really helpful to allow us to model and adjust for non-response at each of those stages very, very effectively. And this compares to, for example, one of the issues with non-probability samples is that their standard calibration weight, sex-age region, aren't necessarily very good at explaining and understanding the non-response and the bias that's occurring in their samples. So that's a summary of our standard or our core design. What I'll look at now is how that approach has sort of panned out. So overall a total of around 4,000, it's not around a total of 4,205, people out of the 7,270 people who were interviewed as part of our 2015 and 2016 British social attitudes surveys agreed to join the panel, representing a 58% recruitment rate. As I mentioned earlier, we have varied the way that we invited people to join. So in 2015 we ran an experiment asking around half of the sample if they would like to join the panel and the other half if they'd just be interested in taking part in further research or follow-up studies. This chart shows that BSA participants were substantially more likely to agree to be contacted for follow-up studies than specifically if they were asked to join the panel. However, this really large difference is somewhat offset by the fact that participants that agreed to be contacted as part of the panel are significantly less likely to subsequently attrit from the panel and are significantly more likely to actually take part in the surveys once they've been recruited. Looking firstly at attrition, as with all longitudinal samples, the NatsM panel is subject to panellysy either deciding that they wish to leave the panel or for some reason or other becoming ineligible, for example, if they die or if they leave the country. As of July 2017, we had a total of around 3,666 people still members of the panel giving the overall attrition rate of 13%. And actually interestingly here, the chart sort of gives an indication that attrition doesn't really tend to slow as the project's progressed. This only goes up to July and actually it has started to plateau a little bit. But actually attrition seems to be very much an ongoing issue and an ongoing effect more so than we see in other potentially other longitudinal studies. And we think that's a reflection for the sort of high levels of contact, high levels of engagement that we have with our panellists. So they have more opportunities to say that they don't want to take part anymore. But what this chart also shows is that the attrition rate varies by the invite groups. With those recruited to the follow-up studies, so the pink line at the top, far more likely to leave the study than people who were asked to join the panel also in 2015, which is the orange line below or the blue line in 2016, which obviously needs to be shifted back along if you want to compare them. As well as this, the recruitment rate difference is bound somewhat by the differential survey response rates. So when we talk about response rates, we use two different things to track our field work. The first is a survey response rate, which looks at the proportion of all participants who are invited to take part in a survey that do so. But secondly, we use an overall response rate, which goes all the way back to the original BSA sample frame and looks at the proportion of participants eligible for that initial BSA interview that actually take part in a given way The former of these is really useful for understanding how response rates are changing between our surveys. Typically around 60% or so, so around 60% of people who we invite to take part in a survey on our panel will do so, and those have remained very, very stable across our waves. This chart uses those survey response rates to show what participants recruited to join the panel consistently have a survey response rate of around 64%, while those agreeing to follow-up studies typically have a lower survey response rate of 53%. Again, offsetting somewhat that different initial recruitment rate. So to see the cumulative effects of all of these different points of non-response, we can use the overall response rates to account for the non-response that occurs at each of the stages of panel recruitment. The BSA participation recruitment to the panel, attrition and non-response at a given survey. These overall response rates are again broadly consistent at ways where our standard design has been used, around 15% or 16% overall. However, this chart also suggests that while the survey response rates have remained broadly stable, they are actually very, very gradually declining by about 0.4 percentage points per wave. But this pattern is not seen in the survey response rates, tells that this is very much being driven by that underlying attrition that I mentioned a couple of slides ago. However, the level of attrition we see is much, much, much higher than would be indicated by the very, very small drop-off in the overall response rates. And that's because around two-thirds of the people who actually do a trip from a panel actually never ended up taking part in any of our surveys. So basically, those people who have left the panel have not taken part in any of our surveys, so therefore they're not actually impacting on our response rates at all. Going back to the variation between the invite groups, this chart shows that while it does vary from wave to wave, the overall response rates for those who are agreeing to be contact for follow-up studies are actually consistently higher for those agreeing to join the panel, suggesting that this may have been the more effective approach and also that we made the wrong choice in 2016 about what invite wording to use. However, this higher response rate really needs to be balanced against managing a panel that is about one and a half times the size of the panel size if we use the alternative approach. That leads to larger costs from sending out more invite letters, more reminder letters, and having to have our telephone interviewers call more people. And actually, we're not actually convinced that this sort of two, three percentage point improvement in response rates is actually having any impact in the terms of the sample profile and sample quality behind that, which brings me on my next slide, which talks about sample composition. So, although response rates are a helpful proxy for sample quality, under the assumption that response rates has response rates decrease, non-response bias increases, they do not actually show if or how the underlying bias sample is biased. So, looking at detail at the sample using the background information that we have collected on BSA, and we can see that there are biases in place. So, we can see, for example, people that take part in panel surveys are more likely to be women, they're more likely to be older, they're more likely to be in managerial and professional occupations, have a degree, and own their own home. But actually, for the most part, these kinds of biases are the ones that we see in all of our survey samples and are really just a continuation of the original existing bias when people took part in the British Social Attitudes Survey. In fact, there are some instances where the subsequent non-response improves the profile, so we're more representative in terms of the household type. But of course, there are some instances where we make things slightly worse. So, it seems to things like social classification, level education and tenure, in particular, stick out of things that the additional non-response that occurs when people recruit to the panel and take part in panel surveys as the sample. However, while these biases exist in the underlying sample, again, the non-response rates are really, really effective at removing this bias from the sample. And when we do compare the weighted population estimates for the panel sample compared to the British Social Attitudes sample, we find that they're very, very similar and it addresses most of these non-response biases quite well. So, while demographics are important, we also try to look at non-demographic variables related to survey outcomes. So, this will often vary from wave to wave. So, if we were doing a survey focusing on health, we might try and understand what bias they might be in terms of whether our sample has a long-term health condition or if we were doing something on political attitudes. We might look at our background data to profile information to see how the sample varies in terms of interest in politics. Over the years, we've also run a number of questions duplicating those asked on high-quality probability samples. And our estimates are really comparable to those. But I think Joel sort of alluded to this earlier. I don't really like using other probability-based surveys as a comparison as a benchmark for the estimates that we're producing using our survey. For a start, it's really difficult to disentangle any mode or sample or timing effects. But actually, it might be that both probability samples have exactly the same underlying biases and we're not really finding anything true under there. We are currently undergoing a benchmarking exercise to find some hard measures that we can validate against. Unfortunately, population figures for non-demographic, i.e. non-census variables, are relatively rare. One thing that we have tried to use and as a reasonable proxy for the sort of attitude in the population is how people vote. So this chart gives us an indication of how the weighted estimates from the panel survey compared to the actual results from the 2016 EU referendum. Here we can see that the panel estimates for the direction of voting are actually very close to the actual figure for the population. But the levels of not voting are significantly underestimated. And this reflects, again, established biases that we tend to see in surveys in that those who take part are more likely to be engaged in society and they're more likely to want to have their opinions heard. Just for a bit of context, the British Election Survey run on the YouGov panel produced post-vote non-voter estimates around 7% for the 2016 EU referendum. So at least this gives some indication that although we're not going the whole way and we're not matching the population perfectly, the additional effort we're going to to pull in people who are disengaged is improving our sample somewhat and making a real impact to the estimates that we're producing. But clearly looking at every single social demographic characteristic every month and a half when we run another survey to try and understand the bias just isn't the most convenient way to work. So to make things slightly more accessible, what we're currently developing is something called an adjusted R indicator which basically produces a summary score between 0 to 1 to show how the unweighted panel sample compares to the population which we assume to be the weighted British Social Attitude Survey estimates. And we use this across a range of social and demographic variables to try and understand the bias in a little more detail. The idea is that this will give us a simple metric that we and actually anyone else using our data can use to monitor the changing sample quality over time in a more informative manner but still alongside the overall response rates. However, what we're particularly keen on doing is using these for more in-depth understanding of how non-response bias occurs. So for example, rather than just applying them to a given survey wave what we can do is apply them to we can understand the levels of we can apply the measures at recruitment to the panel stage we can understand the post attrition we can understand it at the recruitment to BSA stage and this allows us to measure the extent to which the non-response bias occurs at what stage of the non-response process the sample becomes biased allowing us to target our interventions a little bit more effectively. We can also use it to analyse subgroups. So for example, those differences that we were looking at based on recruitment question by using these R indicators we can understand has this difference of two percentage points made a difference on the sample profile or actually the two comparable and it's not necessarily worth the additional money and additional effort to improving them slightly. In this context of focusing on the representativeness of the survey sample certainly since the initial feasibility study we've continued to embed experiments at multiple waves of our field work. For example, the impact of interwave mailings the design structure of our communications and our question design and our waiting design. One area that we've particularly been focusing on more recently is targeted design. So using the survey and parodata that we've collected from previous waves to target our field work approach in a more efficient manner. And this can be fairly simple. So for example, after around four waves of field work we typically have a fairly good idea of the people who aren't going to take part in our survey online anymore. So while we still always invite people to take part online first what we now do is start our telephone field work for that group a little bit earlier which gives them more time to take part in their mode of preference and also it takes off some of the pressure from our telephone unit for when they're working the rest of the sample in that quite short two week period. Another example is that we have a relatively small part proportion of our total sample around 7% who do not provide us with a phone number a landline or a mobile phone number. So our telephone unit obviously can't contact that group. With our standard approach that would essentially mean that after those initial two weeks we just leave them alone for two weeks while the web survey is actually still available which is quite wasteful. So what we're now doing is sending an extra reminder letter an extra reminder email to that group to try and encourage them to take part online during that time. However, we're also using it in a more sophisticated manner to try and improve the representativeness of the sample. So after a few waves of field work again using that rich background information from Aboriginal Social Attitude Survey to understand the biases that exist in our sample which I alluded to earlier. By using that information we can target our resources away from those that are over represented on those variables and move it towards those people that are under represented on those variables increasing their response rates increasing and decreasing their response rates respectively and therefore balancing the sample profile overall. However, we're going to make sure that we do that within the following two constraints. Firstly, we need to make sure that our overall response rates are maintained so we don't want them just to drop overall because if we didn't do that then we could essentially cheat and just take away effort from the people we're over-representing and make it look like everything's better when obviously actually it's not. The approach also needs to be cost-neutral. So if costs weren't an issue then clearly we'd just put way more effort into improving everybody's response rates and that would be great. But this is a targeted approach designed to enable a more efficient allocation of resources to balance the sample profile within a fixed budget. So to implement this, the first thing we did was to model the extent to which an individual sample member had characteristics which are over or under-represented in a panel survey. This model gave all participants a score. Those getting scores of greater than one indicating that they were over-represented and scores of less than one indicating that they had characteristics that were typically under-represented. For simplicity, we then grouped that continuous variable into eight different groups from most over-represented to most under-represented indicating where we wanted to move responses from and to. However, simply moving resources in that way is not actually that efficient. And to make this approach more effective we also incorporated information on past participation. Grouping panel members into those who have never taken part, those that always take part and those that sometimes take part. While these two measures are obviously inherently related they don't quite measure the same thing. By combining the two, we were able to make sure that our resources are being focused on individuals where it's likely to make a difference and taken away from those where it's not likely to make a difference. So for example, we don't want to spend additional resources on people who always take part anyway or those who are never going to take part because it's unlikely to have much of a positive effect. Likewise, we don't want to take away resources away from people who are only sometimes taking part as it's more likely to harm response rates overall. However, taking resources away from those who never take part or are always taking part anyway is actually a lower risk. And this chart basically summarises how we combine those two variables to identify five priority groups. So we've got our highest priority group with the big upwards green arrow, with people who sometimes take part and are most underrepresented. We then have a high group with the smaller upwards green arrow of people who sometimes take part and are underrepresented, but not the most underrepresented. We have the medium group who are either underrepresented and always take part or overrepresented and sometimes take part and so on and so forth. This table summarises then how we've been varying the field work design for these five groups. So the neutral group who also happen to be the largest keep the core design. So they have a £5 on completion incentive with one reminder letter, two reminder emails and two reminder text messages and a minimum of six calls from the telephone unit. We then change these to varying degrees. So far, the highest and lowest groups receive the most extreme changes. The highest group incentive is raised up to £10 receiving additional reminder letter while the lowest group are not issued to our telephone unit at all and receive no reminder letters. The higher and lower group get some kind of variation in between, but actually only slightly quite modest changes. Just to note, we manually move all of those people who say that they don't have access to the internet out of those bottom two groups to make sure they're still given the opportunity to take part. So what impact is this having? This table shows figures for the last wave where we didn't use the Turkish design in July and then the first two waves where we have. So firstly, as we'd expect, the survey response rates for those identifies as highest priority during the August and October waves are lower than for the sample as a whole. However, once we start implementing the targeted design, we can see that those response rates lift up somewhat. However, it's also interesting to note that the change in target design didn't seem to have too much of an effect on the survey response rates of the other groups. As well as this, we can then see the impact on the representativeness of the sample overall. So firstly, we can see that the removal of effort from elsewhere has not harmed the overall response rates which have gone 15, 14, 15, basically stayed about the same. Secondly, we can see that the adjusted R indicator, which has gotten asterisks because they're still in development, but still we think this is a rough indication, are moving positively in the right direction. But since we are developing them, I've also put an indicator of the design effects on, which is equivalent to what Joel was talking about in terms of weighting efficiency earlier on. And we can see that by employing a targeted design, we're reducing the amount of work that the weights are having to do, which is indicating that we do have a more balanced sample profile. So I would say overall that our initial analysis suggests that it certainly isn't doing harm and it does look like it's working. But the impact does seem to be somewhat modest and we want to think about how we can enhance this. It may be because we are focusing too much on the extremes of the sample, so while that is where we can have the most impact, those numbers are relatively small, those most underrepresented groups, so it's a relatively small number of those. So any impacts might be being somewhat muted. Alternatively, what it might be doing reflecting is the fact that the intervention is taking place quite late in the non-response process. So if a lot of the loss of representativeness is happening at recruitment to BSA, recruitment to the panel, then actually this intervention is occurring too late and we might want to think about how we can move it higher up the sort of more upstream, I suppose. We are also now looking at potential ways to further leverage our available parodata and target our fieldwork design to individuals. So just very quickly, we have a lot of parodata on how people open emails, open text messages, and we could use that to sort of better time and change the number of emails and number of text messages to particular groups of people who we know interact with those. Although I will say that when we've tried in the past, it hasn't been very effective at all and actually the use of letters is really, really important, even just as a support to the other emails and the other bits of communication. Alternatively, we've done in the past, we've asked participants for feedback on what they like about the panel, what they dislike about the panel, and what motivates them to take part. And again, we can use this to model and target our approach. For example, those people who say that incentives are important to them, we can raise their amounts, those people that say that they aren't important, we can lower the amounts. People who think that, who say they're motivated to take part due to sort of civic engagement reasons, we can target our communications messages based around that. So these are some of the kinds of things that we're trying to explore. And again, it's trying to be more efficient in using our existing data and the huge amount of information that we have about the people, the people that we're working with. One of the key features in that Senpal is its use of telephone field work after an initial period of web field work. And this has two purposes. Firstly, to improve the response rates by engaging those people that are typically disengaged from surveys. But secondly, and more importantly, I think to give those who do not have internet access the opportunity to take part. So what impact has that had? This chart looks at the level of impact from telephone field work on the overall response rates. And while it varies a little bit from wave to wave, we can see that it generally adds around three percentage points to our overall response rates. Interestingly, we find that those recruited to join the panel are actually less likely to take part on the phone and this may reflect their overall higher levels of engagement. What you probably can't see from this chart, but if you look behind it at the data, is that there is some indication that the impact of the telephone field work is declining over time. So a smaller proportion of all of our completed interviews are being done on the phone. And we have a few different hypotheses for those, although we're not entirely sure. Firstly, it might be because panel members are becoming more engaged and more used to the process of accessing an online survey and just simply getting better at it so they don't go through to the telephone field work stage. But also it might be a reflection of the fact that the less engaged participants who typically take part over on the phone have now are treated and therefore the sample, the overall sample profile is changing, which is resulting in a changing profile here. It also might reflect the fact that once somebody's taken part in the panel, we're more likely, in a panel survey, we're more likely to have their email addresses and mobile phone numbers. So we're more able to get in touch with them through email and text message reminders and improve the web response rates. But we're not sure. So, although boosting the response rate is important and the reason we issue all of our cases that don't take part at all is that because improving response rates is important, methodologically, it's a really fundamental role of the telephone field work is to include people who don't have access to the internet or don't like to use the internet. And overall, around 11% of the people who were recruited to the Natsyn panel reported not personally having access to the internet, which is roughly, I think, in line with the 13% or so that Joel mentioned earlier. And they're significantly different to the rest of the population in terms of their demographics, so their age, their income and their level of education. But actually also the attitudes and behaviours that they exhibit. For example, they were more likely to think that Britain should leave the EU, they're less likely to think that the government is spending money on education. So it has a real substantive impact on the things that we're trying to measure. So this chart shows the overall response rates again using the standard design, where we've used the standard design, but we've split it out between those with web access and those without web access. And it firstly shows that overall, those with web access do have higher response rates than people who don't, and that's not necessarily surprising, given the web-first nature of our methodology. But it also shows the effectiveness of the telephone field work in lifting the response rates for those without internet access. Again, mirroring what Joel found, there are some people who said they don't have internet access and now go and take part online. I haven't looked into too much why that is, I just accept it, the taking partner survey, that's great. But actually a much larger proportion of them are taking part over the telephone and really making a big impact and making sure that this group are included. So I personally think that one of the key unanswered questions for probability panels, and I guess for push-to-web methodology in general, is how we include the offline population. And our current opinion on this is that, sort of within that framework of total survey area, the benefits of the sample representativeness outweigh the problems of mode effects. But this is not necessarily something that's established or defined, it's just our current judgement on the issue. Where we look for alternatives face-to-face surveys, perhaps self-administratives are obviously very, very appealing, but would end up being far too expensive. Paper has also been used historically, but again it creates limitations in terms of the dependent interviewing and the amount of data you can collect for that group. Web enabled, where you provide a tablet to people who don't have access to the internet, it certainly seems to be where things seem to be going. It's currently used by Pew on the American Trends panel, it's being used by the Kronos panel at City University. And this approach is really, really good because it's cost-effective, it's quick, and it removes the mode effects. However, it only helps where the problem is internet access, not an unwillingness to use the internet, which is the problem for a lot of people and it doesn't have the added benefit of putting in those disengaged people and boosting response rates overall. So I'd say, while I suspect in the longer term that's probably where we're going to end up going, for now what we're doing is focusing on making sure the design of our questionnaires are relatively mode neutral and minimising the mode effects. So obviously we can do something about how people interact with a questionnaire, but if somebody doesn't take part in the survey, there's not so much we can do about that. I'll now talk about some of the different ways we have used the panel over the past couple of years. When we first decided to develop and expand the panel, we certainly envisioned that it would operate fairly simply, basically, as a cross-sectional vehicle. For the most part, that is what it has done, along with various types of normal, bivariate, descriptive, aggression, segmentation, analyses, working with academics, working with charities and government departments. However, I don't really want to include... Well, obviously I want to talk about that. Actually, I didn't include this section to talk about the things that we were expecting on doing. And so although the panel was designed with this cross-sectional research approach in mind, as you'll hopefully have picked up by this point, it's actually more akin to a longitudinal study in its design, as we basically are maintaining a panel of the same people over a period of time. And as such, a longitudinal analysis is possible. However, for that to work, we need to make sure we have a relatively high re-interview rate that is making sure that people who take part at Wave 1 of our longitudinal study also take part at 0.2, 0.3, 0.4, et cetera, for however long you want to do the analysis for. So, looking at the Natsen panel waves conducted between November and May 2017, we found that around 85% to 90% of people who took part in any one wave also took part in any one other wave. This reflects the fact that we do seem to have a high within-panel response rate and a section of panellists who take part pretty much all the time. We have a section of our panellists who never take part and a relatively small group of people in between. This chart shows how that re-interview rate accumulates should longitudinal analysis be required at literally every single wave. So, 76% of people, 76% of people who took part in November 2016 also took part in February 2017 and also took part in May 2017 and also took part in May 2017. And actually, if I extended this further and went all the way up to October 2017, we'd found around 60% of people would have taken part in every single one of those seven waves. And that translates to around 1,500 interviews with seven uninterrupted waves of panel data. So far, we've used this panel longitudinal in a few different contexts. Firstly, we've used it pre and post the EU referendum and the general election to try and understand voter swing during the campaign, try and understand who did and didn't vote, and also trying to get an analysis of the effectiveness of turnout weights. We'll also be using it for a well-being study where we're going to try and understand the impact of shorter term changes on people's overall well-being. And we also use it, it's not perhaps not truly longitudinal, but to follow up people in a qualitative manner to understand their survey answers in more depth. So we haven't used the panel in a longitudinal manner for a huge amount so far, but actually I think that ability to get at very, very short-term longitudinal change for a large number of people could actually be a potentially very powerful tool on something that we want to explore further. The second way it's been used that I'd like to talk about is in implementing experimental designs. By that I mean, this by name is a formal definition, randomly allocating participants to different treatment groups where they experience different inputs, and then we collect output measures in order to be able to infer causal relationships between the variation in those inputs and variation in those outputs. We've used this in a few ways. Obviously a lot of the methodological experimentation falls under this category, but actually we also do it within the questionnaire. So for example, we've done this for online questionnaire testing, randomly splitting the sample in half and presenting each half of those with different versions of the question, and then evaluating how those different estimates vary to get an understanding of the impacts and the changes in the wording. But we also follow up with different probes as well. So if we change the phrasing of a different question slightly and then ask people to say what they thought we meant by a particular word or phrase, we can see where the different wordings are allowing respondents to understand questions in a more consistent manner. We've also done stuff around presenting respondents with vignettes where we randomly vary elements of the vignette and allow ask follow up questions. So for example, we did this for attitudes to benefit claimants, randomly varying different characteristics of the story, so their sex and their age, but also things like their amount of time off work, their levels of qualifications and things like that, and asking people to what extent they thought that person deserved to receive unemployment benefits. And one of the advantages of using the NAPTEN panel for this kind of work again draws on that background information. I mean, we can stratify the random allocations to the different experimental groups, making sure that they are more balanced. The final thing I'll mention that we've done, although it's not something that's unique to the panel, but I think fits in well with some of the stuff that Patrick was talking about earlier, is asking people for consent to link their survey data to other records. We've done this so far, actually, a study that I was working on where we've asked participants to link their survey data to their Twitter information. And this was a pilot study to test the feasibility of doing so. And we used it to run some analysis of how different photo groups, which we identified from the survey data, talked differently about the election online, and that information just wouldn't be, that kind of analysis just wouldn't been able to have been done by using the two bits of information separately. It was an interesting and really interesting study and really interesting pilot process to go through, but the reason I'm particularly mentioning it now is that actually, even with a relatively engaged online panel sample, the consent rates to data linkage for those taking part were quite low. And this is a feature that we see semi-regular across our online surveys that people are less likely to consent to these kinds of data linkage. And this is important in the context of the development of survey research, right? So if we're consistently pushing to be using online data collection methodologies, but also pushing to be using big data more efficiently, more effectively, then those lower consent rates is a really important issue that we need to try and address. Okay. To summarise and sort of talk a little bit about our next steps. Sorry. So in summary, the Natsyn panel was set up in the context of what we identified as a lack of a high-quality probability-based solution to projects with smaller budgets or requiring faster turnaround research. The panel was recruited off the back of the British Social Attitude Survey to keep a probability design and higher recruitment rates without costing the earth. It uses a sequential mixed mode design to help keep costs low and speed up field work, but also ensure that everyone has an opportunity to take part. In particular, those without internet access. However, there are some question marks over the impact of mode effects on this and we need to do further work to look into this. The surveys that we've produced do seem to produce robust estimates with much of the bias that we find being accounted for by the weighting, but we're also developing our adjusted R indicators to improve our ability to monitor the sample quality and target design approaches to improve it. Generally, we've found that the panel can be used for quite a wide range of types of project, more so than we'd initially envisioned, and we're exploring how we can exploit it further, in particular with regards to longitudinal projects. Also to note, one of the things I haven't really touched upon at this point is that actually there's a huge amount of sort of logistical and system stuff going on in the background to this. So we typically work to a sort of timetable of having a signed off questionnaire to a fully cleaned and weighted dataset within around eight weeks, and if you've worked on the probability surface surveys, you'll know that that's pretty quick. And the only way that that is possible is because we have very efficient systems and process that we set up behind the scenes to manage the sample, to send out reminders to clean the data, et cetera. And actually a huge amount of the innovation that goes into this approach is around the development of those kinds of systems rather than the stuff that we might typically consider as methodological and that I'm going to present at events and conferences like this. So it's very much worth considering the amount of work that goes on behind the scenes on those kinds of things. In terms of the future, clearly you want to continue with the experimental work we're doing, and when I get the time, write it all up, it's been a really, really great opportunity for us as a sort of research organisation to develop our methodological understanding and apply a whole range of different experiments and tests that we can then spread out across our other projects. A big part of that experiment work will be finalising the adjusted eye indicators to help us thoroughly evaluate the work that we've done and guide where we want to go next. The target design is also something we're particularly keen to develop. For example, in particular, how we can make it more impactful. Should we increase the interventions or should we be moving them further upstream? And also, how can we use the rest of our parameters to better improve our sample quality? We're also keen to develop our understanding of mode effects. We're currently working with some, doing some experimental work to try and understand and measure these. But we're also working closely with the people at City University who are running the Chronos panel to try and compare our two panels and to see how their web-enabled approach compares to our telephone follow-up approach. We also want to look and do more analysis on the impact of conditioning on panel estimates. So the analysis so far seems to suggest that attrition isn't causing us a problem in terms of our sample profile. But there's a question around whether being on the panel itself affects how people answer questions. So, for example, repeatedly asking people what their views are on Brexit, does that actually end up changing people's views on Brexit? Do they get better at satisfying? Do they actually get more thorough in how they answer their questions? And these are some of the things we want to work out. And actually, because we have separate cohorts of people recruited in 15 and 16, we can start to make some of those comparisons and tease out those effects. We're also trying to expand how we use the panel. I've talked about longitudinal research, but also data linkage, qualitative follow-ups, and so on. And finally, we're thinking about how we can expand the panel further to provide larger sample sizes overall and the more robust analysis of smaller groups. We have, for example, set up a similar panel for people based in Scotland, and we're looking into what we can do in Northern Ireland and Wales to make sure we're better able to represent the different nations of the country. But we also want to think about how we can improve the core and expand the core as a whole.