 I think we make a start, so welcome back everyone to the fourth and final session of today's LFS and APS user conference. You are here now in the parallel session for methodological innovations. If you'd like to join the other parallel session on, I think, subjective wellbeing, then please join the link in the chat. The passcode is there as well. So we've got, in this session, two fascinating papers lined up, and Lorenza and Xabi are making a start. So Lorenza and Lurena Cross-Sarano is a senior economist in the Arab city economics and planning team. She has a master's in public administration from the London School of Economics and over five years of professional experience in infrastructure, policy analysis and governance. Lorenza was part of the team in the free ports monitoring and evaluation project which produced the Power BI dashboard. And Xabi Pogoni is a senior economist in Arab city economics and planning team as well. He has a PhD in urban and transport economics from the Imperial College of London and more than nine years of professional experience in the field of economic research analysis and evaluation. And Xabi leads the data and quantitative evaluation workstream of the free ports monitoring and evaluation project which produced the Power BI dashboard. And they're both now presenting their paper titled UK Free Ports Monitoring and Evaluation, the Innovative Power BI APS dashboard. So if you're both ready, I'll hand over to you. Yes. Hi everyone. Thank you very much for inviting us for this session. I am going to share my presentation. Hopefully it's going to work. Yeah, I think it works. So my name is Xabi Pogoni. I'm, as Martino mentioned, a senior economist in Arab city economics and planning team. And this is a work that was done by actually several people. It's only me and Lorena presenting it, but we are essentially a team and we have produced this dashboard and the underlying methodology as well for the UK Free Ports Monitoring and Evaluation project. I think it's important to first talk a little bit about the program, which might not be clear for everyone. So this is essentially a project which monitors and evaluates the impact of the UK Free Ports program. Free ports are special areas within the UK, which are essentially produce different economic regulations. So this is essentially a spatial policy where there are essentially three types of important benefits. First one is that in case a firm moves to these areas, then that firm receives tax benefits. There are also customs benefits involved and there are also seed capital funding available for local infrastructure projects. There are 10 free ports in the UK currently, eight of them are in England and two of them are in Scotland. And I think the reason why this is a really interesting project is that in most of the cases, also today what we've seen in this conference is that what essentially we do is that there is a certain policy which happened a couple of years back in the past. And we wait a couple of years, then we do an evaluation. We want to understand exactly what was the impact of that particular policy and want to understand how much of its aims and objectives were actually reached. And then we want to feed this back into science and then make sure that the next time there's a certain decision being made on a policy, then we learn from previous learnings. What we do here is a little bit different. Essentially the monitoring and evaluation strategy started almost at the same time as the policy itself, which essentially means that we are also really interested in monitoring the results. And this is also the reason why we created the dashboard and this is also the reason why we created these synthetic free port areas or across the UK as well. This also means that we really have an opportunity now that in case the evaluation finds early findings, then we're able to feed this back. And the department for leveling up housing and communities, the UK government department responsible for this program is able to potentially tweak the policy to make sure that we are maximizing public value. So this is really an exciting opportunity I think for a researcher to actually get involved with what's happening on the ground. As I mentioned, essentially, there are these large set of measures. This is in case you're familiar with the literature, you definitely would understand that this is a spatial policy, which is an interesting mix of investment zones and especially economic zones and free trade zones. But this is a new type that was kind of specially made for the case of the UK. It has essentially three important goals. The first one is to promote regeneration and job creation, and this is the government's lead policy objective. The second one is that they want to make these places hotbeds for innovation. So they want to ensure there is clustering of economic activity happening there and not just any type of clustering, but they also aim at producing high value jobs and green jobs as well. And the third one is to make sure that these areas are becoming national hubs for global trade and investment across the UK as well. So as I mentioned, the purpose of the monitoring and evaluation strategy is to assess the effectiveness of the policy, understand what works and also why, and evaluate whether these objectives have met and to what extent, and see how much of this is actually the direct impact of the program and not just due to background growth or any other important economic or other shock. It is also important to note that what we are doing is not just evaluation, but we also trying to work together with the ministry to kind of make it easy for them to engage and also understand not just our final outputs, but also how we got there. This is the reason why we started creating, actually created the first iteration of this Power BI dashboard. And the reason why we think that this dashboard can be really interesting is that instead of just producing a report or producing a slide deck with many, many charts, this actually provides the opportunity for experts within the ministry to be able to explore the data themselves and filter and create those charts that are interesting for them. It's also important that once you create a dashboard, you also have an effective data pipeline. It essentially means that we wrote everything in Python and we also tried to automate as much of the process as possible because essentially every half a year we are updating the whole dashboard with new data coming in, with new monitoring data coming in, and we're also producing monitoring results based on that. As I mentioned, this is a project funded by the UK Department for Leveling up Housing and Communities, and it's not just our company, I'm working on it, but we are leading a consortium which encompasses Cambridge Econometrics and Technopolies as well. I think there is one very important or very interesting part of our project and that's this kind of no-fee-port program counterfactual. So in many cases in econometrics or in any kind of evaluation, what we do is that we do a post-evaluation and using data both before and after the implementation of the policy we are trying to create a no-policy counterfactual against which we are measuring the impact of the policy. We kind of take this idea and then use it already for monitoring, and in order to do this we came up with this synthetic free-port method where we essentially use a propensity score matching algorithm and for each of the free-ports using areas which are similar to the actual free-port area, but they are not directly impacted by it. We are creating these kind of weighted averages of three, four or five local authorities from elsewhere, in this case England and Wales to make sure that we have a good counterfactual for each of the free-ports. So essentially in the case of England for each of the eight free-ports we created eight synthetic free-ports, and this helps us hopefully going forward that once we gather a new set of monitoring data, we gather it both for the free-ports themselves and we also gather it for these, the synthetic free-ports and we are basically arguing that the observed difference is most likely mostly due to the program itself. The reason why we're here on this conference is that we have used the APS as well, but we are not just using the annual population survey database. We also used several other publicly available data sources and we also did a data gap analysis and there were just quite many crucial indicators that we were unable to gather from publicly available data sources. This is the reason why we also created a survey and that survey every half years is sent to the free-port areas and we can ask certain questions about their operations and also what's happening on the ground and this kind of survey fills the data gap. This means that we are really in a really I think special situation that hopefully in a couple of years time we will be able to do a mixed method holistic evaluation where we will be able to answer whether this program was a good value for money. Also in case we see significant impact we will be able to answer why and maybe why not those objectives were achieved. What we see here is an example tab of the dashboard that we have been working on. As Chava already mentioned the idea behind this dashboard is for it to be a centralized source of information for the monitoring and evaluation of the free-ports which will facilitate the measurement of free-ports performance against outputs defined in business cases, synthetic free-ports that Chava already talked us through and also economic baselands and forecasts. What we see here is not just a snapshot of each of the free-ports and at the moment as well prospective free-ports but also it will allow us to draw comparisons against different free-ports and also against free-port areas and the UK more generally and also to allow the analysis of trends across time. This dashboard as Chava mentioned is going to be updated regularly twice a year with both public data sets and also primary data directly reported by free-ports and in such it will be a source of up-to-date data for relevant stakeholders across government and allow the monitoring of the program at any point in time during the monitoring and evaluation period and beyond. Just to show a different tab of the different indicators that we have been selecting to include in this dashboard, ideally the dashboard will allow us to track the program's objectives across time mostly in terms of job creation, leveling up free-ports becoming hotbeds for innovation and also increased trade and investment. So essentially this is what we wanted to show as you can see this is very much just the first iteration of the results and we don't really have any actual monitoring results yet. This is just the results of the initial baselining analysis but basically I think what's really interesting here is that for instance on this dashboard on this tab of the dashboard on the lower left chart you can see that in the case of Liverpool for instance we created we show basically total jobs per year for each year between 2015 and 2021 and basically you see that the synthetic consortium that we have created follows the trend of the actual Liverpool free-port area quite closely and this kind of gives us hope that once we will get the monitoring results then this synthetic free-port method would actually work and it will be a really useful comparison to understand the impact of the policy. I'm very happy to answer questions or receive comments as well of course. And we'll now move on to our second presenter in this session that's Chris Martin from the University of Bath. Chris is a specialist in labour markets macroeconomics and monetary economics with expertise in surge frictions and new Canadian DEAS GE models. He has been a professor at Bath since 2010 and current research topics include a study of the impact and aftermath of the COVID-19 pandemic on the UK labour market estimating real wage rigidity and using machine learning to explore heterogeneity in the UK labour market and he'll be presenting his paper on heterogeneity in the UK labour market using machine learning to test macroeconomic models. Over to you Chris. Thanks very much. I'll just share my screen if that's okay. That's working now. Okay. Well, first of all, thanks very much for letting me speak and we're very my co-orphan friend, Maglin and I are very keen for your comments on this. So if you've got any comments, please let us know. If you'd like to have a copy of the presentation, there's quite a lot going on in the presentation. I'm only going to do highlights today. Then please, you know, ask us in the chat or send us an email. We'd be very glad to respond. So what we're doing in this presentation is we're really looking at three things. So here's a summary of what we what we what we're trying to do. The first is what do we do is we find and the third is why do we do it? So what do we do? We use machine learning, as you say, to investigate the main sources of heterogeneity in the UK labour market. Okay. What do we find? Well, what we're going to do is going to apply a clustering algorithm to data on individuals. And when we do that, we find that membership of clusters is driven by two things. Occupation and education, which we regard as proxies for productivity. And therefore we conclude that the main source of heterogeneity in the UK labour market is around productivity. Why do we do it? Now we come to this as macroeconomists. And as microeconomists, there are longstanding issues in the analysis of labour markets, which are accounting for differences between individuals. Specifically, differences between individuals in terms of productivity can help us understand those problems. So just to give you a flavor, you can see now is a list of six things, which allowing for differences in productivity between individuals helps us get a better understanding of, better under appreciation of those issues there. I won't spend time going through those issues, but you can see they're clearly pretty important issues. Okay, let me skip ahead and look at the research question. So there are two main research questions. The first we've already touched on, which is are the differences in productivity mainly driven by productivity, differences in productivity, or are there other sources of heterogeneity in the UK labour market? And subsidiary to that, can we discriminate between different models in the theoretical literature? So this is how we come at it as macroeconomists. There are different sorts of models, theoretical models in the macroeconomic literature, and we're trying to discriminate between them. Okay, now I'm going to use machine learning. Now why? Fundamentally, it's actually very simple, which is that we're trying to explain heterogeneity. There is no obvious measure of heterogeneity. So there is no obvious variable that you could put on the left-hand side of a regression. So in the absence of a clear measure of heterogeneity, what we're going to use instead is clustering. So how does clustering work? The idea of clustering is that it will partition the data into one of a predetermined number of clusters or subsets. In that world, we're going to be focusing on the case where there are two clusters. So basically, every observation in our dataset will be allocated to the first cluster or to the second cluster. Okay? And that's not been done arbitrarily. It's been done to maximise on the basis of a criteria. And the idea is that the clustering algorithm allocates data points into a cluster so that data points within a cluster are as similar as possible and as distinct as possible, as different as possible from datasets in other clusters. In other words, maximising similarity between a cluster and maximising dissimilarities between clusters. Now, how is that helpful to us? Because when we look at the characteristics of individuals in one cluster compared to another cluster, what that will do is it will point out to us the main sources of heterogeneity. So basically clustering is a very useful tool for addressing what are the main drivers of heterogeneity in the UK labour market. Okay, let me skip ahead slightly in the presentation and let me just show you our data. Here's our data here. We've got 25,000 observations. These are individuals. They're from the labour force survey. And we're taking the last survey before the pandemic. So we're using 2019 quarter four. And what you can see here are the variables that we will use. So for each individual, we will have measures of these characteristics. And these characteristics are binary. It's a one or it's a zero. So the first three characteristics are about occupation. They're based on the standard occupational code. And basically, if you're one digit, so C code is between one and three, you're in the high skill group. And there's a value one for the variable, which is high skill and a variable zero for the variables, which are medium skill and low skill. So you can see how that's done. Then we've got a bunch of variables which measure education, whether the individual is a graduate, whether they have a levels or higher, whether they have GCs or higher. So we've got three measures of education. And again, those are boundary measures. You can see the whole sample average in the right hand column here. But there are many different dimensions to heterogeneity. For example, this heterogeneity in gender. So we have a measure of whether the respondent is a female or not. There is heterogeneity around ethnicity. So we've got a measure of whether or not the individual is non-white, a very crude measure. We've got a measure of whether the individual is young, aged 30 or less. That's another dimension of heterogeneity. There's geography. And here we've got measures of whether the individual is employed in London or in the Southeast. We're looking at different sorts of contracts. So we're looking at whether the individuals on a temporary contract, on a zero-hour contract. And we're looking at tenure. Has the individual been in their job for one year or less, a short tenure, or in their current job for five years or more, a long tenure? Are they looking for a new job and also whether they're employed in the public sector? So there are many dimensions of heterogeneity there. There are something like 18 to 20 variables. So each data point consists of 20. Each individual has got 20 observations, which are over one or a zero. That's our basic data. And we're going to basically subject this data, all 25,000 individuals. We're going to put it through a clustering algorithm. So let me say a little bit more about clustering at this point. So as I said at the start, what clustering aims to do is allocate data points into clusters in order to maximize similarity and also to maximize dissimilarity between clusters. A clustering algorithm is defined by a central point. And if I show a picture, here we go, this is a very, an idealized visualization of how clustering works. It's based around the central point. In our case, the central point is median. And here you can see a stylized representation of a bunch of data points. The colors have got no meaning. The size of the dots have got no meaning here. We've got two clusters and you can see the median point of cluster one on the left and the median point of cluster two on the right. Just say this is highly idealized. In practice, you do not get separations between clusters. This data set clearly has two, obviously has got two clusters. In reality, it's much more complicated than that. So what the algorithm does is it tries to determine what is the central point of the cluster and allocates data points to the cluster to maximize similarities between the cluster or rather minimize the distance between the center of the cluster and members of the cluster and maximize differences between members of one cluster and members of another cluster. That's the idea. Now, one of his questions is, well, how many clusters? Now, we're going to use two clusters. Let me briefly explain why. We're going to use something called the silhouette statistic. And what you can see here is a silhouette statistic for our data in the case where there are two clusters. So what you can see here is information for every one of the 25,000 individuals in our data set. And basically, this is measuring the silhouette. The silhouette is a value between one and minus one. A value of greater than one basically means the distance between the data point and other members of the cluster is greater than the distance of their data point to members of another cluster. So the data point fits well within its allocated cluster if the silhouette is positive. And in this case, you can see there's only a very small number of negative silhouettes. Now, we calculate these statistics for two clusters, three clusters, four clusters. And we calculate some simple statistics that you can see on the next slide. The blue bars are the average silhouette statistic where we look on the left at two clusters, then three, then four, then five, then six. And this criteria is maximized in the case, excuse me, in the case of two clusters. The average silhouette value is highest in the case of two clusters. The orange bar is the proportion of cases in which the silhouette statistic is negative. And here we want it to be minimized. And it's minimized where the number of clusters is equal to two, although it's less clear in this case. So on the basis of this, we're going to work with two clusters. In the associated paper, we look at what happens if there are three clusters and four clusters. So what we do is we say, yes, there are clusters of this evidence. And we run our data through the clustering algorithm, telling the algorithm that there are two clusters. There were some technical details here, which arise from the fact that there were 25,000 observations. It's a large data set. The algorithm has to do some sampling. If you're interested, you can contact me and I'll talk about that more. But let me not dwell on that now. Let me just say we're going to work with two clusters. Let me show you what the results look like. Our results are captured in this table here. So these are our main results. The column head at all is the all sample average, which you've already seen. The column headed 2A is the first of our clusters. The column headed 2B is the other cluster. And what we're doing is we're looking at differences between the average values of a characteristic in cluster 2A compared to the whole sample. The average value in cluster 2B compared to the whole sample. So if we look at the first row, which is high skill, 83% of members of the first cluster are in high skill jobs. Their SOC is either one, two or three. By contrast, only 21 members of the second cluster are in so-called high skill jobs. We highlight in red aces where the cluster average differs from the whole sample average by more than 40%. It's got no statistical value at all. It's purely a visualization device. If you were to do t-tests on the differences between values in one column compared to another column because of the large sample size, everything here, all the differences are significant at very high levels of significance. So coloring in red is simply a device to highlight the largest differences. So you can see that in terms of high skill, cluster 2A has got a lot of high skill individuals. Cluster 2B has got a smaller number. If we look at medium skill and low skill, you can see relatively fewer members of cluster 2A are in those occupations. Those occupations are much more concentrated in cluster 2B. So it's clear that there is a separation within our sample in terms of occupation. One cluster has got high skill jobs. The other cluster has got medium and low skill jobs. There is evidence that there is clustering around occupation. Okay, which we think is interesting. Now let's think about education. Here the results are even stronger if we look at the case of graduates, the graduate column, the graduate row. In cluster 2A, 74% of members of that cluster are graduates. Whereas in sharp contrast, only 2% of members of cluster 2B are graduates. So there's a very clear difference, a very strong difference. In terms of membership of clusters, in terms of whether the individual is a graduate. Looking at whether the individual has A-levels or higher, the same is true. 87% in cluster 2A have at least A-levels compared to only 7% of cluster 2B. And if we go down to GCSEs, the same is true. 98% of cluster 2A have at least GCSEs, only 42% of cluster 2B have at least GCSEs. So there's a clear separation in terms of occupation. There's a clear separation in terms of education. Those are interesting results by themselves. If we interpret occupation and education as proxies for productivity, here is strong evidence that you can cluster our data on the basis of productivity. What about these other measures? Well, the next row, for example, has got female. There are differences, there are more females in cluster 2A than cluster 2B, but the difference isn't as large. The same is true in terms of ethnicity and age and geography and tempi contracts and long and short tenure and even in the public sector. There are differences, large differences in terms of zero-hour contracts, but only 1% of our sample are in zero-hour contracts. So that's really not a very important measure, simply because there are so few workers on those contracts. So in terms of our research question, we can distinguish between individuals on the basis of occupation and education, which we take to be measures of productivity and not on these other measures. So this is some evidence to support the argument that you can distinguish between high productivity jobs and low productivity jobs, which gives support to the macroeconomic models that we were interested in and which led us to get into this question. Okay, let me in the remaining time demonstrate the credibility of those results. So how do we demonstrate the credibility of those results? Well, the first thing we do is we do what's called a validation exercise, and this is simple. We simply randomly divide our sample into two subsamples. Just randomly allocate one observation to one subsample, randomly another to another subsample, and we cluster on each subsample. And we ask the question, in how many cases is the data point allocated to the same cluster if we use the subsample or the full sample? And the answer is over 96%. So in almost every case, whether you cluster on the whole sample or the subsample, you get the same result. So that's evidence of robustness. Okay, if you know a little bit about clustering, you've probably heard about K-means, the K-means algorithm, a very well-known algorithm. We do not use that algorithm because K-means, the center of a cluster with K-means is the average of the cluster, and the average may not actually be a member of the cluster, which is why instead we use the median. If we use the well-known K-means algorithm, then 90% of our data points are assigned to the same cluster as they were with our main algorithm, which is the K-means algorithm. We also use what's called soft clustering, and with soft clustering, you don't assume that every data point belongs to one cluster or another cluster, you allocate a probability that they do so. Interesting but kind of complicated. In that case, 91% of data points are allocated the same cluster with soft clustering as they are with K-means. Okay, so let me just quickly summarize what are we finding here. We're finding the main drive of cluster membership is occupation, and to the extent that these are good proxies of productivity, then we can conclude that the main source of heterogeneity in the UK labor market is around productivity and not these other dimensions of heterogeneity. And as I said at the start, this supports the use of theoretical models in which distinguish between high productivity jobs and low productivity jobs. I won't go through the rest of this slide because that gets into more detail about the macroeconomic models. How can we take this forward? Lots of interesting ways we can take this forward. The first is that in this paper, we only consider employees. We could include the self-employed, the unemployed and the inactive, and that's something that we aim to do that actually that we're currently doing in work. We're currently applying this to the issue of inactivity, which of course we're all very concerned about at the moment, trying to understand what it is that leads to inactivity and what could be done to move to encourage inactive individuals back into employment where that's relevant. More widely, this is an example of the sort of thing that you can do with machine learning. Once you explore machine learning, you come to realize it's actually quite familiar. If you're familiar with econometrics, machine learning has a lot of familiarity and there's nothing really very scary about it, but it does open up a very wide range of new tools that we can use so we can look at old questions in a new way, but more importantly, we can look at new questions, which is very quite exciting. And as we move into the world of big data, things like machine learning which are designed to deal with big data become more and more important. Okay, I'll leave it there.