 So, Mark, welcome. Please introduce yourself a bit more if I've missed important things and your project. Yeah, thanks very much, Nick. And this is a complete change from the more technical side that Sophie's very expertly presented there, even if there's a sense of her being a bit distanced from a research domain, because I'm a sort of meat and potatoes researcher. I'm a historian, a criminologist, I'm a professor of history at Griffith University. For the last five years, I've been directing this project called the prosecution project, which is a history of the criminal trial in Australia. And what's unique about it is that we're building a database of, as far as we can get them, all criminal prosecutions in Australian criminal jurisdictions, which are mainly the states, the six states and the Northern Territory over very long periods of time. So we have records dating from 1788 through to the 1960s. This has been a digital project that has relied on partnerships with archives that provide the data. So our typical data is from original court registers. We extract that data, transcribe it, because it's mostly manual data. So there was no way of accessing the data by machine technologies at the moment. So we've had to organise transcription using the research and the volunteer community into a database that we built with the research services at Griffith University. And on this topic today, we probably really should have somebody from our research team here to talk about some of the issues that are likely to be of most interest to this group. But, yeah, I mean, you've indicated an interest in this new type of research. So I might just introduce a little bit about it and show you some of the tools we have. And particularly the issue around what we do with the data once we get it, because let me say there are two types of users of this kind of data. There are researchers like ourselves who may be interested in telling individual stories or looking at in kind of conventional social sciences terms, looking at aggregated data and analysing that in terms of what are the factors that shape how a criminal trial develops and what its outcomes are. So at the individual level, we also have a very large community of people involved in family history and genealogy and so on that also access our database. And those sort of users are really interested in individual stories and really in descriptions of events and individuals as they were recorded originally and not really classified into some sort of higher aggregate. But for the purpose of thinking about patterns of the events and we're talking about then visualisation of our data is becoming quite important. And it's at that point that we have to think about how we aggregate into meaningful categories that respect historical forms but also make sense in terms of the social science possibilities of analysis. So this public search page, I think you can all see that here. That just outlines the purposes of the project. And so we have search historical trials here, which has got a basic keywords search which works across a select number of attributes of our data and simply searches in an uncontrolled way for any term arising that somebody might choose to investigate. So somebody coming in might want to know about a particular individual and they type that in or they may want to know about a particular offence and without having to go into more advanced search they may wish to see whether we've got anything on forgery and there's plenty of stuff there for them to look at. But if they've got more information about the area in which they want to search then they are able to search across a number of our attributes. Now this is a select number of attributes for a specified period of time which is constrained by archive access conditions. Some of our records are from closed periods or under restricted access of other kinds such as children's court material. But for most of the records we have people can search across this range of attributes and we're in the process at the moment as we're getting to a more complete data set of starting to consider releasing a bit more of our data. So how do we derive these things? I think in terms of any kind of application principles of classification then you know the original data challenges are just at the transcription level of getting accurate terminology off the page of the data. So first name and surname are significant challenges so it's very important for our data that they be as accurate as possible. The offence category is one where we have the possibility both of them have been a regional transcription considering how it might be accurate for our purposes and I'll show you that in a minute. Most of the other terms we have available are we simply transcribe from the original record and we have an open search that enables people to establish whether you know somebody guilty of fences in New South Wales in the 1910 should get some results from that I think yes. So that's just how that search function works. Well I might just draw your attention to what lies behind this and that this is probably more interesting a lot of people. Our first challenge was that we were dealing with a number of jurisdictions in which terms that we regard as you know common to all of them might have been represented differently in the original records and the records in any case vary in the extent to which they cover all aspects of the criminal process so you know Queensland and Victoria are particularly rich data sets in terms of including earlier stages of the trial as well as later but we had to develop a process that would enable the researchers to define the different registers as we call them different state jurisdictions and the particular courts at which we were accessing data from and have an approach that would allow us to add attributes as they emerged over time and to have registers that had different numbers of attributes and at the same time respecting the original data. So this is a typical example maybe Queensland State and Supreme Court got a 67 attributes here. Some of these attributes will be shared between different with other states and others not. Some of the data is available in original sources others is very inconsistent it's very important in this area looking at Indigenous identity for example but for the most part these records don't contain that and that tends to be derived from other reports such as news historical newspapers which can be searched through a trove API that we link to our records. I'll just show you quickly how this looks in in practice with so again examples from Queensland. So a key thing for us is verifying the data and the system for most of our states enables us to check the data extracted against the original record and that's very important because our data has been prepared both by researchers on the research team and as I mentioned by quite a large number of volunteers in this this record itself has been entered by a volunteer just in the last day or two. Some are able to check the accuracy of this record and this is a pretty experienced transcriber and I'll be expecting accurate representation of what's on the data page. One of the key classification challenges for us is making sense of this offence here breaking open a locked showcase and stealing there from which is a very specific definition of an offence that if you looked at crime statistics you wouldn't find a category for that and so we've done quite a lot of work over the last couple of years coding our offence data in particular to enable us to visualise the records so let's get out of that one before. So back on the main page people are able to visualise our records through this facility and here we as I say we've run a code over sorry Mark you you just cut out for about a sentence there if you could just that last sentence please. Yes so the visualisation is a product of work we do on aggregating particularly our offence categories because this is obviously key area of interest for people looking at this in social science or historical terms. We run a code over our offence data for whole jurisdictions over long periods of time and generate levels of aggregation through that code and the classifications are pretty familiar to people working in criminal justice and anybody looking at criminal statistics since the 19th century will recognise these are generally the kind of categories that are used and really across national borders now as well. So there's a lot of work gone into that and we have both meta-level aggregation looking at homicide offences and property offences personal offences and then within those categories looking at more refined aggregations that still have their reference point in historical statistics or and now in contemporary criminal justice statistics of the kind you see on ADS. The other areas are pretty much drawn direct from our data although we do aggregate again the dirty fields and sentences particularly because there's some interest in considering during this period when the death penalty was still in place those currencies in which the death penalty in fact was applied particularly for the 19th century. In the trial place and committal place we just used the original data there at the moment. We're involved in some mapping exercises at the moment where we got an ARC to look at a more detailed study of interpersonal violence over long periods of time using this database and extending it and we'll be very interested in geocoding crime events if we can get more specific information as we hope. So that's sort of what we're about and as much as I think I can say at this point I'm very happy to answer any questions. Thank you very much Mark that's very very interesting stuff.