 Thank you for the invitation to present at this meeting. I'd like to talk today about the fact that we have lots of data and, in fact, increasingly amounts of data, and this poses some questions, so I ask one of those questions, now what? About a year ago, one of our board members handed around a report from EY into the data that is held by the National Health Service in the UK. One of the comments they made was that medicine is no longer being just supported by data, but that clinical science and data science are very closely related and, in fact, now data is driving change. They also put a value on the NHS data and said that the 55 million patient records are worth several billion pounds to a commercial organisation. So that instigated a bit of internal reflection at our institute here in Perth, Western Australia. We have 120 staff members, 35 audiologists and 15 clinics around our state and the numbers are growing. We have an implant program that has 2,000 clients. Our database over the last 20 years has grown to include data from over 30,000 clients and about 10,000 appointments per year are added to our database. We manage that data on something that we call EOSuite, which is built on a commercial product called NetSuite. NetSuite looks after the finance, the stock, the HR. What we've built on is a patient management system, which we call EOSuite, which handles the case notes, calendars, reporting, and so we have contact through EOSuite to our funders and our suppliers. We can also communicate to our patients and our referrers. Alongside that sits the NOAA platform, which is used to communicate to instruments and to devices, and we do have a link into our NetSuite, which I'll get back to shortly. So in terms of the challenges, I want to discuss a few of these and perhaps you have encountered these as well and have some solutions already in place or have some ideas and I want to use this as a platform to start or continue some discussions on these. Firstly, it's floors and ceilings. We all need a floor and we all need a ceiling over our head. But in terms of our data, these provide challenges. And of course, when we see the audiograms of implant candidates, we see lots of limits to the instrumentation and they can't record any other data beyond those limits as shown there by those arrows. On the right there, you can see one of our studies on speech data. This is on CUNY data from our implant patients from preoperative through to 12 months. And if we use a 100% score as a cutoff, we can see that at 12 months, about 30% reach a ceiling score. And one could even consider 95% as a ceiling score in which case half of our patients reach the ceiling at 12 months. We've since been involved with a study across three large clinics internationally. And when we looked at the data, we found that about a quarter of the records across the three data sets were affected by either ceiling or floor effects. So what do we do? I don't imagine as much we can do about audiometry. In terms of speech, audiometry, speech perception testing, certainly we should be using adaptive tests such as the azibaya or the noise test. When it comes to analysis, if you look through the literature, we can see that people have imputed values for ceiling scores. And they've done things like adding 5 dB, 10 dB, 20 dB, or in another instance, just set it at 125 dB. Now, it certainly isn't correct. And it's also not correct to ignore those scores. We can see an instance of people interpolating to get immediate values, which I guess is fair enough. The potential solution is for us just to classify these data in terms of severity, the type of audiogram or outcome we see or the change in classification. The next challenge that we've been confronted with is accessing data in proprietary databases. We know that NOAA is used to communicate to instruments and hearing aids. And we have Cochrane Plant fitting software, hearing aid fitting software. Now, in terms of NOAA, there is a part of that, which is public. So you can look up an XML file and you can identify and extract the ID, the date, the name, the audiogram, thresholds, transducers that have been used. But there's a lot of data that is encrypted by the manufacturers. And in essence, we can get one-on-one access, but to sit down and transcribe all these data is not a job that we should be doing. So it looks like we have to work hand-in-hand with the manufacturers to extract that data on, say, hearing aid fitting and data logging. Or we have to get hold of some APIs. But once again, we certainly need the help of the manufacturers because they've encrypted that data and they can only get that access if we give them permission because, in effect, it is our data. The third challenge that we've been confronted with is the linking of the records of clients that are on different databases. And there I want to go back to the example of NetSuite and NOAA. Over time, we've noticed that clients are not being recorded similarly in both databases. We don't have a platform for common IDs. And even if we did, it is prone to error and and audiologists or administrators not paying attention. And so we see multiple records for clients. We see mismatches between names and errors in names and dates of birth. A Susan can be a Susan can be Susan, one database and Sue and another database that can be misspellings in names. We see dates of birth that are being transposed, being born on the third or the sixth or the sixth of the third. One of the tools that I've used is available from the Centre for Disease Control and Prevention. They've got a program called Link Plus. And as you can see there in the example, I've used it to link the first and last names out of the NOAA database with the first and last names out of our ESuite database. And it uses a statistical approach to matching these names and therefore can identify when things are slightly misspelled, when dates have been transposed, or even where names have been shortened or slightly changed, a Joe and a Joseph, a Sue and a Susan. And then of course we've got limitations in the available data and I've given some examples there. We have inconsistent definitions and classifications. If you look through the literature, people have used three frequency average, four frequency average, better year, worse year, the WHO and the global burden of diseases group and different definitions. And it's good that they've now unified their classifications. There are inconsistent and non-existent definitions of etiology and disease states. For example, we recently did a study on hyperacusis and there is no standard definition for hyperacusis. Also in the case of implant registers, there are no formal definitions for etiology, which makes analysis quite tricky, particularly across platform studies. What we've also come across, and I'll get back to that shortly, is there are insufficient and inadequate measures, particularly for outcome measures. And finally, of course, we should acknowledge also that we are an international community, but we will have our own languages and therefore comparing speech perception tests across languages is a bit tricky. This was illustrated by a study that we've done with three centres around the world, contributed a large amount of copper implant data and we had a series of predictors, as you can see there on the left, and our outcome measure was the word recognition at four months. When we applied standard regression analysis models to these data, we found that we could predict the outcome with a certainty of about 12%, which compares similarly to what other people have reported in the past, in terms of... They've reported predicting when they actually have about 20%. We then applied some machine learning algorithms to our data and while we found that machine learning improves the performance of the models, beyond the linear models, the gains were pretty modest. And we also showed that they were unlikely to be improved even if we had more data. The only way it could be improved is if we had better quality data and different types of data, which highlights that we need to standardise our data collecting in terms of definitions and we need to also improve the quality or the number of both predictor variables and perhaps also outcome variables. And of course, we can all recognise that just a word recognition score is not the best outcome measure. I want to quickly finally make a couple of points on statistics. I think we really do need to make good use of our statistical colleagues. I just want to use as an example the IOIHA, which many of you may be familiar with as an outcome measure for hearing aid use. It has been misused quite a lot, even by senior people in recent years. And I just want to draw your attention to where the neutral point in each of these questions are. Question one, it's at the beginning. Question two, it's in the middle. Question seven, it's at the second point. And what we found is that people have applied a number to each of these points on their scale, added them up and then looked at a total score, which on the surface is really not appropriate. And thankfully, somebody has drawn attention to it. An article has recently appeared in the International Journal of Audiology, where the case is made that we really need to revise the way we analyse this data. And as I said, we really do need to make use of our statisticians. And finally, I want to just mention the whole aspect of ethics in using client data. And we've been confronted with that recently, when and how is client clinical data available for researchers? In Australia, the Privacy Act says that data collected can only be used for the reason that it has been collected, and therefore for use as a managed and hearing loss of somebody and not for somebody to research project. We have instances where our clinicians are also researchers, and it's very difficult for them to have one hat on one day and another hat on the other day. The whole topic of de-identification also needs some attention, just removing the name is not strictly de-identification. And finally, who really owns the data? Certainly, clinics collect that. But particularly if some commercial gain is made, how do we communicate and how do we acknowledge the contributions that patients, our clients make to this output? Thank you very much for your attention. I'm looking forward to in the near future, us gathering face to face rather than remotely and particularly would like to welcome you to come to Perth, Western Australia.