 Today I'm going to talk a bit about multimedia as a data source, particularly about data management and supportive research where multimedia is a data source and is the research material. And I guess I need to confess right up front, I'm not a new media expert, I'm what could be referred to as a digital librarian. So I come at this with a very different kind of understanding, potentially an exploratory approach and I'm very interested in the kind of questions and comments or thoughts those listening in today have. What I wanted to start off today with was to I guess have a bit of a think about what multimedia is because I have to say that this is what went through my head when I was thinking about offering up this webinar. I also wish to acknowledge colleagues over at Edith Cowan University who triggered the idea for this webinar because they were interested in looking at different types of data, particularly the sort of data that is generated in support of performance studies. So that's where this idea came from. But I really wanted to look at how multimedia was defined to get a bit of an idea about the way that it's referred to by different groups of people and also what it is from a technical point of view. So you can see there on the slide, I've given a mixture of definitions of what multimedia is and I think the second one in there is the one that really holds my attention probably the most, is that it's about different types of data that's actually contained in a file. And there are different ways of referring to those components. So I've picked some of those out in the third point. I was quite intrigued to discover that there are chunks and atoms and parts and perhaps because I don't have any media background, I come at this as someone who's looking at the material nature in some ways of multimedia. I wanted to have a think about where multimedia appears in the research environment and there I've listed films as an example of multimedia and webpages where you can have a mixture of moving image and sound, possibly some graphics, digitised documents that can have both image and annotations, markup and transcripts and also satellite images which is something quite new to me. We had the great benefit of a lecture within the Australian National Data Service recently from Stuart Minchin from Geoscience Australia and he opened up my eyes to what a satellite image is and all the layers that actually exist in a satellite image. That was really interesting. He gave a great talk on the data cube that they've developed but that's a whole other topic. I just thought I'd pause at this point to see if any of those in the group have a background in multimedia or have some questions about the definitions of multimedia at this point. So I guess looking into where multimedia appears in the research domains got me thinking about the fact that this topic had come up through those working with people who undertake performance studies. So I listed some of the research domains there where multimedia is generated to help me kind of unpick who's actually using or creating multimedia and where that's happening and why they're using multimedia to get a better understanding of how it's created and also perhaps used. So I guess I'm looking at this from a cultural production perspective and I started to look at the methods that researchers were using that they used to create or process data or information and I found that a very kind of iterative process I kept bouncing across from the idea of something as information and something as data and having a bit of trouble identifying what was what and I kept circling back to asking myself what is the researcher doing and I think it really helps to look at research methods to understand when a digital object is being treated as a piece of information and when it's being treated as a source of data and I'm sure that there could be quite an exhaustive conversation about this but I really thought it helped to understand the purpose to which this material was being put and how it was being used and whether there was a human looking at it or whether it was a computer looking at the multimedia and whether that was a useful distinction or not. Where I got to was looking at multimedia was trying to understand what happens to a digital object and that's what I refer to something as a piece of digital material and so I tackled annotation as an area of research I guess interpretation or a process in which information is applied or data is applied to data and I became very tangled so I picked out four areas where I could see the word annotation being used and the first was genetics which was really interesting. I'm quite fascinated by the idea of automated annotation and how that actually operates in genetics and it's more out of curiosity than ever having a desire to be someone who studies genetics but it was really interesting to understand the capacity for generating very large amounts of automated annotation and then I moved into geoscience to look at what happens when people annotate geospatial information and the kinds of terminology that's actually used to understand what's actually happening and whether data is being applied to data or whether information is being applied to information or data is being applied to information I really don't have the answers to this but I wanted to unpick what was actually going on to get a better understanding of how multimedia was being used and enabling research. So I moved on to linguistics which was again quite fascinating to discover the different types of annotations that are applied to languages and I've listed them there, descriptive, analytic, time sequence and text. So that was I guess really interesting for me to understand to pick apart say a descriptive annotation from a time sequence annotation and try and understand what's data and information. It helped kind of distinguish different annotations but it certainly didn't help me answer the data and information and kind of dilemma what's data and what's information but I recognised that material was becoming what seemed to be becoming increasingly multimedia in nature if it hadn't started off that way in the first place. So the last area I looked at was biomedicine and I've got some images following this slide where researchers they annotate images and they do that in different ways by drawing dots and marking areas and I thought this was really fascinating and it made the idea of simplifying managing multimedia into what is data and what is information kind of meaningless in a way because it might be a theoretical concept rather than something which actually helps the researcher to do their research. So I thought I'd put some examples in front of us here and this is a biomedical slide and it's been annotated and you can see that it's been annotated with a line shape and also with some words and that there's a scanned image underneath. The next image I've got is this wonderful gene annotation image that I looked at and really couldn't make any tail of but it made me pretty interested in understanding how geneticists actually managed their data and what information they derive from that data really really complex and I think I've mentioned the fact that machines actually generate these annotations made me want to understand where those annotations are actually put and how they're linked to the gene sequence but I think that's a whole investigation unto itself and the next image I've got which is slightly more familiar for many people is a Google Earth image which has been a satellite image which has got street markings and bubble pop-ups and line tracing. It made me want to understand a little more about how that information was being captured to know how to support researchers who want to manage their data effectively and be able to potentially make it available to cite it or to present it as part of their research. This is the last one which I hope as of you who've ever been to Portland enjoy, I found this on Flickr and it's a graphic image in the background and on top of that it looks like there are letters that are very carefully placed in alignment with what's called a spectrogram which is someone saying it rains a lot in Portland and I thought this was really interesting. I wanted to understand a whole lot more about how these discrete pieces of data were actually brought together and whether you captured that all as one thing or whether you captured that separately and if the researcher uses those separate components as part of that multimedia. But it made me understand that where the annotations or the combinations occur may be really critical and supporting some of the research findings. I thought what I'd do is introduce a project that's happening here in Australia that kind of for me emphasises this idea of what's information and what's data and what other research is looking at and it's a project based up in Griffith but I think with people dotted around Australia. It's a centre of excellence policing and security and they're looking at criminal trials over time and they've been digitising archival materials and transcribing them. You'll see there on the slide a nice kind of slashed image there that Mark supplied to kind of give you a view of the digitised image which to me is information but also on the lower part of the image is where the data entry occurs for transcription. What I've found interesting in the exchange with Mark about this project and I met him through an interaction with Alana Piper recently up in Brisbane is that they're really looking at making the absolute most of this digitised material. Looking at it from an informational point of view to look at being able to read the records of these cases, criminal cases here in Australia and also looking at what the data underlying the information can tell them. It's been a pretty interesting process to get to grips with what it is that they're doing and I hope that this offers some insight to perhaps why it's important to understand what the research is trying to do and that they're interested in using whatever method and whatever feature of multimedia to enable them to do their research. So Mark has emphasised here in the outcomes that they're looking at a mixture of research methods both quantitative and qualitative. He's sent me an article and I will pop the link into these slides so that others can have a chance to go and have a read of it. But the qualitative aspect of it was something that was a little more familiar to me. The quantitative aspect of it was something quite different and it made me realise that perhaps looking at mixed research methods was also a way of understanding how multimedia is operating as both an information source and a data source. But it's the data side of it which I'm finding I guess enlightening is the word to use. And that they're getting that data through transcription human transcription. But in other cases of digitisation it can be character recognition. So this is where I kind of got to as multimedia as a data source I got to the point where I decided that it could be both information and data at the same time because it's the way that the researcher is using it and building whatever they learn from that multimedia whether it's being looked at as a piece of information or as a data source to do their research. And so reading the cases or reading the court records and also doing text analysis or data mining is enabling this research or that research group to do their research which I think are pretty incredible potential from one source of digitised material and I think that's quite an exciting prospect. So from a point of view of management it made me think about how they were going to approach managing that and this can't been kind enough to give me a description of how the back end to the prosecution project is going to work. They've got archival materials as digital images. They're going to transcribe those images into an SQL database and that supports them doing quantitative analysis of longitudinal and comparative patterns. This is an email that he sent over the last week. They're looking to extend that database by accessing or linking other data sources and he's mentioned the Trove archive and possibly other projects or other digitised material like the police visits to enable qualitative what he's referring to as case level as well as quantitative analysis. And they're looking also to enrich the data by accessing and transcribing the trial transcripts and other text archives. So I guess what I understood from this was that my notions of splitting something into information and data were helping me to understand what it is that was going to enable this research group to do their research but also I needed to dig even deeper into what sits underneath this application to understand how they're storing the digitised images and how they're storing the transcriptions and where they're wanting to store the linkages between those two things and I realised that multimedia in this context is very complex and that all that language that I introduced at the beginning about layers and components is important to I think inform how we support the management of this material. So that's where I got to with the prosecution project. Last but not least I decided to have a look at three applications that enable a person to manipulate multimedia. So I picked three that seemed to be reasonably familiar to me and just wanted to have a look at how they enable material to be brought in and how they enable informational data to be applied and what happens in these three applications and these are I guess reasonably ubiquitous applications, Final Cut Pro and ArcGIS and WordPress. They're certainly not the very domain specific and I guess less commonly used applications that you might find in biomedicine or genetics more specifically. So I had a look at Final Cut Pro to just try and understand some of the language that's used to understand what's actually happening when you use Final Cut Pro and if you've got any competent users in the group today it would be great if you offered some advice but I just wanted to look at what Final Cut Pro does to digital material and from what I can understand is that it technically consists of separate files. There's something called a project file, a media source file and render or cache files and to me that gave me an understanding that the multimedia was being captured in different ways and potentially for different purposes and I have to confess I haven't ever used Final Cut Pro and I think this is an interesting way for us to understand how multimedia is either brought into an application and where it is saved but also to try and understand what happens when you want to try and get that material out of the application and how you store that and whether you store that as a combined object or whether they're separate objects, the Open Archives Information System model which is used in the digital archiving world has been interpreted in different ways. To give you an example, a long time ago when I was working on the National Digital Heritage Archive in New Zealand we decided to be very clear that we would capture metadata separately to capturing material that we were hoping to keep and I think it was the Dutch National Library decided to go a different way. They decided to build the digital object with both the metadata and also the object that was being collected and to me that is two very simple ways of approaching capturing multimedia is to separate concerns if you like or different types of digital information or to actually build it into a bundle but it made me realise that if I was trying to get material out of Final Cut Pro I would want to understand how this material could be linked back together again in case I ever wanted to work on that multimedia material again. I won't carry on with that one but I did look at this and wonder how the output of Final Cut Pro is captured and how the components are captured and I don't have the answer to that today. So ArcGIS is another tool that I haven't used but I went in to have a look at how material was viewed in that application and what happened to it when it was being used and what I can understand from this is that it's possible to pull in images, geospatial images and it's possible pull in geospatial data if you like, that long that kind of data and to build up layers within this application and again it made me think about being able to maintain those components separately but also to maintain the final output which may be a combination of those components could be critical to a researcher it might not but when you're dealing with different parts of material how is that used by the researcher. From a point of view of looking at a map from a human point of view we can read it but is that an important aspect to the researcher or is it the annotations on the map that are more important. I can't answer that but I guess in terms of being able to support researchers who use or create multimedia it's important to ask what it is that they want to do with it and whether they want to deconstruct or reconstruct from those original components. So the last one is WordPress and this is one that I suspect many more people have experienced with. I've always wondered how people get the content out of WordPress so I went and had a look to get an understanding of what happens if you've had a website up using the WordPress application and you want to suck all the content out so you can capture it and perhaps put it into a different application and it may be very important to keep discreet the narrative that's in posts or pages or comments separate from categories and tags. What I can glean I think you can get that out as separate pieces of data but it may be wonder how a researcher might actually use that material whether they would just want to re-import it into another application or whether they actually want to process those tags or categories to see how much content's been given those categories or tags. Thanks again Ingrid and thanks very much to everyone for attending today and for your questions and comments.