 Thank you so much, Mike. Well, welcome back everyone. As you know, this is our final webinar for the course caring for audio visual material and we really have a fantastic webinar planned for you today. It looks like right now we have about 154 people logged into this meeting room and feel free to continue saying hello in that chat box and throughout the webinar feel free to post all your questions in there as well and we'll try to get to them all by the end of our session today. As you know, this is just one course in our series caring for yesterday's treasures today. Six courses have already concluded and are available on our website to go back to at any time and recordings to this course will also be posted there shortly. These courses, I'm not exaggerating when I say they would not have been possible without the Laura Bush 21st century librarian program grant from the Institute of Museum and Library Services. So a huge thank you to IMLS for supporting training opportunities like this. And we're also fortunate to have Mike on board with us with Learning Times to help us with both the website and webinar support. And for this particular course, we also owe a great debt to the Conservation Center for Art and Historic Artifacts for organizing all of our speakers and materials. And Laura is on board with us again today to help field all your questions. Laura, do you want to say a quick hello? Sure. Thanks Jenny. Again, my name is Laura Hort Stanton. I'm the Director of Preservation Services at the Conservation Center for Art and Historic Artifacts. And we are a nonprofit regional conservation center based in Philadelphia and we specialize in paper based materials but do also have some experience with audio visual materials. So that's why it's been great to be able to work with Heritage Preservation on this series. So thank you. Thank you. So before we move on to our topic today, let me just quickly review what you can expect following the conclusion of this webinar. To officially complete this course, we just asked for a few things. The first is that you've registered so you're in our system. We ask that you watch all five of the webinars in the course, whether you're showing up live or you're watching the recordings. And finally, we also ask that you complete all five homework assignments. All of these assignments are due one week from today on November 6th. So Wednesday November 6th. So we have had to make a few adjustments to the way we reward our Certificate of Completion. When we began that the courses, we had absolutely no idea how many people would go. We not only go through the effort of watching these webinars but also completing all of those homework assignments. On average, we have about 300 people finish each course. And as you might imagine, we are very close to hitting our ceiling on our postage budget. So in an effort to help reduce cost of postage and printing, we are going to ask you to help us just a little bit. For those of you who complete the course, you will receive an email notification that includes your name, number of instruction hours and other pertinent information about the course. And this will in essence serve as your certificate in proof of your achievement. And as always, you will receive a digital credential from creggly.com. We hope that this is a satisfactory alternative and that you understand why it's necessary. But with that, I will say we do know that that 8.5 by 5.5 piece of paper has become important to some of you. And we want to do our best to accommodate those folks who have found it really important. So there are two additional options. We can email you an image of your certificate that you can print yourself, or if absolutely necessary, we can print it and mail it to you like we have done in the past. In today's homework assignment, you'll notice that the second question after your contact information will be a spot to denote how you want to receive your certificate. So make sure to fill that out to let us know. As in course past the final assignment is actually the evaluation. You really look forward to hearing your feedback. And if you feel more comfortable doing so anonymously, you'll notice at the end of this evaluation, there is an opportunity to provide anonymous feedback. And if you're not interested in earning or completing the course officially, which means you haven't been doing the homework assignment, I'm going to ask you if you could please still do that last assignment and fill out that evaluation, because that would be incredibly helpful to us. The course web page, it will remain up and continue to hold all of those presentation resources, transcripts. And after the sixth a day, your homework is due, the course will have officially concluded. And we'll start the process of posting the webinar recordings to that page so you can share with your colleagues or go back and look at everything that you've gone through already. So what's next? Shortly following this webinar, we will send you an email with links to all the webinar recordings so that we hope will happen today. It will include recordings and links to homework assignments so you'll have everything in one place. Again, all the materials are due November 6th, one week from today. And shortly following that deadline, we'll pull down all the links to the homework assignments and replace them with recordings of the webinars on the course web page. Staff at Heritage Preservation will then begin the process of logging all your homework assignments and tracking attendance. Once we have that logged, usually it takes about a week. We will send you an email notification in onlinecredentialsfromkailia.com. Then if you haven't done so already, consider signing up to become a member of the online community. Membership is free and does give you access to posting on the discussion board, which is a great way to continue some of these conversations. And as always, if you have questions, please feel free to email us or call us. All right. With that out of the way, let's move on to our topic. I am pleased to introduce today's instructor, Stephanie Ureni. Stephanie is an archivist and audiovisual specialist at GeorgeBlood Audio and Video. She has worked in a variety of academic and public libraries, archives, and nontraditional library settings around the globe. Her accomplishments and experiences are extensive, including the management of the Arms Control Disarmament and International Security Library at the University of Illinois, aiding in the development of librarything.com, and independently managing the Pacific Basin Institute Archive at Pomona College in Claremont, California, where she was also responsible for building a new on-campus facility. Stephanie, I know I'm missing a lot of stuff in there, but I'm going to go ahead and move this out of the way and hand things over to you. Great. Thank you. I am happy to be here and happy to see so many people from all over the globe, so welcome. Like I said, I work at GeorgeBlood Audio and Video. We do digitization reformatting of music, film, video collections for libraries and archives all over the globe. Today, I just wanted to talk to you guys about understanding reformatting options and providing access to your collections. I was just going to begin with a bit of an overview of the audio digitization workflow process here. As you can see by the graph, this is just kind of a visual representation of the digitization process that we go through here at the studio. I'm mainly just going to cover details of audio in this course given the time frame, but video is just as extensive and complex, and I'd be happy to be a resource in the future for questions pertaining to that as well. So I'm going to go through the particularities of audio digitization in practice, covering the standards, including many metadata standards. And right now, I think probably most of you are dealing with collections that you're trying to figure out what tools to use and also kind of what happens during the digitization process. So I just want to provide an overview for you. So when we receive and process materials, there are three different types of files that can be created to make up a digital archival set. They consist of the preservation master, the use and access copy, and the web accessible copy. All of these depend on the situation of your institution and your wants and needs for the project, but we'll begin first by looking at a preservation master. So a preservation master is the most important file to manage, and as such it should be rarely accessed. It's meant kind of to provide a copy of the original, but in digital form. It refers typically, excuse me, the standard is 96 kilohertz or 24 bits, which refers to the, excuse me, bit rate and bit depth and sample rate for an audio file, sometimes that is done in 44116 bits. But you can think of kilohertz as referring to the pulse code modulation in an audio file, which is the digital representation of sample analog signals. So you can compare this to the DPI of a TIFF file, bits measure the volume or the amplitude, and it's similar to how a TIFF file would document its range of colors. Preservation masters are usually processed as a wave or broadcast wave format, and I'll be discussing the broadcast wave later in the course. So the key advantages of a preservation master, we process it into a broadcast wave format, which is the most widely used format. It has a higher resolution than 99% of sources, and as a file format for audio data, it adheres to the EPU or European Broadcasting Union Technical Recommendations and Standards. It's better than most playback chains, and derivatives can be easily created from this file. So if you can think of it like a sound TIFF, it's kind of a way to visualize what a B wave file is in the field of audio. So the preservation master comes with some difficulties as well. There's no standard storage medium. The data types are often expensive to maintain because the files are larger, so they're too big to house on a CD-ROM. Online storage requires ongoing maintenance, and internet delivery is often impractical for this type of file. It's just too large. The advantages of having a preservation master over its derivatives is that it'll provide you with a basis that you can use your files. So a typical solution that we provide for some of these problems is that we'll give a 9624 hard drive to a digital library, which requires enterprise level storage. Often you might need IT staff to help with that. 9624 can be put on DVD-ROM, and from there it can be migrated to hard drive when available. Or you can also put it on a gold CD-R. If it's small enough, you can put it on a CD-ROM, or an LT-03 data tape to keep it long term. So a main derivative of the preservation master is the use and access copy. The key traits of a use and access copy are that it's readily accessible in a user-friendly format, so often we'll make CDs or DVDs to have on shelf in the library. It's good enough to substitute if you lose the preservation master, but an important element to remember is that once you're pulling a derivative from a preservation master file, you lose some of the information in the file. It's often information that you can't necessarily hear to the human ear, but because it's pulled from a larger file, it doesn't have the same breadth and depth to the sound. So some key difficulties of a use and access copy is that we'll look at CD audio versus CD-ROM. Now both of these mediums are not, can't last long term, because they will deteriorate in some form. It is a CD, but a CDDA is a digital audio CD. It has a pure serial read, so you can't reread it to correct errors, even small transient errors. Whereas a CD-ROM is more sector-based, it's digital audio provided as data, so it can be reread. It's a bit more reliable, but it requires computers, software, particular OS system to retrieve the information. So CDDAs are more widely playable, but CD-ROMs are more reliably played. So depending somewhat on the preservation master file, a use and access copy will be provided on a CDDA for near universal playability in video form. We provide it on a DVD, and it's always important to have multiple copies. You could put one copy on a gold CDDA or one on a green. These are different standards of a CD. Often we'll put the preservation master on a CD-R, excuse me, a gold CD-R, and the user and access copy on a green gold CD-R is just a bit of a higher quality. So our third format of file is the wet accessible copy. It depends on the rights of whether or not you're able to stream this material. RA and AAC on the, you'll see on the PowerPoint here, RA refers to real audio, which is a streaming format. AAC is an advanced audio coding file, which is the standardized Lossy Compression Scheme for digital audio that was designed to be the successor of MP3 as part of MPEG-2 and MPEG-4 specifications. MP3 is used as a wet accessible copy or WMA, but to have real audio RA or AAC, you have to have access to those rights. It depends on your needs. You might have restrictions to put things online beyond your institutional ability, but sometimes you want to give further access to your material and a web accessible copy can provide that. It won't be quite the same standard of sound as you would get in a Preservation Master or a user access copy, but it can still provide a different resource to your material. One of the benefits, I think, of a web accessible copy is the accessibility. Obviously says it in the name, but that's a large part of Preservation, is just access to materials. I think I accidentally pressed something here that confused my screen. Oh, there we go. Apologies, thank you for bearing with me. There it is. So just to summarize, Preservation Masters used to be provided on analog processed from to a further format as things were developed, but now since we're in the digital realm, we stick with 96-24 Preservation Master file which can then be moved to DVD ROM, hard drive for storage, or an LTO data tape. CDR, you know, it can be then made into CDDA or CD-ROM in either gold or green. It depends on your choice and the resources that you have to give to your collection. But ultimately, it stands to point out that digital is not forever. It's the idea of preservation is often thought of as the caretaking of old archival material found in someone's dark attic that has to be retrieved from a back room, but the practice of digital preservation is more largely understood in the field as that of providing access to information as well as history. The American Library Association, PARS definition of digital preservation and PARS is the is a cohort of the American Library Association that specializes in preservation. I think I provided a link to their website at the end of the course or on the resources page, I believe. It's the preservation reformatting section. They state that digital preservation combines policies, strategies, and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. The accurate rendering of authenticated content over time. This leads us to look at preservation in terms of access to the content versus that of the medium. Audiovisual media deteriorates rapidly. Playback machines for most, if not all, analog material have become obsolete. Analogs widely considered to be dead digital, though it was not considered to be forever. As a result, we need to be conscious of the constant change and challenges that face the practice of digital preservation. This uncertainty principle considers and values the idea of access to information documented, whether expected or unexpected in recordings. The catch is regardless of the challenges of media failure and technical change. So this makes digital, therefore, makes migration a way of life. So what do we consider when discussing migration? How frequent? How long will your materials last? What determines when you should migrate? Often that's due to format obsolescence which can't be as big of a problem as we know how to move forward in formats, but carrier obsolescence is a really big problem. If you can't find anything to play your files on, then there's no way for you to listen to it. So what then is obsolescence? Consider a CDR versus an LTO tape. A CDR is a good option for a user access copy, but it comes into use in somebody's hands often. An LTO tape is something that can last a little longer, but you need some IT support. We term that as enterprise level storage. It boils down to the ability that your institution has to support your media and also the limitations. Can you handle having an IT level storage or are you dealing with a bunch of drives on a shelf? So the term bunch of drives on a shelf was coined by Andy Clovis at the Vermont Folklife Center. It's cheap and fast and familiar. We can all use a hard drive, but the issue with hard drives is they die easily and they're easily erased. An LTO tape has high density, high resolution, but its IT intensive has short lifestyles and a complex machine dependency. So you'll often need an IT support staff to help with the backups of LTO tape. CDs are cheapish and widely available with mid resolution, but there's lots of handling to migrate and the metadata is none except for the label. On an enterprise class hard drive, it's a fast preferred solution, but it needs technical staff and can often be expensive. This is a large server type hard drive that you could use for backup of your materials. So what does all of this say? Well, IT is getting even cheaper ever more quickly. It also gets obsolete. So over multiple migrations, you have to plan ahead for the life cycle cost. What ability will your institution have in any future time to support the migration of digital content? Because the decisions you make today are governed by that future ability. This is an important point. If you can't support the migration choices you make now further down the road, then you're unable to support the continued digital preservation of your material. So I think before I go into a standard section, it looks like we have a few questions that I'll try to answer for you guys if there's any confusion. Yeah, hi Stephanie. This is Laura. There does appear to be a few questions and I think a lot of it is due to like in any field all of the acronyms that end up floating around. So when you were some of the questions related to that more just terminology, one was when you were talking about the preservation master, you mentioned PCM and what does that stand for? PCM stands for Pulse Code Modulation. It's essentially the digital representation of sampled analog symbols. So it's the way that if you would look at a wave symbol, picture a wave symbol for an audio signal, the ups and the downs, that is a pulse code modulation of that audio signal. Okay. And so other questions in that same vein. Some folks wanted to know, and I know Jenny did a little bit explanation in the chat box, but what actually is LTO? Yes, LTO is a linear tape open. It's actually a magnetic tape data storage technology. It was developed in the 90s and it was an alternative to proprietary formats. It's used especially for backup. We use it to backup our files. Essentially we have things housed on the server and then we'll back it up to LTO tape. It's a reliable backup option. Okay, great. We also had some questions about, you were also mentioning with the preservation master, the term 9624 and people were looking for a little bit more clarification there as well. Oh, sure. 9624 is the sample rate and bit depth referred to in an audio file. The sample rate is essentially the frequency. It defines the number of samples per unit of time taken from an audio signal, a continuous sample. So the sample rate is the frequency. Bit depth refers to, it is a part, I mean, both of these elements are part of pulse code modulation that I mentioned before, but it's the number of bits of information in each sample. So it corresponds to the resolution of each sample. So there's 16, or in this case this is 24 bits per sample. If that clears that up a little more, I hope. Yes. A few more questions for you and then we'll let you get started with it again. But one question was, how do you convert other formats to the WAVE file? So the person was asking, how do you convert, for example, WMA to a WAVE format? If there's easy explanation for that. I guess generally our transfer engineers would be doing that process. You would use a converter, a particular computer program that would convert the WMA to the WAVE. It's similar to how from analog to digital you have a analog to digital converter with essentially a machine that you plug in that connects the analog to the digital. It's something that makes the files talk to each other. So I'm not positive. I think there are various sources that you can use for that process. I'm not positive all the ones that we use in the studio here. Okay. Fair enough. But knowing that there's tools out there that you can use to be able to do that I think is helpful to people. And I lied. I do have one more question for you. You were talking about your access copy, your youth copy, and of course your preservation master. And the question that we had is, when you're doing your migration or conversion, which copy would you use in the future? Is that why you're making the preservation copy as well? Yeah. I mean, the preservation master copy would serve as the go-to copy. Essentially, youth and access copies and web copies are made to provide access. So the preservation master copy is a similar thing to taking your film and putting it in a freezer to preserve it. It's not a copy that you want to be using. That's what you make the use and access copies for. They develop as derivatives of the preservation master. So the preservation master aims to be the closest to the original. The idea of it is to provide a digital representation of the analog signal as close as possible to its original. Great. Well, we have lots more questions, but we're going to go ahead and hold those until you're ready to take another break, I think. Okay. Great. Well, I guess we'll just move on to highlighting some of the standards. I realize that I'm probably throwing quite a few terms your way and seeming to gloss over some things. It's some standards and ideas that are a little difficult to explain in just a couple minutes. But as an introduction, I hope this can provide you with resources in a way to begin looking into how to process material. We also welcome questions or calls to the studio. I'm happy to be a resource in the future. But in looking to the practice of digital preservation, there are some important standards that include B-wave files, as I mentioned before, which is a broadcast wave that we use for the preservation master. There is bext and info chunks that are part of the metadata process within a B-wave file. ID3 tags, which are also referred to metadata. There's an AES57 guidelines, which is the Audio Engineering Society guidelines. These are all things I will be going through in depth as much as I can in PB Core and then check sums, which provide data integrity, so just to launch in. So the B-wave is a derivative of the wave file. It's a broadcast wave. It was released in 1992 as part of Windows 3.1 as part of the RISC standard, which is a Microsoft resource interchange file that also had derivatives of, you've probably heard of, AVI files, or they're probably the most common on this list in addition to RMI and RDI. So the B-wave is what we use in final automation at George Blood Audio and Video. All of our files are created from a single original capture file. That specific file info is then gathered from a FileMaker Pro database, which is where we house the metadata. We use Linux command line auto utilities to read the information. The original file is split into these three different types of a preservation master access copy or web copy. The specifics of those that I briefly mentioned before, the preservation master is often processed as a wave and is provided in 96 kilohertz 24 bits with Bexed metadata attached to it that's housed within the file. An access copy is often processed into a wave, but at a lower pulse code modulation, which is at 44 one kilohertz and 16 bits, and that's provided for CD burning most often. The web copy is MP3 at 192 kilobits per second, and there's ID3 metadata tagged on to that. So some of the audio utilities that we use in the digitization process include SOX, which is sound exchange. It's considered termed as the Swiss army knife of sound processing. It's used for sample rate and format conversion, so most likely this could be something that would be used in the WMA to wave the question that was asked earlier, conversion. And then there's the Libs and File, which is a C library language. It contains an example of a program that gives a lot of useful info about the files with Bexed embedding. You can check that out at MeganNerd.com. I provided these links on the resource page as well, their open source programs that you are able to download. So B-Wave file consists of two mandatory wave chunks. The FMT chunk, it describes the contents of the wave file. This is speaking to the metadata, the Bex chunks that are attached to B-Wave file. The format chunk, it includes descriptions of the format, the number of channels, whether it's mono or stereo, the sample rate and bit depth, which are 9624, as I referred to before, and the streaming info. The data then covers the audio data, which is there's no compression. There are many other things that you can include in the audio data portion. Multi-channel format, 64-bit audio. It's a place where your wave, the pulse code modulation is documented in the file. So there's also an optional list of wave chunks or info chunks that can be embedded in a B-Wave file. This includes information on the list you see here. Any new info field can be defined, but an application shouldn't ignore any chunk. It doesn't understand. So there are common registered info fields, like an artist, comments, copyright, genre, name, some examples of these optional items that you can include in your Bex chunk. So additionally, there's the sample chunk or the sampler chunk, SMPL, which defines the basic parameters that an instrument, such as a MIDI sampler, could use to play the waveform data. It includes info about looping the waveform during playback. It's useful when data is used in samplers, but it rarely holds value in the preservation world. But Peak Audio is one of the software programs that we use, and it inserts a sample chunk in every wave file it saves. So it's just included as part of that particular program. There's other optional wave chunks that a pad or junk chunk, which is really just a place holder chunk. These are less common and are used to align files of different sizes for easier conversion. It allows a quick expansion of any other header chunks, and a program called wave lav is one that we use. It inserts pad chunks in all saved wave files. So we lost my screen again. One of the most important elements of the B wave is the Bex chunk, which is mandatory part. It defines the metadata fields and holds a controlled and suggested vocabulary for most fields. So that includes the description, the originator, originator reference, origination date, origination time, time reference, and coding history. It limits the data chunk to pulse code modulation or MPEG formats, both of which I mentioned pulse code modulation before. MPEG I can go over in a little bit. So here's an example of a Bex chunk that we would show in an audio file. The description at the top with the name and the song that is played, the originator, which is where it came from, the origination reference, which is the client, the date and time. And then at the bottom is the coding history. So you can see at the beginning there's PCM, which is the pulse code modulation here, and then 9624, the bit and sample rate. Here we document whether we process it in stereo, dual mono, the machine that's used, and the programs that are used in conversion. So moving on, obviously BeWave is a surrogate of the WAVE file, but there are some problems with normal WAVE files, which is partly where BeWave is created. They have proprietary chunks, which means, for example, PEEC is one of the only apps that would read the WAVE chunk. A lot of the info is redundant. These older applications don't always ignore the support for this chunk, like the info chunk or the junk chunk that I mentioned before. So efforts should be taken to write the most basic WAVE file, because the simpler it is, the more interoperable it will be. So there are some programs that exist to strip extraneous chunks from your WAVE files after conversion. WAVE trim is a Windows application that removes superfluous chunks from WAVE files, and SOX is a command line application that does many other utilities. You can download that from sourceforge.net, which is a great resource. I also see some conversations about BWF meta-edit. You can download that from sourceforge.net as well. We use BWF meta-edit in our quality assurance after a file is processed. We look at the metadata from the original to the converted file and compare between the two. So a WAVE file, it has a lack of WAVE format extensible support. There was a Windows 2000 update to the spec that supported higher sampling rates and greater bit depth, with multiple channel greater than stereo audio. So it's best to avoid it if you can. It's difficult to support it down the road. BWF, there are some problems in the implementation of it. There are a few commercial software titles that read BexJunk info and few pro-audio applications that embed the metadata. For example, Peak 6, Adobe Audition and WAVE Lab are some examples of professional audio applications. But if you look at Peak 5, the old version, or Audacity, they can't embed the same type of metadata that was held in the BexJunk. So, BWF is mostly geared toward broadcast applications and has some limits for info preservationists. Honestly, it's just best to keep it simple. Avoid extensible formats and know your software. Stick with similar versions. So I don't know if I should maybe pause again for a few questions that look like large questions over there. Yeah, I think that would be great to kind of address some of the questions. Again, some are more technical and while others are of the more general sorts of questions. Someone had Andrew from Rock Hill, South Carolina had asked, do you think that there's any alternative to the WAVE files for storing audio since they're just so huge? And are you familiar with, and I know I'm just going to say FLAC or FLAC file format, and would you be able to comment on that? Sure. I mean, as far as preservation master, it's really, we recommend a WAVE given the amount of sonic information that is covered. I think that it's probably best to, if you don't have as much space, you know, that's what we provide use and access copies for, but storage is becoming a lot more cheaper hard drives and the light. So it's actually not extremely difficult to store your files when they're in WAVE format. I think part of it is prioritizing what you're preserving. Well, some people I think their heads are spinning a little bit with all this information because it is a ton of information. And I guess I'm wondering, and also a few other people were wondering, so say that they don't feel comfortable using one of these converters or they might not have access to it for one reason or another. So some people have asked, if they can't convert things to WAVE files, is it something so important that they should send this out to a vendor to do right away? Or is it something that as long as they're storing what they've got, okay, are they safe? Should they worry? Yeah, I mean at the very least, you know, stick to standards that you know in the field, you know, keeping things in a cool dry environment with proper temperature and controls. If you can't digitize right away, then you have to care for the AV material itself. But the fact is that a lot of these materials are deteriorating and time is not something that's on our side. So I think the sooner the better, which is partly the idea of prioritizing some of your material. Great. So Selena from California had the question, just wanting to confirm what she was hearing was right. And the question was, can you confirm that the two required WAVE chunks are the FMT and the data? And although BEXT is optional, that you think that that part's actually most important? I would say, I mean the FMT and data chunks are considered just part of the metadata standard that are mandatory. It's true that BEXT chunk is considered to be part of the definition of those fields. So I guess it might be a little bit misrepresented here, but I would recommend all three. And let's see, I did have another question here I wanted to ask. So are you going to touch on metadata a little bit more later? Because one of the questions was, do you have a software that you would recommend for adding metadata to WAVE files? Sure. I guess there might be a little bit of confusion then. A BEXT chunk is referring to metadata fields. So right now in a standard broadcast WAVE file that I'm discussing, these chunks are referring to different fields of metadata that can be added to the file. So it's a way of breaking down how metadata is attached to an audio file. Okay, great. And then one more question, and then I do see a lot of questions coming in about video formats as well. And maybe those are questions that we can reserve for the very end since you are saying that you're going to focus mostly on the audio here. So just wanted to let people know that we're not missing their questions there. But the one question I did want to mention was it was about RAID drives and wanting to know what you think of those potentially for storage. We use a RAID drive for some of our larger storage. As I'm not the IT department at the studio, I am not quite as knowledgeable about that element. But we back things up to an LTO tape and use RAID drives for a large part of our storage since we have so many projects coming in and out. Okay, great. I'll let you get back to it and certainly field more questions here at the end. Okay, so I guess moving forward, ID3 tags is another type of metadata similar to what I was mentioning with the BEX chunk. ID3 tags are the most widely compatible. It has limits, though. And I'm sorry, I'm a looks like there. Pardon me, there are several different versions of the tag implementation. The metadata is embedded either at the beginning or the end of MP3 files depending on the virgin. It's not for certain types of files like a WMA as they each have their own tagging format. This website is actually incorrect. I apologize, but on the resources page, this website went down. But on the resources page, you'll see a connection to ID3 tag standards. It's also source forage.net. And they will help kind of explain a little bit more about what's involved with this type of metadata attribution. As I was saying before, it's most widely compatible, but it's the least capable because it has limits of characters and size. There's lots of flavors of ID3 tags. There's version two, which is the most capable, but it's tricky to support. Its applications are not really responsible with this type of metadata. Often proprietary reasons are involved. And so the transferring between the material is a little difficult. So 2.3 is the most popular version used. It stores a tag at the beginning of a file. 2.4 hasn't caught on yet as a successor, which stores at the end of the file. So I guess just to kind of drive the point home, we worked with Princeton University on a project where we're creating master files and streaming files from lectures. The metadata they sent us was incomplete. And so as we were processing the audio, they asked us to get dates and info from the beginning of the lectures to import that data and then edit the date fields. So when we sent them back, the audio files, it turns out that they were seeing something different in the metadata. The dates were changing. So the reason why that was happening is that we embedded the files with a program called LAME, which embeds all four types of ID3 tags, which means that they couldn't communicate to each other. So different tools often look at different versions of metadata by default. And so it's something that you have to be aware of, which tags are part of a file when you're embedding, which tags are involved with a file when you're editing, which tags are involved when you're playing back. It's really an authority control problem and a versioning problem. So it's just something to pay attention to. So if you look at, here's an example of the different ways that you can see ID3 tags in iTunes or real player, just the way that the metadata is displayed differently. As you can see in iTunes, they use version 2.3, real player uses a different version, which displays the information differently. So I mean, recommendations is just to pick a version and use that version only. Just be aware of using multiple tools to embed metadata and files. You can take a look at the resources link at the end that I provided on ID3 tags for some more information. Winamp is a Windows program that can be used to write and view both version 1 and version 2 ID3 tags. ID3 version 2 is a command line tool for writing, extracting and erasing version 1 and 2 tags. So another standard that was developed by the Audio Engineering Society is the AES 57 2011. It was a standard that was set out to develop a vocabulary to describe both digital and analog audio visual elements, which uses an extensible markup language. It provides structured human readable document that is easily parsed and manipulated, using different tools. So it really just concerns the technical documentation or metadata for long-term storage and preservation, linking the document to the physical objects. PB Core is the public broadcasting metadata dictionary project. It was organized as a set of specific fields that can be used in database applications. It's utilized as a data model for media cataloging and asset management systems. As a schema, it enables data exchange between media collection systems and organizations. It was based on Dublin Core. Version 2.0 was released in 2011, and it was provided free with Creative Commons licensing. Interesting application and used as a standard in the field. So provides metadata. PB Core offers different elements. Metadata, as you can see by the represented in the map. So I guess maybe before moving on to checksums, it looks like there are some curiosities. Yes, I understand how difficult it is to distill a complex topic into an hour and a half. So of course more questions are going to come up. And I guess for those of us that might really end up maybe not doing this work ourselves, but outsourcing it to a company or work with our IT departments or other things like that to do this sort of work. Do you have any suggestions for some resources that we can turn to for how we can take a look at our own collection and try and assess what we need to do and some of the options for conversion so that we can go then when speaking with a vendor with our IT department kind of be more informed on that language? Sure. Yeah, I mean I realize I sound like I'm talking in the technical hierarchy of the digitization world, but most vendors are really approachable. We often act as a resource just to answer questions of things in this process. There is a website, I was just at an oral history conference and there's a project called Oral History in the Digital Age that if you can want to check that out online, there are some resources for smaller institutions and ideas of what you can do in-house. I think mainly, you know, if you're interested in the digitization process, it depends on the machinery. You need to have the resource of the machinery and essentially equipment that will convert your analog materials to a digital format. A lot of the elements of what I'm talking about is essentially discussing the technical details of metadata that we add on to files that can be embedded in files. I think that it's something that's possible to do. You just need to, it's just learning about the process of it and having the amount of staff to devote to it. Essentially it takes time. Everything that's analog has to be processed in a real time to transfer to a digital format. Our transfer engineers sit with the analog material from start to finish to convert it to a digital format. I'm not sure if that helps to answer the question. Absolutely and I thank you in addition to the resources and the ideas that you gave that in our chat box other people are offering up a ton of great resources as well. The other question along those same lines that I wanted to ask is also a lot of what I'm hearing you say is that you need to be consistent in what you're doing and to document what you're doing. Do you have any other kind of recommended resources for people that might want to look at developing a policy or a plan in relation to reformatting? Maybe some samples that they could look to? Sure. We do have on our website a resource page that provides an introduction to audio digitization and the processing of that. I think that there's quite a few I could provide more resources for the website for you guys after the course. I'm sure that there's an extensive list that I could put up there some places for you to go. That would be fantastic. That might be easier than me listing them at the moment. I think people would really appreciate that afterwards. One question that always comes up when we are talking about these sorts of materials and the reason that we're really having to do all this conversion is because things become obsolete so quickly. Some people are saying that they're a little afraid to commit to one digitization format or technology and then it becomes obsolete so quickly. Is there anything that we can do to prepare ourselves for this or make it easier down the line? Yeah, I think the expectation is that the idea of digital preservation is that technology and media will change. Being able to roll with the changes is important. Migration becomes a way of life so you're going to have to take into account that you will need to continue to convert files down the road. I guess it's partly why we recommend keeping simple keep with machines that process things in the correct way and then using consistent software programs throughout. As I was discussing with the ID3 tag versioning issue that's something to be considered. I think migration is the main point except that migration is a part of preservation. Which is so hard for us as preservation people to truly embrace. We would love to just put things in a box and have it be done and not necessarily think about it as this continual process. It's a change of mindset on all sides I think in some respects. There's the idea. Digital preservation is a bit of a different concept. Digital is not forever. As things are, we look at analog material that has a particular lifespan. There are some, for example, some glass discs that we've received that we are unable to process because they have deteriorated so badly. I think that the idea of digitizing material is to save it before it's gone. Speaking to the content, I guess a lot of the technical details of what I'm discussing is we take complex processes like this in the transfer process to ensure that we are giving as true of a representation of the original, what was originally recorded because we're not here to edit it or change it. We're here to preserve it. I think that's the idea of sticking with a lot of these standards that have been developed in the field and that we use here at the studio. We have a little less than 20 minutes left, so I'll hold any more questions until we're at the end. I guess moving forward to a check some formula. You can take a look at what's on the slide here. It looks extremely complex and a bunch of mumble jumbo. This is essentially what a checksum algorithm looks like now. This isn't something that we would be passing on to your end, but this is the technical data integrity of a file. Essentially, after we process, after we take, for example, an analog reel and convert it through an analog to digital converter, which is connected to the computer, which is using the software program that is creating the digital file, the checksum formula is essentially an algorithm that we run that compares the first file to the last file. We can use this mathematical equation as an example if you change one digit in the algorithm. This is a mathematical representation of an algorithm. If you change one digit in the value you can see how it changes the end result. Now, I point to this because a change of a single digit in a checksum gives a very different value. It isn't subtle. The probability of having their 3.4 times 10 to the 38 possible values that can be created in a checksum algorithm. The point of the checksum is to compare the two. It's just simply pure math behind the file. The chance of two different files having the same checksum does not happen. There are three different major types of checksums. There's an MD5, which is a message digest, a SHA1 and a SHA256. Now these are technical terms, but essentially it's a 128-bit value that's converted to hex to make it easier for humans to read. At the bottom example you'll see that is what the complicated algorithm that I showed before that is how it's represented in hex with 32 place values. For example, then when you're looking at a file after it has been transferred, we attribute a checksum to that file to highlight its uniqueness. If we're looking at it in binary, that would be what the checksum algorithm would look like, whereas hex is much easier to read and compare. If you were to change one value, it would completely change the checksum. Basically, that's essentially saying, looking at the data integrity of a digital file, if any element is changed in that file, it will be a new unique file. The checksum is a way to highlight that this is the one and only version of this file. I guess some conclusions I would just say that it's important to note that there's really no magic solutions. Vendors are here to help and we're all in this field trying to preserve material together. The idea of digital preservation is just we use established tools, upgrade cautiously, and just realize that every solution is temporary, but it's something that steps that often have to be taken in order so that we don't lose the content that is on the analog original. I know that was an overview of a lot of really technical things that might have sounded complicated and confusing. I'd be happy to answer more questions. Sure. I'm always glad when there is time for a few more questions. I'm going to try and keep them a little bit more general because we did have some very specific questions in there as well. One of them is we were talking about audio throughout all of this. We did get a lot of questions about are there standards for video preservation and where might some folks turn for that, although this might be another opportunity to add more links on the website a little bit later. But if there's any insight you have into that, I'm sure people would appreciate it. Sure. I can certainly provide some more resources online for looking at video preservation as well. Right now, I think the standards for video are still being developed. At the moment, George Blood, the president of our company, is working with the Library of Congress to write a white paper on digitization standards for video preservation, which highlights essentially a lot of the standards that I'm talking about are the technical details of the back end of how we're processing things to ensure that it's as adhered as closely to the original representation of that material. The standards for video preservation are quite extensive, but they're still being developed. I guess we're learning about this process as we go, but the Library of Congress would be a good place to start. They have a lot of resources on their website, and then I'll certainly provide more links to an overview for video preservation and some different resources that you can look online. It might be easier just to provide a list of links after the webinar for people to reference on their own time. Great. So I think Marcia in Northern California had a really interesting and good question, maybe a bit more theoretical in nature, but she was asking for what your opinion is. What do you see as the next method for preservation after digital? Is there something coming up that we should all be on our toes for? Oh, Lord. I couldn't tell you. I mean, I think that at the moment I think we're just looking to use the resources that we have with the amount of knowledge that we have and share the knowledge that we have within the community. I think a large part of preservation is about access and about the conversation, and especially being able to do things like this where I can see throughout the entire time that I've been presenting people are exchanging ideas and links and helpful processes of how things are done at their particular institution, and I think those ideas, I think the more that we talk to each other in that sense, the better we'll able to exchange these ideas and help support preserving intellectual material. Great. And Marcia just commented again that she's heard of DNA computers as well as light beams, which I just can't even wrap my head around those sorts of things. So that is really all the questions that we have right now. So I can throw things back to you, Jenny. Sure. Yeah. Let me go ahead and put up the assignment. If you were feeling a little lost, don't worry. It's the evaluation. So it should be pretty easy to fill out, and let's see. I have the link to the course webpage. And then again, as always, if you're watching in a group, just ask that your group leader write down everyone who's watching with them. So we have a better idea of everyone who's with us today. And it looks like I'm going to give this a few minutes for our group folks to post their names in. So you've just got like a few more seconds to throw in any last minute questions for Stephanie. And we'll just give that a few minutes. And I'd also, I mean, I can, I think as my contact information on line two, I would gladly, you know, accept questions over email if people want to ask things about specific projects that they have, or, you know, want to help with ideas for resources for things, after providing the links online. But also, people can feel free to contact me here at the studio. And we also have lots of very intelligent transfer engineers that work with these technical processes every day. So it does look like we have one last minute questions from Catherine. She's curious, any thoughts on AISF's first wave for audio files? I think that to be honest, I'm not sure how to answer that. I suppose AISF, I think, potentially is a proprietary format. And Wave is more adaptable. I think AISF is specifically for Mac, which I think you can run into some issues when you have a file format that is proprietary, just in terms of transfer conversion and using things between different computers. Okay. Well, in addition, folks can feel free to email our info at heritagepreservation.org. I've also posted my email, and we can forward your questions along to Stephanie if anything comes up. And again, keeping that on the website, I've been tracking all of your fantastic links, and Stephanie has more links to give me for resources. So do you go back to the webpage, I will attempt to add those as quickly as possible. So it looks like we are done. Again, the deadline for all that homework assignment is one week from today, November 6th. Thank you all so much for logging in and coming with us over the past couple of weeks. And I have to thank all of our speakers who aren't necessarily here right now. You were all fantastic. And Stephanie, you were fantastic as well. Oh, thank you very much for having me. I appreciate you all attending my session. Yeah, and also CCAHA, thank you guys so much. And thank you Laura for being on board to help us with all those questions. Great. It's been a pleasure working with you. All right, everyone. Have a fantastic avenue and a happy Halloween.