 Hello, I am Sam Long, I'm the new director of technology and preservation at the Bay Area Media Coalition. This is my first time in a MIA, so that's very exciting. Today's content is a series of firsts. We've released QC Tools 0.6 today, so that was very exciting for us. And I also get the chance to introduce Dave Rice, which is the first of me, and I'm also very excited. I will keep going. So QC Tools is very much the result of hard work from many individuals and organizations, both in this room and not with us today, but still working probably right now as we speak to make sure that 0.6 actually works for you. There's been a lot of previous generations of Bay Back Employees that have really played into a role, in a variety of winces, particularly Lauren Sorensen, who I had the opportunity to listen to earlier, and I hope to actually introduce myself properly. The Dance Heritage Coalition, who have been critical in supplying us with various video examples for testing in the early stages, and also the rest of Dave's technical team who aren't able to be here today. All right, I'll continue to go. So I think that one of the major themes that I've found through this stream in particular in the last several hours really speaks to what QC Tools is for us and hopefully for you. The goal for us to be committed to a community, not only committed to a community, but engaging a community that actually wants to collaborate because that is the essential key here for us to create an open source pool. And it's also our kind of determination to release often and early as been discussed earlier in the open source initial panel. Do I have time for anything? No. On that note, I'd like to introduce Dave Rice with some music. Hi, everybody. So there was this panel last year about QC Tools when the project was about eight months into roadmap. And now, I mean, it's two-year projects, we're almost done. We wrap up a demo run. So we've got to show what we've got and really, like in this panel, it would be good to encourage you guys to give us some feedback to help guide the remaining work to be done on this project. Sam probably mentioned Bayback, NEH, QC Tools. These are the two main websites, the one at Bayback and the one at GitHub where all of our content is moved to. So one of the big issues, frustrating part of this is that we have this huge dilemma that we have this content on these plastic rectangles that we need to move into shapeless digital media. And we've got to do this in a big hurry and it would be good if we knew that we were doing a good job at it as we went so that we could, you know, trust the results, save as much content as we can. Now, the broadcast world has a different set of resources. So, you know, they have different sort of tools that are accessible to them. So often, like, we need to use different resources to make tools specifically for archivists. So QC Tools is a project, that's what I'm talking about. Shout out to Angelo, it's not so dirty. So normally at Bayback now, I don't know if it's a high company. This whole project kind of was based on conversations that arose out of a panel that me and Kip Health Timer and Angelo did maybe like five years ago about quality control. So we shared lots of video and tried to explain like why it scares some people up at night, like these random glitches and problems we find that we have to try to understand. Angelo and I ended up writing a grant at 20, and Angely rejected us and sent some feedback. We responded to all their feedback, made a second draft, submitted it, got rejected again, and then Angelo took his pie time, you know, the presentation was all done for him. Bayback, like, kind of, you know, rewrote and redesigned the approach to the project and submitted it successfully the third time, which was really funny. So the project is, as you guys have heard about GitHub, it's all moved into GitHub now. We don't have anything to hide or anything, just like, we're good or bad. It gets thrown into there, and then occasionally at moments when our confidence level is slightly higher than usual, we'll release, make a major release like we did this morning, which was .06. So this is the contribution team that's actually contributed in code so far. So me, M.D. Rice, is in the upper right. Jerome Martinez is a pretty crucial developer in the project. He's most known for being the principal author and developer of Media Info. Devin Landez, who presented to me last year, sort of coordinates a lot of the documentation efforts. Ben Turkus at Bayback started pushing in some documentation as well. Ashley Bluer, you know, the one who just took off on me. She did a lot of the design work, designed some really good works like for the pause and play button and stuff. And so far, Eric Peele is like the first one outside of the development team who's contributed, who's sent in pull requests, which are mostly based on snarky conversations about Y.U.B. and Y.C.B.C.R. Outside of this list, there's also a big part of QC Tools is a filter called SignalStats, which was moved into FFMPEG. So there's a different set of developers on that side, and that's Xamarin Bosch and Mark Heath in France and Australia who do a lot of the signal analysis to take video in and get metadata and statistics out that we can use in the plot. So in addition to the coding, a big part of the project is research. So this is Eric Peele. We destroy a D5 tape, try to play it back, figure out how it fails. Like, there are a lot of formats that we don't know how they fail in particular. We assume D5 would fail like DigiBeta and give you like sparkling from all over the place and little nice grids, but it fails by just like repeating a constant color in a grid shape. You know, here's like UMAC failure, FINA with the tools necessary for the job, we got cache, wine, tape measure, and, you know, icon. So in this time, we made a board digital animation combined with a live-action full-of-gooning, little open-access video, recorded seven minutes on UMAC, captured it cleanly, and then destroyed the tape very unprophetically. Like, every 40 inches, we would do a different kind of damage, going from the likely damage, like tape scratches and crinkles to, you know, when you tape is like ground in the salt, it's like a good model, like we got that covered too. And then, like, that's my like dental kit in the bottom, where we could use those tools for different kinds of scratches. And then, this is where I'm trying to get the heavily damaged tape to actually play. There's one part where we had this thing, like a very concentrated hair dryer that we would use in the film. They, like, folded it in and shrank it. So in the clip, you see a massive loss of stabilization and a whole image shaking around as it tries to repast that. The research was kind of concentrated into this feedback section at Bayback. So this is the crew, you know, the QC Tools crew. We all gathered in Bayback, and we spent two days, you know, working over what we had, the samples we could bring together and gather. This sort of focused a lot of the workflow and the UI and the priorities of the project. I think this happened, like, us, Fedora or so. A bunch of the screws is here. Alright, now to shout out EpiPathetic. Like, EpiPathetic is an enormous building block for this project. It provides all the decoding, so we don't have to write our own programs and do your $50,000 in uncompressed decoders. We rely on EpiPathetic to decode all the video. But one of the contributions that we got into EpiPathetic to support the project was a filter called SignalStats, which takes video in and gives metadata out. Information about the YUV values, like their average, maximum, minimum, as well as quantifications of certain visual qualities that are not typical to see in the results of the analog video unless there is something going on. For instance, like counting the amount of lines that have near identical luma as you go from one line to another. It's not common for analog video to do that, because there's always some noise making each line distinct, but like a dropout compensator and a time-based corrector can do that by just putting the same line over and over. Other issues we look for are temporal outliers, which is like comparing pixels to its temporal neighbors of the previous and next frame. And if it's too distinct from its temporal neighbors, we call it a temporal outlier. I mean, if it's too distinct from its neighbors, we call it a temporal outlier and tally that. It usually works well to tally frames that have skew or crank rules, like anything that makes white speckle all over your video from tube to image. So we weren't sure if this would work out or not, but we worked with the ffmpeg community and helped develop the standard to, you know, they have a very rigid set of specifications and styles and how they like it, but we were able to get it into the official ffmpeg, which definitely helps with project sustainability. It also helps sort of scaling up the application and kind of quality control because people can now just use ffmpeg directly to make all the statistics and then potentially use PC tools to view it. Generating the statistics is the more time-intensive part of the process because you have to decode all the video and analyze it to get such reports. You know, but potentially archives that are processing video all day can have computers loop through all the video and process using ffmpeg to produce these reports tonight. GitHub, I mean, GitHub, PC Tools has an issue tracker that we enabled and tried to start encouraging to use. So we've got 27 issues in there so far, about half of them are closed, but you can see what other people have identified as issues with the software and you can contribute your own issues. This includes things like bugs if you manage to crash the application, we'd like to know how you did it, but it also includes things like wish list and enhancement suggestions, you know, to help sort of guide the remaining amount of resources we have in front of us. To introduce you to the UI, there's sort of three main layouts. This is sort of the most prominent one that you get to first. This is called like the graph layer, I think. But the signal status filter analyzes the video, makes the value reach frame and then plots it over time. So in this example, we're seeing four different plots. We're seeing a plot at the top for the Luma, like the Y channel, and then plots for the U and B, the two promo channels. And then at the bottom, we're plotting saturation levels. So, I don't know if you can see the lines too well here, but like the saturation, for instance, you can see the saturation peaks at 181 is sort of the maximum on our scale. Like if you were looking at a vector scale from the center to the side, we call it distance of 128. So because of the fibers, it's 181 point something to the corner, so that's the maximum. If you have a pixel in the corner of your vector scope, that's when you have maximum saturation. And those levels are like illegal colors. They don't actually convert back to RGB without causing a negative value or an overflow. So identifying them is important because it's not likely that you are practically recorded, hey! It's not. It's like actually screwing, you know, didn't have a lot of prep time. And then by, so that's sort of the saturation. If the video is blocking away, for instance, the saturation levels would all be zero all the way across. For each of these plots, you'll see five different lines plotted. It's like the minimum, the maximum, the average, and then one called high and low. And those are plotted by excluding the highest and lowest 10%. You know, because often in your video, there'll be some random pixels that just don't behave and are toggling to extreme values. And if you want a general impression on how light or dark or how much contrast a frame has, you need to exclude the outliers so they don't throw off the entire image. The tool also has a player that lets you play back the video with a bunch of different filters and multiple at a time. So typically it's open in this sort of two-filter view. So on the side, it's just called normal. It's like the presentation of the video. And on the other side, it's the vector scope being enabled in this case. And this is sort of the newest layout that we're developing, which is just called the list layout. But for each file that's open, it gives you some overall statistics about the file that sort of indicate how dropout-y it is or to what extent the file has pixels that are out-of-broadcast range, how off-center the Chrome is, like how many frames contain illegal color values or out-of-broadcast color values. So if you digitize a large collection, you can potentially use this to find the worst of your work, to sort of focus on the material that deserves to get redone and put your attention to places where there's a likely no error rather than kind of skimming through a collection randomly to find problems. Here's a couple examples that Michael Angoletti sent me this morning, a PC Tools. You can see here, like, all the lines pretty much just go straight across, but then there's a massive spike where the maximum and minimum Chrome values, like, you know, go pretty far out of the center. So when he would look at this frame, you know, this is what he would see. So a lot of people in there are like, how do I actually use PC Tools initially? Like I said, just, like, looking, you know, things that are out of place in the graphs, you know, like when the graph is having, you know, a crazy abnormal time, you know, trying to figure out why, you know. So he had, like, updated, I think, the firmware in one of his capture devices or one piece of hardware in the digitization chain. He updated the firmware on, and the next day this happened three times in three different videos, and PC Tools he was able to find all three of them and redo the work and address the problem. With more traditional QC approaches where you're just kind of randomly looking across the video, this stuff is easy to miss because it's one frame, but if you are letting these, like, one frame errors happen because of a poor set of new hardware consistently over time, like, you will be sorry one day. This is another error he noticed. The bottom plot here is for a temporal outlier, which is sort of counting, like, white speckle or inconsistent pixels over time. So you can see there's sort of this low rumble that happens here in the video, where it's otherwise, you know, pretty peaceful. In those places, when he looked at the video in those places, you can see, like, the top third of the video is, you know, very unstable. You know, I'm going to let you know the digitization a little more. Well, his expertise much more than mine is, like, what to do to actually fix the problem, like, he was able to digitize the tape without this issue on the second pass. So that's the slideshow part. Now I'm going to cut to the download part. And then I'd be happy to take questions, and then it's only half of our panel, so I'll try to be quick. All right, so this, there's a focus on making this very accessible and cross-platform, so we developed the GUI in this open source framework that we developed and released by Nokia, but it's good at being developed in, but then released in multiple platforms. So if you go to BayVarik.org, you can get versions and Linux, Mac, Windows pretty easily that you can hopefully get onto your computer. So first up, I'm going to walk through some of the players. All right, so I was setting up the digitization station in Seattle at one point, and they had two beta cam decks. I was like, I'll do the very trick. I'll put a beta cam tape in one deck to digitize it and put it in the other one and make sure that they're about the same. But they were not at all the same, and at first I had a really hard time thinking about what the problem was. I noticed when a scene was very saturated, like this kind of very bright yellow hat, it looked kind of, looked kind of like striped eventually, but in the player, I could see this more apparent. When I'm playing normal, there's like a field split button at the top of most of the player. So instead of seeing the image all combed together, you can see field one on top and field two on the bottom. So with this kind of player, it becomes much more obvious that field two is sending out no chroma at all. Whereas when it's together, it looks a little less saturated, a little less color, but it's difficult to see on the computer that half the color of the image is missing. I made a sample of this, the other scenario where there's no color on field one, and I noticed when you transcode it to lower chroma of sampling, like for the web, I transcode it to 420 in HB64, and the result had no color at all because it's only taking one color from field one in a sub-sampling conversion. So it could be this very sort of deceitful problem where it looks enough like there's color and somebody might let this pass, but when you try to transcode it, you only get black and white outputs. On the right we see the vectorscope, sorry, the waveform, and all these filters sort of have contextual options. I think last year I showed the filter that there were no options, you just saw what you got, but now we can control them a bit more. So I can see the Luma or the two chroma planes separately, or I can see all three presented on top of each other, or I can see them separated by the fields. So here I'm seeing all three sort of signal components of the image, field one on the left, field two on the right, YUV going down. So you can see the Luma is sort of the same on both sides, but the chroma is very distinct on field two. There's almost no data there. And you can see, in general, the Luma is sort of using almost the full range of what the digital sample can contain, but the chroma is usually very condensed into the center, like using a very small range. But I found that for diagnosing some problems, it's helpful to separate, split those ranges out and stretch them. So this is what the video looks like when you look at the chroma plane, like this is just the Y channel, or CR channel, but you know what's happening on the two sides. So I can see that on field two, I'm getting a lot of data at the center, of course, but I can see that instead of color, I'm just kind of getting this, you know, pattern over time. We have the line select filter, because a lot of waveform monitors do that. One of the users, people who've seen this, suggested we put in a manner, means to see what line we're plotting. So if you click the background option, you'll see the image underneath the line select and when you answer it, you change the brightness of the plot. I don't know if you can really see it, but there's like a yellow line moving up and down, and that shows which line I'm plotting in the waveform. See the spectroscope, when it's hit to spectroscopes. So if I go back to the graphs, like, yeah, I can enable the, this is just looking at the saturation graph, so I can see where the video is very particularly saturated, like this normal purple that shows up, and then, you know, kind of corresponded to the spectroscope and let it play back to see these weird like loops and patterns that happen because of, you know, the non-ideal surface of the hardware. Like in this case, it's not the tape in particular that's causing this problem. Like it plays fine in another deck, but the one particular deck is producing, you know, this kind of irregular patterns. I'm going to open up a born digital video and then the analog version of the same just to show you a couple more filters and then I'll take questions as I go around. So this video, there's a touch pattern at the beginning, and then there's just kind of like a blue frame, and then there's a live action video from then on. But at the beginning, you can see, this is born digital animation and the waveform is extremely thin and patterned. This is 100% color bars of a scrolling gradient. So it looks like on the waveform. On the spectroscope, the scrolling gradient covers, you know, pretty much the boundary of, like, why do we color space? So this is sort of anything beyond this hexagon is illegal colors that you can't convert back to RGB properly, but everything inside is sort of normally where the video will have its image. So I wanted to show off the bit plane filter. You know, the video is often encoded at a certain bit dip, for instance, 8-bit, and the bit plane filter, you can isolate the individual bit positions and play them back individually. So this is just looking at the first bit. But I can go up to the second, third, fourth, fifth, sixth, seventh, eighth. You know, so this is what the least significant bit of the Luma channel is like. You can see, even when I'm viewing this, I see, you know, it's sort of structured clean because it's born digital. If I flip over to the analog version of this, this is when we roll the tape out to analog and then, I mean, to UMAC tape and digitize it back. So you can see the waveform is very different at this point. It's much sort of messier and sloppier. There's, instead of having like one very particular Luma value per column, there's, you know, a wide range of maybe like 15 values of these kind of fuzzy lines you see. If I go to the bit plane filter, you know, I can see when I go into the second bit, there ends up being a lot of differences between the original born digital copy and analog. Like things get fuzzier and noisier very quickly. You know, this is at the fourth bit and there's already a lot of noise because of the analog carrier affecting the image. If I go down to like the seventh bit, it's really recognizable. So many of the values have changed from the original copy and the eighth bit. You know, it's about the equivalent value of random data at this point. You are short. Well, this is sort of like a side effect we found in the project, but that we could use these filters sometimes to detect like problems. So, this is if I'm looking at the the chroma planes individually, but instead of having them be very compressed as they are, it's like stretching them far out. When you watch video in this way, you can sort of identify the different manners that lossy codecs throw away data for a analog capture that you do. This would look very noisy, but for compressed data, this is H.264. You end up seeing these kind of distinct patterns. Let me pick an mp2 so you can see that. It's like a test file we made it half day. So this is like a shot of some leader, but it's, I don't know, Tommy here? I can talk all about his media now. So, yeah, it's going to be difficult to see where you are. But this is looking at the chroma plane of this encoding. This was digitized off of a digital media player as a lossless file. But we can see, if I look at the chroma patterns, I can see these sort of patterns of mp2 compression where there's this very like square shake macro block pattern in the video. So like when you look at dv, mp2, H.264, like, you know, lossy codecs and this kind of, you see kind of distinct patterns based on, you know, the wider brick shape of the dv codec block or the square mp2 macro blocks or, you know, the more flexible fluid patterns of H.264. Yeah, so that's the project. I should go over to take questions and I really would like to encourage feedback and I would like to get you guys to, you know, go to the GitHub site potentially after testing this, you know, report issues, comment on existing ones or join the discussion somehow. Yeah, thank you very much. First of all, really beautiful work. Thank you so much. This is really going to be great for our community. What I was wondering though is those of you who have been working on the project a lot sort of know what normal might look like. Is there any sense that somebody will build a basic instruction for people starting out so that we might understand what the tools are and what sort of the normal products might be for some of the multitude of tools that you've given us? I agree, certainly. Yeah, we've made some attempts at doing video tutorials but need to make some more serious efforts of this. This one happened before. We do have health documentation in here. So there's, you know, six articles currently that walk through the different, you know, filters and options to, you know, give a couple of hard-coded examples of what normal and abnormal sort of are. You know, but back to my initial advice, like walk in, open video, look for inconsistencies or incoherencies in the graph and then, you know, try to play over them to find out why that issue is happening. At the bottom of this phone nail view, we made it so you see nine frames at the same time. So, you know, if you see a place here where it's like, why does the color just all, like, pinch at the center at this point, you can, you know, see the frames at the beginning of the bottom all go to basically bottom right at the point. But yeah, documentation, there's some of it in here. These, like, six HTML pages that overview it and, you know, we definitely need to produce more. Now that the software is kind of settling, like that's definitely our priority over the next couple of months to do some kind of education into how video is built and how to interpret what we are seeing here. We got a wrap. No, you're actually the first speaker in the next presentation by 4.5. Lightning thoughts have to get our stuff together, so we need a wrap. Alright, thanks for the lead question. Alright, thank you.