 Our next presentation is by Kitware Matt Dawkins talking to accessibility of open source AI platforms marine science and Anthony Hoogs is also on the panel today. All right, thank you for inviting me to speak here today. My name is Matt Dawkins and I'm a software engineer at Kitware, a small software company headquartered out of upstate New York. And I'm here today to talk about the army, which is an open source toolkit that we're currently developing alongside no fisheries, and a couple other collaborators. For those of you not familiar with Kitware, we make open source software in a few areas. Being open source, most of the software is free to use and available on sites like GitHub. We have our well known software CMake VTK and Power View. Anyone who's built other computer vision software like open CV or pytorch have probably encountered CMake, which is used to generate cross platform builds on windows Linux or Mac, but it's also widely used outside of computer vision, where headquartered in upstate New York, but have offices in North Carolina DC, New Mexico, and Leon France. So what is the army. Over time it has evolved into a do it yourself toolkit that biologists can use generate their own machine learning models with little or no programming experience. Now in either weather desktop interfaces. We have recently also added support for an online repository for storing imagery and related annotations with additional support for training models. Some of its current use has been on the first three items on the left of the blow figure, particularly object detection and tracking, but more recently we've been expanding its capabilities into stereo measurement, image enhancement, registration 3D model generation and a couple other fields. Lastly provides tools to assist the evaluation of different algorithms. For example, I've begun applying the army to a diverse range of data and challenging problems. For example, here are images from underwater downward facing cameras pointed at the sea floor. Many of these problems require the collection platforms to apply their own illumination and necessitates running image enhancement prior to automated detectors. Aerial surveys are also fairly popular. So you can see a diverse range of climates here with different types of seals and sea lions. All the way from the Arctic on the top to warmer clients in the middle for example of the coast of San Diego scene segmentation has also come up with a few groups, such as on the right for classifying land cover into different categories. Outward facing cameras are a bit more diverse than the others with a large range of different sensors targeting fish populations around the world. Tracking objects frame to frame to avoid duplicate counting typically becomes more important here, but that depends on the application. These might be underwater on board fishing vessels for electronic monitoring or just, you know, GoPro's chapter penguin heads. Just to give you a feel for the assortment of platforms collecting this imagery, there's a large variety. Some are towed vehicles with a fixed cable to the surface and others are more autonomous such as AVs. All of the, the aerial ones we've worked with so far being mostly man fixed wings but a small amount of UAVs with friends get involved with. Planning for Vimea began in late 2015 with the formation of Noah's AISI committee, the development picked up a bit more in 2016 2017. AISI was creating with a representative from each fisheries science center around the US and also a couple members of industry and university partners. Vimea wasn't the only product funded by ASI, but it's originally designed as an integration platform. Over time it evolved into a full do-it-yourself AI toolkit with multiple workflows for generating different types of classifiers or detector. The more traditional approaches in the middle, the user has some raw data, they do a lot of manual annotation on them then train a model. The second is when you already have an existing model more similar what you're trying to do, either from someone else's work or first pass model trained on your problem. In these cases, it might be more helpful to run the existing model and correct the annotations as opposed to doing them from scratch. The last workflow involves low-shot learning and video search to rapidly generate a model. There are different types of annotations that the annotator interfaces support, box level, pixel level, frame level, and key points. Each are useful in different circumstances depending on the application and user time available for annotation. Additionally, we have now developed a couple different interfaces. The newest dive is shown in this video, which can be run either on the web or on desktop. Dive supports multiple types of annotation in addition to model training on multiple sequences, and we now have a web version of dive hosted online at yami.kitward.com that contains a couple million annotations. Then there's additionally some more specialized desktop applications in addition to dive, including search for rapid model generation, sealed for multimodal annotations, for example, on infrared and thermal simultaneously. And lastly, we have project folders for bulk processing data and bulk from bat scripts outside of interfaces. I'm now going to give you a very quick feel of some of the capabilities in yami, but I'm not going to linger on these slides too long. On the detection side, we've done some high-level integration of algorithms in wider use in the computer vision field, but then we've done some optimizations on top of them such as fusing in motion information into these detectors for depth maps. Similarly, ensemble classifiers for taking multiple detectors and running them in parallel. This is a quick example of our interactive learning for rapid model generation where the user provides only a few samples of the target, then the system searches a database for additional things that match that target in terms of appearance similarity, returns those results to the user. The user annotates what was right and wrong to rapidly generate a new model for the query of interest. You know, enhancement comes up in underwater a lot, but one of the things we've been doing is coming up with 3D models based on camera calibration and stereo cameras, trying to infuse those into detectors for improved performance of buried objects such as flatfish that might be covered in sediment. Now, registration comes up in a few domains for aligning imagery, especially in aerial, we might have lower frames per second data like one Hertz data, that might only lap by 50%, but you still don't want a duplicate count on different seals or sea lions. Now object tracking also comes up in a few domains and one object tracker doesn't always fit all problems. So we typically have a one tracker dedicated for fish type problems that runs at higher frames per second video like 10 Hertz, 15 Hertz data. But then we might have lower frame rate registration based trackers for some of those aerial sequences. So in aerial measurements similarly we have two different approaches one for when you have lots of annotations so head tail points on what you're trying to measure, and then a not deep learning approach when you don't want to rely on having lots of annotations. Now the same thing is true for full frame classification where we have our rapid SMM model that can be trained on few annotations and then sort of a deep learning sort of standard just ResNet based model for when you do have lots of annotations for problems of interest. People here are looking for typically full frame properties such as background information or whether or not an event is happening in a video such as in those penguin feeds. Lastly, we have embedded processing a lot of people they're fine taking their data to their lab. We've also started doing a bit with detectors right downstream from the camera systems itself and running it on on embedded software, low swap hardware. Some of the future steps for the army include trying to get it working on more types of data sets for example acoustics. Lastly, I just want to thank you know no CFF for funding the army that's been instrumental and image annotators across multiple organizations who have contributed annotations to it. Thank you. Thank you for sharing Matt. We have a question for you and Anthony. I guess that you should get the prize for fitting the most into a five minute talk. There's a lot of work to be done to tease apart all of the questions and even I'll leave one of the technical questions to Matt and I'll take a 30,000 feet stab at a question for you. You mentioned about the AISI community and how this has helped the development of the system for work teams in America. Can you give us some insights into how that how that rolled out and what worked and what didn't and who funded or how you managed to get people to work together or talk together and so on. Just a little bit of an insight into how your world worked out. Looking for you on the top shelf here to put you down. Thank you. So thanks, I guess for inviting me here today. So the ASI committee was founded with one representative from each of the fisheries science center I was known the original committee. Anthony was who's also available on this call. Certainly that helped begin the process of getting annotations a lot of people in the beginning they didn't have many annotations for their problem. Some of them did. You know when you're doing traditional deep learning methods, you typically need a lot of annotations you know it's not a small amount when you're coming into a new problem. So one of the original goals of the AISI committee was to sort of combine the people who had a lot of annotations with those who didn't. We're just starting out for the first time and I think so, having contacts there really helped a lot of bringing together people who had lots of experience for a problem and those who didn't. Who may have just been coming into this raw who had never heard of convolutional neural networks or deep learning. Yeah, I think that was very helpful. Anthony, can you add to any of that about the history of story about how the community sort of took off and how that helped. Be happy to back in 2014. Noah organized a workshop at the National Research Council, which is part of the National Academy of Sciences and Engineering here in the US to try to bring together the computer vision community with no official results. Because even back then they knew that computer vision and help them they had people dabbling in this across the different science centers. And they wanted to coordinate those efforts and bring in researchers who knew how computer vision works. This is way before computer vision and deep learning popularized AI around the world, and it was really ahead of its time. I noticed this initiative, which came with funding from Noah to give out to various researchers and then some of that came to Kitware for beyond me development, someone to UC San Diego for coral net development. Others went to different universities and labs. This was coordinated and allocated through the as I so it was really very visionary and because of this, the, the Noah team that put this together and ran this for years under Ben Richards was one of the co panelists are not here today. They won the gold medal for scientific achievement, the highest award in the Department of Commerce in the US. Back in 2019 for this effort, because it really predated this huge wave of AI usage and acknowledgement the position fisheries, such that it's way ahead of the game here now the army exists fishery scientists and using AI and computer vision. We've had some of them for years. So it's really, I really him I have to to these know folks who put it together and thank them again for funding Kitware to do this work was really quite visionary. I'm going to hand over to Anton and then Matt will probably hit you with a real technical question. Thank you. Sorry, I have no real question here. Okay, Matt, go ahead. Yeah, kind of a technical question. What's the biggest challenge that you faced rolling so many dimensions into one package because you're really your coverage of many different aspects to thermal infrared. Slam image stitching, you've got so many components in that one piece of software. It's such a mountain I think you've managed to climb and putting it together and making it accessible to users as well I think is something really manageable. What. How did you approach doing that. I definitely think that's one of the challenges we faced. Typically, there's not one workflow that works for everyone in the get go. A lot depends on you know how long the user is going to sit down with a particular problem, you know how long they're going to spend doing annotations for these problems you know that that might control which type of workflow they want to do. So I think their lab is, you know, for example, they're for people sitting who want to do co annotation at the same time, which case a web solution might be better, or if they're just one researcher who has access to GP resources, you know they might just be fine working on their desktop. That's sort of what's led to us having a lot of different workflows for these algorithms and adapting it to some of these different problems. Sorry, and what's been the user update, like how many users do you think we've had that are using it prior to the analysis. Probably estimated a couple hundred right now. I think on our line annotator we have roughly 400 users who signed up. A lot of those people just signed up to get annotations they're not active users but looking at the past 60 days has been about, you know, you're 320 or so active. I don't actually have a feel for how many people who have downloaded the desktop version. There's, there's no statistics for that. But I guess is something similar there. To follow up on the question about where I was adding all this functionality in. As I mentioned, the army has been under development for five years or so more. We maintain the capabilities that are there but then new capabilities come along because there's enough interest across no centers that that we use the no maintenance and improvement funding to add them in. So when there's capability, let's say, stereo reconstruction from stereo cameras producing 3D depth map, something like that might be of interest it's multiple centers we might add it we also have additional efforts that are funded through no centers directly or anyone else who wants to to pay for something to be added to the army. So we've had a few of these where we've added in a significant detectors enhancements for detecting certain kinds of items in the sea like scallops. So they get the capability keeps growing, and we try not to delete things or make them obsolete not working anymore so it does just keeps getting added in one thing we don't have is reinforcement learning and be interesting to put to put something about that in there for various problems. But that's on a to do list. Well, thanks very much for that I think we