 Good afternoon, everyone. I am Alina Simo, the Director of the Office of Government Information Services, known by our acronym OGIS, a part of the National Archives and Records Administration. I am very happy to be welcoming all of you today. I just want to share a personal note. Today is also my 10th NARA anniversary. I joined the National Archives exactly 10 years ago today. It is my absolute pleasure to welcome all of you to our 13th annual Sunshine Week program, as we hope to shed light on the importance of artificial intelligence, open government, access to government records, and the Freedom of Information Act. For those of you who are watching us via the NARA YouTube channel, we are very happy that you are able to join us virtually today. This week we celebrate Sunshine Week, an annual initiative that is a nationwide initiative in the United States that promotes open government and transparency, and that coincides both with President James Madison's birthday and National Freedom of Information Day on March 16th. National Sunshine Week was actually launched in 2005 by the American Society of News Editors, and as of December 2023, Sunshine Week activities are now coordinated by the Joseph L. Breckner, FOI Project at the University of Florida College of Journalism and Communications. Sunshine Week strives to raise awareness about the importance of access to public information and the need for government transparency. The goal is to encourage public dialogue, engage citizens in discussions about open government, and advocate for policies that promote transparency at all levels of the government. The title Sunshine Week is derived from the idea that sunlight is the best disinfectant, emphasizing the role of transparency and holding governments accountable and ensuring a well-informed citizenry. Before we begin, I have a few housekeeping items for those of you who are in the McGowan Theater with us today. First, please remember to silence your cell phones. Second, if we need to evacuate the auditorium, please follow the exit signs on either side of the auditorium. And if time allows and our moderator allows, we will have a question and answer period following the panelists' remarks. Please plan to use the microphones on either side of the auditorium to ask your questions. With that, it is my pleasure now to introduce our deputy archivist of the United States, Jay Besenko, who will provide opening remarks to get today's program started. Thank you. Thank you, Alina. Hello, everyone. I'm Jay Besenko, deputy archivist of the United States. I am pleased to welcome you to our 13th annual Sunshine Week program. Whether you're here in the William G. McGowan Theater or joining us virtually on the National Archives YouTube channel. Since 2012, the Office of Government Information Services, or OGIS, has spearheaded the annual Sunshine Week program here at the National Archives in order to shine a light on the critical importance of access to records and information. Since then, we have witnessed the transition in federal records management from analog to fully digital record systems, and this is just the start. Artificial intelligence or AI will undoubtedly have a profound impact on the way the government conducts business, and I am thrilled to welcome the members of today's panel for an engaging and informative discussion on this topic. Here at the National Archives, we are putting in place the infrastructure to prepare the agency to move forward on AI-related efforts within a framework that ensures careful consideration of the tremendous opportunities and risks that AI presents. Late last year, Archivist of the United States, Dr. Colleen Shogan, signed the Charter of a newly formed Executive Steering Committee for the Freedom of Information Act and AI at NARA. I am pleased to co-chair with Michael Cheatham, our Chief of Management and Administration, the Steering Committee which is tasked with exploring the application of AI technology to expedite search, review, redaction and response to FOIA requests received by NARA for both archival and day-to-day operational records. For me personally, today has been many years in the making. I've spent my entire NARA career, more than 30 years, focused on access to records, starting out as an Archives technician and later as an Archivist and a Management and Program Analyst, working to provide access to NARA's rich and diverse holdings. As the Director of the Information Security Oversight Office from 2008 to 2011, I was responsible for oversight of government-wide security classification, controlled unclassified information and the National Industrial Security Program. Later, as the Executive for Agency Services from 2011 to 2013, I was responsible for NARA's efforts nationwide to service the ongoing records management needs of federal agencies and to represent the public's interest in government records with regard to accountability and transparency. And most recently, prior to my appointment as Deputy Archivist, I served as NARA's Chief Operating Officer for just over 10 years, responsible for all facets of the agency's mission. While we at the National Archives continue to face the seemingly endless challenges and opportunities for our work that I have seen in the last three decades, it is clear that the next frontier is artificial intelligence and machine learning. AI will undoubtedly transform the way the government does business, and I look forward to hearing from today's panelists as they describe their groundbreaking work in this area. Before turning the mic over to Pam Wright, NARA's Chief Innovation Officer, I think back on the first observations of Sunshine Week here at the National Archives back in 2012 and 2013, when the original FOIA statute was displayed in the East Rotunda Gallery here at the National Archives building. Today, I invite those of you that are here in the McGowan Theater to stay in the building after our program. As you leave the theater, there will be a temporary display of documents related to the FOIA from our holdings. I also invite you to visit the Rotunda, home of our nation's earliest founding documents. Today's program is moderated by Pam Wright. Pam became NARA's first Chief Innovation Officer in 2012. She leads staff responsible for agency-wide projects and programs in innovation, digitization, web, social media, online description, and online public engagement. Pam previously served as the agency's Chief Digital Access Strategist, where she pulled together the web, social media, and online cataloging staff into an award-winning integrated team for improved online access. She has previously served as NARA's representative to the White House Open Government Working Group and currently serves on advisory boards for the Digital Public Library of America and the Library and Archives of Canada. I will now turn the mic over to Pam, who will introduce our distinguished panel today. Thank you. Thanks, Jay. You know, although Mr. Bosenko has just recently become our Deputy Archivist, he has been a leader at NARA and a role model for NARA staff for three decades. So it's an honor to work with such a dedicated public servant. I also want to thank OGIS Director Alina Simo and her team, Martha Murphy, Kirsten Mitchell, Kimberly Reed and Dan Levinson for all the support behind the scenes and for coordinating everything for today's events. Thank you, guys. So good afternoon, everyone. I'm pleased to host today's Sunshine Week panel with my esteemed colleagues from across the federal government. I invite you to sit in on the discussion we will have on the use of artificial intelligence in the federal government and how the various forms of this technology can support our efforts to ensure an open government at our respective agencies. The pillars of open government are transparency, collaboration and participation. The work we do through our open government plans helps us to hold our government accountable and instills trust with the American people. I cannot think of a more important mission for our work as federal employees and public servants. In 2009, the White House issued an open government directive requiring federal agencies to take immediate specific steps to achieve key milestones in those areas of transparency, participation and collaboration. And since then, federal agencies have set forth their steps in biennial open government plans available on each of our websites. Over the years, NARA has completed over 100 initiatives from open government plans. And some of NARA's initiatives from our most recent open government plan are in the fifth and current U.S. National Action Plan, which includes initiatives for improving our catalog and redesigning our website for better public access. This year, GSA has reinvigorated open government efforts and established a new open government secretariat that is leading interagency efforts to develop the sixth national action plan. They recently held their first interagency working group meeting. And in the past, many of these meetings were held just upstairs in the innovation hub here at NARA. We co-hosted them for many years. NARA stands ready to support open government efforts going forward as well. So what makes our work in open government so exciting to me is the way it can be turbocharged through emerging technologies. We have seen it in the past, the wave that came through with social media, allowing federal government agencies to interact directly with the public, where the public was online. We take that for granted now, but it was a sea change back in 2008. Artificial intelligence as a field has been around for many years and a variety of flavors. As most of you know, AI refers to computer systems capable of performing complex tasks that historically only humans could do, such as reasoning, making decisions, solving problems. Today, the term AI describes a wide range of technologies that power many of the services and goods we use every day, from apps that recommend TV shows to chatbots that provide customer support in real time. Over the past year, AI has exploded in the public consciousness with the advent of public use for tools like OpenAI's ChatGPT. I just wanted to stop for a second. The room, how many of you tried ChatGPT? Everybody. Yeah. And much like social media did 20 years ago, these tools have gone mainstream before we have a full understanding of the potential impacts for good or bad that these technologies offer the world. Now that the Pandora's box of AI has been open, there are very few fields that artificial intelligence won't affect in the near future. ChatGPT, large language models and other forms of machine learning and artificial intelligence are part of this generation's wave of emerging technologies. I'm both super excited and concerned about the surge of interest that generative AI is creating. My hopes for this event are to explore both sides of this opportunity and risk with some of the best thought leaders we have in this space. So today, I am honored to host the panel of federal government experts that we have. Each has prepared a short presentation. And once we've gone through those, I hope we'll have some time for a little discussion and maybe some questions. So let's get started. Allow me to introduce our first speaker, Abigail Potter. Abby is a senior innovation specialist at LC Labs, which is within the digital strategy directorate in the office of the chief information officer at the Library of Congress. Abby is a founding member of LC Labs and has been leading a program of digital experimentation with an emphasis on practical and human-centered outcomes. Since joining the Library of Congress in 2005, she's helped build capacities in local, national, and international networks for mass digitization, digital preservation, web archiving, machine learning, and GLAM research labs. In the past year, she led the creation of the LC Labs AI planning framework, the NLP vendor evaluation guidance from the GSA AI community of practice, and is the current co-chair of the AI for LAM secretariat. Abby has a background in digital publishing and an MSI from the University of Michigan. Please join me in welcoming Abigail Potter. Thanks so much for having me today. This is a great event that we that we're really excited. And I think it really is a good example of how the promise of AI could really be put to good use. If we can do it sort of carefully and sort of knowing as much about the process as possible. We've been in at the Library of Congress, we've been experimenting with machine learning for several years, maybe since, I guess, 2018. And with the sort of distinct purpose, because we wanted to use it because we wanted to use this technology to create more entry points into our digital collections, engage our more people and make our collections more relevant to more people. That was the sort of driving force around exploring these technologies. I think, as we sort of did those explorations and saw what was happening, we thought, oh, well, we need to sort of, it's not as easy as it seems, sort of unleashing this technology on our collections. There are things we have to keep in mind. So that's where this framework came from. And I should say, let me see. That works. Okay, so we have been sort of iterating on the framework for a while. In November, we released a blog post and the framework is on GitHub. And it's a very sort of basic set of Word documents and sort of questions. So it's not a super advanced thing. It's something anybody could kind of pick up and use. And we plan on updating it over time. But that's the link if you want to dig into pieces of the framework that I don't get to cover today. And then also I also want to reiterate that the framework is not our official policy. The Library of Congress also has a newly formed AI working group that is going to recommend the official policies. But this has been really useful in helping to establish that those policies with actual sort of evidence of how the data works and tools work. Okay, so I won't spend too much time on this, but it sort of helps to sort of abstract kind of what is happening in AI. It is complex, but it's not a wholly new technology. It's been around for a long time and it's become kind of more impressive and more powerful. But at its core, there are still these elements. The data, very key to machine learning and AI. There's the models, which is a very sort of abstract way of talking about the technology that's in there. There's the actual models, there's the architectures, there's the type of training, different ways that the technology works. And then there's the people. And these are all sort of overlapping, but the people are at the center here. The people are represented in data, they create the models, their work is sort of represented in the use cases and they make up the organizations that are that are sort of going to use this technology. And people are also ultimately responsible for what happens and they're ultimately and they are impacted. So it's good to sort of not lose sight when thinking about AI that it's a magical sort of technical thing. It's not, it's something you can understand and control if you want to. So this is just a very high level sort of different phases of AI that we have sort of outlined. The first phase is understand. The second is experiment and then the third is implement. And implement is, I should say we've not gone all the way through to an implement stage where we are implementing a sort of new version new sort of instance of AI in a production application or or kind of cross organization with all sort of content types. So it's a pretty high bar to get there, which is what we're learning in our understanding experiment phases. And these also sort of lay on top of our current government governance and policies. There may, like I said before, this is not a wholly new technology. A lot of our current governance and policies are still appropriate and are still valid. And there may be some tweaks or updates to make to them. But we are still relying on our current sort of policies around how we deploy it and prioritize it in our organization. So this is a sort of zoomed in version of that of those three phases. And I'll start here and understand. And the first thing I usually recommend when people are thinking about what they want might want to do with AI is to think hard about the values and the principles of your organization or the sort of goal of the of the specific tasks that you're trying to advance. And the and then also the the the activities sort of in this understand phase are usually collaborative sort of an assessment activities of people in rooms talking together talking about, you know, what data do we have? What expertise do we have? And, you know, where do the risks and benefits layout? So there's sort of we have workshops and worksheets and things to sort of help people think through the major questions there. In next, we have experiment and we call this experimentation because we are in a in a digital innovation lab. So we have sort of a process of experimentation already that has been super useful and applicable to AI. And this is where we can actually get specific about what data we're working with and and use it with specific models to with our and then use then look at the outcome of when we're when those that data is processed through the model, you know, let our staff or users sort of see see what comes out of it. And then from there because what we've learned a lot is that expectations of in terms of performance for a lot of AI processes and tools don't are are sort of overblown at this point, there's the inter if you are looking for high quality, highly accurate outcomes or data at the end, then you really it'll take some time to get there. So the this is where the experimentation process comes in where you can sort of take that real data with a real use case, see what happens and then review and and and develop a baseline of of what quality means and what good enough is, which in some cases we have in some kinds of cases we don't. And then in the implement phase, like I said, we haven't been here yet, but we haven't gotten here yet. But we know that it will, you know, we'll need a program of monitoring and measuring because just the nature of AI when models see more data, the the sort of outcomes can change over time depending on that data. So monitoring is a really important aspect of of implementation. Also the you know, the there'll be I think that there there will be a skills gap but not a huge skills gap. I think a lot of the information management, librarian, archivist sort of background, especially working with digital collections and digital content that really still does apply in this in this area. So the it's a matter of sort of updating new skills, updating skills and capacities, not we don't have to so, you know, kind of start from scratch here. There's a lot we already have. And then the one I think really important part that the that we're sort of lacking, which I would love to work towards together on are these shared quality standards. And and I think of, you know, I think of OCR, which is a optical character recognition, which is a widely used sort of machine learning technology. And as sort of as a community, we've we've thought, you know, we get maybe about 80% accuracy. And for a lot of use cases, that's probably good enough. So and we have sort of embedded those, you know, sort of quality standards in terms of imaging in terms of OCR. And we've shared them in in place venues like the federal agency guidelines for digital guidelines for digitization. And the so I think we need things like that for AI when we have these key, you know, sort of use cases, what does quality mean there? What is it? And we can communicate that to our vendors, we can communicate it to each other. Okay, so going back to the principles bit, we there is a there's executive orders for about AI. And I just highlight the and they outlined the number of principles of what, you know, AI should look like in the federal government. And this is, you know, across different administrations, different parties, but there are sort of highlighted sort of exact matches of words, where their, you know, people's civil rights, civil liberties and security have to be sort of number one. And I think that's sort of a baseline. But in our individual organizations and for our individual tasks, we can sort of refine these. And and I think the PAM already talked about some really, you know, salient ones around transparency and participation, collaboration that you can build into the AI programs that you that you make. Just one more thing about assessing quality, because I think this is something that is really primed to work together on. This is an outcome, these are outcomes from some of the experiments that we did at the library. And we're trying to think through how we do assess the quality of the outcomes. And and the on one side, I think it's looks like it's on the right. There's on the precision recall F1. These are sort of performance metrics, common machine learning performance metrics. And that's one way to sort of measure if this if this tool is working well enough for our purposes is are these performance scores. There's also a number of qualitative measures we could we could evaluate models around based on how reliable it is, how much compute cost does it take, how many resources does it take to to run these models? Is the documentation good? Is it active? You know those kinds of things that are very relevant but that also there's other sort of program and sort of sort of higher level assessments that we want to do especially around cost because cost is really variable depending on what tools you're using and it's really unclear what the actual costs are going to be. There's a lot of compliance, you know, some issues I think back to web 2.0 days when you know we in the federal sector we couldn't use a Google map, we couldn't use a survey monkey because of how the data was treated in those systems and from what I hear we're sort of starting with square one with some of these tools for having government friendly terms of service. And then there's also things we want to track around fairness and unbalanced and if these technologies are used only for a certain type of content or only for only a certain type of user we want to track that to make sure we are applying this technology in a fair way and then also of course there's the human impacts of what the technology can do which we'll want to assess sort of on a case by case basis. So anyway this is just sort of food for thought on how to assess the quality of an AI system that we may want to implement. Okay I'll stop there. Thank you so much a lot of food for that that's terrific. So let me introduce our next speaker Eric Stein he's the Deputy Assistant Secretary for the Office of Global Information Services at the US Department of State. In this role he is the department senior agency official for privacy. Previously he served as the director of the Office of Information Programs and Services at the State Department and was responsible for the department's records management FOIA, Privacy Act, classification declassification library and other records and information access programs. Wow he also serves as co-chair of an interagency FOIA technology working group of the chief FOIA officers council led by the Department of Justice and our own National Archives and Records Administration. Please join me in welcoming Eric Stein to the podium. Well good afternoon happy Sunshine Week and thank you the National Archives for this opportunity today. Our AI is better than this. Well while we're waiting for the slide to pull up I'm very excited to talk about some of our recent experiments with artificial intelligence in our transparency programs at the State Department. Over the past a couple years we've been looking at how to declassify records and I've spoken quite a bit about those pilots and efforts and other fora and look forward today to speaking to the public to everyone here to other agencies about a pilot we started for FOIA and AI in June of last year that just ended last month and the thought was could we use some of the lessons learned from our declassification AI pilots to improve our FOIA processing and we came in with very big ambitious goals and learned that maybe we need to take a few steps back before we jump into some of those bigger dreams. We'll try one more time here. So what we learned from our AI declassification pilot is that a machine learning model can make a decision using discriminative AI to either declassify a record, make a proposal to declassify or to keep it classified or then I don't know and a human review would have to occur. I'm very happy to say that starting in September of last year from that AI declassification project we started proactively disclosing those declassified records on our FOIA website and have done so every month since. We reviewed tens of thousands of records for declassification using that model and we're very happy to release I think just about 700 or so records so far there's a little bit of a bottleneck in terms of getting those out to the public but it was a declassification review and there are a myriad of other factors for public release that we need to consider as well. Today what I'll be talking about is the FOIA pilot we had for AI and what we tried to do was we tried to address a couple different things. There was a challenge in particular with we were wondering how could we maybe improve the experience of our requesters who are going to our public website and looking for records we've already released and then also how could we improve the experience of our employees in searching for records maybe not searching or reviewing the same records over and over again and so fortunately we're able to share some of the results today from those pilots and can say that we have gone from to take Abby's term the phases of understanding and experimentation and we're now looking toward implementation of some of these FOIA results. All right here we go so here's our FOIA business case up behind me. Essentially we've seen from the public FOIA reports a number of cases and requests have gone up over the years. We find that there's increased complexity of the requests and large volumes of data and included in these slides is an appendix with a graph that I absolutely love that shows this large line increase of records that have been captured that we have captured over time and keep growing and growing and growing that we have to review for declassification and that's just the classified stuff that's not even unclassified information so that's in the appendix. Point being is the complexity of requests continue to grow and also the electronic volume of data we have to search just continues to grow as well. So if we don't try to embrace technology now and figure out how can we leverage these tools to improve our process we are not going to be successful now or moving forward. And then we found there were some certain silos between our teams and tools that led to inefficiencies and delays in our FOIA processing and I'll be talking about that in a moment. In addition to just some FOIA snapshot of our stats about our program we had our goals here on the bottom. Identify similar FOIA requests so what we want to do is right now a lot of this is very manual. We get like 10 15,000 FOIA requests a year. Someone can request something very similar maybe not exactly the same but someone has to manually search our public FOIA.state.gov website. They have to search the case management tool we have. We have to search our archives. All of this is manual and and we want to be very clear the public can request FOIA whatever they want from agencies and as you go through these requests sometimes as a sentence but sometimes these request letters are 10 pages long even longer. Sometimes they list hundreds and hundreds and hundreds of terms and that's not a complaint in any way but rather just to acknowledge as we go through these different terms someone has to manually copy and paste those over. So the thought for this pilot was could we take a request letter in the terms that we have and then try to determine just take it once we perfect the request put it into a search and search existing requests that already exist and what's on our public facing FOIA website and that's what I'm going to show you a real screen a couple of real screenshots of this AI tool that we developed. Additionally we conduct searches of the central databases I mentioned that already and improve the customer experience. We also realize that coming to our website we have 250,000 plus records on our public website. Is it the most user friendly? No and we're going to work to improve that but this pilot was able to point out some areas where maybe we could start. Thank you. So here's our project timeline and approach and when Abby was talking I kept thinking about and here I have phase one and phase two. Our phase one is our understanding phase. Our phase two is our experiment phase and then here where we say present this is where we're going to work toward implementation but we started with two phases. The first phase is to understand some of the challenges in our FOIA processing to look at how could we use AI machine learning. A lot of times we get questions like well can't we use this for redactions can't we use it for this for we decided to start let's just start with some search features that we saw that worked well in our AI class model and see if that could be applicable in the FOIA context. And I also want to emphasize this work was done with the intention of making our FOIA process more efficient and actively seeking out inefficiencies that we could seek to improve upon in process and technology and none of this would be done without our people of course. I'm going to take a moment to thank everyone from our Center for Analytics which includes our chief data and at chief data and AI official to our CIO's office of a Bureau of Information Management Resource Management and my own organization. Thank you to the team that they seem to really love working on this project and we're really glad to have you with us. So for phase one you can see we took a couple months to do some research about what challenges we have and for phase two we then looked at what could we do with the AI machine learning. Next slide please. Thank you. So phase one in our understanding phase here are key findings the pilot found that FOIA employees were using these manual processes and this is where we've learned that AI and machine learning can be very valuable saving time and energy by not having to do the same task over and over and over again. I want to emphasize if you're looking at the chart on the screen here I'll be talking about this more in a moment but this is from our FOIA reading room our public reading room on the on the y-axis is from the left up and down it's number of documents on the x-axis on the bottom there left to right is the repeated number of times and we found from just in this understanding phase that there were records that we looked at 118 records were posted three times on our public reading room and if anything I think that goes to show we do try to be transparent respond and so forth but what I saw was inefficiency. How many times did we possibly review the same record and in some instances cases are similar so before anyone's like well wait why are you doing this well some people request the exact same things and sometimes cases are about cases are about cases so you could be instances like that but you look at we said 113 documents were reviewed four times and the far right you have six different documents that at least appeared to be posted 12 times it could be a very popular document or could be a lot of questions asked but we kept digging and digging why does this happen how did this happen what steps could we take now and maybe what could be used with technology and what we do not use technology for manual processes can be inefficient we already mentioned that they have to search across different areas and then for certain cases we're duplicating efforts reprocessing the same documents over and over again so if an archive we can search our centralized archives we have to go back to different state department bureaus and offices and take them away from the key work that they do in their respective missions to ask do you have these records what doing consultations understanding sensitivities and so forth so could we find ways to improve that as well next slide please so here's our new capability this is the actual tool that we developed and I want to point out we didn't go with aesthetics we went for functionality right now so in terms of the upper left side you have title and that could be like a request ID number like a case number but I'm going to focus more on the request text box here think of that like a Google like search feature and what I'm going to show you in the next page is the results of the machine learning and AI tool we took you could put the actual request text in there it could be a paragraph it could be paragraphs pages those 200 terms I talked about you put those in the request text box the search beginning date and search end date if you leave that blank which is searches all the dates of the data we've used if you want to look for very targeted date range you can put that in there and then there's term frequency and context in fact we're about to show you as a result of term frequency because context is a different search for a longer broader text but I didn't think it'd be beneficial to put like a 10 page request up here for this presentation here today so let me just take a step back what this tool is about to search and what we piloted is he pulled all the data from foyer.state.gov all the public records that are on there and we pulled all the FOIA requests we have in our case management system and so what we wanted to find is if we hit submit in a moment you'll see the results and I'll walk through what those results say in particular the results will each case and document will be signed to score one to one hundred but don't think of this like rating in school in fact like there are different cutoffs for 70 percent 80 percent but it signs a score to the case that may be related to this request or at the document level even what's related so thank you and then that was really I didn't touch the button there so thank you to move that forward I guess a magician shouldn't give away tricks right but here we have our phase two results so for our matching tool in the upper left corner you see kind of a split screen where it says fully expressed results and that's our case management system and our reading room results you see that from the travel advisory search I think was travel advisories and warnings was just those the terms we used there were 40 potential cases currently active at the department right now involving some instance of travel advisory or warning and then includes about 4,871 documents on the right you have a hundred instances in our reading room including potentially 125 results you can imagine the words travel warning and different iterations pop up quite a bit so we'll talk about how we drill down those results in a moment underneath that is a bar chart which might be a little bit difficult to read but it shows you the grouping of the like requests and what you see that red line that to the right of that red line is where we're like this is very confident this case might be identical to what exactly it was in those words and then we move further away from that confidence of it being identical on the upper right you see a breakdown on the left side again that's in our case management system the possible cases that could be related and on the right it's what's on our public for a reading room so on the left-hand side at the very top there they probably can't see if it says 100% so there's the first words in that for your quest or travel advisory a direct hit on that term and then on the right there's also text about travel advisory so what you can do in the tool when it's live is actually click on those bars and it narrows the results down so you can kind of see you know what documents have been released and this will help us to respond to requesters faster as well we see information sorry on the website we can email that out contact the requesters say here's a link here things you could look at on the left-hand side it goes to show us well wait a minute there are you might be several related cases where we've done searches reviews how contemporary or recent are those searches or reviews and so forth on the bottom here you see the word mapping these are the work these are word mapping don't be very specific here from the reading room this is and I want to be very also clear this is not all State Department data again it's not all State Department data this is from our FOIA cases that we have and records we've released publicly so as you look at release publicly related to travel advisories and warning so what happens on the bottom is if you click on a specific country or term you can see which year the number of documents start drilling down on results if you're looking for a specific country or region in this case first travel topics or travel warnings or advisories and it does the same thing for case management as well we just ran out of space on the slide so we have been able to successfully pilot this tool and see different ways we can look at cases and find efficiencies in our current search process and also in our case processing and we've learned quite a bit as well which I'll talk about let's go next slide please for those of you interested in the different forms of artificial intelligence or machine learning or the technology we used we have on the screen everything that we've used let me very clear this is not an endorsement of any product or tool in particular but rather the ones that we've used for our specific work very happy to say that there are a lot of different moving parts in these this type of work in terms of looking at what records we have requests for what records we have found in response to those requests what we've already publicly released and I think as we go back to the whole concepts of understanding experiment and implement we're going to continue to do our proactive disclosure of our declassified cables and we will be looking at moving forward different ways to implement this in our FOIA process so we can improve our processing time and do better with our FOIA case management with that I want to thank you for the opportunity to present this here today and I'll turn back over to Pam. Thank you so much Eric. All right so our next speaker is Bobby Tolabian he's the director of the Office of Information Policy at the Department of Justice and is responsible for developing policy guidance for executive branch agencies on the Freedom of Information Act providing legal counsel and training to agency personnel on the procedural and substantive aspects of the act and for encouraging agency compliance with the law. OIP also manages the Department of Justice's obligations under the FOIA including adjudicating administrative appeals from denials of access to record handling initial requests for records at of the offices of the Attorney General, Deputy Attorney General and Associate Attorney General. Providing staff support for the Department Review Committee which reviews Department of Justice records containing classified information and handling the defense of certain FOIA matters in litigation. Please join me in welcoming Bobby Tolabian. Thank you so much and thank you so much to NARA for inviting me and it's such a privilege to be here with such a distinguished group of experts on a really really important topic in FOIA especially. So this Sunshine Week I think we celebrate a really historical new chapter in FOIA. For the first time ever we have we have received and processed over one million requests and of course I I think that's such a great thing every time we get more requests and we're processing more requests because FOIA is really important tool for citizens to engage with their government keep government accountable and meaningfully be able to engage with government and so we want it to be put to good use but of course there's challenges with that. As as Eric also mentioned with more complex requests being able to keep up with that demand and so now more than ever it's critical that agencies continue to be able to find advanced tools to help with every part of the request process. We've you know from working with agencies and reviewing their chief FOIA officer reports there's been a lot of great work that's been done on several areas of FOIA administration and trying to implement advanced tools. For a number of your now agencies have been using advanced e-discovery platforms that have machine learning functionalities that help with the search and reviews of responsiveness for records. We're excited that agencies are making headway especially for example the Department of State and Eric who've been pioneering in this area the potential for these types of tools and machine learning to help with the review process and of course also we're really excited to have been working on a new search tool for FOIA.gov to enhance the user experience that you know kind of focuses on customer service and that also has been been able to have we were able to do it because of a level of machine learning. We are very committed to this and see the value of it and that's the chief FOIA officer council technology committee which Eric co-chairs is doing a lot of great work as well and we're really excited for our next gen FOIA technology showcase that the technology committee will be putting together this coming May which this year we're really focusing on asking industry to come and show us what they have in addition to just case management processing and other tools but specifically in the area of artificial intelligence. So I would what I wanted to kind of show off not show off what I wanted to showcase today is the work that we've done with upgrading FOIA.gov using our new search tool which we were excited to launch in October 2023 and the problem we were trying to solve here is that we understand how complicated the federal ecosystem could be for someone especially not familiar with federal government how it's organized when they're looking for information that may be already available or if they're trying to find the right place to make that request or find that information to ask that information. So we did a lot of use discovery to find out what the great what would be the proper solution to that just to give you context in the federal government we have a hundred twenty agencies that are subject to FOIA and then within that several hundreds of agencies that have FOIA offices to make requests and have separate FOIA libraries. So it could be a daunting task for someone looking for records not familiar with federal government and how to find them. So we did a lot of user discovery on what the solution would look like what would be helpful and we came up with a tool that uses a combination of logic a logic pathway based off user journeys for predefined topics we thought that would be the most common topics that people are looking for records as well as a level of machine learning that I'll get into to help the requester direct them to information is already available what that information is or if they need to make a request for that information or there's another avenue to make a request for that information outside of FOIA that we direct them to the right place. So I just have some snapshots here to give you kind of the idea of what our final product of the first iteration is and so here you can see we have six predefined topics and these were topics that through our user discovery we observe as being what would at that point we're making an assumption of the most commonly sought after types of records. So we have immigration and travel records tax records social security records medical records personnel records and military records. So if you click on one of those I'll show in just a second it'll take you down the logic based question journey to get you to the right place. However if that doesn't satisfy those casual satisfy what you're looking for you can then enter in the search box a description of what it is that you are looking for so the tool can help. So for example if we clicked on immigration and travel records here then the tool takes you to a list of commonly types of categories of records within that category from a files to naturalization certificate visa records and knowing Eric was going to be here I specifically wanted to use state department as the example. So if you click on visa records then it takes you to the best results here where it identifies the department of state as being where you would be able to ask for your visa records it then provides you some best practices and tips on how to make a request and then takes you directly to our form on foyer.gov where the requester can make the request to the state department. It also like I said it's not specifically just for where to make a foyer request if the record is better access outside of foyer through a different means we also built that in here as well. So for example if you're asking for records of a former military service person it says ask you if you're asking for the official military personnel file and then based off of that it takes you to if depending on who you are filing a request directly from eVetRex online or directs to the standard 180 form and the National Personal Records Center website on our website. So if none of those you know predefined categories satisfies your what you're looking for and that's when you can then put in any kind of description of what you are you're looking for that didn't fit there. So here for example you're looking for JFK assassination records the machine learning part of our tool then finds what's already available online on JFK assassination records and obviously you can see a lot of records posted available from NARA and then also where you might want to consider making a request if those proactive disclosures do not satisfy what you're looking for as you can see NARA was one of the options. So how it works and we wanted to be very transparent about what we're doing and how it works so this is also something that we prompt prominently on on the Foyer.gov website but the way it works is that if you do the search text first thing the system does it takes that text and looks at the predefined journeys that we had and if one of those predefined journeys satisfies potentially would satisfy the what the requesters looking for it then throws you back into that logic-based pathway. If that doesn't satisfy then the system scans and looks at agency names, acronyms and the agency's mission statements to see if that would be helpful in directing the requester to the right place but then if that does not also satisfy that's when the machine learning comes into play and what we did was we did a lot of testing and discovery and worked with our partners in this to find what the best machine learning models would be and to take agency Foyer logs that are already published on agency's websites and also frequently requested records that are posted in Foyer libraries to use that data and information that's publicly available to help direct the requester. So we're really excited to just past week have over 50,000 queries that have been made in the system since we have launched it. We're constantly looking at the analytics that's provided to us from the usage of the system to see where we can refine and improve the functionality and what we've seen so far is that 58% of folks that have come and queried they've actually gone to one of those predefined categories which reinforced our observations during discovery that those would satisfy a lot of the information request needs of the request or community finding where to go for the records. Another 14% they were put into the predefined categories based off their tech search. So as you can see a large group of the requests are satisfied through those categories that we identified and I took this to be that you know a good lesson in the machine learning is very important and it's very powerful and it's potential but it's not the solution to everything it's part of the solution. Here we have 28% which are being satisfied and are taking advantage of the machine learning. So we knew when we launched this that we wanted this to be the very first iteration it was important to get something out there to get real use real feedback real analytics to help us decide how to best then refine the solution improve on it so that requesters are getting better results more and more. And I'll note that in all of our analytics and everything that we're doing we're ensuring that private personal privacy information is not is not being transferred to the analytics that we're using. So one thing I'll say was I encourage folks to play around with the tool give us feedback it's really helpful. We're already looking for ways based off the analytics and feedback to improve the functionality in multiple ways. We've already identified another user path journey that we believe is going to be very helpful that we're working on but also we're trying to now solve one of the biggest challenges of this project which was getting the data. Machine learning is very helpful if you have a good amount of data and good quality data and so that was the difficulty that we had with this because we are trying to get the FOIA logs and frequently records from a number of different agencies that posting in different formats and different levels of accessibility so we and the team did a very manual and laborious effort to get all that together because it was very important first step to making this something that could be a reality. But going forward we want to work with agencies who are going to issue guidance to help standardize and make more accessible these FOIA logs and frequently requested records so that they can more in real time be digested that's our next kind of you know vision pie in the sky what we want digested so that the tool is informed with more and more data and more and more real time and we look forward to that and of course that also has the added benefit of making all those postings more accessible so that's that's that's what we've been doing with the FOIA tool and it's been exciting and yeah I encourage everyone to please give us feedback where you can and I again want to thank NARA for including me in today's presentations. That was fantastic and I appreciate how user driven everyone in on this panel has been our last saving the best for last our last speaker is Gulam Shakir. He's been NARA's chief technology officer since May of 2020 and in his role Gulam established agency wide enterprise technology architectures and provides program level IT strategic direction to mission critical programs. Gulam also currently serves as NARA's chief AI officer. He's also served as NARA's chief data officer and as a system architect within the office of information services at NARA since 2016. Before joining NARA Gulam served in various technical leadership roles at I'm not going to get these right data zoo and Marchucks and the IBM corporation. He has a master's of science degree in computer science from West Virginia University. Please join me in welcoming Gulam to the podium. Thank you Pam for inviting me to the event. So I'm going to briefly talk about exploring AI at National Archives. Just want to go through some test use cases that we found that were critical for our mission. Just want to go through to set the context. I just want to go through some background slides. The first background is about an AI executive order that was issued in 2020. It basically encourages all the agencies to use AI but use it in a way that basically the most keyword that I can think of is responsible use of AI. And also the other keyword that I wanted to point out is probably protecting civil liberties. So and that was one of the impetus for us to start looking at the AI very seriously in our agency mission use cases. And also we are looking at what's coming from the executive branch and we recently are reviewing and draft AI memo. And this basically falls along the lines of executive order. But it has three goals which we are working on and trying to align our whatever we are doing with AI use cases along those three main goals. And the first one is about AI governance. Again, as I touched on the last slide, the responsible AI is key here. And one of the ways we could do it at the agency is having some kind of a governance body around it. And then we also looking at working towards the second goal which is advancing AI innovation and also investing a lot in our workforce, especially in upskilling and in data skills, AI skills and generally what is required to work with AI. And then the third one is about managing risks, which is about where we have to constantly evaluate. I think some of the speakers have touched on it like to make sure that the AI is not giving any unsafe outcomes. And also you must have heard the term hallucinating, those kind of things so we want to avoid those kind of things and constantly evaluating the risks. And so we have been looking at AI use cases since 2019. In 2021 we partnered with some external partners through Pam Wright with Virginia Tech and working on the ethical frameworks around AI. And also along with our CIO we constantly in touch with other agencies partnering. And we have representatives in the responsible AI official council roundtable through our CIO, through myself and through the chief data officer. Okay. And before I go into the use cases, what we are trying at NARA, all of these use cases are documented because that was one of the requirement through the executive order. And if you go to archives.gov, hopefully if you search you will find our inventory that we have documented so far. We are constantly reviewing that inventory every year and periodically updating if anything that would either applicable to AI. So we are adding that use case to that. Now, there are lots of use cases around AI, but what we looked at it will be better to look at it by segregating them into a bucket so that you can easily follow along where this use cases falls. So the first one is a pretty big problem which is about searching how to improve the search experience. So we are looking at some use cases in that bucket. And then the second one is about PII detection. This comes in where because we make a lot of records public and we don't want to leak any personal information in Adventurer. And then also self describing records. I mean, it can use a better name, but we are calling it first draft descriptions now because if you look at it as one of the speaker touched on, the data is what powers the AI use cases. So one of the ways you could do it is use AI itself to enhance the metadata around it. Like when was this record created? What was the title? What was the scope and content? So we are looking to see since we have a very high volume of use cases, how we can use the AI itself to annotate the records with these metadata. And then finally we are looking at how we can support the workforce through these AI tools. So the first category is search based on AI. We divided search into two buckets. One is how is search going to be useful for our people who are working on the mission like for the NARA workforce. So in that, obviously the FOIA comes out on the top. We are looking to use natural language processing where you can interact. Basically, you upload the documents. You kind of like have some kind of a free form tech chat interface so that you can ask questions about the data. Hopefully that will help you narrow down the search because the breadth of the documents that you're looking at, it is pretty large. Usually it's not humanly feasible to just review every record manually. And then we are also experimenting with another tool where it automatically tags FOIA assumptions, 5V or 5C. And this is in the R&D phase. And the next one is about national declassification. We are still thinking about how to do it because it requires a special kind of environment to do this, try out this one. So we are looking to acquire that capability before we even try out certain use cases there. And then, sorry for the acronyms here, there is something, we have a system called Electronic Record Archives, an executive office of the president system. Both of these have a search functionality built into them to look for those records. And there are the records, at least in the EOP system range, 700 million. Like, you know, just looking at the emails, several hundred email photos and things like that. So we are going to see how we can improve the search. And then we are looking at semantic search. Actually, this moved into a pilot phase. Again, we want to improve our search experience. We are traditionally, if you look at it, the search has always been about keywords. Like one example that we give it is like, you know, if I try, if a user comes and search our public records saying that I want to know about US's Truman, that combination of words should automatically imply that person is searching for the Truman, the carrier, not the Truman, the president. So we are trying to use, understand the context behind it and improve the quality of the search results. So we are really hoping our first page results, trying to assess the quality of the first page results, we are trying out seeing if semantic search improves that aspect of it. And then, of course, we want to bring in chat GVA like interfaces so that people can naturally interact with our records. So and then also our, we are trying to do AI search for our national personal record center records, and also about searching the knowledge base for our internal workforce. And the second bucket is about the public facing systems. Again, the semantic search applies here as well because we have lots of searches. At the last count, we have what close to 400, 500 million records in the National Archives Catalog. So again, we are trying to target, you know, how we can improve the first page results there, because traditionally we want people to whatever they're finding, we want them to appear on the first page. So also we are trying out different OCR capabilities. And then we are also trying out metadata extraction to improve the data quality. And the PIA detection is another bucket where we found that we do a lot of public release of documents. So we are aggressively trying out several different out of the box tools. And we are assessing based on, okay, how what is the false positive rate? What is the false negative rate? Because we rather err on the false positive rate, not end up missing any PIA accidentally releasing it. So and then we are also trying to classify the data automatically. So that we know what kind of data that we are dealing with before doing any work. And then also finally, I spoke about the AI assisted to create records description. This is about creating the metadata, where things if you want to populate around scope and content, the title, identify the entities, like, you know, if you give a document, it has to identify the topics, the people, the location, if you all are add all of that metadata into your act alongside your actual documents. If people search through entities, they're immediately going to go to the record. So that's something we are looking to do. And finally, all of this obviously happens through our workforce. And and we are thinking, okay, what can we do to empower our workforce? So we are looking at, we are a Google shop. So we are looking at, you know, how we can give those tools to our employees so that they can be more efficient in working on the agency's mission. That's all I have. Thank you very much. Well, thank you so much, Gulan. And thanks to all of you for the great presentations here today. I have a few questions for the group that I can start with. But we would also welcome any questions from the audience. There's a mic right there in particular. There's one over there too. So if you have a question, I know some of you will have some great questions. So I'll ask the first one. While I'm doing that, think about if you want to ask one, I'll be glad to. Okay. All right. So I'll start with Abby, but anybody can answer this question as we go through it. Abby, what are the biggest challenges that agencies face when exploring and implementing AI ML solutions in their agencies? And do you have any tips for overcoming these challenges? And your mic should be on it. Sure. Well, I think we already kind of everyone talked about it. But I think it's the data, the management of the data. The the amount of effort it takes to sort of ready data for for AI is way more than it takes to just process it. So it's that is a big challenge. But I think if you already, if the agency already has a a data management sort of practice that that helps the and like I said before, it's these are not completely new processes that they they are understandable. So I think do you know, supporting people and sort of understanding what this net, you know, how AI is slightly different than the sort of many digital transformations that we've gone through already will help a lot. But I think kind of focusing on the the program of data, you know, having you know, practices and policies having you know, groups that talk about it internally and externally. And and and thinking through specifically like what AI what you need to capture on the data for AI. And a lot of times that is documentation of what the process did to what data. So think, you know, using things like like model cards and data sheet data cover sheets so you can sort of track, you know, how data has been transformed over time. I think will help in the long run, especially with issues around sort of transparency and authority and things like that. Does anybody else want to comment on that challenges that agencies have implementing these and what tips you might have? That was great. So just a few points on that. I think it's going to depend on the agency. Eight different agencies are going to face different challenges, you know, large agencies, small agencies, those with different structures. And this doesn't these details matter in terms of, you know, who's the what has the IT organized, whereas your data officer. So I just want to emphasize again the partnership piece and the support of leadership is really important. I think that that's been those are factors as well. I've been I've been talking to a lot of different agencies and they've asked like how do I get started or we don't have the same setup or we don't have the same support. And I think it comes down to you know, having an idea, everything Abby said, of course, and then in addition to that, knowing who to talk to. And I think just starting those conversations. Finally, just add one more thing. And I think we all kind of mentioned that the the ability to try things before they go, you know, you know, we call experimentation or piloting a prototyping that is really, I think, essential to make sure that the the outcome or the output from these products are good enough. You want to make sure there's a step for that before kind of diving in. Those are great, great tips. Anything else? Good on the one. I would just say one of the challenges for us, especially for folks who are not just getting into this is education, especially for like the mission program people. And obviously, if you have a strong partnership with your data officers or CIOs, that's really helpful. But the learning curve is also a big challenge. How do you think? How have you worked with leadership in your agencies? Have the learning curve there? Have you seen leadership out in front or have you had to help educate leaders? How is that? No, absolutely. Leadership is very supportive. You know, through having the chief data officers and also now a chief AI officer, like this a lot of visibility and support for it. And that's super helpful. All right. OK, we have a question up in the audience. Hey, John's appointed director at Washington Post. Happy Sunshine Week. Thanks for the great presentations. My question is two questions. So for Eric, I wonder if you can just really interesting. Can you just talk a little bit more about the really interesting data you had showing that some stuff was posted 12 times? Like I know there's probably good reasons for it. Maybe maybe people requested 12 times, but just maybe some examples if you can. Or it's just fascinating. And then my second question, building on that is maybe half shoot and half opportunity is that the FOIA as everyone here probably knows requires in theory that if a request is requested three times or more, the document supposed to be proactively posted. And this great stuff that you showed today shows me what I've seen anecdotally is that people are not keeping track of if things are requested three times or more. So maybe if you guys could talk about the opportunity of using AI to make sure that that requirement of the FOIA is better fulfilled. But thanks again. So it's a great question on the why 12 times I'm going analog here, given our challenges before. On the why, let's see here, it was. There were six documents posted 12 times. I don't have the exact case data in front of me, but it could be that it was requested multiple times at state. We've years ago embraced the release to one release to all policies. So I don't think we actually track if we're getting it three times because it doesn't matter if we release something. It's not privacy, you know, privacy act or personally identify the information. We release it to the public. So I would just from having been through so many cases, I would guess that we probably have very similar requests and that makes just reinforces the pilot as we probably release these things in different cases and we probably release the same records to different requesters. I'm speculating here, but those are the initial thoughts off the top of my head. I think we want we want to find ways that I think this tool will allow us to see what's been released. And also we've already thought of a couple of new ways to just update things on our public website to maybe help guide two records or different places where because reports and things are made public. And then on the proactive disclosure and posting question, I think I've addressed a lot of points there for state. I would just say that when we made this decision several years ago and I asked some of our employees, could you track how many times are getting the same request that was looked at as another like onerous task put on top of I could do this and they'll happy to do it, but or I could process requests and it's like, let's just keep moving. Let's just start posting as much as we can to the public. Yeah, I can as far as identifying a proactive disclosures, especially those that hate what we call the rule of three. Agencies are striving to do that. It is challenging. So that we have different levels of best practices that we do using case management systems, the tag and flag, having systems in the front end to help identify. But it's not a it's a very difficult thing because requests are described differently, records processed differently based off of like what fits within that certain request. And I think that's where Eric has shown the potential for helping augment what we are already doing at a human level with machine learning that can really help bridge the gap for requesters that are agencies that are trying to better identify what hits the rule of three. And can I have one more point here on this? I actually think some people some requesters have very specific terms they want searched and we want to honor that. I think I touched on this a little bit, but we know we may not use the same terms in the State Department as the request or use. So and I think sometimes there's this distrust or this thought that we're not trying to respond to the public. So what was pretty neat about this technology is that we're going to be able to search those exact terms for everyone. But we're also going to be able to do the other search, even if people weren't interested in it to show, look, we had this here anyway. So whereas right now we go through those specific terms and you might not get a response and then it seems like we're concealing something when in fact if the word and the request was just ever so slightly nuanced, you might actually find a lot of records or there might be information out there already. So we're able to make associations we weren't previously able to make before and I could tell you in seeing some of the other test cases, it was amazing. I mean, something I had worked on very closely. I saw I was like, wow, if someone had never worked on that before, could come in and just learn right away. It's an incredible knowledge management tool. Thank you. All right. Another question? Sure. Alex Howard, it's great to be back at the National Archives for Sunshine Week. So thank you for hosting us all here and thank you for your continued commitment to open government and everyone on stage who's talked about their work. Thank you too. It's a good time to offer our thanks to you. I've been here in a number of incarnations over the years to ask questions and I'll repeat a couple of them because I'd like to follow up. The specific one I have for the State Department is about your release to one released all policy. About seven or eight years ago, I came and asked this question of your predecessor, Bobby, about why that hadn't been put out because Obama had a pilot under OIP and said you should explore this prospect. So I'd asked the State Department now, now that you've actually done it, hasn't there been any downsides to it? Because when we had a robust national debate about whether agencies should be proactively disclosing the records they give to one requester to everybody so that they can be searched and tagged. There was some back and forth about it and then the policy sat on the shelf and it never got issued. Has the State Department had any issues and has OIP considered issuing that policy now writ large based upon state's experience? So in terms of downsides, I can only think of one. Overall, a big proponent of it, you know, I was the one who put it in place. The downside is we can't always keep up with demand for ever. We do the best we can each month to release everything that's been released. So we've been fortunate to have technology team under our organization that can do this work as we look at ever growing volumes of information and different. I can see challenges and like I mentioned, the proactive disclosure of the declassified cables, you know, we want to get that information out. It's a resource issue. It's timing. So I think the only the only downside is the week. We're going to see increased demands and but overall nothing no downside to proactive disclosure policy. In fact, I think even our employees appreciate it because we're releasing information and it creates reduces redundancy, I guess, for lack of a better term. And of course, we are and have been a big proponent of proactive disclosures beyond the legal requirements and encourage every agency that can to adopt what the State Department has been doing. But when we did do that pilot and we did there was a tension for some agencies and as Eric mentioned to there's there's varying capabilities and resources at the agencies and the there was difficulty for some to be able to do something like that. And so we've been encouraging it and trying to help agencies and try to solve some of those problems. Even through the G4S or Council and the Federal Advisory Committee, we've looked into the issue, which obviously is the resources for rehabilitating those making those records accessible under the Rehabilitation Act. And I'm hopeful that, you know, the work that we're doing with right now with what I like to call the FOIA wizard is is going to help push that more as we get it to become even more accessible, the records more accessible. We see more of the value of it and we get pushed one step at a time towards agencies being able to not just post what's legally required and what's identified as of in public interest, but even more records. Thank you. It's FOIA.gov slash wizard. So I think it's not just you describing it that way. It's like it's in the URL structure that will always be an internet's memory. So to follow up and someone gets behind me, I'll walk away. But there's an opportunity here for you all to lead in a way that maybe hasn't been there for a while, which is to say to offer proactive guidance on generative AI. One of the things I loved about today's discussion is you usefully separated out what's come before public imagination was excited by chat GPT and the looking of large language movement, right? This idea that these things can actually generate pictures, generate text, can transform something, can see something that we can't. Just like when Facebook is now showing us the expressions of people and pictures and identifying them for us. That's getting into creepy areas, but obviously for something like the National Archives, having technology, recognize what's in a record and then put metadata next to it could be transformative. Declassification is a great example. And I really appreciate you brought up the example of the cables. Six years ago, we might have been talking about the disclosure of the former secretary's email, right? Which is a great example of releasing to all. As you look at this next year, when public imagination is out there, agencies are rushing forward not just to explore the machine learning that was documented in terms of here at NARA, but this generative AI side is the Department of Justice considering issuing guidance on the responsible use of that because I know that your contemporaries down the street at the House of Representatives have put out guidance to say to offices, we want you to experiment, but we don't want you to do it with sensitive data. So to create sandboxes for agencies to experiment, but not necessarily to run into some of the issues of generative AI that we've seen in other contexts, which look, frankly, somewhat hallucinatory, right? Like they're citing papers that don't exist and they're creating historical anachronisms when asked to describe things, which could be detrimental to public trust in government. So official government code be creating those kinds of synthetic data or synthetic media. Will DOJ put out guidance this year on responsible use of AI and will you at least say that all uses of AI should be disclosed not just in the AI Gov space, but on FOIA.gov so that everyone in this community can understand where it's being used, how and for what and understand where it's not yet and why. So that definitely goes beyond what we do at OIP and our work, but I know the department's very supportive of responsible usage of AI and so I can't speak to it for the department. Well, you know, our office is responsible for specifically the Freedom Information Act and so where we are going to use AI and machine learning. Obviously we'll do it in the most responsible way in one of those lines. But as far as more global use of AI, I don't think that would be us. I'm sorry, I understood your question to be AI generally and not FOIA, but I can happily, you know, we can happily connect offline or offline and I can help you connect to the right people. All right. Thank you, Alex. And we have another question up on the mic. Hey, I just had kind of a related question to the extent that y'all have deployed AI sort of supported or augmented tools already, you're considering deploying to the public. What have your agencies considered about? I don't know what to say than proactive disclosure, not your kind of proactive disclosure, but telling the user, hey, this is supported and augmented by by AI up front. You know, it's like a disclaimer or something. Is that something you've considered, particularly to roll out more complex FOIA solutions? Yeah, so we definitely really wanted to be transparent about, especially how our tool works and that it's using AI and how it's using AI. So we wanted to make sure to have that prominently linked and give the request or the opportunity to go to that page when they're doing their journey. And so we definitely believe that we want to be very transparent about that. At the department, one of the things I get asked about, I'm not responsible for artificial intelligence or data. I'm not the chief data officer or AI official, but we do have an agency-wide governance structure. There are executive orders that were mentioned here today. We have a public policy. You can go to fam.state.gov. Fam is for an affairs manual. It's our central policy. And if you look under the 20 fam, we have a section on data and AI. So we have a governance framework. We have a publicly available policy on AI. We have enterprise data and AI strategies also available on state.gov. So it gets to governance policy and documents available on what we're doing strategically. And of course, the AI inventory. I think there's nuance in your question about, you know, which tools are going to be using AI and so forth. And I'm not the right person to answer those questions, but there may be some information, those resources I just provided. I could speak to this a little bit more generally about how to sort of embrace transparency throughout sort of all the many layers of AI. And I think there's definitely communicating to the end user what, you know, what if AI has been used. Also, you can communicate sort of confidence levels if the if if the models allow that where, you know, this is produced by AI. We think it's, you know, about 80% good or 70. You know, we have 70% confidence in this thing. You can do that kind of thing, too. The also publicly documenting exactly how the data was transformed by AI, I think is really important and that may not be for all users, but to be able to trace through, you know, how data was created. Because I think it does get very important questions around authenticity. And like libraries and archives, we, you know, our whole business is authentic records and keeping them authentic and and keeping trust with the public. So it's it's something that we think very carefully about and that's why we're so focused on the quality of the data. If we were going to use these methods to create sort of our our canonical authentic data that is available nowhere else. And we have to be extremely careful about doing and have a very high standard. So that's not all use cases. There are other use cases around search and discovery that, you know, I think are much less risky to dive into with the current set of tools right now. But but yeah, so I think it's it's an important question for FOIA and for, you know, just the sort of creation of records in our organizations. I just have one comment about about the generative AI and, you know, are we issuing its guidance at least on the technology side. What we're seeing in the industry is there is a trend on something called explainable AI because when AI reaches some kind of a conclusion because you do want to have a backup on saying that, OK, how did this conclusion was reached and another thing to my agency leadership, you know, we are we are kind of thinking about authenticity of the documents or some things like, you know, how can we technology help us? Like should we like, you know, watermark the digital the generated thing or should we somehow sign our actual records themselves so that then then that way you can verify this is an actual record versus a record that was pulled out of thin air. Yeah, I think that's a good point. I think the other piece I think sometimes you lose track of it with generated AI, too, is organizations that receive requests from the public or content from the public is that these tools are going to be used to create much, much more of that content and then we're going to have to manage it and deal with it. So I think just the scale that I see coming from coming from all these generative tools towards our our agencies, it's going to be huge. So we really do have to figure out how to use AI well so that we can manage all that data. I would just add on that that that's why we need to start thinking about innovation and not just requests for information coming in. But as we generate new data and records and manage them, what can we start proactively disclosing not just just as very specific focused instances, but regularly, and that's something we've been discussing. I mentioned this chart before. You see this growth of data and information. There's no way at the current resourcing level with our current processes, what we're doing will ever be able to keep up. So to your point, Abby, I don't know what we would do with that. Some of this we need to partner with this technology and people are a key part of it. I think as you emphasize, I think all of us touched on and that's that's critical to success moving forward. Well, thank you so much to this panel. I want to thank you for all of your wisdom and everything that you've shared today. Thanks to the folks that asked questions today. It was very helpful and I will invite Alina back up to the podium. Thank you, Pam. Please join me in another round of applause for our panelists today. In particular for doing great job as a moderator and getting this group together. For those of you who are here in person, you get an extra treat. Our archivists were hard at work during the panel presentation to prepare for display some original documents from Norris Holdings. Those include the oaths of allegiance at Valley Forge, the original Federal Records Act. I'm geeking out about this too, the original FOIA statute, the E FOIA amendments of 1996, the Supreme Court Decision and Department of Justice versus the Landano and the Open Government Act of 2007, which of course created our office, OGIS. As you exit the theater, the documents will be on display. You may also wander up to the Rotunda as our deputy archivist Jay Posenko mentioned earlier. You may view the Constitution, the Bill of Rights and the Declaration of Independence before you exit out to the Constitution Avenue side. I just want to take a minute to thank my terrific OGIS staff for supporting this annual event. And many, many thanks to Maureen McDonald, Marilee Harris, Grace McCaffrey and Trevor Plant for all of their help in making this event a success today, including the terrific documents display waiting for you outside. Thank you to everyone again for joining us today and happy Sunshine Week.