 I'm very pleased to be with you virtually, although I regret that I cannot be there in person as initially planned. I was able to give this presentation at Drupalcon Portland earlier this year, and it was great to be back after the first, to be at the first conference since Drupalcon Seattle in 2019. So I do hope that I will be able to join Drupalcon Europe in person fairly soon, but for now, I am very happy to have you here virtually as I share some tips with you to audit PDFs for 508 compliance at scale and doing that as part of a website migration to Drupal. Before we continue, I would love to share some information about myself. My name is Lauren Mateo, and I am a service designer for a government contracting firm called Steampunk, which is based in the Washington, D.C. metro area. Steampunk is a human-centered design firm, and we build technical and design solutions for federal government clients. I have spent the majority of my time at Steampunk working with the Department of Agriculture on various projects. I am also writing a book on designing data governance in the context of data ops, when that book should be in beta with pragmatic bookshelf this autumn. If you'd like to speak more about that book, I would love to talk to you about it and show you where you can pre-order if interested. I was also a correspondent for opensource.com for two and a half years, and I've contributed several articles to the site. Overall, I have 10 years in technology and four years contributing to open source in both voluntary and professional capacities. So, as mentioned, I'm here today to share some advice that I gained from a Drupal project that I was put on at work. The project is still ongoing, even though I'm no longer part of it, so my client's identity will be kept private, but my team and I did learn some valuable lessons about auditing content at scale in order to prep for a Drupal migration. Specifically, we learned how crucial it is to design Drupal sites and content for 508 compliance from the start, even when that's not always easy due to the content format. So, I'd love to share some insights into this project and what my team and I learned with you as part of this presentation today. You can see on the slide the first key steps that we needed to take. We needed to start by auditing the content and information architecture for our clients.gov website, which was in PHP. We also needed to prepare all of this content for migration to Drupal 9, which was going to be our client's first-ever content management system. We needed to ensure that all content meets 508 requirements once it's in Drupal, and we needed to account for all of the content on their current site as it existed in both HTML and PDF format. I mentioned 508 compliance on my past slide, and if you're not familiar with it, you might be wondering what it is. 508 compliance is regulation that requires the U.S. government to make its digital content accessible to everyone. That means that someone who is deaf should not be prohibited from listening to a meeting recording online just because they can't hear, and likewise someone who is blind should be able to get the information they need from a website no matter which format that information is in because not every content format online is accessible to someone who is blind or has impaired vision. This is some formal language around 508 compliance. It says that under Section 508, agencies must give disabled employees and members of the public access to information comparable to the access available to others. Now, you might be thinking that this is a niche topic or that it is very centric to the United States, and while the regulation of 508 is specific to the United States, the issue of accessibility and allowing everyone to access government sites is not niche, because according to the World Bank, one billion people globally have at least one disability. That equals 15% of the total population, and 508 compliance is a way to ensure that all of those citizens can access the internet. The disability spectrum is very broad, as I'm sure you know. I have one or two disabilities myself. For this presentation, we will discuss accessibility in the context of screen readers and making sites accessible to those who are deaf, have hearing loss, and or blind. And that's because people with these particular disabilities often rely on assistive technologies like screen readers to access digital products and services, and so we're talking today about how we audited content for accessibility in order to prepare it for use by these assistive technologies. And when we think about accessibility more broadly, I really like this quote from Sir Tim Berners-Lee, where he says that the power of the web is in its universality. Access by everyone, regardless of disability, is an essential aspect. So you already know what the assignment was for my team and I regarding this 2B Drupal site, but once my design team dug into the website's content, we quickly saw what a challenge this project was going to be. Historically, our client's primary way to share information was through PDFs, and it had been that way for decades. So that's a big barrier to 508 compliance, which I'll elaborate on soon. The site also had zero existing hierarchies, including no taxonomies or organized information architecture. So they had a ton of really valuable insightful data and content with no coherent way for users to find it. They had a total of 5K PHP web pages, 52K PDFs, and as mentioned, no hierarchies in terms of how content was organized in their PHP site. And that meant that people were really led into a labyrinth to find even the simplest document. This got more complicated when I realized that 508 compliance for HTML is different than for PDFs. You saw on the prior slide that this website had several PHP web pages, but it had many more PDFs. That means that there was a lot of content nested within these 52,000 plus PDFs, and we had to find a way to make it accessible. And 508 compliance is very different for content in these two different formats. HTML is the expected 508 compliant format for screen readers. It's also quicker to update than other formats. It's responsive in desktop tablet, small tablet and mobile. And it's better for multimedia than embedding that multimedia into PDFs or PowerPoints. By contrast, PDFs are the hardest format for 508 compliance. They are difficult to update. If you've had to update a PDF yourself, you know that it is not as simple as you might like it to be. PDFs are not easy to make responsive on alternative devices like tablets and phones. And as mentioned, they are not the expected default format for screen readers. And I noted here that 508 is difficult to achieve with PDFs in part because they are difficult to update. That means that if PDFs are not designed up front to meet 508 compliance standards, making these PDFs compliant retroactively is a challenge. It's a challenge even if you have one PDF. And now imagine trying to do it at scale with several thousand. It quickly becomes impossible. You would need a manual team of several people to do that work full time, unless you can come up with some technical solutions like parsing. So what should we do that my team and I to make this information accessible? I mentioned earlier that this project is still in progress, so we are still a ways away from standing up their new Drupal website in their environment. But we did complete some key actions in phase one of this project, which was design and discovery to help prepare that content for compliance. My team and I started by making an as is information architecture map in mural. That's M-U-R-A-L. There are other design tools that you can use. Miro M-I-R-O is another very popular one. My team and I use mural very often. And the first thing we did was we went in and looked at their information architecture along with what we wanted it to look like moving forward. So we made both a as is information architecture map showing how their website is currently organized along with making a to be information architecture map in collaboration with our stakeholders. On that note, we hosted a card sorting workshop and a stakeholder mapping workshop with our clients. This allowed them to assess where they thought information on their website should logically be. And it also gave us the opportunity to sit down with them early, confirm who the key stakeholder groups were that we needed to engage in phase one of this project and map out a timeline for when we would engage them. We learned that was very important because prior contractors had not engaged some of the right user groups or stakeholder groups when they should have it as part of their process. And as a result, the contract came to a halt. So it was very important that we talked to our clients early on to figure out who we needed to prioritize in our user interviews and usability testing. We also collected data from secondary sources that our client had been using like site improving Google Analytics. This was very helpful to me as a new designer on the project because I was able to go in quickly get some statistics about use on the website and then use that to inform the user interviews and types of users that I pursued for various design efforts. On that note, we did conduct several user interviews along with usability testing on wireframes and mockups for the new site. And crucially, we chose to conduct usability testing with several users on the wireframes themselves before we had invested any technical resources in those wireframes. We wanted to validate that our designs were intuitive and could stand on their own before we involved our developers and ran the risk of incurring technical debt. So we decided to test early and often and that's something that we still plan to keep doing throughout this content migration. We also prioritized participatory design and that is the aspect of this project that personally I am most proud of as it pertains to 508 compliance. Participatory design is a method of practicing design that really emphasizes co creation over that typical designer user relationship which can be inherently unequal in some ways. Participatory design actively involves all stakeholders in the design process. It engages users and business champions alike. It prioritizes processes and procedures, not styles. We often think of design as being make it quote-unquote make it pretty work and the reality is it's so much more than that. It also gives everyone vested interest in a product success by creating a core group of invested super users early on. If you can engage users not just in interviews but then show them wireframes and mockups for feedback, include them in usability testing sessions, include them in even user acceptance testing if they are within the organization and have influence over the requirements. I have found that the earlier and more often you involve your key stakeholders whom you're designing for the more invested they are in helping you reach your own goals and that's really important when we talk about participatory design in the realm of 508 compliance because I wanted to make sure that we interviewed users who were deaf, blind and hard of hearing. To that end when I was looking for users to interview I did not only interview two users, one of whom was deaf, blind, the other was hard of hearing and it had hearing loss throughout her life. I also selected interviewees who had experienced themselves auditing digital products and websites for accessibility. It meant that we not only were prioritizing users of our website who have disabilities and accessibility needs themselves, they were also the experts at auditing digital products and services to ensure that they served people like themselves. So they had experience that was more vast than anything I had and so it was really crucial that we engage these users early on to learn from them. To that end we conducted both user interviews and usability testing sessions with them. For my user who was deaf, blind she had an interpreter on our call. This was all done during the COVID-19 pandemic and so all of my user interviews did have to be virtual via video and so when one of my subjects joined the call she had a translator on with her and so the translator would interpret my interview questions for her and then she would give me answers using American Sign Language which her interpreter would then explain for me. We wanted to make sure too that we were really not just interviewing these users for their own sake. As a service designer I really always want to tie every user interview that I conduct back to a prioritized design indoor development decision so we really wanted to ensure that the rest of our stakeholder group and team was hearing what these users had to say so that we could design a 508 compliant site for Drupal moving forward. So as a result of our participatory design with folks who have accessibility needs we presented their recommendations at Sprint Demos. We do work on a bi-weekly cadence where we track all of our work in JIRA according to two-week sprints and then at the end of each sprint we present our work to the client to show what we did. So in cases where user interviews were conducted during sprints I did present those results and recommendations at Sprint Demos. We also prioritized their feedback on usability in Adobe XD and in JIRA. As mentioned we used JIRA to track our stories and epics and all of the tasks associated with those epics so we wanted to ensure that we were really capturing what these users had to say in order to design a more accessible site. On that note I as the service designer was leading the user interviews and usability testing sessions but I always had one of my fellow designers on the call with me to take notes and listen. That person was most often a UX designer and so they were in the sessions with me hearing the feedback on both the current site as is and eventually on our wireframes that allowed them to make quick changes in Adobe XD as needed in order to make their proposed designs more accessible. We also as a team have committed to 508 testing pre-production. We have a 508 liaison at our client site and we also have a quality assurance team of several people whose job it is to audit everything that we push to Drupal for 508 before it gets released to the public. And long term as part of my client's digital transformation they're committed to changing the content delivery experience. As mentioned they have shared their information in PDF format for decades. There are several reasons why they chose that format to begin with but the reality is that in today's modern digital era for many reasons including 508 compliance that is no longer tenable. So they are seriously invested in figuring out how they can adapt the content in their PDFs to be more of a self-service business intelligence model where people can ideally go to the website and enter the data points that they need and get that information back. That's all well and good for the future. They do have an idea in mind for how to make their content accessible in Drupal moving forward but you might recall that at the beginning of this presentation we still had a lot of information nested within 52k PDFs and you might be wondering how did we solve that problem how did we go in and parse all of the content within those PDFs to ensure that it is compliant on the new Drupal site and that it fit is not shown in the standard PDF format that we make it compliant in some other way. Again these are just ideas because the project is still ongoing and we are knee deep in the content migration as I speak but we are exploring various ways to make the content in these PDFs accessible to more users. The first way is that we do have a team member on my side who is devoted to parsing these PDFs. He has written some scripts that are able to pull information out of these PDFs in order to explain what's in there and this is very important because we found my team and I when we were trying to audit this content that sometimes the title of the PDF wouldn't be an exact match for the type of content within it there were a lot of issues with lack of standardization. I mentioned earlier that the website had no real hierarchy to speak of and similarly with the PDFs every disseminator of those PDFs had been doing it their own way and there was no standardization and as you know when you're trying to automate you really need standardization if the tools and scripts that you're using cannot find patterns it's going to be very difficult to automate and so we have somebody who is committed to PDF parsing on at least a part-time basis but the reality is this challenge goes beyond the migration. As mentioned we need to figure out how to make the content in those PDFs accessible once we move over to Drupal and we came up with a few key ways to do that in partnership with the users whom we spoke to. There are some suggestions that include making a PDF repository part of the challenge on our client's current site is that if you go to any page on the site you might be taken to a PDF off of the web page and there's no icon or any notification that you are not only going to go off of the web page but that you are going into a new file format. If you make a PDF repository at least the user knows that this is where they find content of a certain type and then they can access it that way. One suggestion that I found very valuable from user interviews with folks who have accessibility needs is to provide additional word and text versions per PDF on the site and to make icons denoting that they are word and text files. It is much easier for screen readers and assistive technology to read content in these file formats and so providing additional versions alongside a PDF is a way to advance accessibility. As mentioned we do want to add icons which denote any non-HTML content especially PDFs so that users know when they are not going to be engaging in a standard web page or if they are going to need any additional help accessing a different type of content and we also do want to give users contact details for someone at the organization who can provide them accessible content upon request. This should be a last resort we really want to make the website more self-service so that people can navigate it more easily and they can find the information they need without additional help. Having said that if all of these options fail for some reason we do want to share the information for somebody at our client site who could give them an accessible version of the information in a PDF requested. So as we come to the end and think about lessons learned from this project on my end I learned a lot about the importance of designing for 508 compliance at the beginning because the amount I think we've all walked into projects that had an enormous amount of technical debt and that technical debt extends to content as well so when I think about advice for optimizing your content in Drupal for 508 compliance I learned a few key lessons. The first is to design for 508 from the start. On that note always consider the best format for new content it can be very easy to get into a habit of creating a type of content in one particular format without investigating whether that format is the best medium. You also want to prioritize responsive design with easy editing access this allows you to make changes on the fly and keep iterating. You want to practice participatory design you should always be thinking about people who might have challenges accessing your website and not making assumptions about what they need that is really crucial and I think that's the biggest benefit to participatory design it involves not only co-creating new solutions with users but giving them the autonomy to explain what's best and what they need I think designing with assumptions is really dangerous and that's why I never create design assets like personas and journey maps or service blueprints without extensive user interviews you cannot make assumptions or else that is how you will incur technical debt in production environments you also do want to test for 508 compliance pre-production ideally you should have somebody who can serve in a quality assurance role on your team so that they can test all of your designs your fonts every your file formats everything before you publish that content on your website and you do want to perform regular audits to ensure compliance so even once your Drupal website is live and on the web you want to make sure that you're using a site crawler like site improve to make sure that your content is accessible that it does meet 508 compliance guidelines and that even somebody who has vision or hearing challenges is able to access it and the biggest perhaps the biggest takeaway I learned from this client engagement is that retroactively making PDFs 508 compliant is hard doing it at scale is almost impossible it really requires a lot of strategic thinking and it requires patterns as mentioned the the big challenge that we've had faced is that there is no real rhyme or reason to the information architecture on the PHP site and that the PDFs don't really have any sequential logic or order to them either that made it really challenging for automating the the practice of parsing these PDFs and the content within them and we're making it work but it has taken a lot of creative thinking to make that happen whereas if you again design PDFs to be compliant from the start you can save yourself truly endless hours of work down the line because eventually new standards will come into place they will catch up with you and if you are not prepared that can be a lot of time and money invested in something that you could have just done to begin with now as mentioned you will sometimes PDFs are the best format for your content and there are ways to make them accessible from the start the tips that I'm going to share with you on these final slides come from Indiana University and they discuss some key ways to make PDFs accessible to screen readers using Adobe Acrobat Pro so the first is if you want to convert text that is an image to selectable text you would go to tools text recognition and then select in this file you can also add tags to indicate heading structure to do that you would go to tools accessibility and add tags to document you can and should add alt text to images this is very important because for somebody who has vision impairments if they cannot see an image on a website an alt text will explain to them what is in the image I've been adding alt text images to the Drupal site in our production environment to do that for a PDF you would go to tools protect and standardize accessibility tool add set alternative text you can also set the reading order reading order tells screen readers which order they should read content in so that users can make sense of what they're reading to do to set the reading order you can go to tools accessibility touch up reading order and finally if you want to set the language you can go to file properties advanced and reading options and these principles are not inclusive of everything that you might need to do to make PDFs accessible but I do hope that it helps you get a stronger sense of thinking about which content types are most accessible in Drupal designing new content for Drupal with accessibility and 508 in mind and making sure that if you do use PDFs on Drupal that you practice accessibility to begin with it should never be an afterthought it should be baked into everything you do and if you practice participatory design from the outset you can build an exceptional Drupal site that includes all users so thank you very much for being here as mentioned I this is a recording but I am online right now to answer Q&A so feel free to ask any questions and thank you again for being here