 And hello, this is OEG Live, this is Alan Levine. I am your studio host who is managing to mess up everything technically this morning. But this isn't about me. I've got a great group of folks and colleagues that we're gonna bring on stage right away. And this is our attempt to sort of have some interesting open conversations. So there is no outline for this. There's no script. There's kind of a topic. And so we hope you're interested in talking about audio, OER and possibly some artificial intelligence. And we're gonna like steer clear of like all the stuff of, oh my God, chat GPT. We're gonna talk about just maybe how it might come into play for doing some of this interesting audio work. So I'm gonna call on some folks to say hello. Let us know where you're from and just what your interest is perhaps in audio or AI. And I'm really excited because I wanna start with Amanda Gray because she was actually the person who started this off with a converse, with just a simple question in OEG Connect. So thank you so much, Amanda, for coming on here. And maybe let us know what you had in mind when you asked the question. Hello, my name's Amanda Gray. I'm an open education strategist at Qualum Polytechnic University. A large part of my role here is to support faculty to publish and create open educational resources. And over the past several months, I've gotten a couple of requests from faculty who are interested in taking an OER that they have already created but adding an audio component for added accessibility. And I had a couple ideas on my own of how that could be but they all felt very inelegant or not as efficient. So I reached out to OEG Global to see if anybody else had any ideas of how to do an audio book version of an OER and there were a lot of great ideas in there. So I'm here to listen and learn and find out what the options are. Fantastic, thank you, Amanda. And thank you for like, I invited you almost like two days before this. So I really appreciate that. And next I'm gonna go over to someone who's done a lot of OER audio textbook, Brian Barrick who I got to meet through his CC ECHO project. Hello, Brian, and maybe tell us briefly if you can but you've done like fantastic work in this area. Good morning, Alan. Thank you, I appreciate the opportunity to be here. My name is Brian Barrick. I teach political science at Los Angeles Harbor College. We're a community college in the South Bay region of Los Angeles just next to the LA port. I last year was awarded a grant to work on an OER audio book. So we have adapted the OpenStacks American Government third edition text. It's one of the most popular OERs for political science. And in working together collaboratively with a student from my college, we were able to produce a 30 hour full openly licensed audio book. It's available on podcast streaming platforms as well as YouTube to date. We've had about 16,000 downloads of the podcast which is also built into some canvas options for other instructors who'd like to incorporate. And we've had about 20,000 views on YouTube as well. So the project taught me a lot about audio and Alan, you mentioned AI earlier. This is something that's also been very interesting to me as well. Just last Friday offered a workshop here for some of our faculty about how we might be able to incorporate some of the emerging artificial intelligence tools into our workflow as educators. And obviously there's a lot of promise. There's a lot of things that we're also trying to figure out in terms of ethics and plagiarism and all of that as well. But I do think that AI will play some role in other types of learning, like auditory learners and other strategies as well. So really glad to have this opportunity to be here this morning. Thank you so much for inviting me. And definitely thank you so much. And I appreciate it. And your work there was kind of traditional. You did the whole recording. You had a student work with you, I think, which is incredible. But now you're looking at maybe some other ways to do it with these other new tools. And I also turn to someone who's got like so much experience with this, and Delmar Larson from Liebertex, who has been doing some amazing work too with making their entire library available through language translation of machine language. And Delmar mentioned that he's been, he dabbled in this some of the past. So I'm putting you on the spot, Delmar. And just let us know where you stand on this and what you're thinking about, how to go about this. Well, first, thank you very much for giving me the opportunity to be here. It's always a pleasure in order to take a break from preparing for my lecture in the next hour. I'm a professor of chemistry at the University of California, Davis, and also the founder and director of the Liebertex project. We've looked at audio for multiple years as a mechanism in order to be able to get content out there for as many people as possible. But one of the issues that we had with the way that we were doing audio in terms of doing professional recordings like with Brian did is the curation aspect. And this underlies my general perspective on OER in general. In that we want to be able to create an environment that's dynamic that you're able to update content because content needs to be continually updated. And that is relatively easy on text-based infrastructures but with audio, it requires a bit more effort in order to be able to find splice cut and put everything together, which is all a sign of the Herculean effort that Brian and his student put together. But that static aspect has been one of the reasons why I haven't been fully engaged in terms of making audio aspects, audio capabilities of our pages. I favored a more dynamic infrastructure and thankfully as AI has started to develop, especially the last few years, providing the opportunity in order to have a dynamic reading of the content on the page as a more viable solution that resonates with the way that LibraText operates and how I believe OER in general should operate as a dynamic infrastructure instead of static infrastructure. That being said, OpenStacks is a more traditional versioning-based infrastructure so it can understand the utility of what you're doing with that. And as far as AI based up like what Brian and what Alan had mentioned, we've been pursuing machine translation of content in order to be able to get our content out to non-native English speakers that naturally relies on AI and training models. And that's been particularly successful over the last half a year as we start to expand our approach. We translated our corpus or a subset of our corpus about 150,000 pages into Spanish and into Ukrainian. Ukrainian was to address displaced students in the war and Spanish was meant in order to be able to address a large popular model. And the Spanish translation of our corpus has expanded linearly up to about 600,000 page views per week and which is a 17-fold increase from the native Spanish that we have. And it's showing no sign of actually saturating. So we're excited about where AI is gonna be able to take us in regards to translation efforts. All right, thank you so much for being here. And someone else I thought to invite because well, she's smart but she's also done a lot of podcasting is Brenna Clark Gray from Thompson River University and in touch with a lot of faculty too. So I think there's some things to think about in terms of audio, in terms of making content accessible but it's also just there's some flexibility that make audio really versatile which I learned from some of Brian's story. So what are some of your thoughts on this Brenna? Yeah, thanks for having me. So my name is Brenna Clark Gray and I'm coordinator of educational technologies at Thompson Rivers University, as Alan said, joining you from beautiful to come to CESA Quentin territory. I always have to look out the window when I say that it's beautiful blue skies here today. Audio has been big at TRU for some time. We're increasing our capacity in podcasting in particular which is a specific scholarly interest that I have but I really wanna give credit to Professor Paul Simpson in our trades program who is using audio as a way of making trades OER more accessible to learners in his area which is piping trades but that's something that's being taken up by trades across the board. So we're increasingly seeing requests. We use Moodle as our learning management system. We're increasingly seeing requests to build audio versions of content right into Moodle so that students can make choices about how they want to work with material. I really like thinking about audio books and podcasting in sort of the same vein using the technology of podcasting to serialize audio books I think can be a really great and very maybe easy straightforward technologies readily available option for lots of folks and I had the pleasure of being involved in Clint LeLone's 25 years project which was a serialization of Martin Weller's book that again used podcasting as the technology to put each episode out. One of the things I like is the responsive and flexible nature of podcasting so with the 25 years project we serialized each chapter of the book. We also had little response chapters where people could get together and talk about and take apart the content from those chapters which turned it from an audio book I think into an OER and EdTech OER. So yeah, I'm really interested in this from the perspective of expanding what podcasting can do in educational spaces. So yeah, I'm excited to talk with everyone today. Awesome. And someone else who and a good friend who also has done this extensively I'm pleased that Jonathan Portz was available to tune in from Italy. And Jonathan, well I'll let him describe but he's been involved with the Creative Commons Certificate Program and I don't know, did you take it on your own to create it as an audio book and why did you do that, Jonathan? Hi everybody, so I'm Jonathan Portz. I taught math at many different universities in I think six time zones for a hundred thousand years, base two. And then I burned it all down and moved to Italy a few months ago. And I'm working sort of doing kind of gig work in the open world. And I guess my feeling on audio really has a lot to do with when I was a student when I was a professor in Colorado I had a student who was dyslexic and she was a wonderful student and she once played for me the audio that she was given to access one of her textbooks because the disability resource office had it. And this is where I know I think these conferences is more fun if there's a little bit of controversy. So I don't really believe in platforms. I don't believe in AI. And I don't think that these technological tools are really gonna help solve these problems. And as an example, the student played for me the automatically generated voice reading her textbook and it was just horrible. It was like a 1980s sci-fi movie robot and she had to listen to this thing to consume large quantities. I think actually it was political science. And it was just so painful to her and it sort of struck me that people who are, I believe that authors, a lot of what is great about OER is that we authors, we get pissed off at something. It doesn't exist. We want to be an author. We want to take our take on some material and we want to do it. We don't want to wait for some publisher to get a contract. We want to just go and do it. And so I think that what is gonna make audio really useful for students is not a fancy robot doing a slightly better version, but someone who understands the material, reading the material, placing the emphasis where it's important, phrasing the language in such a way that it is conveying more meaning. And so I got involved. At that point I was, had this student, I wasn't working on OER, at that moment I was working, I was doing a lot of teaching with Creative Commons. And I thought I should practice this by reading the Creative Commons certificate materials. I don't know, I've never printed it all out. It's kind of a, it's probably a slim textbook. And so I bought a good microphone and I just went into my closet which had the best acoustics in my house. And I read the whole thing and it was a lot of fun. And I think, and I released it as MP3s. And so I don't really believe in platforms. So I think the great thing about MP3s is that almost any device in the modern world can play an MP3. So I think someone who understands the material, reading it is a great way to really convey what is important about some educational materials. And that's the way that I hope we will go. I think some sort of strategy likes the LibriVox approach to have kind of crowd sourcing, having lots of people, you know, maybe the author of a textbook will not want to do the reading. Doesn't like their own, don't like to hear the sound of their own voice or something. But there'll be other specialists in the area who enjoy reading it. And think of a way to learn it and a way to get their take on it by doing a recording. And if we're gonna have places for sharing MP3s of OERs, that will be just a fantastic way to benefit students. That's why I'm glad you said you'd come, Jonathan, because you like to mess things up. And I think there's some good points there. And I have some questions too I'd like to get into about like the value of human voice versus, you know, and whether it's good to get a lot of content out there or bits. And so we got some people here with great experience. And last and definitely not least, I was so excited in a conversation with Steel Wagstaff who works for this little company called Press Books, here on the same stage with Libra Tech, so all the open textbook publishers can hang out together. But also like, I know Steel's very thoughtful, but I thought too when Amanda's question first came out is like, okay, audio in OER, like how does it go in there? Is it per chapter? Is it serialized? Because like Jonathan, I like the value of it being able to jump from episode to episode. It seems to make sense as well as you can jump around. So what's your take or interest in all this, Steel? And thanks for coming here, by the way. Yeah, hi everybody. I'm gonna try something here and let me know if you can hear it. Hold on, it didn't work. Oh, okay, let's try that again. It doesn't work well. Okay, so I programmed my little intro in a text-to-speech engine and I was playing it before, but when I had my mic unmuted. But, so I'm Steel Wagstaff, I'm joining the call from Eugene, Oregon, which was a city that's built within the Kalapuya Ilihi, which was the traditional home of the Kalapuya people. I'm really happy to be on this call and I just enjoy the company. So many of you have known for so long time and admired and respect the work you do and some of you have been meeting for the first time. So yeah, as Alan said, I am the product owner for Press Books. It's a free and open source publishing platform. And so we make publishing infrastructure that a lot of people do use to publish open text. And when we think about the book, I guess, we're trying to think about it more capaciously to imagine like a web-first object that's a true citizen of the digital web. And so you think about web publishing versus traditional print publishing. There's a lot of affordances that can happen on the web as Delmar and other people have talked about. It's very difficult to include an audio recording when you buy a print book unless you make a CD and included with the print text. But when you're publishing to the web, it's quite easy to have interactive video or audio or other kinds of components. And so I think over the years, authors have thought about a lot of different ways that they want to incorporate multiple modalities. Like the concept of universal design for learning suggests that we want to make content available in lots of modalities to help different learners and to help learning happen. And people like Jonathan Brenna and others have said, well, audio is an interesting modality. There are many people that experience, you know, they're unsighted or that have difficulty or don't enjoy reading text but would prefer to hear it whether they're busy and traveling and they have a commute and they want to do something while their eyes are occupied. And so I think the things that we've seen there's a lot of interest in exactly what Delmar is talking about. If you have a relatively fixed text, it makes sense probably to invest in recording audio and giving it the human touch that Jonathan and Brenna talked about. But if you have a fluid text that's changing frequently, that's quite a lot of time and investment and expense to fix the recording. And then I've got a, I made a 40 minute recording and I've got to change three words. Well, I don't want to regenerate those MP3 files. And so things like text to speech, a lot of browsers or browser extensions you can get that will give you a live text to speech rendition. But if you don't have a live internet access at the time you want to access it, it's nice to have the downloaded files or the files that are portable. So there's tools like AWS Poly or other machine learning tools that can do a really nice job but very low cost of generating something like a human voice rendering a text with mostly good pronunciation. It is probably not a full replacement for the human voice and a performer or a careful reader, but you can get a lot of the way there dynamically and quickly in a way that is hopefully sustainable as Del Mar has talked about. I think one of the interesting things that we've seen though is like here's a project that I really liked that I'll talk about and this comes out of British Columbia but Josie Gray. So this is the gray theme. We've got two grays on the call and I'll mention that third gray but Josie Gray who works for BC campus got a master's degree at what was it at? Toronto Metropolitan, I think, I might be wrong about which school she went to but one of the parts of this project was making a big book all about equity and OAR publishing. And so here's an example of this book where she's created this text and then she's made a podcast version of that chapter that I think is mostly the same as the text in the chapter but it has enough differences potentially that even if she were to edit or revise that chapter that podcast is kind of a self-contained entity that will live and survive and compliment that chapter quite nicely. And she's just embedded it right there at the top of the chapter so that when you encounter this for the first time you have your choice like what I prefer to listen to it, what I prefer to download that and take it away with me or what I prefer to skim or read the text in detail. I think providing those choices and options especially if you're saying we chose open text because of the possibilities the licenses afford that's to the good. I mean like the more choices you give readers or consumers or learners, generally the better. And if you can do it in a way that doesn't wear your people out and grind them down and can be sustained, that's even better. So I guess my opening thoughts on the question. That's great. I might just shut up here and let people talk but I think we already got into it and I'm glad the third gray is listening. I was hoping Josie Gray might be interested because I know she has this expertise. And like I don't know if we can really decide on a rule whether like computerized voice is bad, computer is okay. Like for some people it probably is effective. And so I mean, how do we like navigate like what's gonna be the best way to do this that's gonna be effective for the people using these resources? I think it's worth thinking about what human voice offers in determining whether or not it's appropriate. And for me, one of the most significant value adds to be gross of human voice when it comes to teaching and learning contexts is that there's a very intimate audio connection that occurs when you are listening to a human voice. Like if you listen to a lot of podcasts you've probably had the experience of thinking that the people who host the podcast are actually kind of your friends, right? Like I've found myself saying to someone, oh my friend said, no, no, someone who makes a podcast I listened to actually said those parasocial relationships, we know how much more significant they are in an audio context. So when I was an English teacher, I often found that teaching literature, audio was a really great way for having really difficult conversations about really difficult texts because it was intimate and close. It allowed students who were learning at a distance not to have to sit in front of the family computer and watch a difficult conversation but to take that conversation on a walk or to their room or whatever. And I think that's a real strength. I don't know if there are perhaps applications where that intimacy may not be what you're looking for and where you may want the distance. Cause I do think computer generated audio tends to create a distance that's very different from the experience of listening to the human voice. But I think recognizing that that's part of what works really well about humans speaking. Like we like in general to develop those relationships and where that might help us to decide where it's worth investing. And where, as Josie's describing, some learners would prefer to not have that experience because they're accustomed to a screen reader experience that sounds very different from a human voice. I'll jump in on that too, just to kind of pick up the thread and really kind of engage with something that Jonathan said earlier as well. So a lot of the discussion so far has been around the differences between natural human voice as we're hearing it right now and sort of the more robotic sounding, machine generated text to speech, which I would be the first to agree with you Jonathan. I have heard texts like that. I saw that there was one commenter on YouTube who had mentioned that some people prefer that kind of mechanical sound. For me, I just don't absorb any of the material. I begin to just tune it out. And everybody's different, but that would be my experience. So right now, I believe there's a lot of benefit over having the exact model that Jonathan was describing where somebody is actually reading it out loud because of the context, because of the flow, because it's more natural sounding, all of us, no matter who we are, we've done so much of our learning over our life by listening to other people, right? From the very moment we were born, we heard all of these voices around us. And so there's something very deep about natural human voice that allows us to incorporate new ideas and learn. But one of the things that's been really interesting to me is that boundary between, say, traditional text-to-speech and natural human voice is more and more going to be a very blurry boundary. There was recently a model that was announced by Google called Soundstorm. And I'm gonna put a link to it in the, excuse me, here in the private chat. But for anybody who's curious, I recommend checking out Google Soundstorm. They haven't released the model publicly yet, but as you listen to some of the recordings, to me it was completely indistinguishable from human voice. So essentially they've trained these models on somebody speaking, spoken voice. And to me it sounded all of that intonation, all of the inflection, all of the sort of qualities of a natural voice were there. And so I think that that's just the beginning. I would really expect that within five years, we're gonna see that that technology is gonna be very readily available. And of course there are some other AI competitors that are out there right now as well. I'll just say one last thing, which is kind of like how I'm thinking through it right now. When I did the OpenStacks audio book, it's about 30 hours of recorded audio, but in terms of putting that together, I would say it took me at least 200 hours. So it was a huge undertaking of time. And what I'm thinking is, if you can manage to shrink that amount of time in terms of production, let's say you get it even down to like one-fifth of the amount of time, if you can 5x your reach by having more of these resources available, maybe it's not 100% as good, but if it's even 90% as good, you're really just kind of like the value of the time and the access to it is really gonna make a big impact I think. So I really see this technology coming in in a huge way going forward. I think that 90% is almost what Delmar talked about, his machine language translation, because Lieber-Tex, if I understand right, they were doing like human translation and it takes an in-order amount of time. And he was like, if we can get 90% of our content to a point where people at least have access to understand it, that is maybe worth not being perfect. And I'm really curious, I will share the link with everybody else what you put in chat. I would just say like I would echo that, I really admire what Lieber-Tex and Delmar are doing and I think people have preferences. We have our own preferences for what we like to hear, but it's really important to remember that my preferences are my preferences and are not universally shared. So let's speak like, there are users of assistive technology who have a different experience of the world and the internet that I do. And every day, they're hearing a machine-generated voice that helps them navigate the web. And they're incredibly facile with that technology. They're fast with it. That's the voice that helps them navigate the web. And I expect that many of them prefer that because they can go faster than if I was reading this in a Shakespearean tone and tenor, right? There's a range of experiences and a range of preferences. And so I think there are many people that prefer different things than I prefer. And that's okay, that's probably a good thing. And when we're talking about like machine-generated voices, there's an increasing option of voices to choose from in these, like the technology companies are trying to provide localized or personalized voices. So let's say I'm a reader in Indonesia and I hear someone speaking with a, what's to me an unfamiliar accent in English. I might prefer to hear Brian's text read by a native Indonesian speaker of English who sounds more like me or sounds more familiar to me. And I think that's important to recognize too, that like our listeners are different from us and use language and even the native languages that we speak differently than we do. And providing like, just like we talk about indigenizing or providing localizations for our texts in lots of ways. I think that extends to voice, it extends to audio recording and extends to, it's a good thing when we allow people to make what we have freely given work for them according to their preferences. And our preferences are not their preferences and to be monolithic about that, I think is a mistake. This conversation is making me think of auto captioning for video where we know that the auto captions are not great and should be gone into and change but that is often a lot of extra work and time that our faculty sometimes just don't have. So we say auto captions as a bare minimum for accessibility and then if you have the time and energy and funding to go in and do that, you should. So I'm wondering if maybe the way forward is text to speech as kind of the base bare minimum, this is what you do. And then if faculty have the time and are able to do their own voiceover that can be an option. Maybe there's a space. I think it's a really interesting, sorry. No, go ahead, Jonathan. I think that's a really interesting point, Amanda. I was thinking also about those auto captioning and I was always told that in the United States, there's the Americans with Disabilities Act and it makes, I think it's a specific legal requirement as to how accurate the captioning has to be. And I was told that, I forget what the number is, like 90%, 95%, something. And my experience of, I used to video all of my classes and sometimes when I would look at the auto captioning and it was never even close. I always spend, you talked about the, Brian talked about the multiplicative factor for recording. I will always spend about another class period outside of my class, maybe one and a half class periods correcting the auto captioning because otherwise it wouldn't satisfy the Americans with Disabilities Act. But I think that's something we can expect the technology will get better, right? I mean, I think the errors that we hear in the robotic sounding voices we hear today with text to speech will get better. That's clear and probably very quickly. I think we have to be a little bit careful of, the anti-technology stance, which I was sort of wearing the hat out for a little while at the beginning. I think it's a little bit silly when you think, I mean, I remember the days when the printing press arrived and we all thought that who was ever gonna wanna read something was mechanically produced with all the letters looking the same instead of those beautiful things that my friend, the bunk had written with his own particular handwriting. That sounds ridiculous. And the same way, maybe as Josie was pointing out on that maybe people like to hear, to hear a kind of neutral voice saying something for people who use those assistive technologies. And maybe the Creative Act will be to bring in, the listener bringing in some of those emotional and meaning-latent tones when listening to a more neutral reading of something. I don't think we should be just strictly against technology. And like everyone else said, I wanna echo everyone else. Delmar's work with the automatic work has been an incredible boon to the community. Even if it's, the transition to Ukrainian is probably not perfect, but what a gift he's giving to the world by making that available. So we've sort of transitioned a little bit from the two sides of pre-recording to for and automatic recording. I think the topic now is sort of transitioned into, well, the technology may or may not be here now, but will be here soon. So the issue is not necessarily whether it's gonna be here. It will be here. The question is when is it gonna be here and how much effort do you actually spend in terms of doing non-technology enabled effort, like what Brian did or what Jonathan did, versus is it time to, it's a better nerd to pursue the technology stuff because it's gonna be around the corner. Now that's if a lot of things. I would presume that technology is a better suited nerd to handle a poly-sci than it is for the sort of advanced math, the topics that Jonathan put down there, but things are changing radically. And obviously the auto, the other side effect off of that is in the conversation around math STEM in general is that a lot of the human-made audio conversations that address equations have been particularly poor that I've heard with, whilst you have a consistent infrastructure around being able to read audio or in my field of chemistry, being able to read molecules in a semantic way is well established and you have that standard out there. It actually is more beneficial for the students that have cited issues. And maybe like, maybe there's like some combination here that we're talking about, like let the machine language do the chunk of the content but let the instructor come in with contextualizing why is this important? Or here's what I want you to think about. I mean, we read books of content that have no voice or human intonation while there isn't words, but you go into class and the professor, the teacher, whoever is running this brings it alive in that conversation. And I think that might be a way forward too. Yeah, I think to that point, Alan, like if you're thinking about how to make information maximally efficient, like and available to people, we know that text is really, really good at that. It's a durable format. Like if you're trying to condense a whole bunch of information, delivering it via written text is cheap, fast, efficient, and relatively easy to maintain. Audio is a little bit harder and more difficult to maintain, but it has the richness of the voice nuance. Video is much harder and more difficult to maintain and more expensive because of bigger store spice. Like as we add those extra channels, it becomes more complex and difficult to deliver and maintain. So like good old fashioned text is really great. It's a super durable technology for preserving information over time. But we know that like an in-person teacher, just think about the difference that that has made in all of our lives, whether it's a parent or a caregiver or the educational setting, there have been really transformative moments in all of our lives from real people that we've encountered and also probably transformative events from audio that we've heard, video that we've watched, and texts that we've read. But to imagine that education will exist without the presence of real people, I think would be a really impoverished future. We all expect that real teachers intervening and using their human judgment and wisdom in a situation will be an important part of education in the past, present and future. But I mean, if we can make the automatic delivery of things faster, cheaper, quicker and freer or more available, that's a win. I think that's what we're all trying to do. I think this is part of a larger conversation, thinking about the AI applications in the title of this talk. There's so many faculty right now that have a lot of fears about what does AI mean for me as an instructor? And so I think this is just a general question as our society moves forward technologically, what is the human value in the classroom? And I think there always will be, I'm a background as a librarian and when eBooks first started being a thing, there was this big huge outcry about how, oh, the print book is going away, no one will ever want to read a print book ever again. It's all gonna be online. This is horrible. But we know that's not the case. People still read print. There's still value in that. Some good comments from Chris, who I'm glad is here in the audience. Now we get into some interesting things like who owns the voice and what happens if, you know, I don't know, like perhaps like your voice or someone else's voice gets into your content. Not only in this strange territory, but this is happening and, you know, and look out like not in audio, but like there was, what was that? There was an Adobe audio product that came out that had to be pulled because it was potentially so dangerous. But I was just playing last night with the Photoshop plugin that generates your image editing in rather profound and scary ways. Yeah, I can see a future where you could take, you could, the tool could sit you down and just ask you to read out five or 10 minutes of audio with a variety of sounds. And then from that, they could use your voice to generate any possible sounds you want. And there'd be some definite concerns about that. Like what if someone took recordings of me speaking without my consent and then built an AI that made it sound like steel was saying offensive things. I'd be upset. But I also think like, I don't own my voice. I don't really own my likeness. Like there's a bunch of things that once like send them out into the world, they're out in the world. And like, I think having us all be like mini copyright lawyers litigating our appearance and our voices. Like, I don't know if that's a battle we want to fight. And I don't know how real of a risk that is, but certainly like the deep fake stuff is really concerning. And we're going to face that with disinformation in the political future for sure. Like in the near future, we're going to see probably political races where somebody's been doctored to say a thing. And then we won't know whether it's a real recording or like that stuff's scary. And that's definitely coming. So. Some of you may have seen, you know, Ron DeSantis recently had his announcement that he's running for president on Twitter Spaces, right? And immediately after that happened, the Trump campaign actually released exactly what you're talking about. They had AI models that were trained on Elon Musk's voice and Ron DeSantis' voice. And they put together kind of a mocking, you know, video of that. So it's really interesting. You know, every four years it seems like we have some new technology that's impacting presidential politics, you know, starting, I remember when the Obama campaign back in 2008 had this really great technology called text messaging and they were reaching out to people on their cell phones. And this will be, you know, clearly like the first presidential election where we have some of these really good technology that can exactly be used as parody as it was in the Trump campaign or it could be used for outright manipulation. And a lot of that's exactly based on, you know, what Steele was saying with, you know, training a voice model to sound exactly like somebody else. I just think that's really interesting. I know it's kind of further afield from our audio books, but the idea that, okay, if I want to hear this audio book and I want it to be narrated by David Attenborough, the guy that does all the nature documentaries that's a beautiful voice, that's pretty neat, you know? Does he own his voice? Does he not? That's a big question, I guess, but the idea of having that choice and, you know, I think Steele, that goes back to something you were talking about earlier as well, which was having like different dialects available. I could imagine, you know, in the not too distant future, let's say a decade from now, if somebody does want to access written materials in that way that they would have the ability to choose whatever voice they wanted to, and I think that's going to be really neat. I don't think- You can actually do that right now in the browser with your browser text-to-speech engines. It's just, it's difficult to make a stable recording with consumer grade and technology. You've got to do something like, and maybe Dell Market tells them of the guts of what they're doing to do that at enterprise scale, but it will probably be consumer available in the not too distant future, yeah. Absolutely. So some of this is, Brian's saying like, and we see it that a lot of this is happening in the political space deploying this technology. Like in the old days of the internet, a lot of the revolutionary things were done by the porn industry. So there's some kind of connection, like some of the first e-commerce stuff. And so I don't think we strayed too far from that. But yeah, this is really fascinating, but actually steal what you described. I'm using this new editing tool called Descript. And basically it transcribes your audio and then a lot of your editing is done from the text transcripts. So you remove things, but it has something and I'm still trying to get this dub thing working where if I realize I didn't say someone's name right or I forgot to say something, I can actually type in the text and it will generate in my voice because it's been trained on it to fix something. That's really clever, yeah, I haven't seen that. Yeah, you're right, it's out there. I do think when we have these conversations and it's probably top of mind for me at least in part because I work at a university where one of our central values is sustainability, right? We're supposed to be approaching all tasks sustainably and as these technologies move into consumer grade versions and we're all engaging in play, which I do think is like the best way to learn any new technologies to play with it. But we need to address the fact that these technologies are extremely carbon hungry and they're extremely thirsty. And the thirsty part is I think really eye-opening. You look at a 50 question chat GPT string probably costs us about 500 milliliters of clean water. I live in a desert, so I don't know about you, but that's something that I think about a lot because I live in a desert. And so as much as I value a lot of what we're talking about here today, I think we need to be really concerned about having these conversations separate from the larger values conversations. It's really easy for tech to behave as though it is neutral in the world. We know a lot of bad stuff behind the scenes of AI, but for me, these environmental concerns are particularly of note because when we get into thinking about like, oh, I might change a voice 10 or 15 times while I settle into the audio narrator that I like. Do we know the cost of that? What is the cost of making that choice over and over again? And I just think consumers don't have all the information that they need to make appropriate choices. And particularly within our institutions, I think that these are values level conversations that need to be happening as we establish norms of behavior because this stuff matters a lot. 500 milliliters of clean drinking water when it's 40 degrees and Kamloops that strikes me as an awful lot. And it's one thing to use that to make a piece of learning material accessible to a learner who wouldn't have access to it otherwise. And it's another thing to do it as a form of play. And I think we really need to start to tease apart when engaging with these technologies is appropriate and when it maybe is a cost that is not appropriate to be born in the situation. Not to be a buzzkill, but that is often my role. No, that's not your role to be a buzzkill. It's just a conversation, but in some ways that's like people not realizing what happens when they throw something in a trash. Like it just disappears, right? So unless you've gone on the field trip to the landfill and see where your stuff goes, like we haven't gone far past there. I mean, we talked about invisible labor a lot in these situations, but I think what we're also talking about are the invisible environmental impacts of high resource consumption computing. And certainly machine learning and very large language models consume a lot of energy and electricity to be trained. In terms of like that actual technical issues, though, I think that what's probably at heart of what we're discussing is that one of the big problems that computer scientists have been thinking about for a while now is how to convert speech to text and text to speech. And it's a two-way flow, right? And we're talking about that in the form of transcripts. We're talking about forms of taking a text object and turning it into something recognizable as speech that sounds natural and delivered well. And I would be, if you're interested or willing, Delmar, I'd be really interested for you to explain a little bit about what specifically LibreText has done to take a large text corpus and try to make it accessible speech in audio files for people. Because that's not something that any of the rest of us on this call, I think, have done. And it's probably of interest for a lot of listeners. Would you be willing to speak to that for a little bit? We haven't scaled up anything yet, in part, because I wanted the technology to be there. We have a strong relationship with AWS and they have been investing in terms of the machine translation effort that we have put together using Amazon Lambda and a handful of other things that's connected to that. It's not hard in order to take our polyglot engine that we built and process that through poly in appropriate languages in order to be able to do that. We haven't specced out the price point off of that. We do have a proposal with AWS right now for international activities because we're not for profit, we're able to capitalize on them. And I've been thinking about adding this a little bit into it, but I don't know what the price point is. For a, I can tell you what the price point is for a standard book, let's say, of 100 pages. So sort of, well, 100 pages of web pages, that is. It's about $20 in order to convert it using the Amazon web service infrastructure. And I don't know what it's gonna be for audio. So we provide that to people if they want to be able to convert their books into different languages. So we're not there yet, I guess is the upshot, but not because the technology is not there. We just haven't quite set it up. And one of the other issues that we've had is that while we've gotten a significant amount of investment from both federal and state funds in terms of doing what we're doing, in addition to the stuff that we use for sustainability outside of that, we haven't found an ideal source that's able to internationalize our infrastructure because that's largely a philanthropy-based infrastructure than it is a government-based infrastructure. So we just need to find the right person who wants to go about doing that. Hint to Bill Gates or Amazon or anyone else, or Bezos or anyone else out there that has an interest in order to pursue that. But yes, I'm very enthusiastic about this. I think it's gonna be the future, if not the present. We just have to wait and see. Oh, there's some pricing. Steel has, Steel already answered my question. Yeah, I think Delmar was speaking specifically about an AWS tool called Amazon Poly, and that's the engine that they use for the text-to-speech-to-text. They have a free tier that's available for people that want to try it out. I'm not sure about all the technology implications that Brenna mentioned. I don't know how resource-intensive or what you'd expect environmental impact of Poly to be. I don't have good answers about that. But they do have kind of a consumer trial basis. They have a published API if you're a technical user. And then I also shared something. Someone recently shared with me a tool for transcription called whiteout.ai. It's built, open AI, the people that we know from chat GPT. They have a tool called the whisper API that's built to do text-to-speech, sorry, speech-to-text. And they built a consumer public-facing free tool that produces, not surprisingly, but quite accurate transcriptions from what I've been able to turn. Now, it's not as good as a skilled, paid human transcript worker, but it is faster and freer. And so this is what we're seeing with that creative destruction, I guess they call it, disruptive technologies. So it's something to look at, I guess, if you work at a university and this is a compliance effort and you don't have a budget to do it with a skilled professional, hopefully this can lower your cost and improve the actual accessible output. Correct, yeah. I mean, but still, might it be lowering the costs by kind of transferring them into externalities that, I don't know, Bezos's empire uses cheap labor and does its environmental damage in parts of the world where he can get away with it because the regulatory regimes are convenient. I guess I'm always nervous about when people are going to scale. I understand that going to scale is a way to reach more people and ultimately I wanna reach more people. But when we talk about going to scale with some solution that often involves kind of finding a way to push the externalities out of our spreadsheets and hide the real human cost that Brenna was, I think. So with the environmental costs which would eventually be human costs, I don't know, I'm really scared of those. Going back to our earlier conversation about the value of the human voice and reading out, I just wanna bring up something that came up in our forum chat about this topic was the idea of bringing open pedagogy into the project and having students contribute by reading audio versions. Just wanna highlight that are going to large corporations and big platforms is not our only option if we wanted to think about the learning opportunities that can come from reading an audiobook chapter. That's a great point. So on my audiobook project, we incorporated a student as well which is something at first I wasn't even thinking of but the organizers of the CC Echo Project recommended it and it was one of the best recommendations that I've had. So we actually just got a notification a couple days ago that we're gonna be able to present at the Open Ed conference this year. And one of the things that, as I'm talking to the student, Sarah Aria who helped me narrate the project, she's now over at Cal State Long Beach studying political science as an upper division student over there. She said that that experience for her being able to participate in the audiobook opened up a bunch of other possibilities for her because when she transferred to university she had that on her resume and that kind of opened up some other possibilities for internships and for research opportunities at the university because she'd had that experience before. It was kind of like a resume builder for her which I was really excited to hear. It was something I was not thinking about at all but she seemed to have a very positive experience on that aspect of it as well just being able to be involved and create something and really attach her name to it in this way. And I remembered when we talked about this Brian what was interesting is the way you read content parts of the chapter and she read like the call outs and things that were relevant to the student. So I think there was some interesting ways in which the audio subtly signified what was happening in the flow because you have a great voice and I listened to you for hours but like I think like introducing different voices into the reading could add a lot to it. And I'm glad you said that Amanda and I wanna come back because you started this with your question like what have you gotten out of this and what should we do next? Oh, I don't know. That's such a, definitely coming away from this with a bunch of tools to look into. I don't know for sure how many of these tools we could adopt at KPU. At KPU we're very protective of our students' data and privacy and for our faculty as well. So we tend to have pretty strict compliance and procurement things. So definitely have to investigate those. But I think what I'm taking away from this is that there are a lot of options out there for faculty who are interested in producing an audio book or adding audio to their book. So I need to put together some sort of how-to or resource or something that collects all of that information in one place that I can easily point faculty to be like, are you interested in creating audio? Here's some options. Here's your options. Here's what we can help you with. At KPU we do have a audio visual setup here that faculty can come in and use to record and present. Some of our OER authors have already used that to record pieces of OER. So definitely just letting faculty know what their options are, that there are options out there. There's something I saw in the YouTube chat from Alexandra, which I'll just highlight that. There's a really cool project in that open pedagogy learner created content that I think is worth mentioning that came out of Ontario. It's called Liberated Learners. Won a big award last year for engaging student material. And that project, they incorporated a bunch of student co-creators to talk about the learning experience and learning process. And the students came up with a bunch of really cool ideas. And one of the things that they did was they made a project called Beats to Study To. And these were a bunch of student-generated beats and music, and they incorporated it into the book because that's what the students wanted to put into it. So I think sometimes you think people want one thing and they tell you they want another thing and listening to them and responding to it, that's great. You've learned and you've proven that you've been actually responsive to what the readers or audience wants. And I think when you put an open license on that, you're giving people in permission ahead of time to do exactly that. You will not anticipate or expect everything that people want to do with your content. And it has a life outside of you. So that I think is beautiful. And that's what the really exciting thing about working in open content licensing is you realize like, whoa, someone did something with this that I didn't expect and it was really fun to see. And so I think that's a point to make there. And the other thing I would say is like, I used to work at the University of Wisconsin and when we were there, we made a podcast series. And then when we made our podcast, we realized we needed to produce transcripts. And it started with me listening to the Audacity file and typing out the transcript. And I was like, it took me two and a half hours to type the transcript for a 40 minute conversation. Wow, this is not efficient. And so I was like, oh, I'd get one of those foot pedals and I'll type faster. And I was like, hey, this is terrible. Like why am I, and then I said, okay, what are the options that we have? And the campus have a many campuses. They had a vendor. Here's the contract vendor that you can use for transcription services. And so it was outsourced and probably off-shored. And these are the two vendors that I could go to and here was their price that we could use. And then we also tried to look into exactly what Delmar's saying. Could we do this with an AWS tool at a different price and then clean it up? So we're trying to look at sustainability costs. By the time we ended up working with the University vendor and sometimes we have good experiences, other times we don't. But today, like I'm in the same boat, but we make a video every month and then I need to make the transcript. And I have a relationship now with a four hire transcriptionist and we send it off and she's got her hourly rate. But I don't know exactly her labor conditions and how well she likes them and what kind of pressure she feels in terms of her wage because of competition with these other tools. And I do suspect that her industry and her profession is being disrupted. And like, yeah, we're all implicated in this awful or interesting or complex capitalist mob machine. And sometimes it's nice for us as consumers, but a lot of times it's really readily for lots of us and for the earth. And so Brennan and Jonathan, yeah, you're right. And we're all trying to figure it out, I guess. Well, and I think we can get to a space where it's like, well, all ed tech is carbon, hungry and thirsty. So why would I try, right? And that's the point we need to not get to. We can't linger in despair, right? There is no ethical consumption under capitalism doesn't end with, so never bother trying. And so we do, we have to try to make the best choices we can within our parameters. And I think that the next step of working with AI tools to enable this kind of work is to really start thinking through like, what is the framework by which we make the choice to engage with the technology and how do we minimize harm? Like, I have come to realize that working in a university is primarily about just every day I try to minimize a little more harm. That's all I can do, but I do think that's an important part of the work we do. And we can't do it if we're not thinking about what those bigger picture problems are and sometimes living in the muck. I hear birds, that means this has been great. I really appreciate everybody just wanting to jump in here to this conversation. I hope we can figure out more ways to have this. I've learned a lot and I hope everybody else who's been listening has learned a lot. And so thank you again, Brian Delmoire, Brenna, Jonathan, Amanda who started this in steel for just jumping in. And there are no simple answers, but it's great to have this conversation. And so I'm gonna thank you for doing OEG live. I'm trying to make a thing of this. My colleagues don't understand what I'm doing. I don't really understand what I'm doing, but we're gonna have a couple more of these throughout the summer. And so topics that you want to think interesting work that we should highlight, just let me know. And I'm going to fade out with a little bit of open license music from the Free Music Archive. And I hope everybody has a great day. This is Dr. Ketze. Hey!