 So good afternoon everyone. This is Chris Cox from Carleton University. Thank you so much for the chance to participate in this workshop and sorry not to be able to join you in person. I'm actually filming this from here in the Yukon Territory in far Northwestern Canada where my wife and I are helping out with a language workshop this week. I wish I were able to join you in person, but I hope that this video will be helpful. This is meant to be a short introduction to a small tool that I've been working on called Persephonea Lawn. So over the next 15 minutes or so, I'm hoping to introduce you to some of the basic features of Persephonea Lawn. Some of the things that it currently does and doesn't do and hopefully give you a sense of how you might be able to apply it to some of your own projects. As I understand it, you already have covered or will be covering this afternoon how to train up your own Persephonea phoning recognition models. As we'll see in a few moments, that's going to be important for you to use Persephonea Lawn in your own work. I need to start with a few acknowledgments here. A number of the examples that we'll be seeing in the slides to come are from the Tsutana language. Danny language spoken in Southern Alberta in Western Canada. These come primarily from the Tsutana Language Commissioner Bruce Starlight. I need to thank him for his help and support throughout all of this work. With regards to Persephonea, Oliver Adams and Alexi Michio have both provided substantial support for this work. And within Canada, there's also been support from Compute Canada and the Centre for Advanced Computing in developing some of the Persephonea models that we'll be seeing today. So what is Persephonea Lawn then? This is a small open-source plugin or recognizer that brings in some of this automatic phoning recognition provided by Persephonea into Elan's user interface. What it does more specifically is take an existing phoning recognition model that you've trained for Persephonea as well as a tier that contains a number of blank annotations in Elan applies that model to those segments and then returns the corresponding phonemes that Persephonea recognizes and gives them back to you on a new tier. So essentially if we have a blank annotation in Elan we would pass that tier off to Persephonea Lawn and after some crunching it would return back the corresponding phoneme string that it recognized. Now Persephonea Lawn has a number of software requirements that are important to note at the outset. It's currently macOS or Linux only. There's no support for Windows written in at the moment. The people that I'm working with on related projects don't really have much need for that feature. If this is something that would be useful for people about this workshop or elsewhere, please let me know. This wouldn't necessarily be that hard to implement though it would take some planning to do. Persephonea Lawn also works well with recent versions of Elan. It's been tested with versions 5.6, 5.7, all the way up to the current 5.8 release. It may work well with earlier versions as well, but they haven't been tested. To run Persephonea Lawn you'll need to have a version of Python 3 installed either 3.6 or 3.7 as well as FFmpeg. Now this is the tool that Persephonea Lawn uses to convert some of its media files behind the scenes. You should be able to find installers for both of those programs fairly easily online. Again if you have any problems there please let me know and I'll be glad to provide more pointers. Probably most importantly for Persephonea Lawn to work it currently requires that you've installed Persephonea system-wide. So for the copy of Python that you have installed in your computer you need to make sure that Persephonea is installed so that any application that can access it. So not just in a virtual environment. If you're a Mac user and you open up Terminal you can usually do this by typing in the command given in green there, pip3 install Persephonea and that'll do the trick. Lastly and as I mentioned before Persephonea Lawn relies on having at least one existing Persephonea model that it can apply to your transcripts. Developing those models training them on the basis of existing documentation is outside of the scope of this little tutorial but I've included a link here to the documents for Persephonea where they describe exactly how this is done. Now there are also a number of important limitations to take into consideration here. For one Persephonea Lawn is very much early alpha version software. I have a number of improvements still in the works but the current version as we'll see is still a little rough around the edges in places. So if it breaks on you unexpectedly, if something doesn't work quite as planned, if it's a little slower than you might hope it would be, don't panic please let me know. A number of these things are issues that can be addressed in ongoing development but just be aware this is still very much in the early days. Persephonea Lawn is currently being developed under macOS and that's where most of the testing has happened. So as it stands it should run reasonably well under Linux without any adjustments but as I mentioned before more work would be needed to create a Windows friendly version. So Windows users this tool currently won't do much for you although again if there's interest it wouldn't be necessarily all that hard to add that as a feature. Lastly and I can only stress this so much Persephonea Lawn really does need to have an existing Persephonea model. So we don't at least we won't be talking about here how to train up a new model on the basis of say existing recordings or existing text that you may have it only makes the process easier of applying a model that you developed to new recordings. So over the next few slides I'd like to just take us briefly through the steps that we need to use to actually go download configure install and actually start using Persephonea Lawn with your own materials. So Persephonea Lawn is currently hosted on GitHub there's a small repository there with the current version of the source code I've included the link at the bottom of the screen where you want to go to find the current release is under releases right so close to the top of the page where it says three releases there currently the current version as of today is version 0.1.2 all you'll need in order to run Persephonea Lawn is the top source code link to this zip file. If you download that zip file on your computer you can expand it on a Mac that would just be double clicking and you'll see it's just a single folder that folder contains all the files that you'll need to run Persephonea Lawn. Now before we can actually put excuse me move those files inside of a Lawn that is install the software we do need to make sure to edit one configuration file first this is where we tell Persephonea Lawn where can find the version of Python that it needs to use and where it finds a version of ffmpeg that's going to use to clip and edit media files right so the file we need to edit specifically is one called Persephonea Lawn.sh so on most Mac computers you can just right click or command click on that file go to open with and use the text editor of your choice here I've opened it using text edit in this file there's two lines that you'll likely need to edit you can see them highlighted in blue here the first is the location of the version of Python that Persephonea Lawn should use so in this case we're using Python version 3.6 so this is the full path on your computer that takes you to Python 3 that's the first line the second line is to the directory or the folder where your copy of ffmpeg is located now if you're not sure where either those programs is located on your computer again if you're a Mac user you can open up the terminal application and then type in both of these commands that are in green so which Python 3 that should give you the full path to that version of Python assuming it's installed in your system and which ffmpeg which again if ffmpeg is properly installed should you give you the full path and you can copy and paste those paths directly in between the double quotes on the appropriate lines. Now once you've made those two edits and saved that file Persephonea Lawn is ready to be installed now again under macOS this actually involves moving this Persephonea Lawn folder inside of Lawn's application bundle so in your applications folder we're going to find a Lawn right click on it open up its package contents and then move your Persephonea Lawn folder inside the application itself so here's a short video to show what we mean so again I've opened my applications folder I go and find my copy of a Lawn I right click on it and go show package contents now this opens up what's inside the Lawn application then I drill down to contents Java extensions and it's this extensions folder that holds all the Lawn plugins and I can just drag and drop my Persephonea Lawn folder into there right so that's all we need to do to install Persephonea Lawn in that way it's just like any other plugin any other recognizer that's been developed for Lawn so far once it's installed we can open a Lawn and actually apply an existing Persephonea Phoneme recognition model to a tier of our liking in one of our transcripts now to do that we do need to tell Persephonea a little bit or pardon me Persephonea Lawn a little bit about our Persephonea model so specifically we need to let it know the folder where Persephonea model or experiment is how that model was configured so specifically which feature types we used for phonetic features and what labels we used to provide the text and where the original training data for that model are Persephonea Lawn feeds that information back into Persephonea behind the scenes to reboot that model essentially and then apply it to these new unseen snippets of audio that we're getting from our Lawn transcript now I'm hoping over time that we can make changes to the Persephonea source code to actually save these settings inside the models themselves so that the only thing we need to provide to Persephonea Lawn is the path to our pre-trained model but for now this is information that we have to enter manually into the Persephonea Lawn interface for myself once I've trained up a model in Persephonea I usually just keep a small text file that has all of this information there in the example we'll see in a second I've entered this information the appropriate fields already but again these are the things that you should be able to recover from your model training process fairly easily so again here's a short video showing what this looks like so here we have a transcript and this is again the Tutana language with Elder Bruce Starlight so you can see we have a number of empty annotations or textless annotations on a main tier what we want to do is provide that tier to Persephonea's phoning recognizer so in the recognizer's tab we select Persephonea phoning recognizer and then we provide the settings I was just describing so in this case we trained our model using fbank for phonetic features the text that was provided this model all those text snippets had this file extension again this will look a little bit different for your particular model we've built in support for Tutana's orthography here but again in your case you'd most likely choose none and what you'll get then are the actual phoneme strings that come out of Persephonea with no conversions happening behind the scenes we want to provide this BRS tier that's excuse me that's the tier that contains all of the empty annotations that Persephonea is going to try to recognize lastly we want to provide a reference to the directory where the original training data is so in this case for the Tutana model that we're using here and lastly excuse me to the model itself to the source experiment directory there's a final field here as well for output recognized text this is essentially just a junk file we can't get away with producing Alon makes us do this don't worry too much about it once we hit start Persephonea Alon will start picking out all the individual clips from that tier reload the Persephonea model that we provided and then actually ask Persephonea to start transcribing each of the clips on that tier right so each annotation is being fed to it when we're ready it'll load the corresponding tier and we can listen to the results so you can see here it's recognizing only segments but also also the tones that are marked with diacritics so as we mentioned before there are a number of settings that you need to have for your model at that point once you've entered that information it's as simple as selecting the tier you want to apply this to and pressing start right now you'll notice that the Persephonea output here appears on its own tier this is again a limitation of Alon's current plugin infrastructure if you need to copy the contents of that text from one tier to another though that's something you can do through the annotations menu in Alon so concluding then Persephonea Alon is definitely still in its early days there are a number of important features that I'm still working on at the moment and I'm hoping to improve upon some of which will potentially require changes to the Persephonea code base but even in its current state I'm hoping that Persephonea Alon and tools like it might help make Persephonea more accessible to other documentary linguists and other users of Alon if you have any feedback or comments or possible bug reports related to the software they're certainly welcome feel free to submit any of them via github or if you prefer by email to the address given here thank you