 The first talk is from Professor Nathan Hill from Trinity College, Dublin. His title is Relative Chronology of some Bola, some changes. Thank you very much for having me. I am honored to be giving this talk at Beidah. I was there in December of 2019, was given a very nice welcome. And I hope I'll be able to join you again in Beijing sometime. Not too far in the future. The first thing I'm going to discuss is how I have done this work, because I'm using a new tool that we call Kaper. So it stands for Computer Assisted Proto Language Reconstruction, and it was initially built by Shungong in 2020 when he was working with me in London as part of an ERC project. And we were specifically looking at the reconstruction proto-Burmish. So then he got this job in Vienna and wasn't able to continue working on it and one thing and another. So it left the software in a kind of suspended animation, yeah. So then in, I don't know why this isn't going forward. Okay, then in 2022, there was actually a very brilliant high school student named Seth Knight who rewrote the code base for the program. So now it's back up and running, yeah. So this is just to show you on, on, there's the European Union's kind of research repository called Zenodo. Shungong has uploaded his video. He has, I think it's a two hour video that explains this thing of Kaper and what it does. And then also here is the code base that was written by Seth Knight also on Zenodo. And then it's also on GitHub as a kind of active place for development. And, you know, I invite anyone who wants to, you can, you can, you know, use this software if you want or rewrite it or whatever, right? It's totally open. So this is the, I'm going to give you a sort of demonstration if you like of some things I discovered using this new method of Shungong. Excuse me. Okay, so first I'll just explain how, how it works or how it's, you know, supposed to work. Because I don't want it to be a little bit of a black box for you. So this is a diagram where you have your various daughter language lexicons that should be machine readable. And this is an important point of his system. They don't have to be like already perfectly aligned for semantics and whatnot. The idea is, you know, most computer based work in historical linguistics has worked with a very small word list and we don't want to do that we want to be proper historical linguists and use as large as possible of lexicons. Okay, so then it goes into this thing, the reconstruction assistant, and then the linguist interacts with it, sort of rerunning software adjusting various things. And then at least in theory, it outputs an etymological dictionary. So, now I'll explain a little bit how it works. Where we have some a tested forms here in a charm in Bola and in model model is called all long to in in Chinese saw in the rest of this presentation I will call it long to. And we have some hypotheses about sound changes. And then the software runs those changes backwards on the attested forms to propose reconstructions. And we see here if we run the software backwards on a Chang Pee, we get these reconstructions as possible so Bay and be a and be a and be and be a, yeah. Whereas if we run them backwards on model we get be, or as I should say be, and be, and be. But if these three words are cognate which we have hypothesized they are the only reconstruction, they will predict all of the attested forms is this one in the middle. So, just to, you know, restate that we run all the sound changes backwards on all of the attested forms. And then we take the intersection of the possible reconstructions. And those are then, you know, if you like the proposed reconstruction used by the software. And one thing that I think it's really brilliant about the system is that the reconstructions are actually never stored anywhere. They're always created on the fly by applying the sound changes to the attested forms. Okay, so now I'll just show you what it's like to use this system. Here are this is the called called the cognitive assignment boards. And here are the various proto forms. And here we see under this proto form, different, you know, proposed proto lexemes with the with the attested forms underneath them. And then this is a slide I can't change anything but if I were really using the software, I would be able to assign, you know, move these around dynamically to to say, actually, I think this proto form belongs with this with this, or rather this attested form belongs with this proto form. And, and the thing that's sort of really nice about this is it's it's brought all the data that could be reconstructed to the same proto form together for me already. So in this way it's it's very easy to find like for instance, more themes inside of compounds that are part of the same inherited, the same inherited, you know, lexeme, if you like, okay. And then here's the other major interface which we call the transducer editor and debugger. So this is where I write my sound changes here on the left. And then the correspondence patterns are automatically calculated based on on on on those sound changes. Yeah, so those sound changes applied to the cognate judgments you just saw. So you can put you can see here it says current is finance a transducer is old, and then switch finance it transducer so you can compare an old one and a new one. In order to, for instance, if I propose a change, I want to know does it help or does it hurt. And actually, one thing that's very nice is if new correspondences become exceptionless. Then they turn green, and you get a smiley face, whereas if my proposal makes things that used to work, if it makes them no longer work, then they turn red and you get a frowny face. So you get very active feedback from the machine about whether your proposed sound changes are good or not, but I'll just emphasize the linguist is making all of the cognate judgments and proposing all of the historical phonology. The computer is not actually doing any thinking, it's just keeping track of the internal consistency of my own work. So, so now I will just move ahead. Okay, so now we want to, you know, apply it actually to Bola, and the, I'll say that let's say subjectively speaking, I find that the system works sort of almost too well. So you find sound changes, you know, one after another after another, and just writing them down and keeping track of what you've discovered, what the examples are what the exceptions are becomes very hard. Yeah, it becomes very burdensome. So, so actually the, the sort of discovery phase, it has become so fast that then writing up the findings becomes what's, you know, annoying. So we're going to look at a few interacting sound changes, none of them are particularly glamorous, but just to sort of show that, I don't know that this approach works also to, you know, to propose some sound changes in the history of Bola. But then with an emphasis on fleeting, sorry, feeding and bleeding relationships in relative chronology. Okay, so two notational conventions and these come directly out of the system, we have this symbol that is like does not turn into, yeah. So, this is just something you'll see in the data that I give, and then a dagger in front of what looks like an attested form. So this means that without the sound change that I'm currently exploring. There's no way to generate the, the actually attested Bola form. So in this case, this word is, you know, no, sorry, I should have said, yeah. So, so that form cannot be generated from any protobermic reconstruction is what is what this way of writing it tells you. Instead, that the form marked with the dagger is what we would have expected by applying all of the sound changes other than the one on under discussion. So, for instance, in this case, this for me is form and this long soup form, which mean dry point to this protobermic form, and this protobermic form would turn into this Bola form occur according to our current sound change proposals. But that is in fact not the real form which is why it's marked with the dagger. Okay, so the other symbol going the other way with the not through it. Is like this. Now what does that mean that means without the sound changing question, one could still arrive at the attested form. But that, but that the proto form that you would need to get this is tested form is not the one you need to get the cognates and other languages. So, in this case, for instance, we have this in Burmese and we have this in long to. And we have this in, in, in Bola, and they all go back to just queue, but without the particular sound change that I'm investigating right now. The Bola form would have to go back instead to, to this different proto vowel. Okay, so, so I think these two conventions are very helpful in terms of increased explicitness so we find out exactly what a particular proposal is contributing to the overall system. So, yeah, so both notation systems show what what it is concretely that goes wrong, if we don't include a particular sound change. Okay, so now I just go on to some sound changes. So here is some kind of a vowel fronting or rounding after media glide. So in general inherited who becomes out in Bola, but it remains who after an inherited yeah. So I think that rather than a sound change like who changes to our except after yeah, which is a little bit finicky. Instead, we should assume that the yeah, change the quality of the valve from who to you. And then this new valve was not targeted by the, the, the change to out. So that so if you buy that explanation we need this change. And without this change. The relevant words in Bola would have to reconstruct to this other vowel, rather than to. But we know that that's not what happened because of the final K in long sue. So I'll just show you now the evidence. And then I've put the word Bola in bold, just because these charts of evidence, you know, have a lot of stuff on them so that your eye will be brought immediately to the to the Bola form. Yeah, so you see here that long sue has a K final K. So we know that the protocol must be but the Bola has not changed to out which usually does. So that's why I proposed this change. Yeah, am I doing okay with time. Okay, so. Next is this change that I've already referred to out so this has to come after the change that we just proposed because the change we just proposed leads this change. And here's evidence it's it's there's lots of evidence you know cry for instance at the bottom, a good to better burn word. It's not in in in Bola but rather is now. Okay. And then we, you know, we've already. So we've created this value in order to get the right performance from our to change. So now we have to get rid of this. This and change back to and, and we, you know, we have to do that or these again these attested Bola forms could not be cognate with these long sue forms. And now a very small change is this is that out the final K becomes a global stop it's hardly worth mentioning, and probably quite late in any case it has to happen after that to our change right because. Yeah, because the K the final K was part of the conditioning environment. Okay, and then we have this. I'm never quite sure, you know, I'm I'm I'm a struggling with so I'm not so great at reading IPA. So in the following cognates the important factor is that the long suit does not point to final, because remember long suit changes to the very famous change in in long suit first discussed by Robins Berlin in the 50s. So, in these cases, long suit doesn't have it has out. So it must go back to a different vowel in in the proto language. So, so I propose that that vowel also becomes now this is a different vowels actually this this changes sort of independent in chronology from the preceding ones mostly except that it this can only happen after the U has done all its stuff including changing into into our yeah because otherwise this this would have fed those changes. And the following examples could have been fed by the change of which is why a reconstruction is possible for Bola without this proposal. So, so here's where the, the notation that the system is pointing to you see is quite nice here it's saying, you know, I can't get the attested Bola form. Instead the attested Bola form should have been this unless we unless we include the sound change. And as here it's saying, oh, I could get this. I could get this attested Bola form, but it would have to come from a different proto form than the proto form that the other languages are pointing to. So, I think that kind of subtle distinction is nice output of using the caper program so so so we need this sound changing order to get these attested forms. We don't need the sound change in order to get these attested forms but we do need the sound change in order for these attested forms to be cognate with, for instance, with the old Burmese and the long suit forms that you see on the screen in front of you. Okay, and then. Lastly, you know the the the advantage of all, you know, precise work in in the historical analogy is that we can identify loan words right so because the outcome in long so here is irregular showing who rather than out. And if you like showing who rather than oak, which are the outcomes in long so of either proto or proto, you know, oh, that means that in fact, there's no way for us to get these words. And then you see they're actually extremely close between the different languages. So, I think the fact that none of our historical phonology can get us these forms, and the forms just to phonetically synchronically are extremely close across the languages suggest that they're loan words now I don't necessarily know what you know language they're from. Is it long so borrowed it from Bolo or Bola borrow from long so or maybe both borrowed it from some other language and I would point out just from my knowledge of Burmese which is not very good that who means happy in Burmese and means matter or fair in Burmese. So, my guess is that these are the loan words from Burmese into both of these languages. So, yeah, so that's sort of the, you know, I thought so let's say just to sum up what what we've done I introduced this software package designed by gongshun and then rewritten by Seth Knights and shown you how I've been using it to explore the the history of Bola, and in particular, relative chronology of sound changes in Bola. These feeding and bleeding relationships and, and, you know, this is only one small part of Bola historic phenology that I've that I've been able to explore using this method. But I think this shows some nice interactions between sound changes mostly to do with the value and and similar phenomena. And once we've worked out the relative chronology. And we have used it also to identify some loan words, probably from Burmese into these languages that can be now separated out and not considered cognate anymore. And that's, that's my whole presentation. Yeah. Yeah, so let me just say like if you're getting started in the first place. Yeah. What it needs is it needs. Some lexical lists that are in IPA where the semantics have been aligned somehow. Yeah, like. So for example, you know, in the step project in Berkeley they already digitized the Sun Hongkai. And the big Huang Bufan book. Yeah. So the easiest thing to do because those are already in IPA already semantically aligned is you can just use any, any, any data in there is already sort of ready to go if you like, yeah. If you want to use data that's not in there, then you need to prepare it yourself. And what that means is it needs to be in IPA, and somehow the semantics need to be aligned. Now the way we would recommend to do that is there's a thing that the Max Planck Institute has called. Oh, the concept to con it's called yeah. And this is a is a kind of standardized way of referencing semantics, and they provide some software tools that help you to do that also automatically. And those software tools also include Chinese. So, so basically, if you start from a list that is just like, let's say, two columns, one is IPA and one is the meaning in Chinese. Then you could run some Python script over the meaning of Chinese, it would assign it to a reference in the concept icon. Then you would have to check that. And if once you had done that, it's ready to put in our system. So yeah, so now you said, do you have to have any sound change proposals in the beginning. No, is the answer but you do have to have cognate judgments, right. So if you're starting, let's say you, these are two languages no one has ever worked on just pretend. How do you get that initial those initial cognate judgments. There's also a piece of software that Mattis list has made, which is called link pie. Yeah, and it will do that for you. And it basically it's a kind of, you know, it looks at a phonetic similarity and the and sort of some things we are used to like cook and correspond with Cha and things like that. So it's a kind of like, you know, it's very rough. It only gets about 80% right, I would say. But it's very helpful to save you the time from having to do the very obvious things. Yeah. So if you were really starting from the beginning, you would have those. You know, that a spreadsheet for each language, you would, you would align those semantics, and then you would run link pie over it all. And then you would put that into paper now. So if you like to work on the Burmese languages, we have already digitized everything we can get our hands on, which, which in particular means most of the works by die. So he and his people he works with have published little short grammars, long grammars on, you know, I don't know five or six different languages, and then each one has a vocabulary at the back so some 1000 words some 4000 words we've digitized all of that already it's up on the note that you can download it yeah. So basically if you if you treat it like a computer game, you have two screens. Yeah, which are this one and this one. So you can start in either one. Now, if you have that data and those cognate judgments you don't have to have any sound change proposals and you will still get this because this is showing you the patterns. And those patterns exist whether or not you have made any proposals right it's a, it's just a fact that, you know that that, so in this case long suit K corresponds to Bola K in these examples that's just a fact right. So just like that for you. And now, you know, you might, for instance, propose because you know about the work in the litter in the literature you might propose that G becomes K, initially, in both of these languages. So then I write that proposal over here. I have to write it as a finite street transducer which takes a tiny bit of care, but actually the way the syntax of finite state transducers is extremely close to how we already write sound changes in the kind of in the kind of what's it called sound sound patterns of English, you know, notational model that we all learn anyhow. So you write the sound changes there. And then you press this button that says calculate. And then, you know, it will you see. So let me just, I don't know if you can kind of oops if you can just look really close at the screen. What you have here is the tested form, and then you have to reconstructed forms, and the two reconstructed forms are from the old the first one is from the old finite straight state transducer. And the second one is from the new finite state transducer so in this case they're the same, because we're not working on this question. Yeah. If there were. If they say the old one was somehow bad and didn't predict this form there would be an X here. And then if our new proposal or new our news, our new sound change fixes it, then you would get to check and then it would also turn up green. Yeah. So, here is, you know, where you you write your sound changes, and then you run your sound your, your, your, your, your transducers over them to to to get more and more checks and fewer and fewer X's is the idea. But actually, it's some you know it's not quite as easy as that because you almost always when you propose a sound change, it gives you some things right, but it also breaks some things that have been working. And especially, you know the more refined your sound changes get the more easy it is to break everything. So you have to be on your toes. So, so let's say I work on the transducers a while. So now I have better historical phonology. So this is the moment where I can find new cognates right because think of it this way think of like German sound which means fence and English town. If you're just thinking in the abstract using very strict semantics you would never come up with this. Yeah. But once you figured out the historical phonology, it becomes obvious. And this is where the process that should go calls refishing happens where we take our new sound change proposals, and then we go and we grab the whole lexicon that we have, and we reconstruct all of the proto forms of all of lexical items. And that's when we get back to here. So let's say this was German sound and this was English town or something, it would find that they have the same reconstruction, and then it would put them next to each other on this board. And then that's where the computer is really being helpful because it says, Well, now sound has the same proto form as town. Do you want to put them together. It would have them in different columns. And then I decide okay yeah, you know sound town, you know maybe a town is a thing with a fence around it so I can propose it. Yeah. But, you know, maybe one of the words means chalk and one of the words means cheese, then I might say oh it's a coincidence so I won't combine them. Yeah. And that's the way that it that the computer is being helpful is it's really like I improve the historical phonology, then it looks for cognates. And then I get to decide whether I like those cognates or not. And then I have more data, and then maybe I can improve my historical phonology more. If I improve my historical phonology for more it looks for more cognates and, and that's the sort of what's a virtuous circle.