 What I'm going to talk about today is probably familiar to many of you, if you've seen any of the work I've done, and it's mostly these questions of rhyme and prosody in early prose. Although, of course, all of the information we have from the early period, well, most of it comes from poetry. So we're going to do a bunch of poetry, too. And then how do we scale up? How do we do all of this at essentially as large a scale as possible? So we're going to talk about that. The other thing that I think some people have found interesting in my work anyways, is all the imperfect rhymes, right? So we're not just going to talk about perfect rhyming, but the Ho-yu and the Tong-yu. What are these? And I use Wang Li's definition for these with apologies to Chao Miao-gao, right? But what does it mean to have imperfect rhyming? And what can we learn from our sources when the rhyming is imperfect? So let's go ahead. As some of you may know as well, I've been working also quite a bit in intertextuality and early Chinese philology these days, doing lots of network stuff. This talk has no networks in it. Sorry about that. If I'd planned better, perhaps I could have included some networks. But Chris Foster, if you want to talk about alignments of bibliographies, we're doing that as well. We're doing what's called text reuse across very large corpora, but it works very nicely for bibliographic citations as well. More importantly, what are we going to talk about? Well, we're going to talk about the mean. We're going to talk about the rhyming, right? Because I find it a very illuminative case study for thinking about the finals in poetry. And of course, what were the sounds of these words and how did they change over time? Paloization here is the main question. And then we're a little bit about algorithmic approaches to detecting and especially visualizing these what I call phonorhetorical patterns. We're going to see one text from the early Western Joe, and then we're going to see a text which I date to the Western Han. We could argue about this if you want. I'm not sure. And that's the goal of you. So the real key to the second part is then scale. Can we scale up our essentially elucidation of these patterns to be able to use them for any text of any period? And of course, this scales beyond Chinese. If you can do this for Chinese, that's great. Can we also do it for languages that used phonetic symbols, letters and such in the way that they express sound? And I think we can. So that gets very exciting. And then the last bit, we're going to do a little bit of what's often called text to speech. It's a hot new field in tech. And you'll find that what I present here today is already well out of date, but hopefully we'll give some hope to where we're going with all this. All right, so let's jump in. A lot of what I'm going to talk about comes out of my dictionary that I built some years ago. If you want to go use the digital etymological dictionary about Chinese, you can just go to edoc.uchicago.edu and just click login and it'll let you in. You don't have to register, but I do that. So that people hopefully don't just take the whole thing. Anyway, the other thing I'll say is this talk, I'll go back to the first slide, is here. If you'd like to grab a copy of it, because some of it is very hard to read, especially over Zoom for thousands of miles, you may want to just download the PDF and view from there. All right, so let's go. Mean is one of my favorite words in old and middle Chinese, because what we know about it comes from, of course, the Tia Yun and the Guang Yun, and it's in this fun category, the Zheng Bu. So this category has two different finals in particular, and then also other finals. So it's this kind of, it's not quite a catch-all, but there are various different sounds that are all represented within the Zheng Bu. And that means that we have to really work to figure out potentially what were the sounds, what were the finals of these words. And for Ren Min Min, almost everybody, pretty much everybody, reconstructing this graph back from our rhyme dictionaries has given it a final, either an in or an un final, right? So it's either the schwa n or this i n. Sometimes with a glide, especially in middle Chinese, I think there's a really good set of evidence for there being a glide, mean, something like that, within it. That's what Pully Blank did. Zheng La San Pa and Star Austin were convinced it's mean, which is interesting. And then we get, for the first time in 2011, Baxter Cigart decided that the final consonant in this word was not the n, but it was the un final. And so how did they figure this out, right? How do we know and what's the evidence for this? And I just want to present this as a case study, because what I'm seeing, and I got to look through the other presentations, I think that what you're finding when you do the network graphs of rhymes and words that rhyme together, you're finding a lot of liminal cases as well, you have cases which are fairly solid. And then you have words which kind of fit in between various clusters in the network. And those are most interesting to me anyway, and trying to think about how do we tease out what exactly the sounds of those were that are in those liminal spaces, where it's not clear if this was maybe Yang Buzhe or Gong Buzhe, that can be really exciting. So this is just a case study to look at one example or something like that. The best evidence for mean and the final, I agree with Baxter and Cigar comes from the Shuzin. And it comes from this word, this what we think of as Dean, because in most modern orthography, the right side is Tiki Enzhi, right? But if you actually start looking into it, this is probably an orthographic mistake or variant depending on how you think of it. Donny Tsai noticed this, and he says that in the Shulin, it's always this written with the Dean on the right, and says, Jean is the phonetic, right? But if you go to the Hanstone classics and look at these stalers, they write Dean with a ling on the right. And he says that's the correct original version of the graph. And I think Donny Tsai is correct. And what it leads to is we have various potential sources for Ng, and then it changes in some cases to Ng, some cases to Ng, and sometimes it stays Ng, and this is from Baxter and Cigar. It's newer work. So if you go to the Ma Longui manuscripts, the Laozi, this graph does occur twice, and you can see fairly clearly it's a ling on the right, and this one, and this is actually a ming, so something here on the right. So pretty conclusive evidence that the phonophoric, the shengku, originally was ling and not Tiki Enzhi. Okay, so we can do that. And we have pretty solid evidence now for that graph. What about the other graphs in the sequence? And this again is where the network theory that I think most of the folks here, as you've been involved in this, you're running into cases like this, where you have characters that fall in places where they could be rhymes, and the sound is close, but is it close enough to actually make an educated guess about what the final actually was? And that's the real question here. In my opinion, these are not actually rhymes, the ones in the red boxes here. And that's because these are hokou. It seems that this palatization that we see with ming is mostly a kaiko phenomenon, kaiko as well for routine, but shing and shen, being hokou, don't seem to fall in the same category, and therefore the finals here being in the Baxter-Sigart reconstruction are probably correct. And then we get to the other example. And the other example is this wanliu from the xiao-ya in the shiting. And we run into this word, usually pronounced chen first time. So the question then becomes if we have so much good evidence for pronouncing this graph as something like jing, what is the rhyme here in this stanza? This is the entire third stanza. And what does it tell us about the word tian? Is it possible that tian had a final oong that this is also one of these cases of palatization? I don't think so. And the reason I don't think so is because when you go to the qiayun in the guangyun, it ends up in the xianbu. And if we look over here, that's the shen, here is also in the xianbu. I think we have pretty good evidence that the final for tian, if we follow the system, is probably in... But zhong has a very similar profile to min. And if zhong here is indeed following the same change over time that we see with min, we can propose that the final here was actually oong. And therefore, this was pronounced something like zing, string. And this then would be the rhyme. And then you have a cross rhyme essentially with tian. Putting tian at the end of your first phrase is always pretty effective. I would think it's a nice word to drop in there. And so the question then becomes, is this ABB, rhyme, AAB? It's still uncertain. We need more evidence. But this is the kind of work, at least down on the ground level, that I think you can do with all of the graphs you find where you're running into this liminal space. You're running into this place where they don't seem to 100% fall into one rhyme group or another. And that gets very exciting. So I expect great discoveries from the network analyses. All right. Let's move on to the second part. And I'm going to run through this pretty quickly. And mostly what I want to do is talk about prose in this section because we take much of our evidence for rhymes from poetry and that's right. But there's an incredible amount of evidence in prose as well, going all the way back to the Western Joe bronze inscriptions. So if you look at the bronze inscriptions, and I've got two here on the page and I'm sorry you'll not be able to read the characters or anything, this is mostly to give kind of a bird's eye view of the text and to try to think of the prosody of the text as a whole. Where are the sections where they're using one device? Where are the sections where they're using a different device? And then can we train a system that will help us figure out what those sections are? How they might have sounded at the time at scale for any text from any time period in Chinese history? And I think we can with some work. So you can see here in the Dalu Ding, which is the text we're going to look at, you have a section of cross rhyming. This is a Huyun according to Wang Li's definition. We have some perfect rhyming, but you have really a lot of essentially Huyun cross rhyming. And that's true here even in the bell inscriptions, which tend to have the most amount of perfect rhyming in the bronzes, there's still a lot of what we consider Huyun cross rhyming. And in this case, that means that the final consonant is identical, but the medial vowel is different for the most part. And we see it as well in works like the Kango. We see it in the Shang Chu. We see it in a bunch of places as everybody knows at this point probably. So if you look at the actual text, and here we are, this is the bronze probably cast in 981 BCE. It's the best guess. And you look at these, what you see, of course, is four, five, and then seven. And then I believe this is also seven and then four. So the metrics don't line up the way that poetry would. Most poetry tends to have a fairly regular metric structure, even into the Fu and the Pianhuang of the Han dynasty, you'll get couplets and the couplets normally have the same metric length in terms of number of graphs in them. But in prose, it varies widely. And we have to be very careful about that because can we adduce rhymes when the metrics are different? And I think we can. I think there's plenty of good evidence now that these are clearly rhyming texts, even though the metrics are not perfect. And you'll also notice the variation here in the medial vowel, as I said, where you have wong and then ling, bang fang and ming, right, ming, ming, read with it, final. So this is what we see coming out of the Western Zhou is wide variation as far as we can tell in the medial vowel within a sequence that seems to be at least loosely rhyming, slant rhyming, cross rhyming, covery work. So we see this right then next two sections that seem to rhyme perfectly. And here this is even four, four tetrasyllabic rhyming, just like in the shiitian. Or the next line where you have five, and then five also seems to be a perfect right. So these couplets show up all over the place in early Chinese. Other sections that rhyme here, again, we have four, four ending in un, and then un, and then the glides that Chris was mentioning, we do see sometimes rhyming with glides in all the way back to the Western Zhou Brown's inscriptions. And then all kinds of cross rhyming where the metrics are all over the place. And then you start to have to wonder, is this really still rhyming? Are they using this for its sound value? And I think the case is probably yes, it's a prosodic construction, even though it's not the type of construction, not the type of formulation you would normally expect in poetry. And then a last section here, including the dedication, which kind of rhymes kind of doesn't, it's a hard call here then to say whether or not this rhymes. The other part that we see in these texts, and I think this is actually really important, we see rhyming with final K quite a bit. So final un and final K are the most common. But again, wide variation in the medial vowel and variation in the metrics. So you have to be ready for this when you're using pros, at least pros like this. And then the last part that I think is really important, especially if you're going to do this at scale, is to look at these and recognize that there are plenty of sections that have no rhyming detectable whatsoever. Or if there is rhyming, it's very, very subtle. Here in the gift list, you could make an argument for the Ock here creating a couplet. But it falls right in the middle of the gift list. Here's another Ock that comes. So is this potentially a rhyme? Sure. And then the finals that run through here, almost everything else ends in Ock. Does the gift list actually rhyme in this case? Maybe the rest of the text, as we can see over here, does have a huge amount of cross rhyming and even perfect rhyming. So maybe even the gift list is rhyming, but it's a very hard call. And what we're going to do when we turn to algorithms is we're going to ask the machine to make these judgments for us, and then we're going to check them. So this is where it all gets actually quite complicated. All right. So that's the Western Joe. If we move to the transmitted texts, we see a slightly different story. And the story then in the Zoujuan and especially the Guoyu is lots more perfect rhyming than we see in the Western Joe bronze inscriptions. There's still some cross rhyming, but it's clearly heading more and more toward use of perfect rhyme. And again, this is in prose, not just in poetry. So the text that we'll rip through real quickly here in the Guoyu is the Yueyu Xia chapter. So the discourses of second part. And what we see is lots of use of the Aung final. And here then potentially a five, four, four, kind of an AXA construction or BXB construction here in the second half. It's hard to tell, right, which is why it's in light blue instead of in in darker colors. But for most of this text, and this is very true of most of the speeches you run into in the Yueyu Xia chapter, you see four for rhyming in extremely metrically regular constructions, mostly tetrasyllabic. And that, of course, harkens right to the poetry of the Shijing. They were, I would say, clearly using this as a rhetorical device since the poetry had such gravitas, they're echoing what they find in the Shijing in their prose as well. At least this is how it reads to me. We have other sections where you have an Aung kind of an XA, XA construction followed by three lines that seem to rhyme in et. And that's the dark green here, perfect rhymes. And then final a would be basically CCXC in this last section. So three very different rhymes, but all perfect. So three different, perhaps sections to this passage, just one way to think of it. And you can split them with rhyme. And that's one of the beautiful things I think you can do with a lot of early prose is split sections of your manuscript or your text by how they rhyme when they do rhyme or cross rhyme. This last section is probably my favorite from this text, because it's basically talking about military history. Ernest, if you haven't seen this, you'd probably like this. They're basically giving a little bit of military instruction in, as far as I can tell, basically perfectly rhyming tetrasyllabic prose right here in the Guoyu. So, you know, you want to build up the left as this Mua, as this male part right. And the female then is the right flank. But then, of course, it goes back into our cosmological didactic instruction obey the ways of heaven and surround and revolve, but don't concentrate, but don't concentrate your forces in this case, right. So relatively perfect rhyming, even in somewhat militaristic context, pretty fun. And then like we see in the bronze descriptions, long sections where there is no discernible rhyme at all. And you just, as I say, you have to be ready for that. You have to expect that, actually, in many cases. So when you scale back, when you look at this from a bird's eye view, you can start to use color coding to look at the prosody of entire texts arrayed against each other. So here we have the Daoyiding, here we have the Liancizhong, and then over here we have sections of the Guoyu. And just by the colors you can tell how different they are. I'm not sure that my color scheme is the best, I'll be honest. So one of the things I was thinking about looking at the networks is whether or not, instead of using these more random colors I chose, could we set up a gradient or a set of gradients around various rhyme groups or groups of rhyme groups, where each cluster then has its own color or color scheme, and then the gradients show where one color bleeds into the other. If you can do that all algorithmically, you could plot all of this in a very nice kind of series of gradient color schemes that would show the prosody, that would show the rhyming and the cross rhyming all mathematically. So something to think about. All right, a few issues that need to be brought up as we do this. So we have a lot of gradients. In this case, I did the work by hand, which means I get to adjudicate what the correct sound would have been. But that's a hard call in OC especially, what was the correct reconstruction, right? And as we saw where it's like mean, if you have competing reconstruction systems, which one do you choose? We think maybe neural networks will help here. That's the hashtag AI for OC. Other more complex renderings of the ways that graphs had multiple pronunciations is probably the best way to go. The other thing is that some lines will naturally fit multiple what I call phonor-rhetorical patterns. These are phonological patterns. And sometimes the patterns are not in rhyme-based. As we saw with the glides in the early part of the Dalu Ding, sometimes you have inner rhyming within a line. And you want to make sure that the system is also picking those up, probably. But it can be too much. If you highlight every graph with a different color, does that really help you understand what the sound system involved in the text was? That's a really hard question. And then the last one, we have evidence, especially in the sounds of repetition used as a rhetorical device, alliteration in some places. How do we also show those types of prosodic devices and constructions? And I do want to give a quick shout out to John Rominger and Nick Kudak, who through a project at Princeton are heading this direction, trying to work toward a large-scale system where we could use essentially natural language processing and algorithmic approaches to understanding the prosody of old Chinese. Along those same lines, for those working on Han dynasty texts, I would just urge you to consider all of the ones here on the screen. So the Guoyu and the Zhang Guoxu, we could argue about when those are actually dating, too. They probably originally were written a little earlier than the Han. But I think the versions we have, probably at least in part, date to the Western Han. We have the Pianon from the Han dynasty. We have the early historiographies, so the Shiji and the Han shu, and then of course the poetry. The Fu are wonderful if you look at their prosodic constructions, and the UF Fu are interesting. And then, as we saw with Chris, a wealth of manuscripts, right? Manuscripts are a little more difficult to work with in some cases, but I think it will pay off. And especially if we can do this algorithmically, scale is not really an issue. So we could do all of the manuscripts found from the Western and Eastern Han, Nanbei Chao, Tang dynasty, whatever you want. All of Dune, right? So that's the real benefit to doing a lot of this competition. All right. And I only have a few more minutes, so we're going to do this kind of quickly with the rise of the Alexa devices and other things like Google Home, text-to-speech has become a thing. And so when you're doing text-to-speech, most of the time, we're doing it in a modern context. But there's no reason we couldn't do text-to-speech if we can produce a reasonable IPA for any language. And here, let's think about Tang dynasty, middle Chinese. If I thought about it better, I could have produced a Han dynasty text in the system. It is algorithmically based, so you can actually use any text for this. But here, what we have is we have Tufu, right? And it's probably illegible. Again, you probably want to download the PDF to see all the different sounds here. But you can have it read in modern Chinese. Well, we can all read it in modern Chinese. That's not a problem. The Macintock system from many years ago allows you to render phonemes in a speech synthesizer. And what that means is you can produce IPA, you can convert it to this Macintock version of the phonetics, and then the machine will read it as phonetics. The main problem here, in my opinion, is that the speech synthesizer is a very generic one. It's based on English and it's already about 10 years old. And it very much sounds like war games if any of you remember that movie. But let's see what we hear from the Victoria voice reading the Middle Chinese phonology for this Tufu poem. Ready? South in Tong, New York, New Sin, Xium. Yay! See what you like to play again? It's by no means perfect. But we're getting better and better at speech synthesis. And I think there's no reason that we couldn't actually build a system for Middle Chinese at this point. I don't know that we could do one for Old Chinese since the actual phonetics are still relatively in doubt. But given the wealth of information we have for Middle Chinese, I do think an algorithmic based speech synthesizer is very much a possibility, and one that doesn't sound like war games. We'll do one last one. This is the long way famous poem here. And let's see if we can have this then read in Middle Chinese by our computer. So, despite the poor quality of the speech synthesizer, hopefully you can hear that it did at least get all of the rhymes relatively perfect. And you can see in the rendering it's basically able to handle a version of phonemes. And therefore, if you do it correctly and you do have the same phonemes, it will hopefully speak them identically. And that's what you want from the machines. So that's really it. My 30 minutes are up. We can go ahead and talk about any of this stuff for adjacent things that we're doing. And thank you once again. But I'll just pick out one thing that you mentioned, which is alliteration. And I don't know, like 15 years ago, very early on in my engagement with Chinese historical phonology, I taught a class where I just wrote several of the surging poems in backstreet cigar system. And I was seeing alliteration as a poetic device. But it's the sort of thing you don't want to tell anyone because you're like, the most secure part of this reconstruction is the initials. So like, I mean, kind of that's the whole question, even though it's not a question, but like, how do you think, yeah, well, try it. How do you think we can make progress either on the study of alliteration as a poetic or rhetorical device? Or will it ever be possible to use the existence of alliteration as a poetic device to actually inform reconstruction? Ooh, will it ever be possible? Yes. Yes, I hope that assuming human history continues. Yes, eventually it will be possible. One of the things we're missing is data, right? So we have rhyming poetry from the OC period, we have, as I showed the bronze inscriptions, which do rhyme, you know, won't go away. It works pretty well. So what exactly can we tell about the initials is your question? And that's such a hard question. Carl Grint spent much of his life actually trying to systematize graphs based on their components. And you could tell he tried to systematize the initials because the finals become relatively clear, right? In general, we kind of know what rhymes and cross rhymes and stuff. That's great. Would I ever stand up in my lifetime in front of people and say, yes, I know what the initials of these words were in old Chinese? Probably not. I think you made the right choice there because there's just too little evidence. We're working backward from mostly the Ché Yun and the Guangming here, right? What evidence do we have for alliteration as a prosodic device in the preaching period? I don't think we have a lot. There were poetic traditions that used alliteration as one of their main devices. I don't think old Chinese is one of them. So no, so I think you hit it right on the head. We don't know. We have conjectures, and I think that the conjectures are fine. More data will help our conjectures get better and better. But when we think about representing uncertainties, that's by far the biggest uncertainty in OC by far. So as a research program, do what we're doing, look at loan words, look at Sino-foreign things and whatnot, and then maybe as the initials become more clear, their use for poetic and rhetorical devices will also become more clear. But probably Chinese is a poetic tradition just never made. It's not like old English. So we'll never be able to say because there's no alliteration here, I must reconstruct tea. Yeah, something like that. I think you were talking on the 20th side. And he was talking about how the character had the phonetic gene, like T-t-n for the gene. But the original one was meaning the link. But the interesting thing about this is gene also had M as an ending, and then a lot of these M endings turned into NG as well. So I mean, because I would have to get all my computer programs and books and look up a bazillion things to actually make a real comment on this, but it seems like there's a relationship going on possibly that even the later graph was still tied in some way to changing sound. Yeah, it seems this way. And one of the real keys, I think, to this is graphs that have multiple pronunciations, or as we see, you know, sometimes they diverge crazily, right? And so it's hard to know how to make sense of a lot of that, but all of the sonorans, M and N, tend to have flexibility across vowel structures, which is put it that way, right? Whether there are glides there, the medial vowel E is probably the easiest one to do, doing it for the schwa is much harder, but probably doable as well. So did these words have slightly variable finals maybe even depending on who was speaking and how they were deciding to pronounce the word? Okay, well, thank you so much for getting up very early, actually, to watch some presentations and to join us today and for your talk. Yeah, it was really, it was really fun. And I can tell that everyone here also feels, you know, very happy and very grateful. So yeah, thank you very much. Yeah, it's very kind. Thank you.