 In my last video, I talked about how one of the best ways to tell where syllable boundaries are is by taking advantage of the fact that syllables never span multiple words. In response, a lot of you brought up situations where syllables do appear to span multiple words, but all of your examples seem to me to be great demonstrations of the fact that what doesn't doesn't count as a word is really fuzzy. In fact, when you ask questions like where are their boundaries, and are they even a universal feature of all languages, the answers are even more complicated and contentious for words than they are for syllables. Now, if I'm going to be talking about what doesn't doesn't count as a word, I'm going to have to bring up the most famous example of people just completely getting it wrong. The myth that the Inuit have hundreds or thousands or whatever different words for snow. This idea brings up the really interesting fact that different languages do, in reality, have different numbers of words for different things, and that examining these differences might actually teach us about our different cultures. But that's not what I want to talk about today. Because the idea that the Eskimoalute languages have radically more words for snow than us is false. In reality, they actually have roughly as many words for snow as English-speaking professional skiers do, which is to say, like, 5-ish. More than most of us, but definitely not hundreds or thousands. So, how did this myth start? Well, it comes from a misunderstanding of how the grammar of these languages works. These languages have rules that allow you to easily combine smaller words into larger words and therefore tend to use a few big words instead of many smaller ones. These languages are called polysynthetic, and sometimes have individual words that can occupy the rules of entire sentences in English. I'll use Spanish as an example. Spanish isn't polysynthetic, not even close, but it does tend to contain more information per word than English, so I think it's a good place to start. Many of you might know that Spanish verbs take different forms depending on who's doing the thing. For instance, the Spanish word correr means to run, but aran is yo corro, yuran is tu corres, we ran is nosotlos coremos, and so on. We do this in English too, but not nearly as much. In the present tense, we only have two forms for the verb run. Runs, as in he runs for when there's a single third person doing the thing, and run for pretty much every other present tense situation. Now, in Spanish, because you can tell from the suffix a lot of the time exactly who's doing the thing, the subject pronoun is just dropped. Like yo corro is actually kind of redundant, because you can tell from the oh ending that it's me doing the running, so Spanish speakers will frequently just say corro. And if you can just incorporate the subject of a sentence into the verb, there's no reason you can't do the exact same thing for the object. For instance, Arabic has different forms of the verb to like depending on who's doing the liking, just like in Spanish. But in the same way, they have different forms of the verb depending on what's being liked. This way, in Arabic, you have single words that all in one encompass the subject, predicate, and object of a sentence. Now, most languages have rules that change the verbs around depending on the subject and or object of a sentence, but polysynthetic languages like the Eskimoalut languages take things a step further. They have rules for combining adjectives into the nouns they describe, and for combining adverbs into the verbs they describe, making them all one word. So if you want to talk about soft snow, you wouldn't take the word for snow and the word for soft and just put them next to each other, instead you would combine them into a single word, soft snow. But here we get into a pretty tricky problem. What exactly is the difference between soft snow and soft snow? I think it's obvious from what I've described that different languages have very different relationships with words, so how do we even know where words begin and end if different languages treat them so differently? An easy way out would be to just look at the writing and look at what the spaces are, but for one thing, not all languages are written down, and even some that are don't write the spaces between words, so unless all of these languages just don't have words, we're gonna need some other criteria. Another common definition I hear for words is that a word is the smallest unit of language that can be said on its own and people will have some idea of what you're talking about. This is often contrasted with the idea of a morpheme or the smallest unit of language that has meaning. For instance, the English word hats is composed of two smaller morphemes. The morpheme hat, which refers to these things, and the morpheme ss, which tells us that there's more than one of them. Now, if I say ss all on its own, you won't think to yourself, oh yes, this is the thing that indicates pluralness, not in the same way that I can say hats, and you'll think this is the noise that refers to these things. So hats is a word, and ss isn't. But this definition gets a bit problematic when you look at grammar words. Like, I can say the baseball, and you'll think of a baseball. But if I just say the, what is the thing that you think of? Maybe you think of some abstract concept of the-ness, but in that case there are a lot of non-word morphemes that I can say where you'll have a similar reaction. I can say anti, or pre, or ness, and there's a good chance I'll be able to communicate some vague idea of oppositeness, or beforeness, or ness-ness. This gets into the issue of what it even means for something to mean anything, which is a weird philosophical topic that is difficult to discuss and even harder to think about, and I don't really want to get into it all right now. So are there any other criteria we can use to establish wordiness? Well, there are two big ones. The first is that words tend to be able to move around relative to each other. So one of the reasons we know that the is a word, and not a prefix, is because I can say the red baseball, or the tenderly cared for baseball, and it works just as well, even though the word the is several leaps away from the word that's describing. With suffixes and prefixes, you can't do that. The second criteria is that within a word, different morphemes will often affect how others are pronounced, way more than just between words. You see this in English with the suffix that makes things plural. In the word hats, it's pronounced as an S-sound, but in the word chairs, it's pronounced more as a Z-sound. What makes the difference is that the morpheme hat ends in an unvoiced sound while the word chair ends in a voiced sound, and the suffix changes its voiceness to match the sound that comes before it. This fact that the suffix changes its pronunciation depending on its surroundings indicates that it's probably a suffix and not a word. This is also one of the big reasons we think that the Inuit terms for different types of snow are single words instead of nouns followed by adjectives. When you take a morpheme for snow and a morpheme that describes the snow and put them together, all kinds of stuff will happen at the border between the two. The sounds at the border will change to be more similar, or they might merge with each other, or they might just get dropped entirely. All of this happens according to some very complicated rules that I spent way too long reading about before I just gave up, so there's a link to a paper about in the description if you want to know more. Anyway, these three criteria are usually good enough for telling where words begin and end, but there are cases where it still gets really fuzzy. In particular, there is a notorious group of morphemes called clitics. The go-to example is the English s-suffix, not the one we talked about earlier that makes things plural, but the one that indicates possession. You know, like how Bob becomes Bob's in the phrase Bob's House, or Street becomes Streets in the Street's Name. The weird thing about this morpheme is that it's pronounced like an s after un-voiced sounds, and like a z after voiced sounds, just like the plural s-suffix. But I can say the cat's toy, just as easily as I can say the cat that lives in the garage's toy. In this way, it looks a lot like the word the, in that it's describing a word that's several words away. But it's still being pronounced as a z-z sound, because it comes right after the z sound in garage, not like an s like when it comes after the word cat, suggesting that it's just one morpheme inside the word garages. Things like these that are pronounced like they're not their own word, but move around as if they are are called clitics, and there are big reasons that what makes a word a word is still an ongoing debate today. Alright, so that makes two videos in a row titled What Even is a Blank. So might as well make it a trilogy. Next time, I'll be skipping up the ladder of linguistics a little bit, and be asking the question, What Even is a Language? See you then!