 In reading, the preliminary analysis of the input signal involves using information from written language. To understand this and other issues, we will first look at the writing systems that we know, in particular the two main branches, logographic and phonographic writing systems, and we'll then concentrate on variation in the written signal. Well, you all know that not all languages of the 7000 attested languages have writing systems at all. In fact, only 15% of these languages know a writing system, and not all of these writing systems encode information in the same way. Early systems emerge from symbols that represented animals and other images. Modern writing systems can be subdivided into these two central branches. On the one hand, we have logographic writing systems. Here, for example, you see typical Chinese characters. This is a modern logographic symbol, and here we have an old Sumerian symbol that was used a long time ago. The other main branch is called phonographic writing systems, and there are several kinds. For example, here we have Egyptian hieroglyphs, or these are our alphabets that we know, typical alphabetic writing systems. Well, and this here is a so-called syllabic writing system, which we will look at later on. So these are our main branches. Let's first of all look at logographic writing systems, and logographic writing systems make use of symbols that represent words or concepts. Well, the word logos in Greek means word, idea. The shape of the symbols in logographic writing systems is often closely related to the meaning of the respective concept. Well, there are three variants of logographic writing systems. First of all, we have the pictographic writing systems. Now, in the earliest logographic writing systems, the relationship between symbol and object is clearly visible. Such variants of logographic writing systems are referred to as pictographic. Pictographic, and one of the earliest examples of such a pictographic writing systems is that of simple Sumerian pictographs. So this one here, for example, this symbol stands for mountain, a symbol that is closely related to its concept. Now, the next type is called ideographic. And in ideographic writing systems, the logographic symbols are also related to the concepts. However, this time in terms of ideograms, unlike pictographs, the shape of the ideographic symbol is not a direct representation, but an associative representation of a concept. Ideographs are often mixed with pictographs. So for example, here in our Egyptian hieroglyphic system, we have the ideograph for horse as compared with the ideograph or as contrasted with the ideograph for donkey. Well, the final system of the logographic writing system, which uses true logograms, uses abstract logographic symbols whose meaning is no longer identifiable from their shape. Such symbols, for example, we could take, well, first of all, let's write down logographic. For example, we could take, well, symbols such as these where percent, of course, where the logogram is no longer related to the meaning. Well, the typical examples are, of course, Chinese characters, where here we have the symbol for human. Now, logographic writing systems have to be contrasted with phonographic writing systems. Here we have the three main types of phonographic writing systems. First of all, we have the so-called segmental system. Well, this is the oldest way of showing the pronunciation of words where only selected phonetic aspects. For example, here the symbol of bird represents the combination in the Egyptian system of Egyptian hieroglyphs. Vowels were not represented in writing. They had to be added during the reading process by reference to syntax and semantics. Another variant is this syllabic variant. Now, in a syllabic phonographic writing system, the pronunciation is represented by entire syllables. Syllabic writing systems are very old. One of the oldest syllabic writing systems is the one shown here. It is called the Crete Linear B. The most advanced and historically most recent technique makes use of a limited inventory of symbols, where ideally one symbol stands for one sound. Such systems are common among the Indo-European and many African and Asian languages. And these systems have become known as alphabetic writing systems. However, there are several alphabets around. For example, here the first one is of course the Roman or Latin alphabet. The next representation is the Cyrillic. Then we have the Greek alphabet. Further examples would be the Georgian alphabet, the alphabet used in Hindi. And last but not least, here represented in red because it's the most important alphabet in linguistics. The international phonetic alphabet as devised by the International Phonetic Association. Well, so much for the writing systems in general. Now, like speech, the written signal, whether logographic or phonographic, is highly variable. Characters can be written using several writing techniques, handwriting or print. And the shapes of individual characters can be varied considerably. Well, look at this example here. Now, the Roman character A can be represented in several ways. Nevertheless, we identify these differences with relatively high precision. Well, among the possible aspects of variation, we have about 3,000 very common typefaces that you use on computer keyboards and typewriters. Let's look at at least two variants that can be distinguished here, namely so-called serif versus sans serif characters. Let's look at an example. Now, this is a typical serif character where all types have small strokes at the very bottom. You see, these things here are at the top, these little serifs. Well, and of course, you can select between several types, times Roman, or here we have Garamond, Curia, Bookman and many more. Now, let's contrast them with types that have no serifs, that are sans serif typefaces. Well, here you are. The most common sans serif type is the aerial type. Verdana is a well-known one, which is used on the web. And here are further ones, which are less common, Hattenschweiler and Bretanik. Well, these are just two examples of variation in written language. But there are further problems. Now, here is a text, a short passage, a printed passage, that involves some typical complications of character identification. Well, let's look at this result, which could be a result of a possible scan. Now, this capital letter A exhibits a very well-known phenomenon, namely the phenomenon of a so-called broken loop. Now, the loop would be here, and this loop is broken. There's a gap. Oops, now it's too far, perhaps I'll make it larger. So, this would be the loop, and here is the broken loop. So, that's one example. Well, another typical phenomenon of printed written language is that characters may be too close together to be identified as one. So, the phenomenon of characters that are touching. Well, then a well-known phenomenon is that of what people call noise. Well, it's of course not noise, it's some sort of ink that is spread around characters and complicate the recognition of characters. Well, a very interesting phenomenon can be found here, where the entire baseline is not available. And again, a complication for character identification and a typical example of variation in the written signal. Well, these are just four examples of problems. You see, here is another broken loop, a loop problem in the R, and so on and so forth. Well, how we can cope with this variation in the written signal will be explained in the e-lecture cubes in the written signal.