 Okay, nice to be with you and I will present our work also on dialects, so following up the previous talk. So I will talk about the Syrian dialect. So you know, I come from Birzet University, Palestine, Ramallah, nothing as that, so a little easier. And our group is mainly concerned of developing natural language resources. I don't know this now, so we develop a lot of resources, like we have a lexicographic database of 150 lexicons and Arabic wordnets and Arabic ontology, we have also a set of corpore that we developed for dialects and other things, so it has been covered maybe. So if you visit our website, this is the QR code for our resources, then you can see all of our resources, so this paper talks about the Syrian dialects, not only one dialect, so we did a set of Syrian dialects, and you know, there are different kinds of Arabic, and this work comes here. So first we did the Qurras Palestinian dialect, we did the Lebanese dialect, and then we did actually Lisan, Lisan is set of four dialects, Yemen, Iraqi, Libyan, and Sudanese dialect, we did it in collaboration with the UN and with the American University of Beirut, and we also now we came to present you the Syrian dialect. Actually yesterday, no the day before yesterday, I presented Lisan and we get the best paper word in another conference. So what do we mean with, what do we mean with, we did a dialect, what do we mean is actually is like this, we take away everywhere, so first we collect a corpus, everywhere in this corpus we first segment the word, we say these are the set of prefixes, this is the set to the stem suffixes and so on. As you see here, this is the word, haik, so haik, so this is the prefix and the type of the BOS of the prefixes, can be more than one prefix, then the stem, then the suffixes and their BOS, then we have the BOS, and then we have the lama, and we have the iglos in English, maybe some of you are very familiar with this kind of annotations, but I see some people, when we say we annotated the corpus, they mean a sentence and we put a class, we are not talking about these kind of annotations, we are talking about morphological annotations. If you visit this site, you will actually see 1.3 million tokens annotated, as I told you here, here, so the Balistina is 56,000 tokens, Lebanese 10, the Syrians is 60, and in total there we have about 1.25 million, so in total we have about 1.35 million tokens, this is really massive, and people who are familiar with this work can imagine how much everywhere it takes time to do, and the accuracy is also, so these are the Syrian dialects we did and their numbers, Dimashki is the Shammi, we have actually 17 Shammi here, I cannot read in the laptop, wait a second, I cannot use this mouse also, anyway, so these are examples, so how do we collect the data? We actually, it was social media and also poetry and zajjal, you know, collections, so it was very hard to reach 60,000, but because we did more dialects, it became easier for me now, we know how to do it, so even some Palestinians were helping in the annotations because they know how to really do it and if they don't know, they will ask somebody from Syria, actually Amal did most of it, and by the way, I'm also very proud that I did 1.4 of this myself, so that annotation, I enjoyed, sometimes when you have free time and you want to enjoy, you want to learn something outside the research, so you go and you annotate the data, so I enjoyed it. I will not talk much about the guidelines because it's clear, so these are the statistics about the Interantert agreement, I believe, and I think that's it I wanted to say because I can have more questions, so in total the Levantine corpus is 125,000 tokens, so together, there are a lot of lessons we learned and we compared between the dialects. I will take the two minutes, just questions, it's better, yeah, thank you. Questions? I could have a question for you, so many times when people do collections of data for the dialect, there's this bias or tendency or maybe gravity pulled toward what I would call al-Ajib al-Garib, meaning the strange and wild in the dialects, as in when you think halabi or mardini or latake, you're thinking what is the thing that is the most like that as opposed to what do people actually speak, because there's a difference, there's the marking aspect and there's the aspects that are not marked. I'm not making a call of judgment of what is good or bad necessarily, but I'm curious, what did you do? Were there decisions made of how to select the data or were data excluded because it looks too hard to tell that it's from that place or some other decision? How did you make the decision to select? Yeah, it's actually, it's Amal who did it, she is Syrian, I don't distinguish between halabi and not halabi and well, halabi maybe, but some dialects in Syria I don't really distinguish, but so Amal, she was careful in judging these things, nothing more. But there was no like filtering step of saying this came from let's say a halabi website, but it did not look halabi enough. So dictionary, there's a butyutri, so that's this local. Got it, like what they did with Karas in fact, where you also could get something to say. Yeah, hi, how did you decide how many local dialects there are? So for example like two cities, the difference between their dialects can be just a handful of words instead of, you know, these things I understand new, actually we face this problem between Palestinian and Lebanese even more. So it is not easy to decide, it's the way that sometimes you collect this book or this post about from this local area, you say it's this in Lebanese, we thought when they say ultilhin, ultilhin, that it's female. Actually half of Palestine say use ultilhin. So from Nizarith, Nassirah to the north, it's all ultilhin. So it's like Lebanese. I even sent Karit El-Ghajar, if you know, I sent a recording to my friend, the Lebanese, and he couldn't distinguish whether this is a Lebanese in the south or a Palestinian. So it's continual. So there's no really way to really decide. Thank you. And this links nicely to Amr's talk in fact of multi-dialect labeling, healthy reference. Tell you I forget actually to say there's a tool that we developed that if you want to digitize or rotate a dialect, it's so smart, it's so handy. I mean I spent many years developing this tool so it's good if you want, we can share it. So