 are similar, they will also predict similar values, which means, so if you have factors that are maybe not all that reliable, because we have probably seen relatively few occurrences of these guys here in the text, then what it will do, it will try and push those closer to more informative factors and more issues, I guess, for example, in the Seychelles, both island nations that are relatively well tested in Coppola, okay? So there is something like a cultural clustering effect here even. And then there were three really bad predictions. So Vaskin City is here, and Greenland is here, and Canada is here, I mean, with Vaskin City and Greenland, you might believe that they occur infrequently and maybe not a very informative context or something. To be honest, I have no really, no real idea why Canada is completely out of the, you know, the world. Are there other Canada's, like a city name, Canyada, I don't know. Yeah. No, but in South America. Yeah, that's not even, that's what I'm talking about. That's actually in the video, in Spanish, maybe it's like Canada would be like Cander, something like that, I don't know. Right, that's possible, that there were some Spanish taking confounder which makes it a place, makes it sure enough here in Venezuela, okay? Or Verde is something like Greenland, would it do this? Or I don't know, yeah, I'm not in this field at all, but could there be Greenland in some translation that? Is it data, are your data English only by the way? Tierra, Verde, all right, so it's only, there's no translation possible. But of course, I mean, It could affect it, contaminate it. Well, I mean, these are so and so many billions of billions of words, so there's always a chance, you know, I mean, they will automatically language filter that stuff creeps in, and although, I mean, language detection is essentially, you know, 98, 99 percent. Could there be something mathematical more, like, you know, where the longitude or latitude is specified, you know, like AMPM kind of thing, right? Where you're saying six, and it's always from the context that it's PM, but, and it's all mixed up. Well, I mean, so the funny thing is that, you know, I mean, the East, for example, for Canada, the East West, which I never, I never know which one this, which, you know, which one is which in English. Anyway, so the East West coordinates is relatively good, it's not so clear why the, what happened to the, to the north, south of Canada, then this means Canada seems to, I mean, actually both Canada and back in Sydney and New England seem to share, share, you know, distributional behavior with more, with more tropical countries. But by the, they're just a continent of the independent variable. That it has to do with the, with the continent, so it's still in America, although they're at the right place, huh? Not from America, so. Yeah. I mean, we weren't really able to figure out what went wrong with Canada here, but there we are, okay. Now, so, so what at the end of the day that determines prediction difficulty? Well, there is, there is a technical issue for, okay, particle attributes, you know, that we cannot predict unseen values, which is, of course, very, very unlikely, so it's a lot of work. We did change the, change the encoding and in the interest of time, I'll just, I'll just keep this at this point. Then there's always initial language processing that we have data sparsity. I think Vatican City just, just is seen so, so rarely that, you know, the world doesn't really know how to, how to deal with that. And so here is an interesting question. This brings me back to the difference between the GDP per capita and the absolute GDP, okay. So, you know, because I believe if you're trying to learn these attributes from just this very coarse-grained occurrence of text data, then the important question to ask is for an attribute, you know, does the distribution context that you could actually provide enough clues for you to make that prediction? And I said, you know, actually the kind of common denominator here is that countries that are similar with respect to an attribute should have similar distributional profiles. So they should occur with the, with the same context words. And I mean, if you look at countries, you know, with the same GDP per capita, I think this is a pretty reasonable assumption. So, you know, you have kind of countries like Luxembourg and Denmark or something at the top and they will occur with words like, I don't know, modern and rich and, I don't know, those stuff like that. And then you have, you know, the really poor countries in Africa and they will then occur with, you know, words like famine and poor poverty and relief and then words like this, okay? Now with GDP nominal, it's completely different because, I mean, Luxembourg, for example, although it is a very rich country, there are so few people there that the absolute GDP is still relatively low and so we probably have Luxembourg then with a similar value to predict like, I don't know, what's like Bulgaria maybe or Turkey or something like this. And this, of course, is a prediction that's much, much harder to make from there. I mean, I think that also the government system, part of the GDP per capita is just, GDP per capita makes no sense to just detect. I mean, some, but it's, I mean, I think there's a profit of the essence. No, but I think from a prediction point of view, the interesting thing is, of course, that you have a deterministic relationship between GDP per capita, GDP nominal and population, right? So if you can predict two of the three, you should also be able to predict the third. But again, of course, there's a deterministic tendency that's not so easy to predict. But as a country, you say that the GDP per capita is next to the population, right? Do you get the nominal for, or is it China, for example? Yeah. You can't have a huge GDP nominal. Well, yeah, I mean, I'm always a little bit cautious to just say that's an attribute that I don't want to predict because, you know, as somebody who does modeling, I'm always predicting the services to somebody else. And if someone comes and says, I want GDP nominal, then of course I have to have a story about whether I can or cannot predict that. Do you do a sort of basically waiting, or, I mean, you're going to have United States occurring much more, and now I'm talking about the Canada thing, right? It's sort of countries in the Western Hemisphere that are not the U.S., right? They're all to the south. And my point is that the U.S., you have these problems also with English being, you know, 60% of the web, or it's less now, but something like this, where it just throws everything up because you have a studio dataset. And can you do re-sampling, for example, in your data where you, so in order to even these out so that Maldives is mentioned as many times as U.S.? Well, we didn't look at that specifically in this study, but kind of the general, my general experience from distributional semantics is, well, if you're trying to sub-sample, for example, to make a number of occurrences, to even out the number of occurrences, just lose. I mean, you take all the data that's there, and I mean, here, this, I mean, as long as, for example, your values are normalized, right? You normalize your vectors so that the absolute length of the vector does not really matter, you know. I mean, essentially, what you have is a vector that you're more sure about because you have seen more occurrences and one that you have less confidence in because you have seen it fewer times. But I mean, specifically, but for the purposes of the prediction that shouldn't really matter to them. I have a problem in my own project, or I had a problem, which is that you basically, let's take currency. Well, I should just predict Euro, right? Because there are dozens of Euro countries where is the chance that I'm right? I may be close, and let's say I have two candidates for Armenia, drama and drama, okay? But it's just too risky. I'd better just say Euro than to choose drama and drama. What I mean is that the system learns that, and the system just starts putting Euro for everything and it's because it achieves a higher score that way. Is it, are you making the same analogy to everything in the Western Hemisphere is south of the US? Yeah, I guess, I wasn't really good with that, yeah. I mean, I mean. But if I just put it down there somewhere, I'm more likely to be right? Right, I mean, you know what I kind of work on, right? Where I have only three categories, just good, medium and bad, well, the system, unless you do something to stop it, it's just going to say medium, medium, medium. Yeah, basically, without really thinking because it says, this is actually safer. 60% of the data are medium, so I get 60% correct. Yeah, yeah. If I do anything else, it goes down. But those are two different questions, right? I mean, what is the question of your, of the distribution of your output classes? Right. And how would you feel about this there in the end and if you talk about the number of occurrences of US versus Maldives or something, we're talking about the input features. True, true. So, I mean, I guess, I mean, if you, so if I just come back to the output features, I mean, if you make the numeric predictions, I don't think you don't get those majority effects quite as easily as, because what you're optimizing is not kind of zero one loss against the majority class, but it's more like really distant from the ground to truth absolute value. With the binary attributes, I mean, this is totally possible. And as you saw from the baseline, you know, the majority class baseline results, you know, like false is almost always true because if you have these categorical attributes and you finalize them, then all of them, one, typically or very, we also have one too many relations, but almost all values will be false or given. You also said you're using your network, does that take care of the input in there? You're using? Not per se.