 Hi, I'm Shivansu Rumanian and I'm just presenting our paper X-Outline Gen which is Cross-Lingual Outline Generation for Encyclopedic Text in Low Resource Languages. So the main motivation behind this paper and basically Cross-Lingual Generation engine is that a lack of existing internet content in your low resource languages. So this is a graph which is made from Wikipedia articles in these low resource languages like Bengali, Hindi, Malayalam, Marathi, Oriya, Punjabi, Tamil and English which has a high resource here. And we just try to observe within the Wikipedia pages out of all the references which is given at the end of each Wikipedia page how many of them are actually not in their resource language, not in any other LR language but actually in English. And we observe for most of the domains, across most of the domains, across most of the languages around 60 to 70 percent of on average around 60 to 70 percent of all articles in these low resource languages have all the references in English. This highlights a basically disparity between the amount of information content available in the different languages and it motivates the need for cross-lingual generation. Another main we are in this particular paper in this particular problem we are trying to build on top of the existing x-weekly gen problem which was basically that given a section title or given an outline in a particular language and a set of reference URLs for that particular Wikipedia article we try to summarize a multi-document cross-lingual summarization to regenerate the article. Now this particular problem requires we are trying to smoothen the process of automatic generation of Wikipedia articles by also trying to by also now generating the outline on its own on our own. So x-outline gen is an extension of x-weekly gen in which we are trying to generate the outline just given the section type, the article title and the set of reference URLs. Before we go on to the definition of the problem, just an idea about the data set x-weekly ref was a data set which was introduced in x-weekly gen and it has around 70,000 articles of 8 languages, 7 of which were Indian languages which are low resource in nature and by domains. x-weekly ref 2 have 92,000 articles, it has an extension of x-weekly ref and it has Telugu in Kannada as additional languages and animal cities and companies as three other domains. Just an idea about how our problem statement is going to look, we will have citations and article title in the target language as our input. So suppose our input is Amitabh Bachchan in Hindi and we have all the citations which are present in that particular Wikipedia article and what our key goal is to generate the final outline for that particular article title. So our first step is of course to summarize all the citations which are present because the number of citations there is no guarantee on either the number of citations or the language of citations and we cannot have 100 documents spanning 1000 sentences and feed that to one single model unexpected to properly summarize it or generate the outline as required. So what we will do is have an unsupervised extractive summarization unsupervised because this middle part is born out of necessity is not supported by any data set and hence we cannot supervise it without with the current resources. Then we will pick out the top care relevant sentences which acts as our extractive summarization. Then we pass it to an abstract summarization model which will have the words which help in reinforcing and nudging the model in the correct direction and that will give us finally the generated article article. For the extractive stage we use a hyperanx. Hyperanx is an extractive unsupervised, neural unsupervised extractive summarizer and what it does is it creates a hierarchical graph, a center which has 10 level nodes and section level nodes. For the sentence level nodes it computes it by getting a normal encoder representation and for the section level node it takes the main pooling of the section level, sentence level representation to get the section level representation. It has the connection between these particular nodes are based on both in similarity computation and asymmetrical edge waiting. The similarity computation is done by a cosine similarity and for asymmetrical edge waiting before that this is just a small part that there are two small important part is that there are two different types of edges that we are defining. One is intersectional which means within the section each sentence is connected to every other sentence and another is intersection that a sentence is also connected to the section and only one sentence is connected to all the other sections for the computational process. Now asymmetrical edge waiting essentially tells you that the weight of edge from sentence node 1 to node 2 is not equal to the weight of edge from node 2 to node 1. This is done because of the boundary function which is defined which is on the heuristic that a sentence is close to the boundary of sections are deemed to be more important. Based on all these factors, the importance for it is calculated based on your intersectional and intersectional weighted linears weighted average and then you can greatly select top key most important sentences based on this importance calculation. This is all based on charter summarizer where our input is again a top case equation sentences and article title and we give this particular input to our agent. A agent's job is to figure out the best action to figure out an action to take. Based on that our model selects an action A which is basically the output of the model and it gives it the environment. The job of an environment is that given an action and given a state figure out the reward that I am going to get. We have two different kinds of rewards section title compatibility and entity correctness. Section title compatibility is a linear classifier which we have trained which tells you how compatible is the generated section title to the section content or in this case the article or the citation content. Entity correctness essentially is another reward which we have defined which is there to reduce the hallucinations and to ensure that any entity present in section title is actually present in the section content as well. The environment then gives a reward which is given to the agent and in this case the model our main generative model is updated based on this reward. We use cross-lingual sector set transfer models and as I generated models either MP5 and M.BART and we experiment with both because of the cross-lingual capabilities. The transfer models are then used to train with the rewards in an idle setting to get the best possible results. So for results first we have experimented with M.BART across all domains across all domains and languages and we get an average Rouge F1 score of 0.43. Similarly on MP5 we get an average Rouge F1 score of 0.48 across all domains and across all languages. To reiterate our contributions first we have contributed the data set xxf version 2 which is a more detailed version which has more languages and more domains and more articles in general over the xvkref data set which is a cross-lingual multi-document summarization data set. Secondly we have also proposed the task and motivated the task of x cross-lingual outline generation from Wikipedia citations and we have trained our models on that in an idle setting and completed this course and presented this course accordingly. For any doubts I am the corresponding author and the details having mentioned in this slide. So yeah that's all and thank you.