 Hello, I'm Brent Halpern, and I am the scientific director of the A.I. Horizons Network, and this is our weekly seminar for A.I. Horizons. This week we have Abhijit Mishra from Aviam Research in India, who will be presenting a version of his triple A.I. paper on supervised controllable text formalizations. Abhijit is in the Bangalore India Lab, working for A.I. Tech. Part of that he was a PhD scholar at the Department of Computer Science and Engineering at IIT Bombay. So without further ado, Abhijit. Thanks Brent, and hello everyone. This is Abhijit, and I've said in introduction I'm going to be presenting unsupervised controllable text formalization. So before I start, I'd like to acknowledge my co-authors, Parag, Jain, Amar, Azad and Kartik, who are also part of IBM Research India. And you can check out for more information in my URL, that's abhijitmishra.github.io, and you can download data sets, code and other resources related to this work from the given link. Right, so let's begin then. So most of us are, I think, familiar with this famous robot, T.A.R.S. from the movie Interstellar. For the uninitiated, it's a robot which sort of behaves like humans and is configurable to some extent and it assists humans for the space, in fact, the extraordinary interstellar mission. And this is depicted in the movie. One thing that distinguishes this robot from the traditional chatbots or robots is that it is configurable, like as you see in this example, we can control the way we want this robot to behave. For example, if the humor is 75%, then the robot, not only in its wards, but also in its action, tries to be humorous. Whereas if you reduce it, then obviously the humor quotient reduces and the content of humor in the conversation also reduces. So configurable mind is actually something that is a key goal of strong AI, which is artificial general intelligence, the objective of which is to produce systems that behave in the similar manner as human beings. So in a way, if we produce systems that are controllable by users at runtime and they're able to respond to our queries or inputs in a controllable manner, we are getting a step closer to the principles of artificial general intelligence or strong AI. So this is a scientific motivation behind our work, which falls under the umbrella of natural language generation. NLG or natural language generation is a branch of computational linguistics that deals with generation of natural language texts from unstructured or structured data in textual or non-textual forms. So to say, if we have to categorize it, there will be two categories. One is text-to-text energy, under which there will be tasks like machine translation, summarization, document paraphrasing, text simplification, and text-style transfer, as opposed to data-to-text, which deals with summary generation from data, which will have, obviously, implications in real-world scenarios, like you want to summarize, let's say, patent information in clinical context where the lab reports are in tabular form and you want to generate summaries, et cetera. And of course, it also encompasses other forms of generation problems like persuasive text generation and story generation from events, where events are a sequence of data given in tuple format or any other format. So this is an overview of natural language generation. One of the sub-tasks in natural language generation is text-style transfer, which is sort of the key objective, sort of the key task that we are focusing in this particular work. However, as a team that targets energy problems, we have made our hands dirty with text-to-text and data-to-text problems in IBM research. So text-style transfer, what is it? It is transferring textual content from one stylistic form to another form without disturbing semantics. So to some extent, you want to preserve the semantics that is given in the input. Do not have to have a lot of topical drift or semantic drift from the input. If we have to use style transfer in a three-dimensional manner, we have, in one axis, we have the linguistic artifacts, that's lexical, semantic, syntactic, pragmatic. For example, you could transfer style at lexical level or semantic or syntactic level, or you could have transferring of styles at a perceptual level, which will deal with tone, formalness, sentiment, emotion and complexity, as well as you are also going to be able to vary styles by varying the domain. For example, the tone in finance domain could be different from healthcare. So the domain also forms another axis in this view. To give some examples, let's consider this sentence. The movie is terrible. You could transform it into this sentence. This is messy, uncouth, incomprehensible, vicious and absurd, which dealt with lexical level transformation. In a perceptual level, it dealt with sentiment intensity variation. It made the sentence more formal and made the sentence has a higher level of complexity. This is, as opposed to another form of transformation for the same sentence, a somewhat truly constructed and hence quite an unwatchable movie it was. So it deals with syntactic transformations as well as semantic in some sense and it is to some extent formal but not very formal. As opposed to another transformation, which is you sit through these kinds of movies because the theater has air conditioning. This deals with pragmatic artifacts. For example, air conditioning being good is not a criteria for a movie being good. So this deals with world knowledge and not present in the input text and it also varies sentiment intensity but to some extent it is informal. So as we can see, style transfer can happen at various levels and it can touch upon various aspects of text generation. So controllable text transformation, the system view is like this. So you have an input text and you would like to transform it into another version and you should have power over controlling the system. For example, you could have control specifications in terms of wording, sentiment, word count, formalness, politeness, etc. You should be able to provide these things as input at runtime and based on the inputs along with the text, the system should be able to produce a transformed version of the text. So this is the overall view. To build such systems, traditionally what has happened is that in text generation, people have used a large amount of parallel data or paired data where the input output data points are available. You take them and you train machine processing systems or sequence to sequence systems or generation which relies on this kind of supervision. However, in controllable text transformation, it is quite hard to build data sets which will sort of encompass all sorts of possible scenarios. For example, if you had to pick sentiment and then syntactic level transformations for healthcare domain would have to build a separate data set as opposed to if you had to deal with some other form of combination. It is possible that for each combination, for each use cases, building supervised data sets or labeled data sets for supervised learning is not a desirable thing. And also when you are trying to make the system controllable, then you are also dealing with the scenario where you have to sort of get control parameters as input in runtime. So any labeled data for that would also require the control information to be available which makes the data set creation part even more difficult. As well as if you had to create it by force, then it would be very sparse because you really need to have all sorts of possible variations of output for all sorts of possible control values. So it is very unsustainable. So this has motivated us to work on unsupervised controllable text style transfer and the key use case that we tackled here is unsupervised text formalization. In style transfer, very recently because of this huge explosion of deep learning and reinforcement learning, a lot of systems have been proposed. Key systems include unsupervised machine translation systems, style transfer using non-parallel text. That was a classic NeurIPS paper published a couple of years ago. Then we have this sequence to better sequence idea where you are really transforming a sequence into another sequence by virtue of some external signals which come from NLP systems. This is one of the closest systems that we have as baseline in our experiment. Then there is work on controllable text generation and paraphrase generation, but all of them are quite recent and they are to some extent based on neural paradigms. Text formalization, formal English is often used in a serious context. That is the definition and for example, we see them often in official documents, books, news reports, etc. As opposed to informal English which is used in everyday conversation. Formal text often is carefully edited and often is longer and is more complicated. So to some extent if I have to summarize the readability grade or the educational grade required to understand formal text is typically higher. This is a key observation that we utilize in our system so that I just wanted to remind the listeners. Existing works in formal text generation comprise text generation using heuristic based approaches and NLP systems. These are relatively older systems. Very lately, polite conversation generation engines have been proposed using sequence to sequence variants and formal to informal and informal text classifier systems are also proposed. So controllable text formalization, why is it important? As I said controllable text generation is a key problem which has both scientific as well as industrial or practical merit. And especially controllable text formalization is very relevant in NLG applications. For example, formal conversation generation, automatic email response composition, summary generation in regulatory compliance domains which are difficult domains by the way. And we can also use this module of controllable text formalization in computer assisted generation systems which is similar to computer assisted translation systems, which is a huge business in especially in Europe where translations are often produced with the help of such systems. So apart from automated systems, human assisted systems are likely to leverage such modules. So it forms real world, it's really real world problems, not just a fascinating idea. So existing literature, what are they missing? Typically what we have seen is that the literature, the works do not have the ability to accept control parameters most of them. And they do not, they are not extremely aware of different artifacts that are related to controllable text formalization. For example, fluency, one has to ensure that the fluency of the output text is maintained as well as it is related semantically with the input and it is more formal. So these kinds of existing systems are not trained with such objectives. Right, so our triple high paper was on controllable natural language transformation and we focused on text formalization. The degree of formalization control is given as input during runtime. So the system would pretty much look like this. So if you had a phrase very big building and if you have to transform it into formal, more formal versions, you could control the way you would want to transform the input phrase. Some key features of our approach or work are here. So our work is, our work employs an unsupervised training scheme and thereby handles the invisibility to annotate data for each input output control instances. It really preserves the language semantics. That's one of the key goals. And for learning in an unsupervised setting, it takes the help of the NLP modules which just do the job of scoring and validation of the output. The control degree, yeah, so it also facilitates to control the degree of the intended attribute desired at the output. And we can show that with little bit of tweaks, it can be upgraded to include multiple control inputs as well. So here is the central idea. What happens is that we begin training of our system by using sentences from unlabeled corpora. And let's say we have an initial model, which is going to randomly produce a version of, transform version of the input text. We have the first phrase, which is called exploration, which is responsible for generating more and more amount of training data along with the control value. As a result of which we have few sample paraphrases of the inputs from the unlabeled corpora. Along with that we have the control values. Once we sample such instances, we use these instances as training data to retrain our initial model, which is called exploitation. And at the end of exploitation, the model is sort of, it has got the knowledge to some extent about how to formalize text in a controllable manner. And if we keep doing this in an iterative manner, at the end of the day, we converge and have a model which is good at producing control variations of the input text. Once we have the model trained, during testing, we just give the input sentence and the control value, and we obtain the transform sentence as output. So I will delve deeper into the model now. So the model is not very unusual model. It is based on the classic encode attend decode paradigm, typically used in neural machine translation. Just that we have, we modified one portion of the system, that is the decoder, and we added one extra input to the decoder, that is the control parameter. Obviously, we want the control parameter to come as an input from the user. So the decoder is empowered to take the control parameter as well, which is not a very non-trivial step, by the way. We can just, with a little bit of tweak, add as many number of more inputs to the decoders as possible. The encoder and decoder modules are comprised of embedding layers and tag layers of recurrent units. This is also a standard setup, as we often see in sequence-to-sequence or neural machine translation systems. The training phase begins with the pre-training step, where given sentences from unlabeled corporators, the encoder attend decoder module is actually trained to perform auto-encoding, that it takes the sentences and learns to reconstruct the same sentence. This is done to have a better initialization as opposed to a random initialization of the model. And by the way, our decoder always expects an input, that corresponds to the control parameter. So here, during pre-training, we keep the control parameter as the default parameter. And we have some iterative, so the system undergoes pre-training for certain number of iterations, and we see that the losses are minimized, so we stop the pre-training at that moment. The second phase of training is exploration, where the system does not undergo any training. Instead, what we do is from the sample, from the sentences sample from the unlabeled corpus, we feed it to the encoder-decoder framework and we do not take the decoder output as such, because it is going to be the same, almost similar version as the input. What we do instead is we sample different variations of output from the decoder. That is done by typically sampling the distribution that decoder produces. Apart from that, what we do is we have another sampler module, which takes the sampling, sampled sentences and produces a paraphrase of the sampled sentence produced. And then what we do is once we sampled enough number of sentences, we score them for different language aspects. For example, the readability, the fluency and relatedness measures are scored for each sampled sentences. And then based on that, we select a sentence which maximizes all of these three scores. So, we have a scheme to do that. So, the sampler essentially what I described just before, just now that it samples K sentences with an objective to maximize the cumulative language score. The language scores, let us say it is given by GXY and the sampler is a function called sample KYG. The output of the decoder and we produce let us say K samples from sample K. We have to select a sentence which maximizes the score. And just to remind you, in our architecture, we use a very simplistic sampler which takes the sentence and produce only lexical variants of the sentence. However, as the state of the art progresses, one can definitely use more complex form of sampling strategies. For example, there has been some work on give sampling based sampling, text sampling, and then there are obviously variational lotto encoder based paraphrasers which can be used for sampling as well. Once we sample sentences, we need to be sure that the sentences that we are selecting are actually beneficial for our training purpose. So, sentences which maximize the GXY score are picked for this purpose. The GXY is given as a weighted sum of three different measures such as RS, RF and RD. RS is the semantic similarity between the input at the output. RF is the fluency or grammaticality of the output and RD is the readability grade of the output. So, the readability grade fluency and semantic similarity measures are actually computed using of the self tools. For example, one can use skip thought based models to compute semantic similarities or wordnet based similarities or any other traditional natural language processing based similarity measures can be used to detect semantic similarity. For fluency, typically we use language models. One can use n-gram language models or the latest neural language models for this purpose. For readability, there are many measures, most of them are heavily lexicalized. However, they are pretty good in measuring the readability grades. One such popular measure is the flesh king head readability grade and this is considered in our setup. Once we have this course ready, we do a weighted sum. The beta parameters are sort of decided while development by trial and error and we will disclose more details regarding the betas in the experiment setup section. So, why do we consider readability for text formalization? Formal text tend to require more language expertise to understand them. Higher is the formalness, more is the readability grade requirement. That is the observation for the data set, for our data set. For example, if you consider a formal sentence as the price of $5 was reasonable, I decided to make the purchase without further thought as a very high readability grade as per three different readability measurement metrics as opposed to the informal text. It was like five bucks, so I was like, okay, let's buy it. So, in a way, readability is related to text formalness and we do have such measurement systems. So, we thought of leveraging it. So, once we have sampled our sentences and selected the best sentence possible by virtue of the three different metrics, we determine the control values for the sentences. For example, if I have a sentence which has a higher readability grade, to what extent the readability grade is higher than the input sentence that decides the control values. So, here is an equation which gives a bucketing of control values based on the output. For example, the control value could be 1 if the ratio of the readability of the sample sentence is less than some threshold value, which is decided by trial and error. If it is between two threshold values, then the control value becomes 2 and likewise. So, we do have at the end of the day a dataset which is generated by exploration phase, which has this x, which is input sentence, y, which is the sampled output, and c, that is the control value. So, with this dataset, what we do is we now train our model. So, one can think of, okay, we have a model which has an encoder and decoder and a control enable decoder. So, we can straight away train it using traditional reconstruction loss or cross entropy loss. However, what we realize that cross entropy loss is not alone sufficient to ensure that the control values, especially the c, is taken care of by the model and it has a role to play in the learning process. So, we augment this traditional encoder, decoder framework with another classifier, which is trained in order to, with an objective to predict. If I give you two sentences, one is the original input and one is the sampled output, can you predict whether, and we have the input control as well, can you predict whether it, the produced sampled output actually conforms to the input control or not. So, this system is trained separately before starting the exploration phase. And once the system is trained, it is just used as it is while training our encoder-decoder framework. The encoder-decoder framework has two losses. Now, one is the reconstruction loss, which is how good are you in generating target sentences, which are linked, which are sort of related to the input and they are also related to the sampled sentences. Along with that, we are also checking whether the generated output has similar control values computed as given in the input. What is the relation between the input control value and the control values computed from the output sentences? If there is a disparity, then it incurs more loss. That is why it has, it is bound, the network is bound to learn to adhere to the input control specifications as well. So, why control predictor? I just explained it. We need it because cross-entropy loss is not alone adequate enough to tackle the, to decide whether the control values given by the user as input is taken care of while the learning process goes on. Right, so with this idea, we experiment, experiment our system like with the data set, around 14,500 unlabeled simple sentences. This is quite less. However, we do have promising result with such small amount of data set, some small amount of data. The sources are from Enron email corpus. The sentences are taken from Enron email corpus. The corpus of late modern English prose, non-SPAM email from SPAM data set and SS for kids. We have split the data set into 80, 12 and 8 percent trained validant test splits. And the vocab size is close to 100,000. The average normalized flex-kinked readability for the data set is 0.54. This is the normalized score which ranges from 0 to 1. And obviously the data set is available in this link. We have, in our core model, we have a bidirectional groove based encoder with two layers and two layers of unidirectional groove for decoder. The embedding dimension is set to 300, encoder hidden dimension is 250 and decoder is 500. This is all very empirically decided. The various parameters are also decided based on trial and error. Just to remind you that we have, because of the scope of deciding these parameters while developing, the system can be easily adapted to different objectives. For example, if you had to use the system for healthcare domain as opposed to compliance domain, the beta values can be changed accordingly. Similarly, for control parameters also we have arrived at these values of 1.05 and 1.1. And we conduct 20 cycles of exploration exploitation with sampling size of k equal to 100. For testing, we do have three control values as I showed in the equation. The value one corresponds to retaining the input as it is. The control value two corresponds to introduction of mild formalness into the sentence. And control value three makes the system highly, makes the sentence highly formal. For evaluation, we have three baselines. The first baseline is the sequence to sequence, sequence to better sequence system by Mueller. In this, the approach is that we have a traditional auto encoding setup, but during auto encoding the system also learns to take into account some scores produced by external systems. For our setup, we have FK readability scores considered for this. The baseline two is a non iterative version of our own system where we don't go, don't do exploration, exploitation in an iterative manner. The system just goes one round of multi iterative sampling like many times you sample and gather your training data and just do a one-shot training based on the data. This is done because this would be a much cheaper and faster system because it doesn't have to go through an iterative process of training and exploration. Baseline three is where we don't have the control predictor. We just use the cross entropy loss. Our main task is to generate output for the textbook for different control values. However, we also consider an auxiliary task of reverse simplification, which is a flipped version of text simplification. For this, we consider the popular text simplification data set. The idea is that the system has to learn to make simpler sentences more complicated, not necessarily more formal, but at least this is the closest task for which there exists a data set. So, we consider this task. For the main task, the results are here. The key observations are that we obtain better readability grade when we use the whole system as opposed to when we don't use a control predictor and when we train the system in a one-shot manner. The system, proposed system does way better than Miller system. Models which are not iteratively trained or which do not have control predictor, they typically end up converging to auto-encoding and they are typically agnostic of control levels. So, that's a key takeaway. Because of this iterative scheme of learning, the system really is able to produce and learn how to corroborate with the control values provided by the user at runtime. And baseline one, which is Miller system, this is not doing very well. We suspect that it requires much more data for training, which we don't have. Because it has to learn a sentence distribution, inherent sentence distribution, which it tweaks by the help of the scoreers. And this is not possible with small amount of data. Perhaps it's because of that it has not done very well. The graph at the right-hand side showed the agreement between desired input control and control measure on the output text. We see that for the control formulas of high, degree high, there is a considerable amount of agreement in all the three variants of our system. The complete ensemble system agrees a lot with around 70% accuracy. However, for the mid-range, actually it's very confusing. Sometimes the system produces a similar sentence as the input. Sometimes the variation is quite less. That's why sometimes the scoreers fail to determine the control properly and the agreement is not very high. For the auxiliary task, we do have also, we also have good results. The sequence-to-sequence skyline that we use for the auxiliary task, it just trains on the text simplification data set itself. So it is bound to have better blue scores and relatedness measures as well. But then to our surprise, the relatedness measure was not really that great. The readability was high. This was expected from the sequence-to-sequence system. But it is also kind of very exciting to note that the formulas mid and high control values when given to our systems, they are actually also produce sentences with high readability and they also match with the original labels present in the data set. So for this task where there was some label data available, we also try to show the merit of our system. We also have done human evaluation. We considered human judgment for the outputs for 30 random instances from the test data. The task was to rank, based on readability, rank the sentences produced for different control outputs. Sentences were randomly given to the humans. They had to rank sentences based on whether some sentences highly readable and more readable and would require more understanding than some other sentence. Based on that, we see that there is around 80% agreement between the human rated rank levels and ranking based on the control values computed by our scorers for the output text. So it shows that really our system is capable of producing sentences based on the control inputs. These are some of the examples. The key takeaway from this is that one thing for sure is that the control, when the control value becomes high, the sentence tend to have more unusual words inserted, injected into it as opposed to when the control value is made. However, one thing is, it's very important to note that because we used a sampler that is heavily lexicalized, sentences are mostly lexical variants of the input. The moment we start using some other more sophisticated sampling systems, for example, sampling based on VAE for getting syntactic variations, maybe the sentences will have more intriguing forms of variations. The system will produce more intriguing forms of variation. So I'd like to conclude my presentation with this. So we propose a novel unsupervised framework for controllable text transformation, which is obviously a key requirement in many industrial settings. Our system relies on off-the-shelf NLP tools for fluency, fluency adequacy and readability measurement. These are the key learning signals that come from external sources. But the core learning is, it happens in an unsupervised manner where data is automatically generated and augmented and based on the data, the system undergoes training. We tested the framework for text formalization, but the way the framework has been designed, it is easily adaptable for other controllable generation tasks. For example, one can tackle controllable simplification in a similar manner or sentiment transfer as long as you have metrics to decide how simple the sentence is or what kind of sentiment content the sentence has. In future, we would definitely like to pursue on those threads. And we would also explore better sampling strategies. For example, as I said, the system is now restricted to generate only lexical variants. We can have more complicated forms of syntactic semantic variance produced by better samplers that we would like to try out in future. So that ends the presentation. Here are some links. Thank you very much for listening. Thank you. Thank you very much. It was a nice presentation. If anybody has questions, you can unmute yourself. There's a little red microphone at the bottom of the screen as you hover over it. So you can unmute and remind the IBMers this is an open talk. So please don't ask confidential questions. Do we have any questions? Hi, this is Sanjana from IBM Alamedin Research Center. Hi. So my question was, is there like facility to do domain specific tuning or are you already doing any domain specific tuning? At the sampling stage? No, this was like our intention was to produce a domain agnostic version. However, see the crux of the framework is this encoder decoder module which learns on unlabeled data. So as long as you have domain specific unlabeled data, which you can crawl from websites or gather from documents, it's fine. The second requirement of our network is that we should have these scourers. So for example, we use readability grade as a scoring module. So for example, if you want some domain specific transfer, if you have such scourers available, let's say you build a classifier or you build a regressor or any rule-based metric that suits the requirement of your domain, then I don't see why it can't be used in the system. Yeah, so pretty much depends on whether you have this external scourer available and some unlabeled data available as well. So whenever you do like formalization, so let's say compliance or healthcare sentences, you're saying that the formalization will be like specific to the domain at the end because of the control predictor. Right. Okay. And what about the human evaluators you use? Are they like specific subject matter experts or? No, these are like the normal linguists who have a better understanding of English on the whole. But I don't think we've never had a very domain specific experimentation, so never thought of employing any subject expert for this purpose. Okay. Okay, thank you. Okay. Other questions? So I'll have one while people are gathering their thoughts. You talked early on when you were distinguishing the various kinds of text generation here of starting from non-textual sources. I'm assuming diagrams and database tables or paper tables. Have you explored the non-text sources yet? Yes. So that is, I think, what we have been focusing more on like major portion of 2018 was spent on data to text where we were focusing more on how to use summarized tables and knowledge glass. So we do have some systems, modular systems available for table to text generation, translation and summarization. That means you have to translate every row, columns, every entity in the table and produce a paragraph out of the table. Summarization on the other hand only focuses on some key interesting portion of the table. So for both we have tried to build systems and we have published it. It would be interesting to see at some point if there's an overlap here with the accessibility community, because they've had to do screen readers both for text and data. Pretty much, you know, there's fascinating work, but an awful lot of it is just what works, not necessarily what's the most formal. And it seems that once you get into both text explanation and data summarization, there could be a real immediate benefit for people who can't see the screens. Right. We never thought about the accessibility part. Thanks for pointing. We are noting it down. Yeah. Our major use case has been like the industry document processing and content creation sector. However, this is, this is a very interesting direction. Accessibility is something we never thought about. Yeah. Okay. Any other questions? Please unmute yourself. Hi. Good talk. It seems the system can substitute words, but changing the grammar is a little tricky. For example, the reading grade can simply be increased by choosing a more complex synonym. The semantic relatedness should also be similar. But the fluency is the tricky part. So do you have anything more to add to the fluency control usage? Right. So as I said, sorry, you had to add anything to that? No, that's, yeah. Right. So as I said this, so we have tried to make this scoring part as open as possible. So you could typically add as many terms you want to this scoring function g xy. So if you want like the grammaticality to be tackled in a more efficient manner, let's say you had, you have built a rule based grammar checker. Like if these phrases are occurring, if there is a subject verb disagreement, if there is a noun number issue, or if there is any other kind of grammaticality issue, then I would penalize the generation more. So you could definitely have such modules in place. You could develop it using a heuristic based approach. And then all what all we have to do is add another term to this or maybe replace the current grammaticality or fluency checking function with that function. And it would still work because the learning part is indirectly dependent on this. So it need not be differentiable or it need not have the all those constraints that we typically have in our loss computation. So as long as let's say you had some, you had access to something like the Microsoft grammar checker or some open grammar checker through which you can, you know, score the grammaticality, it would definitely deliver better. But as you said, like currently I think the grammar portion, the fluency part is only tackled by a language model, which typically gives more weight is, if some phrases are looking good, then it is fine. So basically if you are using an n-gram language model, then if four grams are fine, it's fine. I mean, it never has a holistic view or the sentence about the sentence. And if you consider neural language model still, then it loses context because of the constraint. If the sentence grows longer, then it cannot really remember the initial portions and there are issues around that as well. So, but then if you look at GXY again, it's an interplay. So if the readability is getting, if the readability scores are dragging in a sentence generation to generate more formal version and less grammatical versions, then the grammaticality or the fluency computation part will drag it back. So that's the reason we had this composite scoring function. And we do give weight is. So in our experiments, we did realize that if we give readability a higher weight is sentences are super formal as in they do have a lot of complicated words, but they are very, very ungrammatical. So we had to give more weight is to the grammar part, at least because it's okay to have a slightly more formal version, but the grammar should not be compromised. Yeah. Yeah. Yeah, thank you. Yeah, because like for example, in a text containing her highness can be replaced by more formal her elevation, but the fluency of her highness is more than her elevation. Yeah. I understand. Great. Yeah. Thanks for asking. Yeah. Any other questions? I guess not. Okay. Thank you again for a very nice presentation. Thanks for everybody. Those I see on the West Coast who are up early. The seminar next week will also be an early one. It's June 24 Monday, 10 a.m. Eastern time. And it will be by Amrita Sahab from IBM Research, complex program induction for querying knowledge bases in the absence of gold programs. As always, seminars will be posted a few days after they've been given in our YouTube channel. So, again, thank you very much. Thanks a lot. It was an honor for me to present. Thanks a lot.