 There's something that underlies how things function in the world. I think it starts with being curious. That's where it starts. We have particle physicists. Designers. Engineers. Economists. Chemists. Biologists. Ethnomusicologists. I am an IBM Research because I love the people I work with. I admire the people I work with, they're incredibly good and because they're so good, I can be good. And that's the beauty of it. Welcome to What's Next from IBM Research. Today's session is part of a series of seminars in which we spend some time with our scientists learning about the exciting work that they're doing. I'm Shaheen's Parks and today we'll be talking with Dr. Pyle Das. Dr. Das is a principal research scientist as well as a manager in our AI group. She's also an IBM master inventor. Dr. Das holds a PhD from Rice University and she has co-authored more than 40 publications in the field of AI. So Dr. Das will be talking with us today on the topic of controlling generative AI. Before we can really understand what it even means to control generative AI, we need to have a basic understanding of what generative AI is. And Dr. Das will give us that overview. But in a nutshell, generative AI is when we leverage artificial intelligence to create new content from existing data. That data might take the form of images or audio files or video or text or even molecular structures. It's really a wide variety of inputs that we can use to create that new content. So that brings us to the topic of control. So when we think about why we might want to control generative AI, essentially we're looking to ensure that the results that we get are reliable, that they're trustworthy, and ultimately that they're useful for the purpose for which we were creating them in the first place. So today Dr. Das will give us an overview of this topic and also walk us through a few really interesting examples that showcase the role the control can play with generative AI. After that, we'll have some time for some Q&A and some discussion. So if you do have questions or thoughts, please feel free to drop them in the chat and we'll incorporate them into our conversation. With that, I'll hand it over to Dr. Das. Thanks Sahin. It is my pleasure to talk in this seminar series and show you some of our recent work on controlling generative AI to ensure reliable creation. I'll start with the quote from famous scientist Richard Feynman, that what I cannot create, I do not understand. Along that line, artificial intelligence has come a long way and provided state of the art performance in predictive tasks across many different domain and many different application. At the same time, in recent years, we do see emergence of generative AI, that is the artificial intelligence techniques that's not just good at prediction, but now it's able to create new content. So it's becoming really, really powerful technology. And as with every powerful technology, it is really up to us how do we make the best use of it. So as we have the generative AI as a new technique emerging, it's important and critical that we use it in a responsible and reliable and trustworthy manner. For that, it's important to impose control on it so we achieve this objective. Before diving into how do we control generative AI, let's just have a really high level overview of what is really generative AI. As opposed to predictive or discriminative models that learn for the conditional probability distribution, that is p of y given x where y is the label and x is the observation, generative models aim for learning the joint probability distribution, that is probability of x, y. This can be used when we apply base formula to generate new observation given an X or a label Y, and it can be also used for predictive tasks when we use base formula. So in a nutshell, discriminative tasks, they aim for learning the decision boundary between different classes of samples, whereas generative AI learns to model the distribution of training data, and then we use that model to generate new content from it. How do we do it at a little bit more detail level? There are more than one way to architect a generative AI model. The recent powerful generative AI architecture mostly rely on deep neural nets, and here I list some of the popular choices. The first one is generative adversarial network or GAN, which formulates this generation of realistic content as a game between a generative neural net and a discriminative neural net. Then comes variational autoencoder, which is aiming at maximizing a lower bound on the log likelihood of the data. The third popular option is autoregressive models that launch the conditional distribution of every token given the previous ones, and during inference time, the distribution is used to generate next token given all the previous ones. In this slide, I show you some of the recent artifacts, impressive artifacts created by different generative AI architecture in many different domains that exist in the community. These domains comprise of music, image, image caption, as well as human-like writing. I just want to point to you that this specific example, which comes out of StyleGAN, where given a face image of a real human, the StyleGAN converts it to a different animal-style image of the same face. Or let's take GPT-2, which is a powerful recent language model, autoregressive language model, which can provide human-like writing when we give system a prompt. With that, I have given you a high-level summary of generative AI architecture, different options, and also some really impressive examples of what it is able to achieve. And with that, now we are getting into the core topic of this talk that why and how we control generative AI. We need to control generative AI for many different objectives. And today I'll be talking to you about a few of those. For example, we'll be talking about a few of those three different illustrations. The first one is where we control the generative AI architecture for creativity, and for that we propose a brain-inspired mechanism of auditing neuron activation of a generative AI model. Then I'll be talking about when it comes to controlling the fidelity of synthesized data by imposing particular structure, for example, temporal dependency that is inherent to the training data. And the last example where we talk about how do we control for consistency with user-desired attributes, and for that we propose a new conditional sampling from generative AI that will allow us to generate new output that is consistent with the user-defined or desired attribute set. So let's start with the first example. The powerful generative approaches that exist today, for example, variational autoencoders and GANs, they are limited in terms of creative potential because they are designing a way that they would aim for mimicking the training data. And that minimizes their creative potential. However, creativity is considered to be the next moonshot in AI, and if we turn a machine to be creative, it is possible that that will push the boundary of human creativity further. So how do we infuse creativity in a generative AI model? One path is to get inspiration from cognitive processes associated with human creativity. Along that line, recent neuroscience experiment shows a typical association between task positive and task negative neurons in creative brain that happens during when a human is working on a creative task. Inspired and excited by this seminal finding, we thought that what happens if we modify the activation within a pre-trained generator during generation time and promote correlation between task positive and task negative neurons? Does that really induce creativity or not? So this approach that we tested and validated, we call it as creative decoder, neuro-inspired creative decoder. This work was published last year at HCI conference. And the proposed approach, the advantage that it has is that it's fully unsupervised. It is model agnostic, and it can be used of the self. So just going through this illustration here, what we are trying to achieve is that if we take a pre-trained generative model, in this case an autoencoder, we are looking into the layer of the decoder during decoding time. And then we are changing the neuron activation such that it's now giving us access to novel samples that were not present in the training data. So let's see how we do that at the algorithmic level. So here I show you the algorithm. At a high level, our goal here is to capture the task negative concept that is found to happen in the human brain during a creative task. So for that purpose, we identify those neurons that typically have low activation across all of the training data. And then we pick one such neuron, and we select all the neurons that are strongly correlated with that inactivated or off neuron. And then we turn all of those, the strongly correlated one with the selected ones on or we activate them during decoding time. And here is some of the result what we are able to achieve when we do creative decoding of an ARTGAN model that was trained originally on wiki art data. So on left, I show you the original training data or the ones that are generated by the regular ARTGAN model. And on right, I show you what happens after we apply creative decoding on the trained ARTGAN model. As you can see that the modified images, they still preserve the content from the original images. However, we now do see novel variation that are present in the newly generated images via creative decoding. But are this novel variation truly considered as creative from human perspective? That's the important question here. For that we run an experiment. We do creative decoding on trained generative model that are trained on three different data set, MNIST, a data set of digits, fashion MNIST, a data set of fashion objects, and then a combined data set of MNIST and fashion MNIST. And then we perform creative decoding along with three other baseline decoding method, noisy, random and regular. And then we provide these images to human. And we don't tell them which image is coming from which decoding method and ask them to annotate along the line of creative or novelty present in these images. Interestingly, we found, as you can see in these figures, that the creative decoding method produced images that are consistently considered to be more creative by human annotators when we compare the images that were generated by the three other baseline methods. That's noisy, random and regular. Moreover, we found that the images that were generated by creative decoding method were actually at higher reconstruction distance on average from the training data when we compare that to the other baseline methods. Taken together, these results suggest that creative decoding of a pre-trained generative AI model that is done in a specific neuro-inspired way really allows us to have access to creative content that is consistent with human perception of creativity. This method again is application or domain agnostic, and this result encourages us to look into the creative potential that we can trigger in this specific manner by looking into the auditing of neurons of a pre-trained generative model. With that, I will now start with, I'll talk about our next example where the objective is different. Now we want to control for fidelity in the synthesized data as opposed to inducing more creativity, and for that it is important for this specific application or use case I'll be talking about, that we impose certain structure that are present in the training data in the generative model. The application here we are considering is generating financial transaction data, which is basically tabular time series data. The reason we are interested in such a synthesized data, because financial transaction data is really important when we are looking into application of AI in financial and in many other industry, because here we are talking about something like credit card transaction data. And it is really important that we have access to high quality, high fidelity synthetic data that can be used for many downstream application without worrying about maintaining privacy. How do we do that in practice? In this case, we resort to powerful recent large language models such as BART or GPT, and these models are autoregressive in architecture, and they rely on a mechanism called attention. In this slide, I'll show you how do we allow, how do we use a GPT model, a GPT language model to generate financial transaction data. At the bottom of this chart, I show you a glimpse of how our input data looks like. Here we are working on generating transaction data for a specific user, and every row in this figure shows a single transaction. Each transaction has different fields such as ER, merchant state or USG. The task here is that we train the GPT transformer during training time. We provide it in such transaction and ask it to generate the N plus one one. As you can see that each of this transaction can be considered as a sentence and each of the fields can be considered as a token in that sentence. And using that formalism, we use GPT to train it on the financial transaction data and ask it to generate new financial transaction. This chart shows the fidelity of the synthetic data. What we are showing you here is the chi-square distance between generated distribution and training distribution, and we show you chi-square in distance in terms of different features that were present in the original training data. Interestingly, for a specific user, when you look at the synthesized data through our TAF formal model which was presented at ICASP earlier this year, we find that there is conservation in terms of features like amount, card and USG where we do see more variation in terms of merchant name and GIP, which is exciting and really realistic because for a specific user, we expect to see such conservation when it comes to amount being spent or the card being used, whereas the merchant name and GIP can show more variance. With that, I will come to the last and final example that we are going to showcase you today. Here the task is that a user provides us a set of attributes and ask us to generate, ask the model, generative AI model to generate new and valid and meaningful samples that are consistent with the provide set of criteria. For that, we propose a new conditional sampling approach from the generative AI model in order for the model to generate samples that are consistent with the provided criteria by the user at the same time maintaining the novelty and validity of the generated samples. Now, where this problem is important? This problem is important in many different fields and many different domains and I'll talk about a specific scenario where the task is to come with an optimal and novel design and in terms of optimality, it has to satisfy the objective that user has set. Now, the problem is that typically finding a design, it could be finding a new molecule, a new design for a physical system like a race vehicle or coming up with a novel natural language text that satisfies certain attribute like it's consistent with certain style and also certain sentiment. Typically, the search space is huge and we don't have a lot of label sample to train the model with. In order to overcome such challenges that are associated with searching in a large space, effectively in a large space and learning from a few training, a few level data, because labeling is expensive and cumbersome, we take a step back and we train a generative model on unlabeled data. Typically unlabeled data is much more abundant and is cheap to have rather than you know, getting labeling done. As an example, here I'm showing you that we are first training generative auto encoder model comprised of two different neural net and encoder and decoder. And the goal is that the model should learn to accurately reconstruct the input data so the encoder encodes the original input into a low dimensional compact latent space representation. And then the decoder learns to reconstruct given the latent encoding to the original space. We do also have a regularization loss here that ensures that this latent space is smooth and continuous and has the meaningful information about the input data. So we train, we start with training a generative auto encoder model on unlabeled data, on abundant unlabeled data. Next we leverage the attribute level samples to map the attribute to the latent space. For that, first we encode training data with the learn encoder and then we feed explicit density model to this latent space. And finally we leverage now the level data to learn a predictor model for each attribute by using the latent encodings. Finally, we draw samples from this posterior model by using a rejection sampling schema that we have proposed, which is guided by the attribute predictor. So what it does, it allows us to find the region of interest in this model of the latent space where the chances of getting latent encodings that are consistent with the attributes that are provided from the user as an objective is really, really high. And we sample from that region of interest in latent space and then pass them through the original decoder or generator to generate new samples that are consistent with user specified attributes to serve our purpose. We call this new approach as conditional latent attribute space sampling or class. And this work was published at Nature Biomedical Engineering Journal earlier this year. We subjected this control generative AI to an important task that is discovering novel and safe antibiotics. Designing novel and safe antibiotics is really important as it will allow us to help fighting antimicrobial resistance, which is one of the top 10 global health threats. It's really difficult to come up with new antibiotics because again the search space is huge and effective antibiotics lie in a small region in that big space. And that's why the discovery pipeline of new antibiotics is extremely dry. So we applied our controlling control generative AI technique to learn generative autoencoder model on known peptides. And then we control sampling in a way that the model provides us novel antimicrobial peptide sequences that are likely to have desired property for being a good antibiotic. We sent 20 such peptides or 20 such candidates that were designed by our AI model to the wet lab because we need to really get them validated. So those got synthesized and then tested. And we are happy to share that we found that two of those novel AI design antimicrobials indeed to have many desired properties such as high broad spectrum potential, low toxicity and low resistance concept. What is more interesting is that it only took us 48 days and we had a 10% success rate when we leveraged control generative AI as opposed to a few years and less than 1% success rate that's typically achieved with the existing methods. Taken together, this study serves as a proof of concept that when we can use control generative AI in an effective and efficient manner, it allows us to overcome many challenges like low availability of labelled data and generating promising candidates in this case next generation antibiotics. It all can be done by if we know how to effectively and efficiently control this powerful generative AI techniques. And again, this method is really domain and application agnostic and can be applied at a broad range of modality as well as different application. With that, I would like to conclude by saying that I hope I have been able to provide you an overview of what generative modeling is and what it can achieve and why it is important to impose control on such powerful techniques so that it can only serve for humanitarian good and at the same time we minimize the risk of potential adverse events. What I showed you today is only a glimpse of our overall agenda around generative AI at IBM research, which goes all the way from foundational theory of new formalism, new more efficient formalism of generative AI architecture and algorithms, as well as application across different problem areas like trust discovery and business. And finally, leading to all the way to grand demonstration with application that directly benefit humanity. That's what we call generative AI for good. I'll be happy. I thank you all for your kind attention for attending this talk. And I'll be happy to take any question at this point. And here is my contact info. Feel free to reach out to me or anyone at IBM if you have a question about this talk. Thank you. Thanks, Pyle. That was super interesting. And the examples really brought it to life. We do have a few questions and to start out with, I wanted to ask you a little bit about truthfulness. So right from the beginning, we talked about the importance of using control to help ensure the truthfulness of our results. And then the examples that you shared, it seems like in the first example, you measured the truthfulness with your human annotators to try and understand are these truly creative. And in your last example, it seems the truthfulness is essentially measured by the fact that these molecules were successful in the lab. So my question to you is, is there a way to generalize a measure of truthfulness for control generative models? Or is it kind of best practice that they'll have to be fit to each use case? Depending on, I think that that's a really good question, Sahin. And it's a tricky one. Depending on what is our final goal, like, you know, I think when it comes to creativity that we showcased in the first study, it is important that we get it assessed by human annotators, right? That's so that it's consistent with human perception of creativity. The last one, if we are talking about designing new antibiotics, it needs to be able to kill bacteria and there is no other work around rather than testing and doing that, you know, expensive and extensive evaluation at wet lab. So it really fits to the task that we are at hand that is it enough that for generative AI, the truthfulness of the synthesized data is just the consistency with the training data or creativity or, you know, certain aspect of human perception or its efficacy at lab or in a real life scenario. So it really depends case by case. But at the minimum, we have, you know, machine learning proxies and surrogates and machine learning based metrics to evaluate such truthfulness to start with. That's where we always get started. That makes a lot of sense. So it sounds like right from the beginning, you need to be thinking about both how you want to address the problem, but also how you're going to check if your solution worked or not. Yeah, like another example that, you know, we also extended our control generative AI for generating inhibitors for SARS-CoV-2 virus. And in that case, you know, we had to take it to lab and really check if it was able to inhibit the virus in live assay. And we found that, you know, our generative AI approach is able to generate such inhibitor molecule that really, you know, can stop the virus from replication. So again, it depends case by case, but we always do, you know, some checks at different level before sending it to a more expensive and extensive evaluation. Either, you know, when we are bringing human annotators in the loop or we are sending it for expert validation in lab. It makes a lot of sense. And it feels like stopping viruses is something that is top of mind for all of us. Yeah, exactly. I have a much more tactical question for you from the audience. So the question is suppose you're building a new language model from a data set that contains multiple dialects. Are there any strategies to overcome this or to work with that situation? Depending on the goal. Yeah, so we can learn, we can come up with a language model that we are training on multiple dialects. But then I think what is the important question is what is the task here? Are we trying to generate new samples that are consistent with each of those dialects, right? Or it is good that the model learns, you know, something like a universal representation of all possible dialect and forget about the specific, you know, style of the dialect. So yeah, in principle, it is possible, but you know, how do we fine-tune the model depends on what is the specific application or task we are going to consider in this case. That makes a lot of sense. And I think that's probably consistent with initially, as we were saying before, that, you know, right from the beginning, you need to have a clear idea of what your desired outcome is to be able to control appropriately and to measure if you got there or not. So I have a couple other questions. I'm just looking at the time. So when we think about generative AI in the news, it's often paired with considerations about deep fakes. There's a lot of negative press with generative models. Can you comment on that and kind of the relationship with the work that you're doing? We are working on using generative AI to do good, right? So that's what is, and as I mentioned, you know, minimize potential risk associated with that. It's a really powerful technology. And when used in the right manner and right intent, it really can do wonder, you know, as I hope, you know, I have showed the audience today through a few examples. So at the end, you know, it's a powerful technology and it's really up to us, the users, the developers, the society, how we make the best use of it and how do we control for misusing it. So at the end of the day, you know, it's not the technology is us who are at the receiving end and also at the developing end. And, you know, we have to use it in a responsible manner and to ensure that it's only used for benefiting society and not otherwise. Absolutely, absolutely. And I think that's, I mean, it's true of so many novel technologies. So we're coming to the end of our scheduled time. So to close us out, you know, what's next? What do you see coming up in this field of research? Do you anticipate generation becoming more powerful? Do you anticipate, you know, maybe moving to different domains or use cases? How do you see both gender of AI and the role that controls will play kind of coming up on the horizon? Yeah, I mean, this is a really open-ended question and I can't come up with the prescriptive answer here. But a few directions that are starting to come up, you know, promising, interesting and emerging, I would say that so far in a lot of scenario, GenITBI is dependent on availability of high volume of data. And we, and it's mostly driven by learning from data, but there is other sources of language of knowledge, right? Like, for example, you know, how to, how do we leverage rules, existing rules or domain knowledge that can also be used when we are formalizing a GenITBI model, which could be more data efficient as well as more compute efficient. So that I think is, you know, one direction where we not just, you know, learn to construct a GenITBI model with or without control, just alone from data. The other aspect, the other direction I think is going to be important is when it comes to induce more angles of trust, like explainability, you know, steerability or controllability, right? How do we explain the GenITBI model itself or when we impose the control on an existing architecture or a model? So I think some of this direction that learning to explain a GenITBI model, learning from less amount of data, I think these are some of the upcoming, and I think some of that work is already happening in the field. I think that's probably not taking us to the next horizon of GenITBI. As well as we are going to see, you know, more and more application, not just, you know, in isolation where AI alone plays the role, but also with domain knowledge with the simulators that exist in that specific domain or, you know, from a human expert being in the loop. So I think, you know, these are some of the direction that will be considered in the next horizon of GenITBI. And I'm excited to see how that comes along. Well, these are exciting times. So thanks so much for spending this time with us today. I think I know I've learned a lot. I hope our audience has as well. And thank you for joining us. We hope to see you for our next What's Next session. Thank you all. Thank you very much.