 Well, thank you very much for introducing me and thank you for having me today. So I wish I could have attended the rest of your workshop, but it was a little bit too early for me today. So, my name is Werner Geier, and as you can probably guess from my name and from my accent, I'm originally from Germany. But it's been a long time. I did my PhD in Mannheim in 2000, 1999, actually. And back then my work was mainly focused on distributed systems and algorithms for consistency control, but already during my PhD I discovered that I'm really interested in leveraging computers to help people collaborate more effectively. So back then I met with psychologists at the university. We did some user studies during my PhD, but this was more like a bonus material for the PhD because it was more an algorithmic piece of work. But I shifted more and more into the CSCW and HCI domains over the years and I joined IBM Research in 2000. As Daniel said, I'm a principal research staff member and a manager in Cambridge in Massachusetts. I lead the interaction team and as of recently. I'm also leading our global strategy on human-centered AI here at IBM Research. My team is very interdisciplinary. So we have a lot of HCI and machine learning researchers on staff, designers, psychologists, software engineers. We do a lot of system building and trying out things with users. So very applied research at the intersections of user experience and artificial intelligence. So for today, given time is limited, I'm only going to talk about a few selected research projects. There's many, many more. And all in the context of human-centered AI. And I'm trying to frame this work today around automation versus collaboration. And let me just kick this off with a quote from Tom alone who should be very familiar to you guys. So, and as you all know, AI research and development has exploded over the past decade with the advent of neural networks and as a consequence AI systems, they're everywhere in our personal and work lives. But at the same time, with the rise of AI, the fear of these systems replacing humans or taking over the world went through the roof as well as if this was a battle between humans and AI. And that term human in the loop is often used to acknowledge that humans are still needed and still exist. But in my opinion, it is also not a very helpful term as it diminishes the role of humans that they play. Because it suggests that the loop consists mostly of AI players or agents and that's I think far from reality. It's certainly true that AI systems are now able to accomplish tasks that could only be done by humans previously, and that the role of humans will change. However, one could take a more positive, realistic and constructive perspective on this and focus more on the opportunity at hand for individuals for the society for our research domain. And as Tom suggests here in this quote right think about how humans and AI can work hand in hand to accomplish a task together. So the fundamental question is how do we design and build responsible AI systems that enhance and extend human capabilities for the good of, you know, our products here at IBM, our clients, and of course society at large. So take a look on this next slide at the titles of some of our recent publications, for example at CHI ICAPS IUI in Europe CCW, you will notice that automation through AI human AI collaborations are really important themes for us and because they show up all over the place in these terms but ultimately, for us, this feeds into the into a much broader perspective on how to understand how to effectively integrate AI into the workflow of enterprise users because that's for our team at least the target user group. And again keep in mind that when we talk about automation here we really mean infusing AI into us into systems to help humans complete tasks. So, with this point of view, I'd like to take a look at the various levels of automation and frame the research presented in this talk in the context of these levels of automation versus opportunities where we can have true human AI collaboration. So I've borrowed a five level spectrum of automation from our recent CHI submission here, ranging from no automation on the left side on the slide to human directed system directed and full automation on the right side. And this is of course a continuous spectrum and other approaches and dimensions of labeling the degrees of automation have been proposed. So for example, you may be familiar with Eric Horvitz's work on mixed initiative user interfaces. And they can also provide a valuable guide from the perspective of how much initiative humans take versus the AI system. So in addition, I've also painted the white axis here that reflects the degree of collaboration between human and the AI system. So naturally, if we have no automation right, there will be no opportunities for humans to collaborate with an AI system likewise if we have full automation and AI runs completely autonomous. So there will be no opportunities for human AI collaboration either. So they are interesting opportunities in between those two points to design and build AI systems that collaborate with humans at various degrees of interaction. And we make the assumption here that the true value of human AI collaboration unfolds somewhere in the center here. So the curve in reality probably looks slightly different, but we make the assumption here that somewhere in the center. That's where you find true human AI collaboration where AI systems become teammates or equal partners to to the human that means both sides show an equal amount of initiative, and the two of them can accomplish, accomplish more together than they could do each one separately. So, of course, if you look at this, you could make the argument for an AI system to be an equal partner, the AI system needs to be able to mimic or have human like intelligence or characteristics. So we're really talking about artificial general intelligence here which entails that an AI system would be able to have human like capabilities such as learning, general problem solving, reasoning, planning, goal orientedness, creativity, critical thinking, and most of all natural language communication because if you collaborate with someone, communication is an essential element of that. So, we have made a lot of progress in this, in the center space but in some of these areas. But today, realistically speaking, most of the AI systems we design and build fall under what we commonly call artificial narrow intelligence that is they are narrowly focused on a single task their domain specific they solve very specialized problems. While we often use the term human AI collaboration today and you have seen this in the in the titles of the papers from my team. The AI systems being designed and built today are realistically speaking they're more human directed and they're mostly augmenting human capabilities. So, just highlighting a few items here so most of the human centered AI research and development we do on my team is focused on system that solve specialized problems and help human workers be more effective. However, we have also made some some forays into research areas that foster our understanding of this notion of true human AI collaboration, in particular in the areas of communication between humans AI and implications for collaboration as well as computational So, in this category of specialized problem solving. We've worked on making data scientists more effective by automating some of their tasks. We've also started looking into applying AI to process automation. We've also looked into aspects of communication that will be relevant to consider when designing effective human AI collaboration systems such as social perception mental models cognitive bias and explainability. I understand that was a big topic of the workshop today. Some of our recent work focuses on computational creativity. And in particular, how deep generative models may change how humans will collaborate with AI when it comes to the creation of physical or virtual artifacts. We've started doing research into domains here drug discovery and software engineering. So I won't be able to cover all of this today. So I'll be focusing on on these two areas, the top and the bottom. So, let's start with how we have used AI to augment the work of data scientists. So, so you all probably know very well that data scientists today are in very high demand. And that demand is only going to increase according to the US Bureau of Labor statistics there's going to be a 28% increase through 2026. At the same time, this is from a recent info world article there is this 8020 dilemma that data scientists experience so they're highly qualified subject matter experts but they do spend 80% of their time finding cleaning organizing data, trying out different approaches and models. So there's, there's a large part of their work is also repetitive. You know models are only as good as the data so this is an important activity, however, given how scarce they are and the fact that they only spent 20% of their time on analysis. It seems like an excellent opportunity to augment data science capabilities to AI. We support this kind of automation there already a number of products and systems available in the market. So we have our own product called auto AI, as Google's auto ML there's h2o data robot auto scikit learn. So these are just a few examples and on the right hand side, you can see two screenshots of IBM's product. And instead of describing these screenshots to you. I was thinking about showing you a quick video. Let me just get out of the presentation mode here. Move this over here. You should see this now, hopefully. Let me play this back crank up the volume a little bit. Thanks, Pearl. Another technique for scaling is intelligent automation of AI and pop up for data. For example, with IBM's auto AI users can simply specify data source and a column to predict. And then auto AI will automatically generate machine learning pipelines efficiently search the space of top performing algorithms and treat feature transformers, perform hyper parameter optimization, create a leaderboard of the best performing models as seen in the previous demo. user can inspect the different pipelines parameters aerometrics and even save a Python notebook to further inspect or to build upon. So auto AI helps users to scale by enabling them to rapidly experiment with new data, easily generate and train top performing ML models, lower the burden of keeping up with the latest ML techniques, and enables a much broader range of skill levels to leverage AI and we're continually bringing new one. Okay, so let me stop here. This was just to give you a glimpse of what this product does. Let me go back to my deck. Okay, so this is actually something we've been working on for two to three years very closely with our product teams and of course, during that time at some point we wanted to know how, how well does this actually work. And so we run various experiments and studies with auto AI in various spaces. So, so this one here is an experiment with 30 data scientists and we want to understand the productivity gains of automation in this space. And here we were comparing the use of notebooks versus auto AI to solve a machine learning modeling problem and the outcome as you can see on the slide here was pretty compelling. In the auto AI condition data scientists produce eight versus three models they were able to generate models three times faster than in the notebook condition, and they achieved higher accuracy and make significant significantly less mistakes so it seems like it's all working wonderfully and it's great but what was really interesting about the study is that the manual group with notebooks trusted their models, significantly more than the group that use the auto AI system but as I, as I just mentioned the manual models were not as good as the automatically generated models. The system used in the study also had a code generation feature I think Lisa mentioned this in the video earlier that allowed data scientists to inspect and manipulate the order generated machine learning code. So most of the subjects in the auto AI condition actually made use of that feature because they didn't trust with machine generated and wanted to see the code. Not only did they report to us that they appreciated the availability of that feature, but we also measure trust before and after the experiment and we found that in our auto AI group trust significantly increased after the experiment, due to the availability of this transparency feature. So this study shows that the establishing trust in AI automated systems is critical for adoption. It's not always and it's not always trivial to achieve and but there's many more ways of course in which you can achieve trust and transparency. We just discovered this here in the study because this feature was just available and people started using it. When looking at the entire AI development lifecycle of preparing data engineering features and model building what you just saw in the video are really important steps. And while our core auto research so far has focused on those there are many other steps that are critical for successfully implementing AI systems in an organization. For example requirements gathering data acquisition model deployment runtime monitoring model refinement decision making and optimization. As a matter of fact, and to end also data science is a very collaborative process involving many players in different roles. So we have expert data scientists domain experts AI operations people on so on. So there's lots of opportunities also for other technologies not only AI to to help improve to help improve data science. So we administer the survey. A couple of years ago to 239 people in these roles along the life cycle to better understand the attitudes towards automation challenges they see for automation. And of course for us very important we wanted to identify novel opportunities where we can inject AI to augment data science capabilities. So, let me see. So, here's one result from the survey. So this chart on the right hand side compares the current level of automation that's the top bars here with the preferred preferred level of automation that's the bars here underneath. And this goes across the 10 stages of the AI life cycle and L zero means no automation L four means full automation and this actually refers back to the stages that I used earlier in my in my presentation. So there's two important takeaways from this chart here across the board on average, you can see that all our participants feel like there is still not enough automation today. At the current stage and the good news is that they want more that means we have more work to do here. You can also see in this chart though that participants overwhelming you do not want full automation across the stages but rather an L two level which is right in the middle of our automation chart where we see human AI collaboration, which is a balanced mix between human directed and system directed automation and leading opportunities here are in data pre processing feature engineering model building model deployment monitoring your refinement. You can also see there's areas where people do not feel very comfortable with automating more. And that's, that's around requirements gathering model verification and decision optimization, because people just feel these are stages where human involvement is is really needed, and then maybe AI is also not quite there yet. People believe that AI is not quite there yet. Yeah, of course on a couple of slides ago. Trust is very important for auto AI. As part of this study we also asked our subjects about different types of explainability that could help them establish trust like what how why what if and confidence. So what question refers to what is being automated. How is, how, how are decisions being done via decisions being taken by the system what if scenarios. So what if we change some of the parameters with the system make different decisions and confidence of the AI system. Overwhelmingly, you can see huge difference here they rated most of these types of explainability as very or extremely important across all the stages of the life cycle. But you can also see that leading the pack here are explanations that involve expressing how confident the AI is in the solution proposed or the decision being made. In the last slide study with 21 data scientists, where we focus only on trust they ranked a list of transparency features that they considered really important. And the list on the right hand side shows you the top features from from that list. And you can see that aside to the number one features of course the viewing evaluation metrics or performance metrics of the algorithm. In the list you see there's a lot of like visualizations and and visual representations that play a very important role in increasing transparency and improving trust. So, so naturally, we have also played with with these kind of data visualization approaches in the context of auto AI. For example, shown on this slide here we enhance the visualization technique of parallel coordinates by allowing our users to drill in and explore elements further. So we call this technique conditional parallel coordinates, and this is also presented on top of the leaderboard, and this was probably too fast in the video. But when you saw the video of the product a little bit earlier they actually flashed a smaller version of this. Also on the screen, because it is available in the product today. So this visualization allows data scientists to inspect and interact with the various stages of the pipeline so all these, these blue lines here. Turquoise lines they represent data pipelines and models. So we can inspect them in detail while at the same time not losing the overall picture we did a small study with this and that showed that this visualization significantly increased understandability of the process. All right. I'm flipping too fast here. I'm augmenting other stages of the development lifecycle with AI capabilities for example during data acquisition, the availability of labeled data is often a major impediment to successful enterprise AI systems. And many use cases there require subject matter experts for labeling for example when building models for intent classification for conversational systems for example. So here on the right hand side shows a case in which subject matter experts had to label utterances selecting from multiple labels here on the right. And as you know SMEs are expensive they're often time constrained so this is an interesting area again where AI assistance could potentially make a difference. So we tested this hypothesis with 50 poor users here and then in between subject experiment in which we administered assistance to weak AI. So weak AI was trained on less samples versus strong AI trained on more samples. So this was sort of a label recommender system if you will. So AI alone without human involvement achieved 44 or 59% accuracy depending on whether it was weak or or strong. However, what this slide here shows when combining human with AI assistance, we can achieve up to 79% accuracy which is a great example how the combination of human plus AI is better than either side alone. So in addition this also led to a significant reduction in time spent labeling. And you've probably seen this across the slides that I've shown you so far I have actually for for many of these I have put references on the bottom of the slides so if you're interested in digging a little bit deeper into those studies. I have these references here on the bottom of each slide. So let me move on to the second half of my talk. As AI systems become more powerful. They will be able to mimic or even process traits that define human intelligence as I mentioned earlier on my slides and in this case I want to focus a little bit more on creativity. Computational creativity enabled by AI is an area we started looking into because this domain has tremendous potential for our future and offers new opportunities for designing true human AI collaboration systems. So in this context, this kind of collaboration is also often referred to as human AI co creativity or co creation, which means humans and AI systems working together to create physical or virtual artifacts to create content so this is not about decision making. And the, the field of computational creativity has been around for a while. So what's new here is that the powerful new class of deep learning models, which are generative, they have gained tremendous traction in the past few years they have led to breakthroughs and in the process of creation. And while you are certainly very familiar with Ian Goodfellow's work on GANs and how they can be used to generate photo realistic images of fake people these algorithms can increasingly are increasingly being used in industrial settings with very impressive results. Let me just give you a few examples Airbus designed a new compartment divider for the A320 that used less material, increased maximum load capacity, and by doing so reduce CO2 emissions. Airbnb developed a sketch based user interface tool that enables designers to test UX designs faster by translating sketches into code. And then in our lab here at IBM Research, some researchers demonstrated a path for the discovery of antimicrobial peptides. So, those are drugs that can help against antibiotic resistance, reducing drug discovery time from two to four years to 48 days and with a 10 times higher success rate. Or another example of the band Yard created the album chain tripping with generative AI and that album was nominated for Grammy which demonstrated inspirational and aesthetic properties of these algorithms. So we are at a point today where we have reached what Douglas Eck called fluency in a recent keynote at one of our workshops and what that means is that we can build really good generative models. However, while we have fluency, the production process of these models is still very hard work, trial and error, overall very brittle that involves manually creating testing iterating over multiple models fine tuning your running batch script your combining results. So these algorithms, typically lack an understanding of the intent of the creative producers or most of them are feed forward, which makes it very hard to allow for control and steering the output into in desired direction based on the human input. They often produce also way too many candidates which makes it difficult to choose from so there's an information overload challenge to. From the perspective of a subject matter expert we need to address motivation to use the systems because SMEs might also feel a loss of ownership in situations where all of a sudden the AI creates the solution or parts of the solution. So the interesting question for us here is, how can we make these systems accessible to subject matter experts or domain experts to non data scientists how do we design an effective cooperative user experience in which the role of the human is towards more specifications during goal setting higher level creativity and intuition versus the AI can provide inspiration low level creativity and detail work or and can generate variations at an improved quality and speed. So the sounds all a little bit abstract. Let me give you an example to further illustrate this idea. This is a pretty cool business setting but it's the work that we have done anyway, so this is a painting tool we developed together with MIT if you want to play with this tool I also left you two URLs on the bottom one is a video and one is a pointer to the actual website where this tool is running. It's a pretty cool painting tool because in this tool you don't paint by manipulating pixels but rather with neurons so it's a semantic painting tool if you will. And so you manipulate neurons by manipulating the latent space and then the UI. Here you can select objects, such as trees doors bricks grass clouds, etc. And you can decide where you want to put them. So on the right hand side, for example, I put two trees. I've added two trees to this cathedral here and so Daniel to the left and to the right of this cathedral. And if I added a door to this building, the model would exactly know the right proportions and merge them nicely into the existing image because the model has learned the properties of those objects. One thing here to note is that if I try to add a door to the sky, the model would not allow me to do that. So you can try this. If you go to the website, because it has learned the doors typically cannot be found in skies and this is a good example of how this kind of interaction with a generative model is not only human directed but also has aspects of system direction because the algorithm has made a decision that it cannot put a door into the sky. So there's all sorts of implications how we design an effective interaction with these models so again paint for example doesn't tell you anything when it doesn't paint the door into the sky. So, but this is really what I'm, what I'm talking about in very simplified ways. Let me give you two more examples. So we've recently started experimenting with co creation generative systems in business domains. So we developed here we developed an interactive user experience that allows computational chemists to explore the latent space of a generative model for peptides with antimicrobial properties I mentioned this two slides earlier. So here by interpolating between two, two known good peptides. So each peptide and the screen should on the right is represented as a sequence of characters here. And we're showing various properties a chemist can examine as well on the right side of this, as well as the 3D structure on the bottom and they can use their own intuition and knowledge to further narrow down the selection. So one of these peptides that came out of this work they were actually really synthesized and tested and we found some working a brand new antimicrobial peptides. And another domain. We only started looking into recently is applying generative models to the various tasks software engineers perform, such as coding documentation or testing. So in this case we are all dealing with language models, they have become as you know very powerful in the recent years. In this experiment we explored the user experiences of assisting software engineers with code translation from Java to Python, and the envisioned use cases application modernization of legacy applications, which you can imagine taking old code and moving it into a modern architecture is a time consuming expensive and very tedious task. So very interesting from a business perspective. This is a neural machine translation model from the show at all. I have a reference on the bottom here from 2020. We conducted a design probe with 11 participants through semi structured interviews to understand the attitudes towards automated code generation and get feedback on the user experience we designed. So here's the experience B just to give you an example of what was shown to them shows the transcoder translation plus confidence highlighting where the model was less confident in the translation user experience see shows the transcoder translation confidence and alternate translation so aside from the detailed feedback on the user experience there's a few interesting findings I'd like to highlight. This is very similar to the auto AI experiments that I talked about earlier trust here came up as an issue. However, in contrast to the auto AI work that we did our participants here they were not really interested in understanding how the models function under the hood, but they argued that trust should be established in a similar fashion as if the AI were a team member by verifying the work through code reviews and similar to a human peer, who they would give feedback during code review our participants also raised the expectation that an AI system needs to learn from the feedback they provide so it would become better at its job in the future it's almost like teaching a junior software engineer. Well the algorithms today, or the algorithms we use here they were not capable of doing that. This thing we also observed a relatively high tolerance towards mistakes made by the model as long as the generated code save them time they expressed willingness to review and improve upon it. And lastly beyond code translation our participants also came up with other areas where they envisioned AI could help them generate code for example in creating higher level structural system code so they could get a head start on building a model system from the ground up and that is really interesting since it illustrates how the role of AI changes more into a creative partner, rather than being a mere translator so these are areas we are currently exploring in, in future work. So as you can imagine, there's endless possibilities in this space and it's really interesting. So our work here has just been scratching the surface and this exciting domain. There's tons of opportunities to explore the notion of human AI collaboration in this domain. If you're interested I'd like to point you to two workshops. This is the first topic so Lydia Chilton from Columbia Mary Mar from UNC just invites and I have been organizing the second workshop on human AI co creation at IUI this year. And that's the, that's where I met Daniel. So we're currently working on the third edition of this workshop, which we are submitting very soon hopefully, but if you're interested proceedings of the second workshop and the first workshop they're available on the website. I'd also like to point you to another workshop on machine learning and creativity and design that takes place annually in Europe. So I'm not sure this is going to happen again this year but this is a very interesting workshop to that focuses a little bit more on the artistic and design side of generative models. So, let me quickly wrap up in the interest of time I think I'm slowly running out of time. So I've given you a few examples today from our research portfolio that demonstrated how AI can be effectively integrated into workflows to increase productivity and specialized domains I've shown you this in the data science domain. An important aspect to tackle there are challenges of trust and transparency. It's important for adoption. And our work now is shifting more and more into exploring true human AI collaboration I've given you a glimpse of our exciting work around human AI co creation with generative models. And as you know there's a large body of research from CSCW and computer mediated communication from those two fields that examine how technology helps humans to collaborate effectively and I'm actually sometimes wondering if it would be worthwhile exploring that body of research a little bit deeper to understand whether or not any of those findings are actually applicable also to human AI collaboration. So, let me quickly go back to Tom Malone's quote from the beginning of my talk and complemented with another statement. So all these increasing levels of automation enabled by AI and allowing us to design more of these collaborative partnerships between humans and AI systems. I think we must not forget that the common thread to all of these systems today is the human element so people are critical in the design operation and in the use of AI systems and we do have a responsibility to ensure those systems to be transparent that they promote equitable outcomes that respect our privacy. And last but not least, they effectively serve our needs through this new partnership. So, a couple of more pointers. So these are resources I didn't have the time to talk about this today but our team also developed AI explainability design guidelines that are available online I left your URL at the bottom that you can explore. There's also major players in the space you're probably very familiar with these industry AI design guidelines by IBM Google Microsoft, if not, I've left you a few URLs here too. And last but not least, I also wanted to point you to these to these resources so as part of our trusted AI agenda there's a few toolkits and resources available. So, for example, AI fact sheets 360 is a body of work and methodology to support governance across the entire AI life cycle and transparency. We have our explainability toolkit that also contains open source that can be used for explainability, the fairness toolkit, and then around AI security we have an adversarial robustness toolkit that's also accessible. All right, so thanks so much for having me at your workshop today I hope you enjoyed this and I'm looking forward, connecting with you on Twitter. I've also left you another URL here many URLs. So this is all a brand new web page and human centered AI at IBM research and I also wanted to mention the hiring ACI researchers at the moment if you're interested reach out to me. Thank you.