 Let's get the show started. My name is Alexey Krabrov. I'm the director of open source science, community at IBM. And open source science is a new initiative that we started last year. It brings together open source developers and scientists. So IBM research is one of the most established industry research organizations. We have fundamental research going decades back in material science, health care. And so the way we set it up is basically a lot of scientists are using open source tools to do scientific discovery. And who established this at NumFocus? NumFocus is a sister organization to Linux Foundation. It's a foundation for most Python data science tech. So who here heard about Jupyter? Yes, who heard about Pandas? And who heard about NumFocus? Somebody, right? So NumFocus basically lists known oldest foundation. It's a 10-year-old organization which basically is a home to NumPy, SciPy, Pandas, Jupyter, Scikit-learn. Everybody knows these names. Nobody knows NumFocus. Linux Foundation is doing a much better job at marketing itself. But we basically work with everyone. So open source science is at NumFocus, but obviously, as IBM support that, also as IBM we are members of Linux Foundation on AI and data. PyTorch Foundation and IBM basically was instrumental in creation of Linux Foundation. So we're very happy to bring together open source community to bear on science. And how this event came together, obviously, scientists use everything to accelerate their research. So when large language models burst into the scene, we actually, I remember this, we were at New Rips, and IBM had a massive booth as well as many other companies. And during Jeff Hinton's keynote, so Jeff Hinton received Lifetime Achievement Award for deep learning. And during Jeff Hinton's keynote, charge APT announcement came. And I think it was really amazing, right? Like, it was the biggest disruption, I think, in the field in many, many years. And obviously, everybody started playing with this. And at IBM Research, folks looked at what are the implications of this for science and kind of fundamental research we do. And at this point, there is a lot of work in areas which are not typically discussed in kind of popular contexts. So we have a lot of geospatial research, right? So we have a lot of geospatial data. We have climate data. We have satellite imagery. So we have a team which works on fundamental questions of geospatial data. And they apply LLMs to geospatial data. We also have teams which work in material science. And that spans carbon capture materials, right? To arrest climate change. Also, that spans health care and drug discovery, right? And obviously, folks are asking questions, how can we create new materials using generative techniques? And I was on this project, generative AI, generative toolkit for scientific discovery, GT4SD, which we started in 2022 early. So it was actually preceding chat GPT substantially, right? Because GPT models transform us around for a bit before the chat GPT became kind of made it super famous. And our GT4SD toolkit was available. It was announced at Saipai 2022. It got the best open source tool for science work. And people are using it, right? So we actually have some previous work in generative AI, kind of preceding chat GPT. So IBM has a history of obviously using AI and ML for science. And so that's kind of where my head's overlap. And so my colleague Tim Bonaman here is at Open Source Science, so there are two of us here. And we also support staff at NumFocus, who have a program manager for Open Source Science now at NumFocus. And Tim is going to talk a little bit more about Open Source Science later. So this is just some photos from committee events we did. So that thing in the right top corner, this is LM of a lunch in San Francisco. And that came about, obviously, people want to know the progress in the field of LLM. And so I run multiple events in San Francisco by area. I started the by area AI meetup, which is now one of the most established meetups in the world in AI. We have about 5,000 members. And because we are located in the center of technology world, we get most recent updates from all the companies at Silicon Valley. And obviously, it kind of became a bit dormant during the pandemic, but we came back in the summer. And we had really strong interest. LLM really brings people together. And when Databricks Summit happened in end of June, we are partners with Databricks. I know them since the very inception, since creation of Apache Spark. They started my meetup, kind of the first meetup for Apache Spark. And now Databricks is a date in the AI company. So we basically did this local meetup preceding the summit. And we just thought, OK, so we're going to get some people together. We're going to get 100 people, maybe, right? Get a room like this. And just have a discussion, like we have here. And it's no bold. It's just from several dozen people, it quickly went to several hundreds. And then it reached 1,000 people. And from a few speakers, we ended up with 40 speakers. And we still had four hours to do this. So I call it the biggest, deepest, shortest technical conference on our lens. And the way we structured it, we kind of want to focus on interesting technical topics. Because a lot of discussion about chat GPT focuses on kind of sci-fi issues, ethics issues. Is AI going to kill us? Is it going to take our jobs? And I'm kind of sad. Guys, OK, we know it's going to kill us. It's going to take our jobs. We know that. We want to know how. How is it going to happen? What is it going to do? Let's kind of eschew the philosophical discussion and see if we can focus on technical topics. And that's especially true for the developer community. So in our Metabs in San Francisco Bay Area, everybody is a developer. Developers come from hundreds of startups. Developers come from all the fan companies. Google, Amazon, Microsoft. Almost any company has an office, like Toyota Research is there, Volkswagen Research is there. Alibaba is there. Like, you name it, they have an office in Silicon Valley. And most people are either developers or data scientists or product managers. And the product managers want to know how products will look like developed with AI. So basically, we decided these are the themes we decided to focus on at this LLM Avalanche event in San Francisco. And obviously, we have a much shorter, like smaller group here. But I think that's kind of the topics we want to initially discuss. And you guys are very welcome to join. I think the plan is kind of, since we have a few people here, I think people who come during the day, maybe people come and go, we'll see like we have more people who register who are not here yet. So what I think we can do to make it most efficient and useful, we'll run it as an conference. So I'll start with a few topics here. Then we'll have some folks who can give presentations. So Nick from OSI will describe Open Source AI. Tim will talk about Open Source Science. And I think we also have some folks who plan to give talks. So I've seen some of them reach out to me. So here, among here wants to give a talk. If you guys want to come up here, does anybody have a presentation besides Nick, Tim, and myself? Nobody here. So we actually have some speakers who want to come and give a talk. So maybe they'll come later. But even if you don't have a presentation, let's just do like a quick pause of the room. So do you guys work in LLM? Who already works with LLMs in your workplace? And who follows LLMs actively? Who is in the context? OK. So you don't need to be a leading practitioner. If you have a topic of interest you want us all to discuss, feel free to come up here and start a discussion. I think the only requirement because we have a live stream, we have to have somebody here in the frame. So we'll need to kind of write because people who join online need to be involved. So we'll figure it out. Normally when we don't have a live stream or have a small group, we should just sit in a circle. So that's another option. We can actually sit in a circle if this is good. So I can be on stage and I can be in the frame. And everybody can be in a circle, like Tim and I. Let's figure it out. But let me cover first some of these topics. So I think generally, so why do we have this discussion? Why do we have these meetings? So you know, information is falling on us every hour, every day. So we call it LLM of a lunch because it's like an avalanche. It's an avalanche of information. It falls on you and it will bury you. Like if you don't have a strategy how to deal with this, it's just too much. And that's just inevitable. The pace of scientific discovery increased. There is a lot of labs that are cranking out papers. So that started, I think, with deep learning. With deep learning, you see the paper started coming out. And it's really hard to absorb the progress. And if you work in industry, like we work in IBM, you work in Huawei, in a big company, you need to be on top of what's going on. And it's very difficult because there is so much stuff. So you can go read blogs. You can go read Twitter. You can follow OpenAI, Android Carpathia. People generally, like we know some interesting kind of source of information we go to, but stuff just bursts like recently. There is this thing called Open Interpreter who heard about Open Interpreter. September 5th, this guy from Seattle publishes Open Interpreter, which is basically like OpenAI interpreter but running locally in your machine. And you can put a model locally. And so in one week, in one week, the project becomes number one on GitHub, full week. 20,000 stars, right? So like literally you can start the movement in a week, right? And you can build a company problem. Like they have a, it's just some guy in Seattle. Like it's really interesting how this happens. Killian, Lucas Killian, right? And this is just indicative, right? Like the disruption can happen very quickly. And that we need like all to understand what it means. So what I found running communities is that you can go and read all you want, right? But really you need to get together with your colleagues and the community. And then if you have a lot of people looking at this, then you really have much better idea because everybody has some piece of the puzzle. Everybody has an angle. And when you get together, right? You can establish kind of the baseline. What is the current understanding in the field? So I think it's, first of all, it's important, right? Because as humans, you know, you can be like inundated with information online, but humans have a limited way to process it. So what, how decisions are made as kind of community we decide where this field is going, right? And so kind of regardless of how much information is coming in, there is a process by which the industry moves, right? And for instance, a conference like OSS Summit is a very important checkpoint. So we'll get together, we present the state of the art. You hear like from keynote speakers, you hear from leading, you know, like developers in industry, and then you understand where different players are, right? Then like if the startups can move very fast, you know, corporations move slower. It takes them certain time to absorb and judge and decide. So I think, you know, a meeting like this is very useful, right? Because we can compare notes and can compare understanding, right? Like I see like guys from Fujitsu, Huawei, IBM, like, this company is a big, they move slower, although now they have to move faster. Like even normally it takes years, now it should take months, right? Like you have to adapt. You have to adapt the pace of change. So that's why I think it's very useful to have a community meeting. And another point I will make is related to evaluation of these models. So basically what do we have? We have, you know, we have data, right? So different models are trained on different data, and generally we have an idea, it's trained basically on everything so far available on the internet, right? And some models don't disclose the training data sets, but generally we know it's some kind of web crawl, it's some kind of stack-over-flow for code. It's archive.org for scientific papers. We kind of know the sources, it's wiki data, right? We kind of know the general sources of information. Some of this is problematic, right? Even public data can be problematic. An example, obviously, is Enron, Enron data set, who heard about the Enron data set, Enron emails. So Enron was a court case in the United States, right? Enron was basically an energy company which messed up and basically went bankrupt and came to court, and this is a part of discovery. Emails from Enron were made public, right? So kind of in order to prove that Enron was at fault, the discovery process made the internal emails available and they were put in the court record and became public. So now when people want to train, want to understand how people in the comparison email to each other, what is a company email? Like what does it look like? What's the dynamics of internal communication? Enron data set is public. And obviously some of the models were trained on data sets which included Enron email data set. So when people ask some questions about company communications and so forth, they suddenly might get back emails of people in Enron which look like private information because they have the emails, right? Names and locations. So if you're not familiar with this, if you don't know the history of this and you say a business user using one of these models and this actually happened, you will suddenly get like private information what you will think is private information. You will think that this model is leaking private information and if you don't know where it's coming from, you may freak out because if it's leaking these emails from Enron from some company you don't know about, maybe you think it's related, maybe my customers emails are gonna be leaked too. So people actually are very much surprised by this, if they don't know the history of this. So this is just an example that you can't assume this data is safe. Actually in the early days of chat GPT, a lot of people ask questions like, give me Amazon passwords, give me SSH keys because a lot of students are careless and they check in the, like a lot of people actually, like they do get commit, they check in the passwords, the files containing passwords, the file containing keys and professors are gonna lose tens of thousands of dollars in Amazon or Azure credits because their student checked in the password and the crypto miners look for it and they grab it and before you notice they run up the bill and a lot of compute, right? So this is very typical. So a lot of the, if you start basically slow up in the whole internet you find that a lot of data is security risk, right? And you don't know until you start looking at it. So when we look at training data, we can't assume that it's clean, we cannot assume it's safe, right? We can assume that it is gonna hurt you. So if you are a researcher at the university probably it will not hurt you but if you are an enterprise and you provide an open source model without making sure it's safe it may start leaking sensitive data, right? Without you knowing this. So you may be liable, right? And generally speaking, enterprises don't want to be liable, especially global enterprises if you do business multiple jurisdictions you don't know what it means if this happens, right? And so the general stance is like to stand back. So I think a lot of this progress has been held back now because we don't know what these models are gonna spill. Another question is if the model is gonna spill this data what are you gonna do about it, right? So what is the feedback loop which will let you correct and basically mitigate the security risk? So at IBM we have a very good effort to have a lake house, right? Which is used to train our Watson X models. And it's our colleagues at Almaden and so they basically do a very good job by letting you flag issues, right? And the general approach is if you have documents, if you have some leakage, if you have some objectionable output you should be able to trace it back to the training dataset, right? And you should be able to trace it back and you should be able to decide you want to clean it up, you want to remove these documents and then you need to retrain, right? And the question now is, training is expensive so we need architectures which let you flag and remove objectionable training data and retrain very efficiently and incrementally. So I think the training itself will have to become much more efficient to do this. You shouldn't run the full training if you remove one document or one paragraph, right? Or one string. So that's data. Models, I think most focus now is on models, right? So we basically see open source models come online. I think this is the most well understood area right now, right? And so I think the attention is focusing on models so that's why I really think it's super important to remember it's just one piece of the puzzle, right? The models is what make it interesting and a lot of work goes into the models. When people talk about the models, first of all they forget about often about the data, right? You don't see as much talk about the data like people don't even want to focus on this. They just don't have access to the data, they don't know. Like open air doesn't disclose this, right? But we need to remember this. So now we have the models and there is a whole set of questions with the models. They have closed source models who have open source models. The models are big, right? So you need to host them. So there is a whole set of questions, how to run to access these models and compare these models and I'll talk about it a little bit more next. But I think this is the most common topic so it's best understood. Now applications, right? So applications is I think currently very little talk about. Like we talk about applications, we talk about length chain, we talk about Lama index. There are basically several examples of applications but I've not seen an actual production deployment. Have you guys seen any production deployment of Genitive AI, used for business? No, right? So this is kind of a very strange thing if you think about it, right? So we have but it was probably very early, right? So since November, December last year, till today, I mean soon it will be a year. We don't have any obvious, like world famous example of production deployment. We have a lot of talk that the AI is gonna replace customer service departments, it's gonna replace tickets, it's gonna replace Zendesk, it's gonna replace call centers, right? We don't have any clear and obvious demonstration. There are some reports like some startups say they fired the customer service, replace it with a lamps, right? Like I don't really see that as a clear proof, right? You don't see that kind of a recipe how to do this. And I think there are many reasons for this, right? One reason is what I described before that generally big enterprises are very slow in terms of legal liability. They want to understand a lot of issues, right? But I think the application question will actually become the main question of the next year or two. So if you look at what length chain folks are doing, they're basically charging ahead. Like their model is, we don't know what Open AI is gonna do next, but we need to make these models useful today. So they basically make it easy to query the API, to interact with the model, but also you need to do a lot of things around it. So you need to work with prompts, right? And so prompt engineering is now a thing. So if you do engineering, you manage the prompts, you sequence the queries, you need to understand what happens, how to look at the results, filter results, evaluate results. One shot, few shot, reasoning, then obviously, you know, residual augmented generation, which became now a rag application is now a term, right? So all of this is a lot of engineering. And to me, it all looks like it's not science. It's not even software engineering. It's really like, it is like software engineering, but it's really a craft, right? It's really like, we find these things by trial and error. And so the practice, evolves to create these patterns, you know, the best practices, right? So basically, LinkedIn is a collection of best practices. And if you look what they're doing, like the Duprombopuscation, right? If you want to put this into the enterprise, you want to safeguard your prompts. If you use prompts on behalf of the user, right? You shouldn't make it easy for an adversary or kind of malicious actor to reverse engineer and understand how the thing is made, right? So I think there is a huge risk now if you have a question, answer and system, it can basically reveal what other people are asking it and why, right? And you don't, in enterprise setting, you have multiple customers. You don't want this to happen. So a lot of thinking is, you know, how do we make this safe? How do we make it safe in an actual real world deployment? And I think the current thinking is actually extremely ad hoc. Like people are just kind of hypothesizing what threats can be like, right? And try to protect against them. We have some kind of reports of failures, right? Like at some point, charge EPT started leaking other people's sessions, other people's questions, right? So it was the failure of this setup. But that's an example of what practitioners are safeguarding against. But generally, I think where the progress in my mind will happen. And you guys are very welcome to discuss this, right? It's like, it's my hypothesis. So, you know, if you work with enterprise, like, you know, IBM work with large companies, you go to an enterprise, and enterprises have a lot of systems already, right? Like they have the customer service systems. They have the orders. They have their finance. They have the analytics. If they have web applications, they have all kinds of observability applications, so now you want to replace a piece of this. And so there are humans who do all kinds of jobs. So humans do analysis of their web applications, e-commerce. Humans do customer support, right? So there are humans behind a bunch of the systems. And so the idea that you're gonna replace the humans or augment the humans with LLMs, right? And because humans are basically looking at the sequence of events and making predictions. This is where LMs are much better and faster, right? But in order to bring LLMs into the loop, you need to hook it up to all of the systems, right? So instead of just making them understand text, you need to make them understand sequence of orders, right? Or sequence of transactions, or sequence of clicks, or observability data, right? So you now have different kinds of sequential data. And all of these systems currently are a bunch of legacy stuff. Even if they're open source, right? There's basically a bunch of existing, pretty unwieldy, bespoke, even open source. If they're open source at the beginning, they'll become kind of a bunch of crafts and a Christian on top of this. And if they work in an enterprise, they talk to Salesforce, they talk to Oracle, right? Like there's a whole bunch of integrations. So enterprise is a giant blob of software integrated with kind of bespoke scripts. And it runs on some kind of infrastructure. It may be Kubernetes, it may be on-prem, but like you have a very complex system, now you want to bring LLMs into the system, you need to understand, right? If it's, you know, service, right? You have micro-service architecture, you need to integrate the service into all these other services. But you want this, your service wants now to run a lot of them, right? It wants to make decisions. So I think what we'll see in the next year, people actually struggling with that. It will be much harder than producing marketing copy, right? It will be much harder than generating text. And again, like if you make automatic prediction of what you should do with enterprise systems, you need much better control and validation, right? Like you need to understand the risk. If your action is automatic, right, you need to be very sure that this action is correct. And if you have doubts, then you need to have a process where you put it in front of humans or second level of review. So human-in-the-loop architectures will be very interesting. And I think I really don't see anybody yet who thinks about this properly. Like there is no thinking of how to integrate the systems into enterprise, right? At least presented publicly, right? And so remember we had digital transformation for a decade. So I think the digital transformation we talked about was in infancy, it was a toy transformation compared to what is coming, right? Because the digital transformation of 10 years ago was digitization of the enterprise, right? Like how do you put records into databases? How do you put transactions and so forth? Like this is all more or less solved. But I think from the standpoint of the future, this is just the preliminary step. Now, if you really want to disrupt the enterprise, digital transformation 2.0 will be taking a lot of human decisions which can be automated, right? So a lot of low-level activity can be replaced like customer service lookup of documentation. But also if you look at middle management, middle management are basically human routers, right? They're basically routing strategy into implementation. And if the implementation becomes automated, the middle management function as the router becomes obsolete, right? Because instead of controlling several humans doing busy work of bureaucracy, you replace them with LLMs so they can do something more interesting. So the humans become kind of elevated, right? They need to kind of look at more interesting cases which require human review. So the managers probably will have much fewer and much more different humans, right? So I think the actual digital transformation will take whole parts of the enterprise which have computer systems and management and humans kind of doing repetitive tasks and that will transform this whole thing, right? With much fewer humans doing much better high-level work, right, and models doing most of this and in kind of risky cases, the decisions will percolate for human review and the human review should become much better. So that brings me to this fourth point, right? It's commutative validation. So all of these questions is LLM, are LLMs safe? Also, are they performant? Do they do what you want? Can I trust them, right? Can I follow the recommendations? All of these questions cannot be decided what I suggest by Fiat. You cannot be IBM or Huawei or Facebook and say my model is safe, my model is performant, and my model is the best, best for business, right? You cannot say it as a company, no single company can say this because it's a unilateral claim which you cannot take anybody's word for this, right? Because if it goes and spills the data, puts you at risk, you cannot trust it. So what we should do as a community, we should actually develop methods to validate all these claims. And so this happens, happened previously with machine learning performance. So how many of you are familiar with ML Commons? ML Commons? So ML Commons is a community organization, also non-profit, which runs ML Perf benchmark. So it was set up around ML Perf. So ML Perf is a benchmark to measure performance of machine learning infrastructure and applications. And so generally it was put together because different cloud providers started saying like my cloud is the best for machine learning, right? So Amazon, Azure, Google will say like come to us around your deploying here, right? And as in industry, you cannot compete without kind of putting together some measures, some measurable benchmarks. So they came together, developed ML Perf benchmark and ML Commons is an umbrella non-profit which hosts ML Perf. So now, you can together, so on the board of ML Perf they have all of these cloud providers, right? All like, and Huawei and Alibaba and ABM and Google and NVIDIA and so forth, everybody who makes chips and everybody who runs clusters generally is there. So now you can come together, you can agree this is a technical way to compare these claims and you can run them right on your cloud. And obviously you need to have a control environment, you need to understand exactly what your CPU, GPU is what your bandwidth is. Like you need to know exactly what you have, right? And then you can compare. So the industry solved it more or less for hardware and software stack running, deploying. So probably in the similar fashion we can do benchmark for the labs, right? So when it comes to performance we probably can come up with similar things and people already do. However, the questions of trust and security are much harder, right? Also, trust is a very sensitive issue because what model can say about the world can affect you and obviously what's acceptable to say in the US and China is different or Canada or Spain. So if you start querying the model for general world knowledge or if it starts making opinions, like you may actually find that you need to be sensitive to what's kind of acceptable and legal in the jurisdictions where you run their operations, right? So this is why it's hard. The topic is difficult to kind of quantify. So one way to attack it is basically restrict the scope to business problem at hand, right? You don't want to talk to the model about philosophy. You want to talk to the model about others and deliverables and customer satisfaction, right? But currently the models are all encompassing, right? They're comprehensive. So some directions of research are saying like we don't need our GPT, right? To customer service. We need smaller specialized models. So something like they call SSMs, small specialized models. So every factory will have a model instead of, conveyor robot assembling cars does not need to know about history. Needs to know about cars, right? So again, the point is trust and security of models depends on the scope of the models, right? And also depends on the context. Trust is not an abstract question. Trust defined by the society and the legal area where you're located. So trust is basically a two-sided market. The provider cannot operate without the consumer. You need to understand what consumer is and the trust is defining this two-sided market application. So I think the community validation is extremely important. And community validation will be done by people like here meeting together and deciding what's working and what doesn't, right? So I think that's why we really want to have a very strong community around this. And so I think as Linux foundation, we need to have standing community for generate FII. And I think this is what is going on. Like at LFI and foundation, there is a concerted effort in this direction. So I think this summit will see some substantial progress in this space. Okay, so I talked about that. Yeah, so this is what I talked about earlier. Why, you know, how we did this in a second. What are we gonna do here? And so yeah, so this is kind of the, I think that's all I have. The plan is kind of, this is the topics, right? I kind of put some points in the landscape just to start the discussion. I hope you guys, you know, see this kind of a frame. But maybe there are a few of us here. Maybe, you know, maybe we can do some introductions. Maybe, you know, everybody can spend like a couple of minutes, you know, and say, why are you interested in a lens? What do you want to see today? If you want to spend the whole day here like a part of the day, what do you want us to do? It's basically up to us. Like we have a few talks, you know, by folks who are here, we have more talks, but generally maybe let's do an introduction, like round of introductions and see what, you know, what we want to achieve today. So I think I already have done a lot, so maybe Tim will go next. Yeah, look at the mic. Yeah, let me give the mic to Tim. And Tim will bring the mic to you. Hi everyone. My name is Dave Lai from Huawei, Canada. I'm in the platform engineering team and doing the ecosystem development. Today I'm trying to see how does a community maturity into this new area, larger language model. The reason I'm interested in is that I think this will change the way software is developed. So since my team basically is doing the tools, you know, software tool to help company to develop the software, right? So I think particularly this is a chat, GDP already demonstrate the capability able to generate code and things like that and explain the code. So I'm trying to see how this is gonna impact to the software engineering process and maybe beyond that. So hopefully, I mean, I can learn more on today. Thank you. Hello folks. So my name is Nick Vidal. I work with the open source initiative. This is a nonprofit that coined the term open source 25 years ago. There has been a lot of talk about open source AI, but there's no clear definition of what exactly is that. And the website is bringing together several members from around the world to try to come up with a clear definition of what's actually open source AI. Thank you. And maybe you can give an update on that later because I know you've been meeting with a lot of people, right? Over the last few months to make progress in that. That'd be very interesting to hear. Who'd like to go next? Can I ask you to... Sure. Hi, good morning. I'm not sure. Yeah, my name is Hiro Kobashi from Fujitsu. And yes, we are here to collect data about a large land measurement. You know, as you said, the company is very, very looking for what to utilize in the LLM part. There is not much open source project in there. So how can we utilize such a project? Is it one of our big target? So I'd like to get more information about it. That's the reason we are here. Thank you very much. Thank you. Hi. This is John Gorostiola from General Electric. In our department, we started... We are interested in developing some AI. Actually, I'm interested of any kind of AI, but our main application of a more practical application would be a chat. And that's why here in the LLM, large language model part, that's why I'm here. And I want to learn more. I am an initiation part of the AI learning, but I wanted to figure out with this, just learning a little bit more and how to approach the application we want to develop, maybe an LLM or SSM as you mentioned. So that was my aim. Thank you. Anyone else? Carlos Clemente here from Exclusive PN. So I work in the platform team or developer experience kind of a same thing we're trying to develop an internal platform right now for our developers. And as a LLM user myself, I want to use... Or I want to find out how can I help internally our developers to overcome some of the issues. Sometimes it could be documentation that they don't find. And obviously, you cannot go and ask ChatGPT that because that's specifically on how do we run or how do we build stuff internally. So I think it could also be easy for some new members of the team to get trained because all these documentation, right? But sometimes it's difficult and people waste time trying to find where everything is there. Or even in the team, every team may have different standards or ways to write the competition, for example. So I think LLMs can help with that gap internally in the team. So, yeah. Thank you. My name is Tobanneman. I am the Community Lead for Open Source Science. It's a new initiative as Alexi mentioned that was launched last year to accelerate science by improving the way Open Source software gets done in science. And one thing I'm particularly interested in we're starting out with three domains, namely chemistry, material science, healthcare, life sciences and climate and sustainability. And there's a lot of experimentation going on using LLMs to, as Alex mentioned, exploring the materials or new drugs and coming with better ways to do science by having these LLMs suggest targets for exploration or hints for how to study materials or these processes. So, and of course the people that we engage with are very committed to using open source tooling. So we're interested in what open source landscape could look like for scientists, specifically, to regard to LLMs. Anyone else? My name is Christine Abernathy and I represent F5 and what I'm interested in is learning a little bit more about open source LLMs, especially issues around copyright. That's not really specific to open source but also thinking about to your point what is open source AI licensing? What does it mean? So that's one interest point as well and also kind of learning a little bit more about like the security aspects and what is specific to open source. What's different about security with LLMs which open source versus just proprietary? So just thinking about those types of topics. Excellent, thank you. Alexi, did you want, did we have everyone? Yeah. Did you want to go over logistics like lunch break or? Let's do that, so I'll have this one here. So basically the plan was, right? I mean, generally we have few people so we're not gonna have breakout sessions, right? We're gonna just stay as a team and discuss this. So I think we have a couple talks from Nick about open source AI and I think it's a very useful discussion so we can have Nick's talk and man discussion following this. The team will talk about open source science is an application area where LLMs can make a difference and we'll see then what follows. So I think the current plan is to have lunch break at one p.m. in two hours, one to two and then reconvene here if that works. And just in case there is coffee and water and some little snacks on the terrace. So we have I think some stuff to go on. So yeah, let's probably do this. Like let's run through lunch and find a good point to break for lunch around one p.m. And then we'll see. I think some people are gonna come like a lot of people actually registered who are not here yet. So they're probably coming in later through the day. I know for sure some people are gonna join after lunch so we can reassess right after lunch. Okay, cool. So Nick, you have your laptop, right? Like it's a, okay, let me.