 All right, I think we're good. Andre, sorry. Go ahead. Yeah, I'll try. Thank you. How would I do that? Just fine. OK, thank you very much. So I apologize I have a cold today, so your experience will not be uniform throughout the talk listening to me. So I'm Andre Lapitz. I'm a computer science faculty at Boston University, and I also direct the software and application innovation lab. I'm also involved in something called the Hariri Institute. So I'll tell you about some of these. I assume that one of the reasons I was invited here, as Hugh mentioned, was to introduce BU and what's happening at BU to everyone. Now I should make the disclaimer that this is nowhere near everything that's happening at BU. This is just things that we're involved in or that I've been involved in, or that I'm peripheral to, that I'm aware of. But hopefully it'll give everyone an idea of the kinds of things that are happening at BU that are related or would benefit from open source or where open source is kind of incipient in certain communities. So there's just an overview. First I'll talk about the Hariri Institute and the software and application innovation lab. And then I'll go over a couple of examples of work going on at BU that involves open source starting to involve open source that, again, we're involved in. And I'll talk a little bit about the challenges and advantages and so on of trying to bring open source as well as bring software engineering into an academic environment in the way that we are currently doing it. So the Hariri Institute for Computing was founded about six years ago now to help researchers across Boston University who have a software engineering, computer science, data science, computational component to their research all the way from biologists and physicists all the way down to even school of theology. And we do work with school of theology and the humanities and things like that. And the Hariri Institute supports these kinds of efforts through seed grants, through recognition of faculty and graduate students through fellowships. But it also hosts sort of a federation of different centers, labs, and initiatives. Some of them it incubates. Some of them it sort of took on after it was formed. And many of these are around specific topic areas like artificial intelligence. Others are meant to promote things like digital education or software engineering, in the case of the software and application innovation lab, as well as other sort of areas of interest or focus. So software engineering within an academic environment is something that it seems is starting to become recognized as a possible career in a way that it wasn't before. So what I'm talking about here is the standard career tracks involved. You become a PhD student, then you become a postdoc, then you become a faculty member. And those kinds of tracks in academia are well-recognized. And there's broadly well-organized structures for people to progress through those throughout their career. Not so much for research software engineering yet. Although, if you look online, you can see that these things are starting to improve. You have in the United Kingdom, for example, you have these sort of trade groups and associations and conferences for research software engineers who actually want to build a career doing software engineering and bringing best practices from software engineering into the academic environment. And you have other examples in the United States where, for example, the Flatiron Institute, I think it's called, was founded in New York City to help researchers with sort of computational challenges. And it's a very large, well-funded group. So things are starting to happen. And we're basically trying to do that at Boston University. So about three years ago, with this goal in mind, we started the Software and Application Innovation Lab at Boston University. And the idea here was to augment what the Institute was doing in helping researchers and faculty members and students introduce computational and software engineering elements into their research. We grew very rapidly just because there's a lot of demand for this kind of thing. Initially, it was primarily to support the seed grants where the Hariri Institute was supporting these kinds of efforts. But it turned out there's a lot of researchers who have funding, and I'll talk about this a little bit later, who have funding from external funding sources where academics typically get the bulk of their funding. So these are national organizations like the National Institutes of Health or the National Science Foundation, as well as foundations and other sources, DARPA, IARPA, and so on. And in this case, what was very interesting is that it also allowed us to take existing research efforts that maybe weren't able to get funding from those sources, because those organizations themselves are starting to recognize that you need to actually support software engineering as an end of itself in many fields if you're going to actually improve those communities. And you'll see some examples of that. Improve the way those communities work together and the way they build tools and so on. So basically, because the Hariri Institute is kind of a hub at the university and talks to researchers all around about the research they're planning to do or the research they're currently doing, then it really puts the Institute in a good position to promote things like open source approaches. So it's true that we've observed in many cases over the past several years situations where outside contractors were being hired, and basically they were rebuilding the same thing over and over again. In some cases, what you might arguably say is a waste of federal funding, but it's simply because people weren't familiar with the fact that you can actually take a piece of software and separate it into an open source component, which is reusable, and you can probably reuse it not only just within one group or within one project, but probably across projects, but also that you can take your proprietary research or whatever it is that you're building and you can sort of isolate it so that it becomes a configuration or an instantiation or a parameter to that open source platform. So you can still have your IP, but you'd be able to build this kind of infrastructure that's reusable. And it also puts us in a good position to promote best practices. So documenting your code, we've had experiences where we had to rebuild things from screenshots using CICD, making sure that you use good version control and things like that. These are things that we're able to promote and actually teach people about throughout the BU community because of the position that we're in because we're working with so many groups and talking to so many groups. One thing that I will say though is a lot of the things that I'll talk about and that we do are application level development. So there are many groups and you'll hear from other groups at BU who do other kinds of work, but we specifically focus on sort of applications and things that the application layer. So frameworks, libraries, web services, full stack applications and so on. We have, I'll mention some things that also were production quality. So we've built things that are production quality and had to be deployed. Other things are more kind of prototypes or things that are facing users who happen to be researchers. So I'll go over some of these four topic areas that I chose where it's exciting to see sort of open source approaches being adopted. So one of them is actually experimental design and automation and synthetic biology. So synthetic biology, it's kind of a, there's lots of things going on there, but sort of one prototypical workflow that I'll focus on, that there's some projects that are funding work on is this idea of, if I'm gonna build these genetic components that essentially implement logical, programmatic kinds of algorithms and I'm going to explore different designs to see which ones actually have the behavior that I want. So an example application of this is something like, I wanna create some kind of organism that will detect contaminants in water by maybe glowing or something like that. So in order, so you can actually online right now buy a kit to do genetic engineering. You can do it at home. You just need to order one online. So these are the kinds of things exist, but here we're talking about sort of automating this on a mass scale. So basically what you're going from is a design to hopefully some kind of automated infrastructure, APIs all the way down to machines that are actually gonna run the experiments and synthesize biological components that have certain functionalities. So what's really interesting is that for example, DARPA is actually funding an effort called Synogistic Discovery and Design, which is, BU is involved in actually a broader group that includes the Broad Institute and MIT and others and Doug Densmore is one of the PIs here at BU who does this work. But this program is actually explicitly designed to say, all right, all of you labs that are sort of doing synthetic biology and are planning to create pipelines that build these components that are essentially biological systems that perform certain functions, you guys need to agree on standards that you're gonna use. What kind of APIs are you gonna use? What kind of formats are you gonna use for describing an experiment, for describing a biological component and so on. And it's actually really interesting to see, they're actually doing things in the way that you would expect and in the way that we see other open source communities doing things. They have consortia, they have working groups, they put together standards and DARPA is actually funding this kind of thing explicitly and encouraging this kind of work. So specifically for example, something we've been involved in is a number of different tools within workflows where we've contributed to existing standards as well as existing tools within the community. When you need to, for example, build these biological components, there's a standard language for visualizing these and it's amazing to see the synthetic biologists to actually on a whiteboard write it so fluently as if they're just writing in English or something like that. But these sorts of images are actually rendered in various tools so that you can actually present them or examine them and things like that. So we've had contributions where we've added the ability to visualize additional kinds of components within tools like visible. We've also contributed to several tools for essentially generating designs. So if I wanna have some kind of logical circuit and I want to implement it as some kind of biological system, there's tools like cello that are being worked on including by groups of you that I mentioned that essentially take that logical circuit and generate a bunch of biological designs that could implement that circuit. And then of course you need to test and see if they work and that's part of what this sort of experimental workflow is meant to support. Another standard that's kind of growing right now as S-Bowl so this is a standard for, it could be for describing biological components, it could be for describing experiments. And it's an RDF based kind of XML format and it's actually expanding right now as the community decides what kinds of features should be added to it to describe the kinds of biological components that they want to build. One thing that I was really happy and impressed by was this open source protocol called AutoProtocol which is intended primarily for robots so essentially these are these pipet robots that I had a picture of here earlier on the right side. And basically the idea is that you'd have this JSON format for saying take this pipet, move this much of this material into this particular well and it's a programmatic way to describe this kind of process but it's very interesting if you go to autoprotocol.org you'll see that it just looks like an API and this community has actually done a great job of taking the kinds of techniques that we're familiar with from software engineering and adapting it to this particular domain. It's very encouraging to see that and it's very refreshing to use this kind of tool and use this kind of format when you're building pieces of software. So that's one area where it's definitely exciting to see these kinds of software engineering and open source techniques being introduced and we're very happy that we have the opportunity to contribute to these here at BU within the Software and Application Innovation Lab and for the Hariri Institute to be able to support this kind of work as well. So there's another area where we're seeing things change and kind of emerge right now called Digital and Electronic Health and the Hariri Institute as well as some other groups like the Institute for Health Science Innovation Policy as well as the Mobile and Electronic Health Affinity Research Collaborative. These are some of the entities at BU that support or are kind of encouraging some of these things to happen but here I have an example of sort of a series of projects that we're doing with some of the faculty members at School of Public Health in building something called Computerized Adaptive Testing Platforms. So these are platforms that basically allow organizations like healthcare providers, hospitals, nonprofits to provide assessments that either could be administered by clinicians or can be sort of self-administered by patients and users to assess their progress if they have, for example, spinal cord injury or if they have burn injury or if they have other conditions or maybe just sort of how they're progressing and what their condition is if they're healthy. And these tools actually, when we came into this community, many of them were being built using Visual Basic 5 and they were running on Windows XP desktops and things like that. It was very interesting to come into this community and try to introduce sort of full-stack web applications using open-source frameworks to build both the back end and the front end components and to also make them as a result cross-platform more compatible with sort of web-based environments. And now we've actually had great experiences taking some of these organizations on the right-hand side and helping them deploy these tools using modern techniques. So we're using Docker, giving them these Docker images, allowing them to set this up within their environment. And it's really been great to do that. And one of the interesting things that we've run into here is actually a lot of these tools have to be accessible in the sense of if you have users who may be hearing or site-impaired and they need to be able to use these websites nevertheless. And one of the things that we've actually run into is there is a government, I guess, funded framework called Assets that really isn't maintained very well. And we've ended up actually having to use one with not so great license called Accessible Plus for some of these tools. But we're actually thinking about right now coming up with sort of taking Assets and maybe fixing it up ourselves here at BU given the number of projects that could benefit from this. But the idea here is that this is a front-end framework that allows you to introduce components into your web application, into your website that makes it accessible in accordance with standards like Section 508. So this is something where there's a lot of opportunity here also for private organizations as well as nonprofits and academic actors to get in and sort of improve this and maybe contribute in places where, for example, the government may not have the resources to maintain these things. I have here a web experience toolkit which is actually a Canadian government funded project as well just for complete coverage. But these are examples of things where open source is actually, open source approaches are also gonna be very valuable. Obviously this is something that benefits many users and is essential for many of these applications. And this is where those techniques can provide lots of benefits. So here within BU there's another project that we're working on where it was just, showed us how important it is to use good software engineering practices. So Margaret Betke is actually on top here. She's a faculty member within the CS department and she's working with another faculty member in the College of Health and Rehabilitation Sciences here at BU to build the system that uses, well it'll be other things because the connect is discontinued but at the moment we're using a Microsoft connect to essentially allow patients to adhere to exercises to for example physical therapy exercises that they have to do at home. And one thing that's interesting about projects like this is they really are heavy software engineering projects. There's a team of 20 people with PhD students and software engineers and project managers all trying to collect requirements, build mockups, put together user interfaces, put together the back end, take components that are research components. These are things that are algorithms that PhD students are putting together for their theses and helping the PhD students get them to the point where their production quality so you can actually have patients using this and not being frustrated. And there's a lot of project management that goes into this kind of work. There's a lot of coordination and in this case we're also using open source approaches as well as using open source frameworks. So it's a really great opportunity for demonstrating how valuable it is to use these approaches for these kinds of projects that probably wouldn't have made it this far without using industry and community best practices for something this complicated. So that's another ongoing project that's happening right now within this kind of digital and electronic health space. So I'll switch over to another, a third example area where I'm personally very involved in and we have a lot of work in this. So there's next generation cryptographic techniques you've probably heard of homomorphic encryption and there's another thing called multi-party computation. And basically these techniques allow you to sort of factor out what used to be things that you assume have to go together. So you assume that if you want to take data from multiple organizations and do some kind of joint computation over it, analyze it, you would assume that you need to actually share that data, give it to some party and then it'll do the computation. It turns out with techniques like multi-party computation and homomorphic encryption and so on, you can actually separate computation from actually being able to read and hold the data. I can, for example, encrypt my data, give it to some kind of service provider, they do the analysis without ever seeing the data or the results, they give me back the encrypted results, I need to encrypt the results and I look at them. So these things are all possible but these are not techniques that are currently being used in production in the real world. And one of the things that we currently have is a bunch of grants from the National Science Foundation as well as other partners that are explicitly for building open source libraries that will then allow others to build applications and services on top of these open source libraries to introduce these kinds of features into their applications. So here we have a list of different libraries and some of them there's other talks that you'll hear during this conference about some of these, like for example, Conclave as well as the applications where they're being used but we're essentially being funded by NSF to build libraries that are open source and that can be sort of deployed and used within applications Excuse me, we've actually been able to take some of these and deploy them which I'll mention in a second. Now one approach that we've taken here when we started this is there are a lot of libraries obviously being put together within this community by researchers, faculty members, graduate students but one of the approaches that we took when we started building some of these in particular because we knew we needed to deploy them we knew we needed to deploy them for applications that are basically gonna run at the, on browsers that end users are using on their laptops and so on. We actually had to from scratch build them using JavaScript for example so that they run on all the browsers and that you can build applications that are compatible on all browsers. So this is a library called JavaScript implementation of federated functionalities that we built to support those kinds of environments. There's lots of demos online that you can look at and you kind of see these libraries from a few slides ago sort of where they went or where they're going now. So we have collaborations with the city of Boston and the Boston Women's Workforce Council where these libraries were actually used to build applications that have been deployed over the past three years or so in production used by hundreds of companies across Boston where they're basically loading this application in the browser and contributing data to a computation in a privacy preserving way. We have a partnership with Hunder Research Institutes and this is to Hunder Research Institute's credit that it actually is supporting open source development so they're funding this work but they're funding open source libraries that they hope to benefit from as well and we've actually built prototypes and demos of things like, so let's say you have Google's routing service, right? So Google allows you to say I'm here, I wanna go there and it returns to you sort of a path of how to get to that destination. So you can actually build that service in a way such that Google never sees your query. It doesn't know where you're starting and it doesn't know where you're going. Nevertheless, it's able to tell you how to get there. So you can do this in a privacy preserving way using these techniques and using these kinds of libraries that we've put together. We've been fortunate to work with Callisto Project as well. We've put together some libraries that allow them to use multi-party computation techniques within their service. Callisto is a sexual assault reporting service that runs on a number of campuses right now and they're building sort of a next generation version of their offering that is going to use some of these next generation cryptographic techniques and we were able to put together some open source libraries thanks to the NSF funding for them as well. And as I mentioned, you'll hear about Conclave which is another project that involves the Massachusetts Open Cloud, Dataverse and actually Red Hat I think is involved in this as well at this point and it's a very interesting project. You'll hear more about it in subsequent talks but one of the libraries there is again something that we've been working on funded by NSF that contributes to this project and allows these kinds of privacy preserving computations to take place within cloud environments. So the last topic I'll kind of talk about is urban data science. So BU hosts something called the Initiative on Cities which was I think started by the former mayor. And the initiative tries to connect researchers and students at Boston University with cities in particular the city of Boston but also other cities in the area to sort of address issues. And what's interesting over the last three to four years maybe even more so in sort of two to three years is cities have started to embrace open data so they basically take all the data that used to be in document, paper form or just going into a black hole or some server somewhere and never looked at again. And they've been taking them and turning them into these open data portals that are accessible to everyone. So right now you can go online, you can go to Analyze Boston and you can see here's all the bike paths in Boston. Here are all the 3-1-1 calls. Here are all the 9-1-1 calls. Here's where all the accidents happen. And so far it's a difficult challenge for cities to actually go beyond that because they don't really have the resources to then take that and build solutions based on that data. They're just trying to get the data out there. But they are using sort of open source frameworks like CCAN and Socrata API to do that. But what we've been doing is for example supporting student kind of project-based courses as well as other projects that are funded by NSF to build tools that use that data or to build tools that allow others to use that data. So we have this kind of this diagram of this ecosystem of various little components that will allow someone to take. If I have the bike path data set from the city of Boston, can I take that and then for example find the most efficient way to connect the bike paths? Where could I place additional bike paths to sort of connect them up and have them have better continuity? So in order to do that, you have to retrieve the data, you have to store it somewhere, you have to build a pipeline that maybe can retrieve updated versions of that data over time. And what you need to do is then maybe run an optimization algorithm over it. But that requires converting that data from whatever GIS format or sort of open street map style format it's in to something that's gonna be compatible with network hex and Python or something like that. So we've built a lot of libraries and sort of tools to support that kind of thing. And students have been able to contribute as well as use those kinds of tools within courses that are offered at BU. Where they use these open data sets and they do projects, you know, focused on sort of solving problems with optimization techniques and sort of statistical analysis techniques. We've been able also to do it ourselves in a couple of cases. Recently, Boston had the Boston Public Schools had a challenge to find a better way to route school buses to reduce costs. It was a little controversial. We were careful to not get into the controversial part of it. We actually just, what we did was we offered to create an anonymized sort of data set that is representative of the distribution of students in the Boston area but is not actually the addresses of all the children in the Boston area. And we did that by essentially taking a collection of different open data sets and using a bunch of tools to build something that has the same distribution in terms of the number of students going to number of schools. And you have to get it right. So if you see on the picture on the left here, that just connects every student with a straight line to the school that they attend. In the Boston area, for example, because of the way that Boston assigns students to schools, you see there's a lot of cross traffic north to south there. So it's not terribly efficient in terms of transportation, but you have to kind of create something that's representative of the actual challenges that the routing algorithms might face if they're actually going to figure out how to transport all these students. So this was an opportunity to use some of these tools, use some of these data sets and for us to kind of get involved in a project that hopefully at least provided a beneficial data set that others could use to do the actual routing challenge. So full disclosure, MIT won the challenge for routing, although again, if you can read the newspaper articles to see what the controversy was there. So I wanted to kind of now step back from these specific topic areas and just kind of demonstrate some of the ways in which by doing all of these things, certainly the software and application innovation at Lab at BU has been able to find opportunities to reuse and also leverage our experience working in one area to basically inform or give us something that we can reuse or have an advantage in another area. So it's just like we call it the spider diagram sometimes. Some of the projects that we've worked on, some of the tools that we use to support those projects in particular areas of research and you can group them in various ways but this is sort of just one way of grouping some of these projects. So just to give an example, one thing that we've been able to do is build a backend system that we've reused throughout projects. The backend system basically does things like user management, authentication and you can manage data but we've been able to use it for these computerized adaptive testing platforms. We've been able to use it for synthetic biology databases. We've been able to use it for sort of situations where you need to build tools that are HIPAA compliant and by taking this open source platform that again we've seen people rebuild over and over again, building our own kind of version of it on top of open source frameworks, making it open source. We've actually been able to save a lot of effort on our part and we've been able to take features that benefit one community and sort of transfer it to another community and people are starting to accept it. They're starting to understand that oh it's okay that I'm going to fund an improvement to this open source project and save some of my funding for the actual work of the research rather than build the whole thing top to bottom and sort of own everything which doesn't really benefit anyone because again you're just rebuilding the same kind of backend from scratch each time. Frontend as well, we've kind of embarrassingly reused frontend frameworks and components across projects that are vastly different. And yet we've been able to sort of reuse this stuff and you know again save effort, get benefits of feature improvements across these projects in these kinds of scenarios. And then just sort of from a competitive advantage standpoint, right? So one of the reasons the Hariri Institute exists at Boston University is to make it possible for researchers to be more competitive when they apply for external funding to NSF, to NIH in places like that. And among the things that you need to have there is, thank you very much, I appreciate that. So one of the things that you would benefit from is you have built frameworks before, you've worked on, you've used best practices before and you're able to reuse those and say I can reuse this in a project if the NSF funds it. Or I'm able to ask a new kind of question because I ran into it while working on these kinds of projects. So I talked about a little the digital health projects before where we were able to take things that we're familiar with from software engineering and kind of bring frameworks and best practices into these communities and build all these tools with all these collaborators and sort of that's great. But many of these, for example, require HIPAA compliance. So then we can say, okay, well we have familiarity with that because we do cybersecurity research, we do this sort of next generation cryptography research. We can actually take some of these techniques and bring it into these projects. And now we can maybe use a slightly more sophisticated form of encryption to manage sort of multiple layers of users and the way that they can have access to different components. So we can bring that into the frameworks because that we have experience with that. Of course also builds our own sort of experience, the experience of our interns and so on as we do that. But then you can actually turn around and go back and say now that I'm in this community I actually am introduced to new problems that I haven't seen before because I'm familiar with all these members of this community. And now I can actually go back to NSF and say they actually would benefit from additional kinds of work in cybersecurity and cryptography that hasn't been done yet, but that we could do and then they would be able to use. And then you can go back to NSF and say we'd like to do this work, please fund it. It'll require some basic research but then it'll also require us to do the software engineering to put it into production and have patients actually be able to use these features. And we've been able to do that with some partners like, hey Charlie is a startup that actually tries to address the recidivism and opioid addiction and help patients not sort of return to their bad habits. And one of the things that we were able to do for them is build this kind of encrypted backend that allows them to analyze and store data without actually having to store sensitive data about patients and sort of their histories and so on. Other examples we've been able again to allow students within these urban data science projects to use the Massachusetts OpenCloud for example as well as sort of efficient frameworks for doing big data computations in these kinds of environments. And then again as a result of us being able to go to the city of Boston or the collaborators and say here we've helped you with something, they come back and they trust us to tell us problems. Because usually if a researcher goes to like the city of Boston and says we'd like to work with you on something they're like who are you? I've never worked with you before. I don't really trust that if some of your students learn about my problem that they're gonna come back a year later or a few months later and actually solve it. But by actually doing something first and building a relationship, you actually get to talk to people and they tell you what their real problems are and that's how some of our, some of this multi-party computation deployment work happened as a result of the fact that we already had relationships with the city of Boston based on previous engagements. And then that goes back into again allowing us to ask for funding to do research and cybersecurity, cryptography and so on for things like multi-party computation. Sorry, I did not plan to get sick. All right, so anyway, so that hopefully gives you an idea of the way that we've been able to leverage the fact that we are using software engineering practices and sort of open source techniques to enrich the research community at BU and to sort of find new opportunities both for researchers as well as for introducing software engineering and open source best practices into these communities. And again, I wanna pause and say there's a lot of other things going on at BU. This is just a small slice of things that we've been involved in. There's a lot of things going on at the School of Engineering. There's a lot of things going on even within the Hury Institute that I haven't mentioned that again, you'll hear in other talks and I hope you do attend those talks and hear about those other topic areas. But so that's us, we're the nation lab and these are some of our sources of funding now here from these external agencies and other partner organizations that fund the sort of the majority of our work. And thank you very much for sort of listening to what's going on at BU. Hopefully it gives you a better idea of what's happening and also a better idea of the opportunities that you have to engage with the academic community around some of these areas as well as many others. So thank you very much. Thanks very much. Thanks, Andre, that's really great. I appreciate the talk. I forgot to mention one thing which is that in this room here is the Red Hat Boston University Collaboratory Track. All of the talks in this room today will be presented by interns from BU who have been working with us over the summer. So please be sure to check those out if you find them interesting. We got a coffee break now, so get some coffee. Enjoy it. Enjoy the stuff. We're paying for it. And thank you.