 So welcome back. We are now ready to start our second part of today's event the panel of distinguished speakers It's my pleasure to introduce Professor James Davis over there who is going to moderate the panel Professor Davis has been with the Elmer family school of electrical and computer engineering for a little over two years now It's one of our young stars and we're very happy to have him Moderate this panel Hi everybody, thanks for coming to the panel. I'm looking forward to an interesting discussion today. So joining me Is Dr. Dong Yan Shu. He is the Samuel Conte professor of computer science He's also has a courtesy appointment in electrical and computer engineering and is one of the directors of the produced serious center for cybersecurity His current research interests include computer system security and cyber physical security especially in the domains of autonomous vehicles manufacturing and supply chain networks and Dr. Shu has received multiple awards from major cybersecurity conferences for his research papers on kernel malware defenses memory forensics advanced persistent threat or APT analytics and IOT vulnerability discovery Next on the panel you've all enjoyed hearing from Dr. Nancy Levison and I won't belabor her biography further After her is Dr. Ryan Newton Ryan is an associate professor of electrical and computer engineering as well as of computer science here at Purdue He's worked in the space of developer or software developer tools since 2002 in both academic and industry roles In industry, he's worked in engineering and research at Intel Microsoft and Metta formerly known as Facebook And he's also co-founded a startup and served as a CEO there Ryan has been a faculty member down the road at Indiana University and is now on the faculty of Purdue University Ryan's work centers around increasing developer productivity and software Debugability using deterministic execution and deterministic parallel programming The last member of our panel is dr. Eric Mattson who is a professor and university faculty scholar in the Department of Computer and Information Technology here at Purdue He's the director of the Korean software square at Purdue and the co-founder of the M2M lab Which research is multi-agent systems cooperative robotics and wireless communication with a special emphasis on safety and security systems Professor Mattson was previously on the board of the army science and technology for the national academies of science engineering and medicine He founded an external limited liability corporation called autonomous range Which builds systems and logistical services for government and private organizations And prior to joining Purdue University, dr. Mattson was in industrial and commercial software development for 15 years As a software engineer manager and director at companies such as at and t and schneider electric So um, let's give a hand to our panelists, please So the the title of this panel is is Deliberately a little bit provocative So it's called conflict and complementation between agile methods and systems analysis When developing and maintaining software intensive systems And so here's here's the motivation behind this panel right and so you're proposing that we need a paradigm shift in designing Safety sensitive systems and security systems and that what we need is more top-down analysis And your approach dates sort of from lessons learned in the 1950s through 70s. It's just not working and we need to do something new Now in software engineering There was a similar paradigm shift circa the 90s and 2000s that said we've tried doing top-down engineering design And it doesn't work We need to do bottom-up engineering and sort of iteratively evolve our way towards a a good system Whether that system is safe or secure I guess I'd love I'd love to talk about what the panelists So that's what's motivating this is sort of these two competing claims to paradigmatic change One arguing for more top-down and one arguing for way less top-down and a lot more bottom-up work Um, so that's to set the stage So I'd love to start out with a question sort of generally to make sure that we all have a shared definition of what's So we know what you mean by top-down systems analysis I'd love it if one of the panelists or maybe two could comment on What they hear when I say agile methods what the software industry means by that Maybe ryan you could start. I think you're the freshest From industry. Uh, yeah, that's right And I guess I wanted to bring the perspective that agile is a bit of a placeholder And what goes on in software engineering within the software companies? Which are not first and foremost safety critical, of course Is is only sort of superficially called agile for the most part it's a even more Chaotic chaotic and bottom-up process as it is with open source as well Which makes up a very large percentage of the systems that we rely upon today Um, so I'm I'm taking agile here for the purpose of this discussion as the agile process itself That everybody knows about with this daily scrum and stand-up and it's sort of why don't you go into a little more detail Some of our participants maybe know what you mean, but in the audience they may not sure So if you find people debating online different software processes and they're talking about agile They're most likely talking about not only this system of Having an iterative improvement upon something that you have ready as soon as possible Rather than a sort of long-term planning process or an older waterfall method or terms you might terms you might hear But they they're also referring to a set of practices where you'll get up in the morning And you'll have a daily stand-up where people update their status to the other team members You'll do two-week sprints where you select a set of tasks that are drawn from a backlog There's a certain set of rituals. There is a process associated with it But most of much of the software development doesn't follow any sort of explicit process at all And so there's kind of a spectrum of things from top down to bottom up But there's also kind of this other dimension of process heavy versus no process or low process And eric I understand that you were in more of a Defense contracting larger system building than than ryan was can you talk a little bit about whether you experienced agile methodologies there? We actually did both. So I spent lots of years Especially at AT&T at that time AT&T was a very rich and probably from a safety standpoint a lot of the model checking and model techniques came out of AT&T at that time But AT&T was also being broken apart Which now it's kind of more of a cell phone company that it is At one time, I think 60 percent of all patents in the u.s came out of AT&T in sandia and places like that Now what we used it for is, you know, there's if you look at Agile versus so like a waterfall those are kind of two ends of the spectrum And for a long time waterfall was kind of the traditional method everybody used But it was fairly slow and plotting and you discovered mistakes Very deep and it kind of goes back to your expense. It got to be very expensive to change things The way we used it An agile was kind of coming out at that time agile wasn't the term But it was sort of iterative processes, you know when the omg Kind of started defining all these things and a lot of the things for the software engineering world Um, we actually use both What we found out is in the software method that the waterfall method people had a very hard time Especially on the manufacturing floor getting people to express what they wanted with new systems, especially in the union environments places were highly constricted So we actually used No, we didn't call them scrums at that time because that was a term that came later But we would go out and actually collect data and and use very agile methods to generate Requirements And then plow them back into the larger methods The good thing about that is the the waterfall method had a lot of especially the way we applied it had a lot of rigor to it and had a lot of Structure to it. Whereas the agile as kind of you said it's I know microsoft used the agile for years and They had lots of problems with quality and things like that just because the way they applied it So I think it's we applied it by using kind of the best parts of both and that and interlocked in the processes And that actually worked pretty well But neither one of them is perfect and neither one of them will do everything you simply wanted to Especially in the context of this so so Nancy Eric just talked a little bit about Systems that maybe said the more safety or security requirements And higher stakes in play and how in his experience They sort of used a bit of prototyping to to explore what the operators experience and to explore what the system is capable of Can you talk about how that kind of prototyping information can feed back Into the the top-down analysis and the systems engineering perspective you bring well first of all, let me I differ with your first Go ahead different as much as you like the statement about what describing what top-down Development that's not top-down development That people think they're doing top-down development. They say oh, we do top-down I mean we break it into a visa components and then we work at each of the components and we put them together That's bottom up I mean that first step of breaking into components is easy Trivial that's not top down top down Is real top-down development means you look at the entire system at various levels of Abstraction starting with a very high level of abstraction, but you're looking at the system as a whole That's the first first thing I think The second thing is I think the people are are making a false False dichotomy that something has to be everything has to be agile or it has to be something else. I mean that's that's Filly, I mean it's like saying You know we should Do everything Build every system exactly the same way and that's that's not Reasonable there's different kinds of systems and we need to tailor our method if you have 8 000 engineers and 4 000 of them are software engineers Distributed around the whole country. What are you going to have a scrum with 4 000 people on zoom? I mean that's insane It doesn't make any sense um, if you're Building a system that's Going to be thrown away It's some new app You're going to get rid of it in three months. You want it out quickly because then you get market share This is what the silicon valley does and it doesn't have to work so well as long as people start using it And then you can get them a new version and another one later And they'll be happy and they won't go to your competitor Then you use agile but I mean it's insane to talk about using agile on some giant complex system It's going to be around for 30 years. It's going to have to have to be continually changed It's like saying, um I'm not going to look at a map and plan my route I'll just randomly go around the streets and hope I find the place that i'm going to if you don't know where you're going you can't get there And um, and I believe and so Sure, there are things and you know, this whole the agile thing that changed I was saying at breakfast we um The d.o.d. has swallowed this whole agile business And they had this one thing where they were where they were implementing something in afghanistan It was a refueling system for scheduling refueling of airplanes in the air And they did it on this white board there There's little yellow stickies and they stick it up there And they did this for a couple years and someone said well, you know, maybe we should automate this So they automated it had a small company kessel ride in boston do it Took them a very short amount of time The thing was only going to be used when they were in afghanistan So it's deadline but they knew exactly what the requirements were they knew the algorithm. They've been doing it by hand for um For for years So yeah, it works. So then the dsd. Oh, well, then we should do all All systems that way. I just got a a call from some of email from someone A large defense contractor. I can't mention the name And they wouldn't even tell me the project so it's I'm sure it's secret and they said well, they told us we have to use agile And they also said we can't write requirements We just need they'll give us some use cases and we just should use those use cases And I said and and but we don't know how to make sure that this thing is going to be safe I assume this is something that explodes or does something bad and I said well, you can't I mean, I'm not going to be involved You can't That's stupid So I wished him luck So I'm not a a fan of agile unless it happens to be a certain kind of problem. I also Not sure I agree totally with you about people not knowing what they want People don't know how to tell you what they want But I've gotten in I made some Some undergraduate software engineering students go in and bowing they had a Program problem and and they were going to implement it for them and But I taught them interviewing techniques I taught them how some sophisticated techniques for figuring out really Figuring out what these people wanted to do and needed to do And they could After but it takes more work It takes prototyping sometimes so that they try it. They say well, no, that's not what I'm doing Want to do But there are ways of doing these things I don't think that just doing away with it and saying building anything and then saying well, do you like it Is the is a solution to that problem Go ahead. We actually On that it wasn't a matter of saying they didn't know they absolutely do know they know their jobs well The biggest issue especially if you like we were going into a lot of like heavily union shops We're getting them to change was damn near impossible So what we do is you had kind of two methods one you'd say Okay, you have the fear of the white page So you give a white page and say tell me all the things you'd like to have in your new system And you'd get back the page and it would be completely empty Or you'd get a list of be exactly the same functionality they currently have and you're not going to do anything new So what we would often do is we'd go in there really quickly for like a week develop a new system develop a prototype And say okay play with this. This is what you're going to get And then that would drive tremendous and they go. Oh, no, this is this is not going to work for us And then they would give you an extremely detailed list of everything wrong with it and things like that So we would use a lot of those kind of techniques on the shop floor To drive requirements and drive specifications because that worked extremely well Because sometimes dragging it out of them, you know, we'd get pushed back from the union boss or the The stewards and things like that and say no, we're not changing it at all That's like well, look, I'm just the software engineer, you know, I I don't I don't make those changes, but the reality is We're going to get a new system and you either get what you want or you're going to be forced into something And it's sometimes it's not easy. Yeah, they definitely know what they want because they know their job better than anybody The reality is extracting that from them is sometimes a long and painful process until you put it on them to say Okay, here's what you're going to get and then they'll be very willing to give you more feedback Yeah, I mean, I think we just need better tools. We don't have good requirements engineering tool But better tool that doesn't mean that we shouldn't do it We it's an important part of of engineering and Engineers especially computer people want to get to the programming Where's some code I can write Doing a whole bunch of specifications and sitting meetings isn't real fun So I'd love to ask your perspective on what I think of as kind of pollution from the non-safety critical software world This world where I've spent most of my time in tech companies where it's almost I guess to take a kind of devil's advocate position It's almost giving up on software engineering management instead saying Let's have this frothy soup of all the brightest engineers Just trying things and reward people for what they for what they accomplish and out of this kind of Darwinian Darwinian soup you'll get some good stuff GMO was a 20 project or or what have you But in this world of software engineering both inside the major tech companies and inside open source Where it's just passion projects that people are working on in their nights and weekends We get all this software made and then now my car is full of all kinds of software including the linux kernel Which isn't developed with any sort of rigorous procedure But it's just the dominant software platform and And so as this world of open source software and this kind of mainstream programming culture Develops all this Library of software then it makes its way into what I think of as safety critical systems I don't actually know if this is penetrated as much into our airplanes or our missiles as it has in our cars But um, but do you do you see it as a problem with sort of insufficiently vetted software stinking across because it's convenient and it's there Well, I don't think it's just safety critical systems. I mean clearly it's the safety critical is um, it makes it You know compelling argument But I don't want to lose money. I don't want to have to you know Something like 50 percent of large software projects are never finished 50 percent And other ones that are finished something like another half of those never used by anyone. They were delivered Because they someone had to deliver them and then nobody uses them. There's something wrong with what we're doing here And I don't think it's going to be solved by agile um agile was sort of a at the beginning was a reaction against um the software productivity consortium stuff the maturity capability model, what was it? MCC or something whatever because that sort of treated humans as machines and It was not the right way to get good software and get people but the other extreme is You know treated all is just human creativity is also I think A mistake we've got to get somewhere in the middle Dangan, did you have something to add? Yeah, so I just want to add that regardless of the specific Software or system development methodology that you adopt It is always good to be backed by systematic system analysis in different aspects the process modeling system modeling And you do have to look at the entire system as dr. Levison mentioned You know you have to look at the system as a whole and safety security should always be the first class citizen When you start the system design and you don't want to be in a haste to actually You know decompose the system into components and say that oh, you know you work on this You work on that you swim in this swim lane and you swim on that Because you know at the end, you know based on you know what we see the vulnerabilities or the weakest link or the Achilles heel Usually, you know exists at the border between these system components Right, so I think top-down analysis modeling is valuable in the sense that you try to capture these overarching properties and Functions and you know different aspects and try to reflect that and try to keep that in mind throughout your end to end design implementation process So, you know, you can do for example CICD scrum all kinds of agile or Traditional software development It should always be backed by you know a sound system analysis models Modeling methods methodologies. That's what I have to tell you before we tackle that I'd love it if you could talk a little more about some specific vulnerabilities that you've identified in your researcher and reading That really sit at these intersections of components Thank you. So I think that's a Yes, it's good that you mentioned this So in fact when I when I was listening to dr. Levison's presentation about the the use of constraint to actually Constrain the behavior of system that resonates tremendously with what I'm doing in the context of UAV security so what we have a project that aims at looking at the the controller of the UAV system from both the control and the control both the model and the program perspective and a lot of the vulnerabilities that we realize are not necessarily You know the problem or the fault of the system or control engineer It's actually the process of translating or implementing a very well and scientifically sound control algorithm into the control program And you tend to see a lot of bugs that you know are kind of Introduced because of a lack of a kind of a secure programming or because of a lack of understanding of both the systems aspect and The software engineering aspect, you know bugs as simple as a wrong variable name So you you define a kind of a conceptual You know like control variable in the abstract model and when you write the control program You know sometimes you made a mistake you use a different program name Just program variable as simple as that and sometimes, you know, how often you do bound checking How how often you do integer overflow check buffer overflow and all these, you know Problems are we have seen we have seen a lot of those in both open source and commodity autonomous vehicle control framework software frameworks So I just wanted there. I'm sure the graduate students out here looking for topics things to work on and I just don't think we need another compiler We need there's really not been enough in systems analysis as you said and That includes software systems analysis and Be good if you can expand that to engineered systems that include lots of software But that's where the real wins are today. That's where the real important stuff is going to come from I think Did you have something to add oh a chance to defend compilers perhaps go for yeah So, uh, I guess I did want to ask the vision of system modeling So let's maybe narrow the scope to cyber physical systems and we're not talking about arbitrary software only projects because I would challenge you to find like a software only project at a major tech company That does any systems analysis or system modeling? Um, I haven't heard of one. I'd like to see an example But um in the cyber physical case when you are performing stpa or other system modeling methodologies Do you envision this is ultimately being? um connected to the code there is still code certainly and It's connected to code in a formal and unbreakable way where the human doesn't get to mess up the variable name and Screw it up at the last mile when model hits reality I wanted to bring this a bit to the question of formal verification and Connecting that form of verification back to code in a mechanized way, which is something that are Programming languages researchers and the people who are building new compilers adam tropology group at mit Are are very much looking at Oh, so well, I guess a question for both of you No, go ahead. Um, so I I think this is a this is a great point I think in the domain of cyber physical systems We so I think the software actually plays a special role in a cps In the sense that the software is almost a digital twin of the physical system So it's not just yet another component of the entire system It's actually a reflection of that system right in the software You know in in in a relatively formal way So we at least in my own experience it is always valuable to actually have almost like side by side It's really a real abstract system model. For example, like the Like the six the six degree of freedom right for a uav, you know It's corresponding controller and the cascading of the primitive controller that controls the position the velocity the acceleration So you have that plot and on the other side You actually have your program cfg like control flow graph and data flow graph And you actually want to create this one-to-one mapping between, you know, the the abstract controller and the specific program module The the abstract control variable names And the program variable names and you almost try to create this roadmap Right that creates this one-to-one mapping between these type like physical components control Model and the actual program implementation And we use that as the roadmap to unite the system engineers and the software engineers So that every time if they find, you know, they realize that they may have a communication problem We always go back to that kind of a general we go to that same page right to actually see Oh, you are talking about this control algorithm and then that correspond to this software module Or I found this bug in this piece of code or that actually could create some kind of a physical consequence on for example the You know hoovering or taking off or landing mode of the of the of the vehicle operation So I think that mapping is very important Which is not typically done in developing a traditional piece of software like web browsers or web servers Sure, but just to clarify to make sure we understand the example Like if I come along and do mutation testing and flip the x variable in the model with the y variable in the actual code What catches there? We we if we break the correspondence between the model and the code Which which part of the tooling is going to then complain and cry out So I think you will have so For for cps you always have to set up a kind of a testing framework that involves not just the program But also the entire system right it could be a real system or it could be a simulator I think for autonomous vehicles you typically you involve a kind of a high fidelity simulator So basically you monitor not only the program execution But the entire vehicle operation and you do a lot of Instrumentation and measurements so that you can actually measure the health the health in in terms of control As well as program execution of the system at any time. So sometimes if you realize that you know your Uh, you know your your state no longer track that goal Then you realize that you know, maybe your control algorithm is not executing properly even though the program keeps running, right? So in the traditional program, you know fuzzing thing you hit a jackpot every time you have a crash So in our experience sometimes a control program it doesn't crash, but it is definitely not right So I think this is actually kind of new. I mean at least to me I feel that this is something that that that is quite interesting to me And actually highlighting the importance of overall system modeling So Nancy can you talk a little bit? I mean you you had a slide near the end about all the ways you've applied Stamp and some of them include building tools for I assume systems engineers to use Um, do those tools bridge the gap between the systems engineering model and the software? Underneath or are you relying on sort of human operators to do that that mapping? Yeah, you know virtually every accident involving software that I've I've heard about or been involved in investigating and there's hundreds of them Everything alone was a software requirements problem Not a software coding error. We we get coding errors pretty much out And if you can spend some money and time on it for the you can you can do pretty well on that So what you do is you start with system requirements and then you generate from that the The requirements for the different components the hardware the software the humans And but you have to start by doing an analysis of the system as a whole And the operation is required. So I mean I agree with everything you're saying Um, it's just it's a loss. It's a field that just did not got enough attention in in computer science I used to be one Eric, did I see you twitch the microphone earlier? to okay but this the the one thing to kind of Add is one thing is is I spent a lot of time We we we've done a couple of projects actually here at Purdue with Air Force research labs on microwave vehicles and in particularly If you look at there's very few teams doing microwave vehicles like flapping wing robotics That's a defense thing until hypersonics came in and kind of took all the money away So we continued the actual work and what we did Was you have the model you have the software you have the actual cyber physical implementation Which is the control vehicle, but what they were interested in and what we actually did for them was In the middle of it actually update the model based on It's it's not, you know, the I do a lot of work for the military and the military is very interested in survivable systems Um, we also do a lot of the man done man the mum t the man done man teaming With these systems I can't talk about some of it, but we can talk about some of it But the really interesting thing that that they want us to do and what we're working on now And we're getting like second level of funding for Air Force research labs Is not only comparing the the model and the software but updating the model In the middle of the actual execution So for example, if you have a flapping wing robot And the wing gets clipped off Well, that's going to change your control model. That's going to change how the bird flies You know, if you think of it as a bird, which it is um And you actually have to adjust your model To be able to adjust it which is very complex when in the middle of a flight But we actually showed that you know, it's not proven because you can't prove any of those kind of physical things But you can show that it is valid that you can do it And so there's a lot of really cool things out there, especially for the grad students if you're looking for Really cool projects like this. Yeah It's not so much of a safety concern because the thing is very small and the worst it could do is crash You worry about the safety of the actual Remember, I find safety very broad Yeah, yeah The safety of the vehicle, you know, it's if you're looking at like a predator drone That's a little different story, you know, you're looking at the safety and especially for lethal systems But I think there's a really cool concept there where we kind of even go beyond that And that's kind of what I was interested in some of the things you were saying is Is how do we actually update the model in the middle of that operation to adjust the model to make the model better? And so we're doing a lot of work with that and and and we're applying that to the mumt systems Mumt is just a we're manned unmanned for the the u.s. Army. So So this is a broad question I'm not sure which of you wants to pick it up, but maybe more than one of you So when we talk specifically about Trying to evolve Complex systems that have been built and often that involves changing the software in some way One of the goals of agile methods is to facilitate Moving a thing from a working state to another working state And one of the challenges there is measuring whether it's still working And so the typical approach has been running tests on it, right? And it'd be nice if we could instead prove somehow based on our models that it's working properly and and a Perennial problem or at least a perennial claim in the software engineering practitioner community is that you can't scale formal verification of your systems It there's just too many states and it will take a million years to prove that anything really complicated is is working So can any of you comment in your own particular sub areas on how true is that claim? How close are we to letting engineers over a two-week sprint? Modify an existing complicated code base and be still very confident that you didn't regress behavior in a way that will compromise safety or security I I can kind of this is some of the stuff we're doing so in in reference to that Um, yeah, when you have if you look at the entire system as a whole It's kind of like calculating the chess game You know a few years ago You couldn't do it until deep blue and some of them came out The reality is those what I know what we've worked on and what I've seen others do is The model doesn't have to represent every single thing something does right You look at the most important aspects of it the safe the critical aspects of it from an operational standpoint And then sort of approximate that versus approximating every single possible thing this cyber physical system could do And that's oftentimes calculable. I mean you can actually go out and calculate the states You can look at those kind of things And that's worked quite a bit now the biggest issue And it kind of goes back to what you're saying before is Is the one thing that people do a really poor job of is defining why their systems are so complex Is it because they have too many too many linkages to too many things there's too many couplings You know in software engineering coupling was always the the dirty word you wanted your systems decoupled But that creates a problem in this because if your systems are decoupled you can't See the giant system as a whole, you know and each one is kind of its own little individual So I think you can make approximations based upon the most critical factors of a system And that's worked pretty well from a practical standpoint. It doesn't necessarily work from a purely theoretical standpoint One of the the Another topic who I thought of that. I think people should be working on is You may have read it you should if you haven't you should it's Fred Burke's Paper no silver bullet. Have you all read it? Hopefully Where he says there's certain We add a lot of complexity into the systems ourselves But there is still just some complexity that's part of the problem that we can't get rid of So I'd like to see people come up with how do we create systems the Zion systems without added unnecessary complexity Because we do a lot of it. I think at Purdue you guys have the Her um, what is the goldberg the the yeah, we have a goldberg challenge the Yeah, the way, you know the adding complexity and for complexity sake. Unfortunately, we sometimes do that too often accidentally and not on purpose and We so I think a good Topic that we've got to get is how do we create systems from the beginning? Instead of laning on assurance. You cannot assure a 10 hundred million lines Piece of software 100 million lines in a soft in a car today You cannot assure that forget it if you found the problem you can't fix it without causing 20 others More serious ones and you can't be sure that you haven't caused 20 other More serious ones. So we've got to figure out how do we build these things in from the beginning How do we what I see people do is they get a set of requirements and then they immediately create an architecture And what architecture they use? Well, it looks like something they've always been before Instead of we don't have any theories for how do you design? What kinds of architectural principles should you use to get certain properties out of systems all systems have multiple properties What kinds of architectural design techniques can we create? To do those how do we analyze them? How do we look at the trade-offs? Between this this may be more reliable, but less efficient. How do we? Create architecture. How do we start? There's not enough at the beginning of the design process and unfortunately in Software and it just historically We leaned a lot on testing. I used to be in testing You can't test them I'm sorry. You have to but you're not going to get much out of testing either we start Designing systems and becoming more mature and our ability to create an engineer system from the beginning Or we're going to still have 60 70 percent of our our Complex systems we built are never used by anyone and can't be used and that's that's a shame That's Tommen's waste of resources Do you view computation or in particular computing horsepower provided by machines is offering any advantage for reasoning about complexity at this kind of system requirements level so you would think at a glance that The computer's going to evaluate many more chess states. Dolly can generate Artworks synthetically with a text prompt, you know computers solve these amazing problems that would be beyond human capability but when it comes to thinking of complex combinations of requirements that are going to yield a failure like some of your examples Uh It seems like we should in some sense be able to use computers to reason through some of these unforeseen transitive consequences Better than humans can but i'm not hearing that there's any lever for that right now that I can warm up a computer and Have any additional help with this kind of holistic system safety reasoning I also I I definitely agree with you Ryan. I also observe that you know The uh like large-scale software development, uh, especially in terms of the human developer Management, I think that is still not very scalable in the sense that you know, even for you know, like world kind of large-scale real world production system So again, I talked to some of my colleagues in industry working in the like the names that I don't want to name But um, they said that despite the complexity of the problem and also the number of software engineer like hundreds of them Thousands of them involved in the development and maintenance of that software The core team the people who actually are you know, you know, critically responsible for you know CICD and making those critical commits is a surprisingly small subset. I would say two digit So I feel that this is not very scalable right because now we have all these computing power You know sitting there, you know that can be waiting to be Leveraged maybe to speed up or to uh perform some of the tasks that the human developers Are routinely doing these days To you know to improve the productivity and the reliability and facilitate the communication and you know smoother CICD processes But in the traditional software development and software bug elimination process Uh computation is helping with increasing testing and with increasing formal verification You already mentioned fuzzing but also on the formal verification side So testing and verification are very different things and we are getting closer to a world where uh a human being can write additional logical specifications in their code and a large number of those can be mechanically checked So if you're using something like liquid haskell or the prusty system for rust There's there's increasingly the ability to have a in this case a smt solver Do a lot of the work for you when it comes to discharging formal verification obligations But when will that become mainstream in programming? Well, I'm a dreamer and I can I can hope it will be The majority of programmers in a couple two or three decades, but uh, but now it's certainly not the the mainstream And even if we had wonderfully verified kernels and compilers and pure software systems Then there's this whole separate question about how it relates to system specifications and cyber physical systems And that that's actually the specifications themselves are changing Because you know, you want new features you want new you are defining new performance and functional metrics, right? That you need to take into consideration Yeah, that's on the fly. I guess most of our traditional software formal verification requirements weren't really moving targets So we want to make sure there's not memory errors. We don't have a matter of bounds access We prove certain properties about a kernel in terms of safety or isolation, but they don't change a lot I think it's also interesting as as as dr. Levison mentioned You do need to identify some of these constraints and sometimes you do need to mine the the existing system the The the code the documents and even the operation log of the system to derive the constraint that needs to be enforced In in in the system So we're approaching the end of this and we'll head into a q&a session I'm sure there's lots of questions out there But since many of the folks in the audience are students planning to head into industry when they finish I'd love it if if each of you could share any thoughts you have about As engineers what ethical responsibility you have as a professional and as a member of a professional discipline towards Pursuing safety and pursuing security holistically in the systems you're building regardless of the engineering process agile or waterfall or whatever How should engineers be thinking about their their duties? to society and As they're doing their work I have a chapter in my book on ethics. It's the first chapter And I put it first on purpose because well, it's the second chapter, but I wanted people not to skip it There's an introduction before that every And I one of the things I make my students do is look up The code of ethics for their professional society and there is one for the acm And for every professional organization and nobody's ever read them And but they should and I just make them look up so that they can see that there is and and there are always They always contain Safety as one of professional responsibility And unfortunately I see people not doing that There was a some people that might feel never speak not a lot of people at MIT will never speak to me again But there is this one group who advertise this new class for undergraduates where they were going to have them use machine learning and and whatever and create tools for medical tools And they were going to be able and they advertised that they were going to be used at local hospitals And I looked and it didn't look like they had anything about safety in this so I called them up I say I wrote a bail. I said are you going to teach safety and they said No Said don't you think you should you're going to use this on real people? And they said no and I said well You know, that's really unethical So they said well if you want to come and teach them a class I'm supposed to come teach their class and if you want to come and we'll give you a couple hours to teach them about this Um, and I said no It's not my responsibility to teach every every thing you shouldn't you know people Don't seem to think about the ethical responsibilities. I think that's what you may be getting at They're not thinking enough and we're not teaching them enough to Explain to them There's people's lives at stake. That's what you most of what you're doing or their money, which is also their livelihood All of these things We're being we're responsible for them. We have to take that seriously as As what we're doing with our lives Well, let's take a minute and thank the panel Um And we've got about 15 minutes for questions So if you have a question for the whole panel to consider or a specific person Um, there's a couple microphones around shoot up your hand and we'll get you a mic Okay, um, thank you and um It was a very explosive lecture and um interactive session. Um So I have two questions here one for um prof Nancy and um the second one for prof Dangan Okay, so the question is on constraints. So the whole um One one thing I got from your lecture is like um from this interactive session is that the whole concept of um The new paradigm is about enforcing constraints on it of on a system But these constraints first need need to be identified Um, so I'm actually now curious First to identify these constraints How like are there newer ways to identify these constraints on a system? And secondly, what happens if we miss a constraint or an important constraint? Um, yeah, thank you So the questions were um, how do we find the constraints in the first place your paradigm requires that we have them So where do we get them and are there new tools and techniques that you could talk about? And the second part of the question was What what are the potential consequences when we miss a constraint and how do we find it before the plane crashes? Yeah, you know, we can only do Let me do the second one. I mean the You can only do as much as you can I mean the world's not perfect the engineering is never going to be perfect I will never ever claim that my tools find all the unknown unknowns They find many of the unknown unknowns, but now I get nobody can guarantee that in any kind of analysis program And But we need to at least do the state of the art. We need to at least do as much as we can That's All we can do. How do we find the hazards? Identifying the hazards are not is not usually very hard and The only thing that may be tricky is getting the stakeholders To tell you what their priorities are because it's not our responsibility side what's What's Important to them or not. They have to decide what's important to them. We're just a client of theirs But that's not usually the problem identifying the constraints are pretty simple Actually, it turns out This is the rest of the stuff Figuring out whether this they're they hold is the real problem In your system or how to build a system that always enforces them So I do feel that this is almost like a kind of an iterated process Sometimes in the first iteration you may not be able to identify the full set of constraints that you want to enforce For the system, but you can actually do that throughout you know the the production or the test operation of the system And as I also mentioned earlier, you are also able to derive some of these constraints from the system itself So put it in a for example a clean and safe environment operator system, you know You know with guarantee make sure that you see all the normal operation and at the same time Collect as much data as possible. So this is where machine learning comes to the rescue So you can actually mine the data to derive some of the rules invariants or constraints that are not even specified In the design document. For example, these parameters are supposed to satisfy this mathematical Relation so and then you can have the human expert verify that and see if the your program or if your system actually enforces that Rule or that constraint and you can actually do that in a iterative or kind of a spiral process Hello Hello, this is a question referring to your previous lecture miss leveson, but For the plane incidents, why didn't they check out? Why didn't the software check altitude in order to determine if the plane was landing or not? So the question was in some of the examples that you shared of the aircraft accidents Right, how does they know that they're landing? Yeah, and so that they've landed they use different cues One is weight on wheels How you know is their weight on the wheels you can measure that with sensors in your wheel They use things like wheel turning rate So if you're landing and you're you're going your wheels should be turning the wheels are you know, the landing gear is extended Before you get down, but it's not going to Rotate unless until you get on the ground and start rotating it. So there's a whole bunch of cues Was that the question? Yeah Part of the question was why not use something like an altitude sensor? Do those not exist? Are they too expensive? Or is it a bad method and the the existing why not use an altitude sensor? There must be a good reason I don't I don't build a design aircraft But there must be a good reason they may not be accurate enough They They did on the In mars polar lander that was the spacecraft that that crashed into the mars when we tried to land it and They have They they did try and use an altitude sensor, but they also use I mean these things fail everything's else So you have multiple things and they had these very Very sensitive sensors on the landing legs, but it turned out called Hall effect sensors and They're very very sensitive because as soon as you get down on the surface. There's this thing A thruster that's that's slowing down the aircraft like a reverse thruster at the spacecraft by pushing up against it how it slows it down so it doesn't crash into the surface and they actually used this It turns out that when they they have to extend these wheels this Before they get down to the ground and there's some noise That that gets generated when the wheel legs are extended and the engineers knew about it They didn't tell the software engineers about it So the software engineers didn't know about it thought it landed and then turned off the reverse thruster basically this Dissent engine and it crashed But they have they had also some altitude sensors the it's just It's I don't really know why there must be some do you know any do you know why they don't use them My educated guess is that you know because sensor fusion is very typically used They actually use not one because you don't want to critically critically depend on on just one sensor, right? But you know because of this quote-unquote Democracy among these different sensors. I am you whatever They need to reach an agreement right before raising a real alarm and sometimes if they cannot Especially because of the malfunctioning of a subset of them Then they cannot agree with each other and you know, that's really a kind of a fuzzy kind of a false positive or false negative moments And I think this is my guess is that you know, this is the reason why inaccurate Or you know like fp or fn a false positive or false negative happen And I think the situation you mentioned, you know, we talked about belongs to that they have things like ground proximity sensors That they use to make sure you're not going to hit the ground before you But you remember ground isn't necessarily all flat You know, you can have a hill and go over there and you think that you're now landed and you're you're not You know now all of a sudden that's not hill isn't there There are actually a actually a fair number of airports. I have big mountains right in front of them. They're not real safe to land at They have a lot of accidents running into them. I think I saw a question over here Hello So first, thank you very much for sharing your experiences Move it closer to your mouth. Thank you very much for sharing your experiences. I have a question If you could share your comments on the v model the v development model when you design the requirements on your test So the question is waterfall with a kink in it Well, it was just it's wonderful. They just it was hard to draw all those lines going back So they made a kink in the waterfall so that it was easy to draw the lines without them hitting crossing each other artistic Anyway, yeah, the v model Can use the microphone, please So, um, maybe if you could elaborate on your comments on on it, uh, so I'm guessing maybe you are not a big fan of it So he's he's he's curious if you like the v model or waterfall or I You know, I don't What I try and teach my students the first class of Microsoft system engineering classes I they want them to look at all the different criteria that might affect The processes they choose to use in system engineering We have a long list of them It has to do with the types of systems what they're going to be used by who's going to use them and stuff There is no one model. We want to simplify it down and say there's one thing we should use for everything Agile works for some things. I'm not even against agile It's just you have to match your system engineering process to the problem You're trying to solve and I don't think we teach people that We we we fall in love with one model. I'll have to be waterfall. I'll have to be agile it all has to be You know, there's more than those spiral. There's there's dozens of these It depends and I've seen some comparative studies that showed that when they used uh compared a An iterative more iterative model with a waterfall model and they did two different teams and the more iterative one The users liked better Because it satisfied their needs But they weren't as easy to maintain. They were much less reliable. They were other, you know, there's different properties You're not going to find We've got to stop being naive. I'm thinking there's one model that's going to be perfect for everything You've got to We've got to teach people on that. We don't do I don't think enough to match The model with the properties of this problem. We're trying to solve but I I think from my experience, you have a couple of things one is Matching the model with what you're the problem you're trying to solve The second is people don't they think the model is going to solve everything for them Oh, we just follow these steps and everything will just magically happen The rigor the discipline All of those things that they don't when they don't employ those it doesn't matter what model you use You're going to end up with a disaster and that's why the old software engineering institute You know most most software engineering programs, whether they were safe or not We're 250 over budget. They were never came in on time and then what happens is Because you're over budget management comes in and says okay kill all the testing You know kill all the you know model checking kill all those kind of things and go write to coding software And it's literally Something out of a dillbert cartoon You know just start writing code because that's what we want to see We don't care about all the set of stuff just write code I can't tell you I could sit and tell you stories of How we changed an entire software project because my Ceio got invited to the Nogano Olympics, you know, we had to change all the dates just for that You know so you have all this arbitrary nonsense That kills the discipline kills the rigor of the project and that's the biggest That was the biggest issue. I ever saw was the The process the process can be followed. But if you don't do it correctly, you don't care Um, it doesn't matter that none of that stuff will solve any of your problems It'll just make it worse because you spend a lot of time documenting stuff. You're not going to follow So I think we have time for one more question Um, thank you everybody for the insightful discussion So, uh, one question I had for everybody in the panelists so far most of the examples you've given are Some what physical systems like airplanes and torpedoes and so far. Can you give me an example from your career? Uh, consisting of a more abstract system like a software only system Because I I believe it would be kind of different So are you looking for an example of a failure or an example of applying these methods to a failure? Yeah, okay Given the time constraints, maybe one or two of you could comment on a software only or an it systems failure that you thought was Particularly spectacular and maybe could be traced to not thinking about the whole system Well, yeah, first of all, I don't like the word that software failure failure software stuff fail software doesn't fail How physical systems fail software is an abstraction. It's a pure abstraction. It has no physical reality How does this an abstraction fail? I mean, it it doesn't it doesn't stop working It doesn't break It's um, it's just an abstraction that isn't useful For the problem you want to use it on um, so So that's you know, I I got distracted by the words of the word failure because I've You know, it drives me crazy. Also human failure drives me crazy. I mean when your heart stops you fail Otherwise the human is trying to do the right thing and in a different situation And probably would be the right thing and they didn't have the information they needed to know What was the right thing? I don't did they fail and they were doing the best they could With with what they got so what was the question? So I'll repeat the question. Ryan looks ready and then you can come back after he's oh good Yeah, I guess I would say that um for major websites that everyone uses like google or facebook There have certainly been outages over the years and in my personal experience They mostly have to do with fairly small traditional software bugs. So things like um a Unsigned integer instead of an integer. Uh, it will not go negative instead It will wrap around things like bounds overflows the kind of little bread and butter Errors that are the things we deal with in CS 101 programming assignments are also the things that can cause major problems for major pieces of software at some level So often it's not necessarily even some incredibly subtle interaction between 57 different components that necessarily bring something down. It can also be the little stuff Which is why maybe one of the One of the rules that we should follow is is to not mess up the little stuff and to use the best available Technology for example use rust don't use c++. Don't have bounds overflows by design. No, which uh, which Objects can be shared between threads and which types are signed and unsigned and it required extra checks and have and have that being forced So you don't mess up the little stuff So Nancy maybe to close this out um and to tie this back together For teams that are experiment, you know, this is the problem I'm losing my hearing as i'm getting old and microphones are the worst Because they distort sound so i'm i apologize to everyone i'm trying So ryan just mentioned that major companies google and microsoft and and facebook have Outages that cost them on the order of millions of dollars that can be traced As a sort of a root cause to an integer overflow Um, how could we use your methods to not have an integer overflow Caused millions of dollars of losses at these companies. What would you suggest that they change in terms of constraints? They can define or processes they should pick up or What do you think? Google's using our stuff by the way They they love they love it. Um If this isn't this isn't um a component engineering i just system engineering and Yeah, that i'm i'm sure there are problems where we have stack overflow and all these things we've always had them We have a lot of tools to get rid of them most of the time when we have those people didn't use the tools, right? I mean, I don't think This is necessary You know impossible to deal with nowadays But it's not the kind of things that are causing Real problems. I mean it causes them a problem because their customers don't like having outages But it's not causing accidents The real systems they spend so much time on them That they don't it's not coding is That are there are the problem even when there are coding errors They don't seem to be causing the big Big problems. They've been they looked at all the spacecraft software for years At jpl and they have found there were Uh requirements problems and there were coding problems, but the coding problems just didn't cause the loss of the of the spacecraft I think that brings us to a great stopping why I mean there we probably could So let's thank thank the panel again Thank you, dr. Shoe and dr. Newton and dr. Matson and especially dr. Levison for joining us today I hope you all learned a lot about safety engineering and assuring holistic properties and enjoyed this dialogue about software engineering and systems Engineering methods and some of the different application domains. So thanks again for coming