 Cool, so we started. Okay, hey, my name's Joe Winchester. Thanks so much for dialing into this talk. So I've got about 25 minutes and this is interesting talk that, so I gave a talk yesterday. So I work for IBM. That's my email address there, winchesteruk.ibm.com. I work on open source software for international business machines. My Twitter handle, if you look on the sort of bottom right hand corner of that, it talks about what I enjoy, what I enjoy doing outside work, you know, my family and so forth. But it says on there, I'm a coffee nerd. I like coffee. I like to always find good coffee when I go to a town. I enjoy drinking good coffee. It also says I'm an astronomy, amateur astronomy. So this is really mirrors two of my passions together. So I've always enjoyed software and I've always enjoyed astronomy and space. So this talk, I've given it a couple of times actually at astronomy clubs, local astronomy clubs where I tend to go and geek out. I've not actually given it at the conference before. So bear with me and give feedback if it's not what you're expecting or you think perhaps improvements can be made next time if I give that talk. So a little bit of background. What I wanted to do was, so the talks that I like to listen to the most are ones that sort of tell a story or give me something that's like didactic. There's something I can learn from the talk even if it's not about something that's my area of expertise. So what I decided to do for this talk is I wanna focus on four space missions, four particular space missions. They're all different and all of them have a story to tell about software, computer software in space, okay? And the first space mission I'm gonna cover is the Ariane 5 rocket. So background to Ariane 5. So in 1957, there's a picture of Planet Earth in the bottom left hand corner. There, Sputnik was launched. Sputnik was launched by the USSR, the Soviet Union at the time. Sputnik 1 was the first satellite in space. And if you look at the right hand picture on that chart, we've now got over 3,000 satellites in space around Earth, plus around 8,000 pieces of junk, which are non-operational satellites plus actually some satellites have actually, a couple of satellites recently were actually sort of blown up as kind of, perhaps a precursor of space warfare or things like that. And there's been some failures as well in space and things like that. So there's about 11,000 bits of stuff orbiting the Earth, at least 3,000 were sort of put there by mankind. So Ariane 5, so there was a sort of big race to try and get stuff in space, get satellites in space. They're hugely important for telecommunication systems, global positioning systems, the spy satellites, there's telescopes, the Hubble telescope is a very famous telescope in space in particular orbit. So there was this huge sort of race to basically have the best vehicle to get satellites into space. The space shuttle, most of us are very familiar with the space shuttle, the absolutely wonderful spacecraft built by NASA. Space shuttle was operational from about 1981 to 2011 and it had 135 missions in space. ESA is the European Space Agency and the European Space Agency around the same time sort of towards the end of the last decade also wanting to get its own kit in space as well and contribute to that sort of vast amount of satellites in space. And there was a series of rockets, Ariane, Ariane, which is a Greek goddess, a mythical goddess and they were numbered, so there was Ariane 3, Ariane 4 and Ariane 5. So Ariane 4 was in it and as each rocket was created it basically made use of sort of advances in hardware that were coming at the time. Bigger rockets, they could get bigger payloads into space and they could burn their engines more efficiently so they were more fuel-efficiently. Just the same way that if you go back in time modern car now it's a much better MPG, it's much more efficient aerodynamic cars and perhaps the same car sort of 20 years ago and 40 years ago. So you've got that same regression between the Ariane 3 and the Ariane 4 and the Ariane 5. Justin, I'm just going to read a few notes here. So by comparison, the Ariane 4 rocket depending on how high you want to go whether you want to go into what's got a low Earth orbit which is where you get sort of telecommunications satellites towards a, you get further orbit out where you get geostationary orbits which basically they maintained over the same line of longitude as the Earth orbit takes or there's another orbit which is about halfway between the two which is where it goes around twice the Earth which is where GPS satellites are. You're basically going to get a payload of about in a 7,500 kilograms up to a low Earth orbit or perhaps four and a half thousand up to the middle orbit and to the GPS orbit it's going to be even higher for the geostationary. So the Ariane 5 pretty much, that was the Ariane 4. The Ariane 5 came along and it basically you got that four times a payload. So they have these huge boosters on the side very much sort of a model upon a kind of space shuttle model which is where you just take extra rocket boosters and they give yourself a huge lift to get a skate velocity and then basically get into those further orbits. Ariane 5 decade, 10 years development cost seven billion US dollars. This is a very expensive. That's in real time money. So if you project that now, it's going to obviously be more money. Maiden launch, June the 4th, 1996. You've got $7 billion worth of kit sitting now on the payload. And it was putting four satellites into space on its maiden launch. And I've added the word uninsured satellites. So for those of you that haven't seen what happened, you can go to YouTube afterwards as some phenomenal videos of it. It blew up. The rocket exploded, $7 billion worth of hardware. A decade of development just blew up in space. Great big firework. Nobody was hurt. Our man rocket. So then they forensically afterwards, they say, why, what happened? Why not? Did this happen? Little bit of background. Everybody here works in software. So many people are familiar with how numbers are stored in the computer. So most of it is a binary. Binary is effectively a transistor. It's either on or off. It's whether charged or uncharged. So you can see very quickly, another 10, for example, in binary is 1010. The 8-bit computers, all the ones are lined up, gives you a maximum of 255. 16-bit gives you a maximum of 65,000 or 64K, as it's often shorted to 64-bit numbers. It becomes much, much higher and so forth. So as you go through a 16-bit computing, 8-bit computing, 64-bit computing and so forth, obviously the numbers that you can store just get larger. You can do more computation and you have more ability to store higher numbers. And that's the sort of progression that was occurring around at the same point in time. Now, what was unthought? And it's important to understand how numbers are stored. What was unfortunate about the Ariane 5 rocket is it blew up. It's an absolute tragedy. It's about four of it. It's a perfect storm of things that have gone wrong. The first thing that went wrong is basically the mission failed because 39 seconds after launch, it's self-destructive. It's self-destructive because it realized, and perhaps I can see it if I go back, as I can and it's wonderful software, I can go back and slide. So if we look at the pictures here, it's going up straight and it's veering off course. It's kind of veering off course and then it blows up. So it's got enough sensors on board for it to realize it's veering off course. I apologize. And blow up. It did it on purpose. It's self-destructed on purpose because it thought, we're off course. This is really bad. We might hit populated area, abort, abort, abort, boom. And it's self-destructed. It blew up some explosive charges. The reason it went off course is basically because there was a 16-bit number that was receiving telemetry information. And the piece of software that was received, that was transmitting the telemetry information on Ariane 4, it was giving a number that couldn't get higher than 16 bits, okay? On Ariane 5, which was a faster rocket, the number was larger number, okay? The number was to do with the velocity and because it was going faster, it got bigger than it had been on Ariane 4 and it went into a storage area that was only 16-bit. It overflowed and it basically started walking over other memory and it all went pear-shaped and the whole thing blew up from there. And most rockets have multiple systems that aircraft and a second system confirmed it. It went to a second system and said, this has gone really badly wrong and the second system confirmed it because the second system was fed the same incorrect data. So $7 billion worth just exploded. Now here's the awful thing. On Ariane 3, when this software was as written, moved from Ariane 3 because the developers knew that the exception could never occur, the software was ported to Ariane 4, where luckily it didn't occur and then when it went to Ariane 5, it most definitely did occur. Not only that, just to make matters even worse for the poor people when they were having that kind of management appraisal as to why they blew up $7 billion worth of rocket. 39 seconds into the launch, it blew up. The software wasn't even needed because it was sideways velocity of a rocket on the launch pad. It's off the launch pad now. So it's flying out. So it's not even needed to be stretched on. But they had decided, well, let's leave it running anyway until 40 seconds after launch because just in case it's easier because if the launch sequence has to be ported and then we started, it's an easy reboot sequence. So I was saying, what can we learn from this? Number one, understand the arithmetic. It's really important in computing to understand the arithmetic. If you Google and I did a bit of research before this, there are many, many other examples of where people not knowing how computer store numbers has caused terrible things. There's one example I've got here, which is a Patriot missile. It's a defensive missile that the US Army and other armies have. And they're examples where it's just completely misfired. There was an example of a calculation error on Microsoft Excel that people took for granted. These are both fabulous bits of hardware and software. I'm not criticizing them. I'm just saying the fundamental error is, and I see this often with developers, if you don't understand how numbers are stored, if you don't understand how big things can get, how big arrays can get, and you get array overflow indicators, you can cause catastrophic failures. So understand how things count. Reusing software. If you're writing a new system and you take software that was written from older system and bring it across to the new system, you'll find that the constraints have been removed. Things have got bigger, the world's got faster, bandwidth has got bigger. The constraints that the developers had in the older software will no longer be relevant. You will see things getting stressed and strained at levels. As you saw with that software, for Arian3 that was written where they took out the exception handling code, could have put it back in to Arian4, could have put back into Arian5, just simply didn't think it would go wrong. They said, well, it's proven software, we're just gonna run it and migrate it. The next one, Understanding Demons. You get that a lot with software. I say a lot of performance problems on software where things run slowly and it's because you've got heartbeats and keep alive and things sort of left running alive for more than they need to be. Dynamically scanning this for plugins that already needed an activation or startup. And that had a little bit with Arian5 analogy. It didn't need to be running the software after launch. It already launched, it wasn't required until it launched and it blew the thing up. Catastrophic. The other thing, software, don't give up. Arian5 actually had about 108 successful launches. So even when you've managed to explode $7 billion worth of rockets here, don't give up. Okay. So that's my interest. Next story I'd like to tell you, Saturn. So Saturn's really interesting planet. It's second largest in our solar system. It's six away from the sun. And it's a very large planet, very massive planet. It's about, the radius about nine times the radius of Earth. It's a gas giant. It's mostly a sort of a kind of helium and hydrogen atmosphere and an iron nickel core. It has 62 moons, 62 named moons. That's going to be about 20 more. Now one of them is called Titan. Titan is a really interesting because Titan is actually bigger than Mercury. Mercury is one of our planet. So it's large. It's a large celestial body. And it's very interesting to study. Saturn is an interesting planet because Saturn actually emits about two and a half times more thermal radiations and it absorbs from the sun. It's actually a generator of heat. So it's almost like a mini, the relationship that Titan has with Saturn is almost like the planets like Earth and Mercury and Mars and Venus have with the sun in that we received heat from it. And it actually has a methane atmosphere. Titan is a very interesting planet because people have been pointing telescopes at Titan for a long time and you just can't see inside it. It's like cloudy, methane atmosphere. So Cassini-Huygens is a really interesting probe that was created. It was launched in 1997. And what you've got, I'm looking at the picture of here, you've basically got a very big ship. It's actually nuclear powered. It's one of the first ships in space that NASA created. And it had basically plutonium-238 became. It's a seven year mission. It did about two slingshots of Venus and one of Earth and it gets acceleration. Each time it comes around the planet and it doesn't quite hit the planet, it gets terminal velocity and gets a little boost and it kicks around at seven years and it's flying off and Cassini had a phenomenal mission basically exploring Saturn and its moons. Now, when it gets to Titan, the European Space Agency have some wonderful thing kind of piggybacking on the Cassini, which is called the Huygens probe. Huygens was an interesting, he was a very clever Dutch mathematician who actually discovered the rings of Saturn. And the idea is that as it approaches Titan to get inside the atmosphere, these are the cost and this is a lot of money. You don't want to get wrong. They basically drop it. I hit the next slide, there's a nice picture of it here. The idea behind it is you drop this thing, the unpowered that's a parachute through the atmosphere, gets under the cloudy atmosphere, opens a parachute. All sorts of sensors and probes go on, looking, listening, you know, measuring radiation. Finally gets to the bottom, take a bunch of snapshots. Now, the Huygens probe doesn't have the power to transmit back to Earth, because you're seven years away, it's about five light minutes back to Earth. But this bit giant Cassini craft is still in space. So that's basically your real estate. So Huygens, as it's going down, transmits to Cassini. Cassini then gets out, it has a two and a half meter antenna. Let me see if I can go back to the picture of it. Here we go. You can see it's got two and a half meters, two and a half meters. It's large, I don't know what that is in feet. It's about what sort of nine, possibly 10 feet or something like that. It's very large and that's basically, you know, talking back to Earth and storing the data and sending it back to Earth. Okay, so you've got two transmissions. You've got Huygens dropping, needs to talk to Cassini and then Cassini needs to talk back to Earth. Others do go wrong. You're basically losing a lot of money. Now, this is somebody that if you ever, if you work with somebody like this, I work with a lot of testers. I've worked with a lot of testers in my mind. And I know this is a tester. This person, I'm gonna kiss him to my screen and give him a little bit of a hug. Boris Smith, he was a test engineer. Now, good testing means that you think of things that the engineers haven't thought of. You use software in a way that your end users haven't predicted. Don't test for the predictable. You know, look at how is this thing gonna break in unpredictable ways. So he was bothered about the fact that they had never run a test where you drop a probe into a planet's atmosphere, talk back to something else that's sort of orbiting the atmosphere, or in this case, was didn't actually drop it, but then talks back to Earth. He said, we never tested it. So the first time it's ever been done. I want to test. $5,000. It was all it cooked to create that test. Google for this, if you want, there's a lot of stories here about managers, not really trusting testers, engineers, being dismissive of testers, but this story is about testing software. Now, the fundamental problem, I'll give a little background. Most people here will know about doctor's shift, you know, something comes towards you, it's gonna deen, deen, deen, deen, it gets loud. And, you know, because as something emitting a wave is coming towards the receiver, the wave, the wave gets compressed. And when a wave gets compressed, the frequency goes up and for sound, you know, the pitch gets higher. For light, you know, you get red shift and blue shift. And then on the backside, it goes down, the sound will go down. And you're gonna get the longer, the stretching, the lower frequency. Now, everybody is aware of the frequency because you can see red and blue shift light if you're looking at stars and things. And you can hear, you know, a police or an ambulance siren or a motorbike coming towards you and the noise being compressed. But if sound is constant and you change your frequency, you're changing your wavelength. This was the Achilles heel of the mission. So as you're dropping your probe and your separation between your Cassini and your Huygens, it's about six kilometers a second. Okay, so that's quite high. You're looking at about three times speed of sound or something like that. Your frequency is obviously affected by the Doppler shift but so is the wavelength. And this is what basically SMEDS realized and SMEDS tested for and proved was that the digital sound is basically ones and zeros, it's peaks and troughs, right? So the way that you decode digital sound or whether you decode any digital signal is you have your peaks and troughs for your ones and zeros. You line it with a base signal that's sort of one, one, one, one, one, one, one, one, one. And you just say, am I high? Am I on, am I off? Am I on, am I off? And then you have your binary number and you manage to decode your digital signal. This is fundamentally how digital data is decoded. The problem is if you think about this, it's a bit like two cogs mesh together and turning and saying, am I on or am I off? If you sort of take a blow torch and heat one of them up so that the wavelength gets bigger, your mass all goes wrong. Everything, all of your alignment is gonna go out. And that was a fundamental problem that they had and they realized that about four years into the seven year mission. They knew that the frequency was gonna change but they didn't realize that the wavelength was gonna change. So what would you do? Four years into a mission. Easy, you just patch your software. You would just, you know, you've got three years left. You would just go, here's my fix. This was unpatchable software. They had no chance of patching the software. So the way that they had to fix this is they had to physically change the separation velocity. So they had to ditch the Huygens probe weeks ahead of when they expected. They had to decelerate Cassini. They had to do a double fly pass to decelerate this huge ship. And then they had to basically make it so that your separation velocity was different. And even then, only 80% of the signal was getting through. It was actually very successful. We got some nice pictures here of the Huygens probe descending onto Titan and landing. So what have we learned from this? Single point of failure. If you have a single point of failure, which they had in this case, try not to. So they could have not had a single point of failure. They could have had another way to transmit data, inability to live in fixes and fight. I've seen software that goes out that can't be patched. And I think it's appalling, right? If you buy something with hardware, right now we're used to patching our phones, internet of things, it's really important to be able to patch software. Software will break, software will get stressed. Lack of testing. This is the first time ever in space that this binary method was used. And there hadn't been a lack of testing. And the other one, and I won't talk about the management stuff here if you want to Google for it, but if there's many managers on the call, listen to your testers. Trust them, listen to your testers. And don't test stuff that, if the developer says, here's a manual and I built it, you almost wanna trust it's gonna work. Go look for the stress cases and the edge conditions where it's unlikely to work, because that's where you'll find vulnerabilities. And also, that's possibly where you'll find the ability for hackers to come in and perhaps do malicious things to your software. And the third one that I loved, best way to solve a problem is to avoid it. If you're flying too fast through space and the docker chef is against you, laws of physics are up against you, slow yourself down. Let's do a couple of fly-by's every minute. It's the most obvious chance. The next one, I've just got a few minutes left. I'm gonna run up. So Mars is a wonderful planet. Mars is our sort of closest, most Earth-like planet, possibly it possibly had an atmosphere, possibly had water, possibly who knows, had some sort of life forms on it. So within the late 1990s, actually in the early 1990s, there was some quite awful, there was a mission to Mars that got lost and sort of NASA went back and said, we're gonna do low-cost missions. So within 1998, there was a Mars climate orbiter and Mars polar orbiter land. There were both launched to go to Mars. Okay, your climate orbit, orbit's Mars. It's a satellite, it's obviously looking down, measuring it, but it's talking to your polar lander, which sitting on the pole and driving around. This is what's meant to happen. Most missions went wrong. The Mars orbiter, unfortunately, the trajectory it came in at was earlier and it basically just bounced off the atmosphere and got lost. And what's awful about this, this is a $190 million mission. It's the most basic error. It's measuring where it is based upon, effectively, the density of the Mars atmosphere, but there was a unit that NASA built and a unit that Lockheed Martin built and one of them was delivering metric data, which is panther square inch and the other one was imperial data, which is kilograms, a bus square meter, which is newtons and you've got a difference of about four and a half and because of that thing, it's the most fundamental thing, but it comes back to knowing what your count and physics, count, but if you look at a number, know what units that number were in and that was a catastrophic and the second problem because the orbiter wasn't there, when the polar lander arrived, the polar lander was unable to rely on telemetry from the orbiter to work out its height and as it was descending onto the atmosphere of Mars, it had to parachute and you don't want to land and then have your parachute fall off on top of you. That's going to cover all of your instruments and you're not going to be able to take wonderful photos of the atmosphere and stuff like that. So it needed to know when to jettison the parachute and it has these legs so down beneath it that try and work out how when it touches the atmosphere because the orbiter's got lost, if the orbiter was there, you would know exactly where it was. It would have been communicating with it because the orbiter disappeared about sort of 25 days earlier. Then when it came down, it had to rely on basically feelers and the feelers unfortunately got it wrong because there was actually a storm at the time that basically made it think it had landed before it landed. So it jettisoned the parachute and then it fell the last 40 meters and this is what it looks like. It's since been photographed since. It's basically just kind of in bits on the Mars surface. So what can we learn from that? Understand your units, read your specifications, don't rely on a single force of failure, do design playbacks, pair of you, talk to a brick wall, do risk assessments, do worst case planning and anticipate, test, simulate and my little punchline at the end. If ever anybody tells you that they haven't, they won't test their software or they want to put in more features to their software relative test, what they've already got. As far as I'm concerned in space, no one can hear your excuses. Make sure what you've got works and it's proven and think about the edge conditions. Most of us, I'm certainly lucky that it's a software that I write breaks. I can patch it. We don't lose millions and billions of dollars in space but that's what we should do. Okay, cool. I am just going to wrap up there. I've got a couple of other slides available for download. Yes, they are available for download. I don't know where there's a Slack channel or else just Google for it. And yeah, thanks everybody very much for being part of what was my first attempt anyway to talk about doing my passion software in space. Enjoy the rest of your day. Stay safe everyone. Bye.