 For this example, we're going to be working with this set of data. In this case, our instruction fetch stage takes 200 picoseconds to complete. Our instruction decode stage takes 125 picoseconds. Our execute stage takes 250 picoseconds. Memory is 225 picoseconds. And our writeback stage is 150 picoseconds. So the first thing we probably want to do is find out how long our clock cycles are. To get the clock cycle time, we really just need to look at all five of our stages and see which one is the largest. In this case, that's the execute stage. So our clock cycle time can be 250 picoseconds. Then to get the latency for one instruction, we take however long that clock cycle time is, 250 picoseconds, and we'll multiply by five. So that will give us 1250 picoseconds. That would be the latency for one instruction. If we change our numbers up a little bit, our instruction fetch is 150 picoseconds, instruction decode is 75 picoseconds, and so on. In this case, our clock cycle time is going to be dependent on the memory because the 225 picoseconds is larger than any of the other stages. So our clock cycle time would be 225 picoseconds. The latency then is 225 picoseconds times five, so that's 1125 picoseconds. If we wanted to try to improve one of these processors, then we'd be interested in focusing on that most expensive stage. So in this case, it's the memory stage. In this case, it was the execute stage. The most expensive stage is the one that's slowing the processor down overall because all of these other stages have some amount of slack in them. There's some amount of space at the end of the clock cycle that really isn't being taken advantage of. So if we could improve the execute stage here, then there would be 25 picoseconds worth of slack at the end of the clock cycle that even the memory stage isn't taking advantage of. Then if we sped up the execute stage, that would mean that our memory stage would become the bottleneck and our clock cycle time would be dependent on that stage instead. The second example, if we sped up the memory stage, then our execute stage would be the next slowest, so that one would become the bottleneck instead.