 Okay, I was right on time, huh? Let's get started. So today, we'll move to MIPS assembly and MIPS programming. I just wanted to give you an overview of where we are in the lectures before I let Dr. Anganathan lecture on MIPS assembly and MIPS programming. So we've actually gone pretty fast, we've covered a lot. Are you guys learning a lot? Yes, everything good? It's exciting? Ooh, okay, good, excellent. I like that. Keep that attitude, because there will be more to come. So you know the material we've certainly covered, and some of the videos are online. Do you like the videos online? Is it good to have them? Okay, you go back and study. Cool. So going forward, what we're going to do is this. In the next two weeks, Anjan, Dr. Anganathan will lecture together with Dar-Yuan tomorrow. We'll cover two things. One is MIPS assembly and MIPS programming, which will be really important for the labs that you're going to do. By the way, before I start, do you guys know what MIPS stands for? MIPS is an acronym, M-I-P-S. Anybody? In computer architecture, it stands for two things, actually. Can anybody guess? No? So one thing it stands for is millions of instructions per second, which is a performance metric, which could be an okay performance metric, but basically people use this performance metric to evaluate the computers. Today it's not a great performance metric because today most computers actually execute billions of instructions per second. So you actually need to add another factor of thousand to this. So today usually people use MIPS billions of instructions per second. We'll talk about performance metrics maybe in this course later on, but definitely if you take the computer architecture course that I'm teaching next semester. But that's not this MIPS. This MIPS is something different. This is the second meaning. This is basically an instruction set architecture, ISA, of which there are many. Basically as we discussed earlier, it's the interface between the software and the hardware, right? What the software designer assumes the hardware will provide and what the hardware promises to provide when you write an assembly program. An add instruction, how is it specified, for example. That's the instructions set architecture. And Anjan will talk about the assembly and the programming aspects of it and you know that you're designing, or you're working toward designing a MIPS processor, a simple MIPS processor. So the next two lectures will be about that. But the acronym itself is actually a weird acronym, which will become more clear when we talk about things later in the lecture. It stands for microprocessor without interlocking pipeline stages. Does that mean anything? Microprocessor hopefully means anything. It means something to you. Without interlocking pipeline stages, well, it's a bunch of words. You don't know about pipelining. We'll talk about pipelining. That's actually what's coming down in the road. Pipelining is a technique that's employed in microprocessors today to increase performance. And the idea is instead of executing only one instruction at a time, you pipeline the execution of instruction. It's an assembly line of instructions. You bring an instruction into the processor and then you move it to the next stage and then you bring another instruction. Well, this instruction is being processed in the next stage. And then you move every instruction to the next stage and you process many instructions at a time. So that's a pipeline. So when people designed this MIPS architecture, they had this pipeline microarchitecture in mind, but they wanted to make it high performance. But what about without interlocking pipeline stage? So you kind of have an idea of pipeline stage, right? It's an assembly line, basically. You do different things to different instructions at different stages in the pipeline. You do different instructions. So now you understand the microprocessor, pipeline stages. What about without interlocking pipeline stages? Now that becomes a little bit tricky. You need to know how a pipeline operates. Interlocking means you need to check the dependencies between instructions. So you do an add, you do a multiply. This multiply is dependent on the add, let's say. And somehow you need to ensure that this multiply waits for the result of the add to be available before it executes, right? There are multiple ways of ensuring that that data dependency is obeyed between the add and the multiply. Make sense? Basically, one analogy that we will give when we talk about pipelining is you cannot dry your clothes before washing them. You could, I guess, but it doesn't make sense, right? You don't want to dry dirty clothes. So your program is useless in that sense if your program is washing clothes. So that's the idea. In the pipeline, which means that if you're pipelining these loads of clothes in the washing machine and then the dryer, you've got to make sure that a load is finished in the dryer before, a load is finished in the washing machine before you start the dryer, right? Or before you put it in the dryer. Otherwise you will have kind of a useless program or you won't achieve your goal of washing the clothes. So interlocking ensures that that dependency is obeyed. And there are many ways of doing that. One way is doing it in hardware. Basically, in the hardware, you have a mechanism that checks oh, can I execute this instruction right now? That's called the hardware interlock. And when people designed the MIPS architecture, they said we don't want to make hardware complicated. We want to make hardware as simple as possible and we want to still get high performance from it. So we want to design the software to be complicated, if you will. We want to design a compiler. We want to design an ISA that's kind of minimal and the compiler re-order the instructions such that you don't need this interlocking in the hardware. The hardware will not need to check these dependencies. Now how is this done? You don't need to know at this moment. I just want to give you the acronym, but that's the idea of the acronym. That's the idea of the philosophy also, in these two lectures, is simple hardware, software takes the hard job of making everything fast. Why is simple hardware good? Well, first of all, it's easy to design. Second, you can make it faster, hopefully. Higher frequency. This is the idea of reduced instruction computer sets, basically simple instructions, as simple instruction as possible. This is not the simplest you can go, by the way, this is one of the simpler architectures. So this actually, just to give you a history a little bit, this engendered a lot of research in compilers in the early 1980s. MIPS is one of the examples. Spark is another example. Actually, the earliest example is from John Koch from IBM. He's a touring award winner, basically. What he envisioned was the hardware is as simple as possible and you can control everything from the software. Basically, the ISA is at the very low level and the compiler has all the control to optimize the instructions on top of the hardware. So compiler is very complicated, hardware is very simple, but then that puts a lot of burden on the compiler to ensure that instructions are ordered such that the hardware doesn't become complex. So that led to a lot of research in compilers, which we will not cover really in this course, but hopefully when Anjan is going through this, you have an idea of, oh, what do these instructions mean to the higher level, right? How do you get high performance out of it? It's always good to think about that. Okay. So to make a long story short, we'll cover MIPS architecture in the first and second lectures of this week. And then next week, again, Anjan will start with the single cycle architecture. It's basically, now we'll move into micro architecture and implementation of an architecture, which is single cycle architectures, which is not really employed today, but you've got to learn it. You've got to learn how to build up to more complicated micro architectures going forward. So the next two lectures will be about single cycle architectures. Don't worry about it if you don't know what single cycle means. But the idea is basically every instruction is executed in a single cycle. Okay. And then later on, Anjan and Serjan will take over and we'll talk about more complicated micro architectures. Basically, we will go into multi-cycle micro architectures, pipelining, which we just discussed briefly, and the more complicated schemes like out-of-order execution or superscalar execution. Basically, how do you actually design processors that detect these dependencies and execute instructions in an order that's different from the one that's specified by the program? Which is essentially what's almost all high performance. Actually, I should say all high performance micro architectures do today. Including the ARM processors, which are supposed to be very energy efficient. They do very efficient out-of-order execution of instructions today. And then we will look into some alternative execution models, certainly superscalar execution, executing multiple instructions per cycle at a given time instead of a single instruction per cycle. How to do that? And then we will look into the basics of GPUs a little bit if we have time. GPUs are very different architectures. They have a single instruction, multiple data. So we will talk about that, hopefully. Then we will move into the memory system architecture. We will talk about... That's another exciting part, hopefully you've seen earlier in the lectures. That's one of the major bottlenecks today. We'll talk about things like main memory. How is it designed? What are the problems with it? Memory controllers. How do you control main memory? Which are essentially becoming processors on their own almost today, going forward. And then we'll talk about how to bridge the gap between the processor speeds and the memory speeds. Because today, executing an instruction is quite fast in today's machine. It takes like a cycle, for example, for an integer operation. But getting data out of memory takes like 500,000 cycles. So how do you design a processor in the presence of this discrepancy between the latencies? Just to do an operation, do you wait a thousand cycles? That's one of the major bottlenecks when you have, especially applications that are very data intensive. So we'll talk about things like caching, for example. Instead of always going to memory to get the data, have a structure, hardware structure that's close to the processor that keeps frequently or recently used data close to the processor in the anticipation that you're going to access that again. And we'll talk about some of the design issues in caches, especially hardware design issues. And we'll talk about implications on the software also. So that's kind of the plan for the remaining part of the lectures. Any questions before I leave it to Dr. Ranganathan? Does this sound good? You're all excited about this? Okay, excellent. Then I'll leave it to Anjan and hopefully you'll learn a lot. Thanks, honor. He pretty much did what I wanted to do in the first 15 minutes of the lecture. First of all, good afternoon. I failed to keep my promise of last week that you will not see me again. But still, you will see me for three more lectures. So let's get started straight away. Till now, what we have seen is the principles of digital design. We started somewhere at the bottom where we saw some basic electrical engineering. Then we moved up to see how you can implement basic logic gates using transistors. And then we moved even further up and saw some of these how to design combinational logic and sequential logic. And then today what we are going to do is we are going to jump or we are going to skip one level which is the micro architecture, which is a single cycle architecture that we will talk next week. But what we are going to see is we are going to go to the architecture. What is an architecture? Architecture is basically what the programmer or what the programmer is going to see and what the hardware sees in the hardware. So this is basically a set of instructions that a programmer can write such that the system can actually do whatever it is supposed to do. In the next week, what we are going to see is micro architecture. It's basically the implementation of this high level architecture. You will see how a single, for example, how a single cycle architecture is implemented and for people who are interested, this lecture is following chapter 6 of the book. So let's say that you are all going to China. What do you do first? Or you say you go to my part of India, south of India. What do you do? Not many people speak English. You try to learn some basic language, basic constructs so that you can communicate with the people, right? No? So this is pretty much what you have to do if you want to start commanding a computer. What you have to learn is the language that the computer can understand and this is called... This is basically a set of... We call it a set of instructions. So what's an instruction? It's words like, for example, you can say that I want to add two numbers and add is an instruction. You want to subtract two numbers. Sub is an instruction. And then every architecture has a bunch of instruction sets. That is all the different instructions that are possible to be executed in that particular architecture. So in this particular lecture, we will see a lot of assembly language. And assembly language is basically a human-readable format of instructions. So it's simple English. It's very similar to high-level language, but not really. So it's simple English. You write them and this assembly language gets eventually executed by the computer. But computers don't understand simple English. So they only understand zeros and ones. This zeros and ones is called machine language. We will see in a couple of lectures how assembly language instructions gets converted into machine language that gets finally executed in the computer itself. So what we will see in this course or in the next couple of lectures is the MIPS architecture. So there are several types of architectures, X86, which is extremely famous. It's implemented in a majority of desktop computers today. You have PowerPC, you have Spark. But MIPS is still... I mean, which architecture to teach or which architecture to learn has always been up for debate. MIPS, so each architecture has its own strangeness. Some of them implement some oddities only the developers of that particular architecture will understand. So there are pros and cons of any architecture. What we will use in this course is MIPS and why? Because it's a small, clean architecture with a small number of oddities. So there's not many strangeness there and it's also very good for beginners to kind of understand how computer systems work. It was initially developed by John Hennessy and colleagues at Stanford and it's been used in a number of comment. Millions of MIPS processors have been sold today. And one thing that I would like to tell you is that if you have learned one architecture, if you're studying MIPS architecture today, it is not a significant effort to kind of understand how X86 or PowerPC or Spark actually works. So let's get into... So MIPS, MIPS actually... So Otter actually was talking about the philosophy of MIPS. Basically MIPS has been designed based on these four philosophies. We will go into each of this in detail. Simplicity favours regularity. We will look exactly in the next slide what it means. Make the common case fast. Instructions that are most commonly executed, let's get them executed extremely fast. Smaller is faster. So you have a much smaller number of registers. You have a much smaller number of instructions in MIPS so that you can execute instructions fast. And good design demands good compromises, which we will understand towards the end of this lecture. Any questions still left? All good? So let's start off with one of the basic instructions that a computer performs, addition. And it's one of the most commonly executed instructions. On the left is the high-level code, like for example C. And on the right is the MIPS assembly equivalent. So you have in the high-level code, if you want to add B and C and store it into the variable A, what you do in MIPS assembly is write exactly... So this high-level code can be represented as add A, B, C, where add the operation that you want to perform is called the mnemonic. So there are several mnemonics in the vocabulary. So you add is the mnemonic used here. And A, B, C are called operands. You need to perform the operation of add on these variables. So how you read this particular assembly, especially MIPS assembly, is that you add B and C and store the value in A. So basically B and C are called the source operands and A is the destination operand. Note that the first operand that you mentioned in the instruction turns out to be the destination operand. So it's very easy to confuse this instruction as A plus B equals C. For example, A and B is the source operand and C is the destination node. The destination operand also always comes in the beginning. So always keep this in mind. And subtraction is very simple again. So if you want to do an operation like B minus C equals A, you simply use the mnemonic sub. And A, B, C is pretty much exactly how the add instruction is. You have the B and C source registers, which means that you do B minus C, and then you store the value of B minus C in A. So A is the destination operand. And what this brings me to the first design philosophy, simplicity favors regularity. What this means is that there is a consistent instruction format. Whether it is add or subtract or any other instruction that we will see, or majority of instructions that we will see, will have pretty much the same format. You will have a mnemonic, and then it will be forwarded by the same number of operands. Here we have two sources and one destination. And this is much more easier for us to implement. You will appreciate this much more when we talk about single-cycle architecture. Why such a regular or consistent instruction format helps in hardware design? So if you want to execute more complex code, for example, A equals B plus C minus D. Typically what we do in MIPS, if you want to do this, is you make use of temporary variables. So you have to use a temporary variable here, T. So what you do here is T equals B plus C. And then you follow this up with a sub-instruction, and then which is basically A equals T minus D. And to realize this B plus C minus D. So if you want to implement a slightly more complex instruction, you have to write multiple MIPS instruction. Why? Can't we just have an instruction that does this? So the reason, of course, there are some architectures that actually does that. For example, Intel X86 has a specific instruction to do string move. So what is string move? String move is basically copying a bunch of characters from one portion of the memory to the other. And X86 architecture has an instruction, a specific instruction to do exactly this. If you want to implement a string move in MIPS assembly, you'll have to write hundreds of lines of code. But then this leads to the design principle too, which is to always make the common case fast. What this means is that, of course, if you want to implement some complex logic, you write a whole sequence of MIPS instruction, but addition or subtraction, which is a more common case scenario, can be executed extremely fast. Why? Because MIPS supports only a small set of instructions. And the lesser the number of instructions that you support, the much more easier is the hardware that you use to implement it. And then the faster you can be. I'll come to what exactly I mean by this in the next couple of slides. So basically there are two types of architectures. Risk architectures and complex assist architectures. And MIPS is a classic example of the reduced instructions at computers, which basically means that you have only a simple, small subset of instructions that are supported. Whereas in a complex instructions at computer, you have instructions like string move being supported. So which basically means that in MIPS, you will have say 32 or 64 instructions in total that is supported, like add, subtract, jump, branch, and all these things. So you have like a bunch of 64 instructions that are supported. And in a complex instructions at computer, you might have say several hundreds of instructions that are supported. So in order to implement these instructions in hardware, you have to encode each of these instructions in bits. So to kind of give you a perspective, we saw state machines last week. And if you want to encode four states, we need at least two bits, right? And if you have to encode 64 instructions, or if you want to represent 64 instructions, then you need six bits to encode in the instruction. Six bits to kind of represent each of these instructions. And the higher the number of instructions, the more number of bits you need, and hence your architecture gets a bit more complicated. We will see clearly, I'm just talking in a much more basic simple terms, we will see how this actually reflects on the actual architecture when we see single cycle architecture. So what we are going to focus today is our focus in the rest of the lectures is the reduced instructions at computer because of the simplicity it provides. And it's much more cleaner to understand. So we talked, so consider the example that we saw. Add A, B and C. These are, these you understand, so you say that okay we can read from B and C and store the, add B and C and store the value in A. But however a computer does not understand this. So what it needs is basically a location for all the variables A, B and C. The location can either be a register in MIPS for example, it can either be a register, a memory, location, or a constant. What are the differences here? So let's start with registers. Registers are basically a bunch of, so main memory as such is extremely slow. You will learn about this when we talk about memory systems. And registers are basically a small set of memory locations. Typically MIPS has 32, MIPS has, MIPS each register can store 32-bit data. So they are a small set of memory locations that can be accessed by the CPU extremely fast. And also MIPS is called a 32-bit architecture because they work on 32-bit data. So all the 32 registers that MIPS has operate on, can store 32-bit data and operate on these 32-bit data. Again, in this course we will focus on 32-bit architecture of MIPS. We will not look at 64-bit even though it exists. I guess 32-bit architecture is a good starting point to understand basics of computer architecture. So why do we have only small number of registers? And this kind of nicely leads to the third design principle that is smaller is faster. Which basically means a good analogy would be if you want to retrieve some data from a couple of books that are there in your study table, this is going to be much more faster than retrieving from thousands of books in your room. So this is pretty much the principle that the philosophy of MIPS tries to follow by including only a small number of registers that are extremely fast to access. So you have only 32. And most of the modern processes have much, much more number of registers that you can access. So this is the whole entire MIPS register set. We will be using all of them during the course of the lecture. We will start off with saved variables and temporary registers. But you will see as such, we will use all these registers as we go forward in assembly language and MIPS programming itself. So some things that you want to remember is there are some registers such as $0 which are extremely, which are having specific, which are very specific in its function. That is, it always has the value zero. That's it. If you want to have the value zero, you point to the $0. And each and every register actually starts with the dollar symbol. So registers are written with the dollar sign and registers zero, then it's always having the value zero. And as I mentioned, certain registers are used for specific purposes. And for example, the saved registers and the temporary registers which we will be using quite a lot in this lecture are used to hold variables either across function calls or just intermediate values during a computation. We will basically see pretty much all the registers in how to use all these registers in the later slides. And for now, we will be using only the temporary registers and the saved registers. So let's go back to the add instruction. A equals B plus C. Now, what we did in MIPS assembly, in the assembly is that add A, B, C. However, computers don't understand ABC. It has to retrieve these values from some physical location, which basically means that this can replace... The MIPS assembly can be replaced as add $0, $1, and $2. That is, add the values that are stored in registers S1 and S2 and store the value in the register S0. So the source and the destination operands are now in registers. Simple? So register is one place from where you can retrieve data. Another place from where you can retrieve data is memory. As you know, that registers are typically only... There are only 32 registers, so only a small amount of data is going to fit in. And if you want to use much more data, then you have to look into the memory locations itself. So if you want... And also, you have to remember that if you want to retrieve data from the main memory, it's going to be very slow. So it's always good to keep frequently used variables in a register so that you can always access them very fast. Typically, an assembly program uses a combination of both register values, like operands stored in register, as well as in the main memory. I can actually show you using a simulator how these work. You will actually be using the simulator in your labs as well to write much more complicated programs related to some kind of a small image processing. So you will be using arrays and you will actually be simulating your program in using the simulator and then implementing them on the CPUs that you would have implemented using VariLog. You started using VariLog in your labs, right? Where are you? Writing combinational ALUs. You started designing adders. Okay. So as you progress in the labs, you will do... you will write adders. This will be followed by an FSM, sequential logic you will implement, and then you will see how you can combine all these to implement an ALU. And then what you will do is you will write an assembly program using the simulator and store this and move this assembly program or try to execute this assembly program on the ALU that you have designed. And you will... So by this, you will actually have a full experience starting from designing a simple logic circuit to writing assembly program for the ALU that you have designed and also interface LEDs to this ALU and write a full-fledged program using assembly where you will actually play a snake. So you will write your own assembly program that plays snake on a computer that you have designed in hardware and see it executed in FPGA. So you go through the entire process in the laboratories. And you will use a lot of instructions that reads both from the register and the memory. You will try to get the whole... You will try to understand it better when you start implementing it. So memories are typically... There are two... I mean, there are two types of memories you can address. So you can address the memory either by the word addressable memories or byte and byte addressable memory. So as I said, each memory location holds 32-bit data. And what is a word addressable memory? A word addressable memory is somewhere where you can address each word or one unique address. An address is referred to one word. So for example, the first address 0000 refers to pretty much four bytes of data. And then word address 1 refers to the next four bytes of data. So you always address the data in the memory using words. So you skip words and you address. So the first location has the next 32-bit data and second word address has the next 32-bit data. But this is... We are not going to see this in MIPS. What we are going to... No, actually. Sorry. So let me give an example here. How do you read word addressable memory? So it's much more easier to understand how to read memory using an assembly instruction. So if you want to read a particular memory location, you use the instruction load word, LW. And what this basically means is that you are going to read the memory location. So 0 contains the value 0. So you are going to read memory word 1 into the register S3. Got it? And how you compute this is as follows. So what is inside the braces is called the base address. So here the base address is 0. What is outside the braces is called the offset. And in order to compute the effective memory address, you would add the base address with the offset address. So here it's 0 plus 1. So your effective address is 1. And what data is there in location 1? It's F2, F1, AC, 07. So this value is now going to get transferred from the memory location into register S3. Got it? No questions? Yeah. So you can actually replace this with any register, any values as well. You can as well use any values here. You can put any number. Sure. I'll come. Sorry? Well, it's a good question. It's more towards conventions, typically. And you will see that this kind of gives you an example of how you can either... So it's kind of an example which says that you can actually use a register inside, which means you can also have $S1, for example, here, which basically means that value stored in the $S1. So you can actually store the memory address that you want to read in this location, say, S1. And then what will happen here is that the base address will be the value that is stored in $S1. So it will not be... You don't need to put a kind of a constant there. You can store this memory address in a register and then use that as the base address and then compute the offset address and then store it back. It must be. Yeah, it must be. Yeah. For word-addressable memory, this is okay. Yeah. But it should be for the memory align. Yes. I didn't want to mention that. Okay. So if you want to write... So we looked at reading from the memory location. Now if you want to write into memory location, what we have is store word. So SW, the store word, writes any... writes the value that is present in this temporary register, T4, into location 7. Simple? Yeah. Well, good question. Do you have an answer for that? Exactly. There's a reason why. Great. So again, a simple example of how... a simple example of how the effective address is computed, you add the offset address to the base address and then you store the value that is present in T4 into this effective address. And as I mentioned here, you can use any register value here. You don't have to say $0. You can say $S1, $S2, and the value that is there in $S1 or $S2 is going to be your base address and then you compute the offset address. And you will use a lot of this when you want to read and write from arrays and so on. So the next type of memory is byte addressable. So what happens here is what we saw till now is that you will... you are looking at addresses that were having... for each increment in the address, you were incrementing it by word. That is 32-bit data. But here, which MIPS uses, MIPS uses byte addressable memory. So this is what you're going to do in the courses. So what happens here is that each byte has an address. So you start with 78 as 0, 1, 2, 3, 4, 5. You're no longer 0, 1, 2, 3. So you're no longer going to increment addresses based on 32-bits of data. But you're going to... every increment in the address is going to move to the next byte. So let's see an example. It's the same example. You are now going to load a word of data at memory address 4 into the register S3. So what you're going to do here is use the word... use the assembly instruction loadword, which is loadword $3 and then you give your... the offset address and the base address. If you notice here, if you want to load data that is here that was previously in a word addressable memory having address 1, here you will have 4 because you are using a byte addressable memory here. Previously we saw a word addressable memory. Remember MIPS is a byte addressable memory. So you will be using increments of 4 if you want to read words. So if you want to read the word 1, word 1 is typically located at address 4. So that's because you have... it's the fifth byte. So you will say 4 as the offset and then $0 as the base address. So the effective address that you compute is 4, which is actually holding the value f2f1ac07. And this is what you are going to use in MIPS. Okay? Throughout your lab programs you are going to use byte addressable memory architecture. So you will be dealing with such offsets. Any questions? Yeah. I'll come to that. I'll come to that sometime. Yeah. Register as the offset. Yes, it is. MIPS allows you. Yes, I think, yeah. Go ahead. The register offset is an immediate... and that's simple for the hardware. If you have register plus register, that doesn't exist in MIPS actually. Unless you change up... unless the offset is the ISA... At least the simulator allows you. Yeah. Okay, so... Okay. So clearly there is some confusion there. So we have... for reading byte addressable memory you have the offset for reading from word 1. Next, you want to write into this memory. So you follow pretty much the same... pretty much the same... Oh. So let's stop here and then we continue in the next hour. Let's continue. I'll quickly show you a demo. So this is pretty much the simulator that you're going to use in your labs. And this runs... this kind of simulates the MIPS architecture. So for example, the one... the values of the registers are shown in your right side. So you can see the values starting from the zero register to the temporary registers, the stored-value registers, the stack pointer, the program counter, which you will soon see what it is used for. So you kind of have all these registers on the right side of a simulator and what you can simply... you actually write pretty much... it's a very user-friendly simulator. So you can actually... it will actually tell you what is the format. And let's say we start with S1, S2, and S3. So you first compile, execute, and now you see the main memory. So this is the memory locations. And this is all the registers. So now what we are trying to do here is add S1 or add S2, the value of S2 to S3 and S1. So if you want to verify, you simply change whatever you want here. So let's say 30 and let's say 3. OK. So now I will execute this. So you see it will actually... you can actually walk through your program and see at every instruction what happens. And pretty much you will be... you can also manipulate values in the memory and you kind of see how the MIPS architecture actually works inside. You will use this quite a lot. Of course, it's extremely funny to have one add and show you the power of the simulator. But you will be implementing much more complicated programs to actually validate what your program does. So this is what you will be using. And let's go back to the presentation. Good. So... of course, Onur was right. You cannot use a register for offset. You cannot use a register as an offset. So you will not be able to use it. So... let's keep moving forward. So we saw how to read and write data from byte addressable memory. And what is important now is the concept of Big Indian and Little Indian. So... we talked about it, right? A couple of weeks ago. This is a more illustrative example. So what happens here? You guys are making me laugh. Okay. So... actually, if you want to look at the history, the original reason why an architecture is called Big Indian or Little Indian comes from Gulliver's Travels. How many of you have read Gulliver's Travels? Cool. So... in Gulliver's Travels, Big Indians were the people who always broke the eggs using the bigger part, bigger side. And Little Indians were the people who always broke the egg on the smaller part of the thing. So this is the actual origin of Big Indian and Little Indian in the whole of for the computer architecture community as such. So... what happens? So what is the difference is that if you want to quickly know what is a Big Indian, a Big Indian is basically your the zero, the address start as the most significant bit. So you know that you have a 32-bit word. You have the most significant bit and the least significant bit. In a Big Indian, the most significant bit is always addressed from zero. The address zero refers to the most significant bit. And in Little Indian, it's the other way around. You have zero addressed to the least significant bit. Why does that even matter? This doesn't matter if you have a word-addressed memory because word address... okay, this is a byte address memory then this will be zero one, two, three. So if you are referring to zero if you are going to read from memory address one in a word-addressable memory you will read exactly the same thing. You will not have any problems. But if you are going to go down the path of byte-addressed memory then the zeroth byte in a Big Indian would read the value that is stored here and in a Little Indian you will read, if you want to read zero address out, you will be reading the value that is stored here. So this, when you want to start loading bytes this is going to be extremely important. Okay? So let's again look into it with an example. So consider that the temporary register T zero has two, three, four, five, six, seven and eight, nine. And now what you are going to do is you are going to store T zero into this, into memory address zero. So what will happen now is that in a Big Indian you will have the value 23 stored in zero like this because but imagine so now it's a stored word. So as I said if you are going to deal with operations at the word level there's not going to be a change in Big Indian or Little Indian because you will store exactly 23, 45, 67 and 89 in zero, one, two, three. But however, rather in the first value. However if you are going to start loading bytes which you can do in MIPS using the word, using the mnemonic LB and if I want to load the byte, the first byte or the load byte from the first address into S zero what's going to happen in a Little Indian is that S zero is going to have the value 67 and in Big Indian S zero is going to have or is going to read the value out 45 the byte. Got it? Exactly. So in general so now let's move on to the design principle four which is basically we saw instructions still now that are having a mnemonic and this is followed by three or two register operands and in MIPS you have three different types of instructions the R type the I type and the J type we will see and these are all the different type of instructions all these three different types of instructions follow a particular format and if you know these three types of instructions you pretty much can understand the assembly language written for MIPS architecture so what is also important here for example is that add and sub uses the three register operands and load word and store word has only two register operands and one constant and what is also important is the store word you read it in the other way so here when you load a byte you load values from this is your source source operand and you read value from here to the destination and store it's the other way around if you always store what is here in the source operand t0 into the destination operand so this is one difference that's important when you it's easy to make mistakes so don't ask me why they implemented it it is so so now we talked about loading memory reading memories, reading from registers but what is also important or what you frequently use are constants or immediate so why do we call them immediate is because they're not stored anywhere they're just available, they're immediate they're immediately available that's it so when you want to implement something like a is equal to a plus 4 what you do typically in assembly is use the mnemonic add immediate so you have an add immediate add i takes two registers followed by the constant so what you do here is that you basically add the value of s0 to it's exactly like the add instruction just that you don't have a register operand you have a constant here so you add the value of s0 to this constant and store it in the register as 0 and actually I should not have shown this slide so do you think you require a subtract instruction sub immediate instruction people are not fast then you didn't read the slide fully why? so you don't need a subtract immediate instruction because of the fact that you can implement subtraction by adding a negative integer any questions still now? good so we saw some basics of assembly language still now we saw add, subtract we saw a couple of immediate instructions like the add immediate and add immediate you also kind of learned how to read data from the memory how to write data into the memory but all these things are still in human readable format but for computer to understand which understands only 0s and 1s you need to convert these assembly level yes we have flags here you will understand it as we go down so you need to convert these assembly instructions into machine language that is 0s and 1s or in numbers so in order to do that you have in MIPS three different instruction formats you have the R type you have the I type and the J type so let's start with the R type whenever you see an R type instruction this is how you will see the machine language being encoded so you have an operand so the total length of the instruction itself is 32 bits you have and this instruction is split into several parts you have the operand part which is 6 bits and the function part which is also 6 bits and for an R type instruction the op or the op code as we call it is always 0 for the R type and then you have the source register which has an RT and then the destination register remember that when you write the source and destination register in assembly you always have the source register after the destination register but when you are going to encode it in the machine language you have the source registers before the destination registers so you have RSRT which is each 5 bits followed by RD and then you have the shift amount this you will use when we are seeing shift operations so this is how an R type instruction is typically encoded and the function itself indicates whether you want to perform an add or a subtract and so on so an example pretty much the example that we saw so add as 0 as 1 as 2 so add the function value that represents add is 32 so you have 32 in the function part which is represented with 6 bits as I told you the op code is 0 for an R type and then you have S1 which corresponds to yeah it's S1 so register corresponds to register number 17 and 18 so each register also has its own address so these are kind of register addresses so you say 17 and 18 which kind of indicates to the computer that you are talking about registers S1 and S2 and S0 is 16 so now when the computer reads this code it will know that you are executing an add instruction from the function field and then you know it knows the registers where it can find the operands because it knows that it has to take the value that is in memory location 17 add it with the memory location in 18 and put the result or store the result in memory location 16 which kind of corresponds to the registers S0 similarly you can also kind of convert assembly code for subtract into machine language code and now this you simply write it in binary or hexadecimal and this is what you will load into your computer so in the labs when you write your assembly language program the simulator will allow you to generate what you call as the byte code or the loadable binary file and that binary file will have just these numbers so and these numbers typically gets loaded into your processor which you are going to design put it into the memory and then as the CPU starts executing it will always start at a certain memory location and start executing the instruction in one after the other so if you have add followed by a subtraction and you store it in some portion of memory where the CPU starts executing you will immediately the computer will start executing from this memory address and then it will read this data that is there and then it will interpret this data as what it should be doing and then after it finishes it goes to the next memory location next memory location and thereby executing your entire program as I mentioned note the order of the registers it's not it's like RS comes, RS and RT comes before the destination register in the machine code but it's the inverse in assembly code it's important to note it if you want to avoid silly mistakes so the next instruction type is the I type which is immediate type and it follows pretty much the same frame for the format you have a 6 bit opcode what you don't have is so what you don't have is the shift amount and the function and the opcode itself will tell you what operation you want to perform so add immediate has a specific opcode value and so on and you give the immediate and the immediate value that you give is put into the 16 bits in the 16 bit field and then you have the register operands just like before so let's let's look at some example here so you have add immediate which has the opcode 8 and you have the register S0 and S1 so basically what it does is you add the value of S1 we add 5 to the value of S1 and then you store it in S0 and this gets converted into machine code like you have add I opcode is 8 the address of RS and RT is from S0 and S1 which is 16 and 17 and the immediate value 5 and this directly gets converted into machine code which is what you need to feed into your system and then you can also look up for load word store word pretty much so all these are I tap examples any questions I know this part of the lecture is a bit boring but it's important because this is comprises of at least 7 to 8 points in your exam so now wake up see now you have questions because the whole instruction format can only be 32 and you need to kind of put this up into portions and you cannot give so this is also restriction in the hardware you cannot give an immediate value that is more than 16 bits which is 2 power 16 so you cannot have an immediate value that is so big so it's a restriction because of the 32 bit instruction length itself changed it's not always that you will have so for example let me pick this one so it's not always a destination here so so it's not like the R type instructions where you always had a destination register but when you look at for example in load word this is the destination in some sense yes but for store word it's not the destination it's the source so you cannot differentiate in I type at that level so you can only see the assembly level and figure out whether it's a source or a destination so you don't have this RDU represented simply as registers you don't even call it source you just call it operand registers or register operands that's it exactly so the store word is having the op code 43 and the load word has 35 and based on the op code you decide whether it's a I mean you still should not call it a destination register because it's just register operand register that's it yeah you cannot you don't do it or you yeah you will get overflows and all these kind of problems yes good so j type very simple the more not the simplest right you have an op code and then you have an address you have six bit op code which indicates what's your what's the actual jump instruction that you want to execute so it's basically jumps several jumps to a certain address in your sequence of instructions that you are going to execute so I said when you start executing a program you start at a particular memory address and then you execute every instruction in sequence when you come across a jump instruction you can give it an address and the sequence will kind of jump to this particular address and it's one and it's a 26 bit address the reason is yeah so exactly and also you have some values in the some specific values that are already given to registers so you have a 26 bit address and you have a six bit op code and this is this is the j type instruction you will see a couple of them soon so a quick review of the different instruction format you have R type I type j type R types add subtraction where you have three registers three operand registers op code is always zero for an R type for I type you have the op code it's an immediate so you have a 16 bit immediate followed by the two register operands j type it's for jump instructions some of them are in the memories so you allocate some of these addresses straight away for these general purpose registers so you not specify all the addresses here so you have some restrictions there so you're not executing the full 32 bit spectrum of the computer we can we can talk take it offline I don't want to go much more deeper yeah so the so this is pretty much what I covered so what you have is when you generate these this generate this machine language code you have this bunch of similar values that you can store into memory and then as soon as the ascent is to stay execute or you power on your CPU it always starts in a particular memory address MIPS always starts in 4,000 yeah it's you will come here so you always start at a value at a specific memory address when you power on your computer and then it starts executing all the instructions it finds at each of these memory location in sequence so the power of this basically is that you don't have to implement new hardware if you want to change your program you know that just by reprogram I mean it's not something great for people of this generation but like a couple of decades ago this was fantastic this was really fantastic that you don't have to rebuild circuitry to run a new program you can simply load you can write something like an assembly convert it into machine code write multiplication operation and then if you don't want the computer to perform the multiplication operation or you want to do division operation all you need to do is change something in the code nothing in the hardware fantastic right this is this is this is where we started this is where we started and this is why we say it's it's really the power the stored program concept really kicked in a couple of decades ago so how does a computer know which address to fetch instruction from you have a specific register for it this is not included in the 32-bit registers that I was talking about 32 registers that I'm talking that I was talking about it says another different register which actually holds the address of the current instruction that is being executed so it will hold the memory location of from where the current instruction is getting executed and MIPS typically starts in this address always so whenever you want whenever you power up your MIPS computer you always the pc will always be reset to have this address and it will start executing the whatever is there in these address it will pick up the instruction decode the machine language execute the particular instruction store the values if necessary and so on you cannot have access to this register so you cannot write to these registers so it's an automatic increment that happens based on the instruction so any questions till now so a quick example again so you have load word add, add immediate sub you have the corresponding machine code so what is load word R type, I type, J type sure there is there's no such no no no it's very simple it's not like the more advanced architectures where you have protected memories and stuff like that so so you have load word add, add immediate and sub what's load word R type I type so add add immediate is rocket science immediate type, I type send and R type so all these gets converted into the machine language machine code and this is how your memory is going to look like so you have memory and you have exactly all these things stored in memory and the convention is that always the addresses are from the lower to bottom so it gets stacked up so you always put the lower values at the bottom of the so it's kind of a memory convention that we use to start the PC starts here and then once it sees this instruct the machine code it will execute the corresponding load word and then it goes after it finishes it goes to the next instruction, next instruction, next instruction and so on and the cool part was that if you want to do something totally different you just store a different program here fantastic so again interpreting machine code so if you are given machine code how do you interpret them straightforward actually if you have an opcode is 0 you know what type of instruction it is and so on you can deduce the field from there so I'm not going to go too much into detail I see that it's kind of repetitive so what did we actually learn we learned how to read memory we learned the different types of MIPS instructions we learned a couple of example assembly instructions and we saw what is big endian, little endian and so on so what I can do is start MIPS programming with branches or you guys can say no I don't want I want to go home I will start the next one because tomorrow is Friday and you can go home earlier so yeah yeah exactly okay so till now we looked at very simple instructions but now we go into branching branching is one of those very commonly used commonly used programming primitive right so what you have is conditional branches and unconditional branches so there are these two types of branching instructions conditional branches you have it's more it's like an if if statement so you have branch on equal it's an I type both the conditional branch statements on I type so you have branch on equal branch if equal and you have branch if not equal and then in unconditional branch you don't have any kind of check so you just either jump irrespective of you just give a jump instruction and then you go to a specific address you go to a specific value specific address mentioned in a particular register JAL is a special type you will see it when we go into functions and procedures and so on so again the jump and JAL these are the only J type instructions that are there in the whole of the instruction, the instruction set okay so this basically means that I have to open the blackboard so you have a MIPS assembly program so let's just walk through it quickly unfortunately this is again not recorded but it's okay so you have the first instruction add immediate S004 which means what is the value of S0 after this it's easier for me to have confidence in you and then I don't have to look at my slides S1 equals SLL what does it do what does it do left shifting exactly so what it does is basically takes the value in S1 and shifts left two times so what's the value in S1 and when you shift say let me represent it as just 4 bits of course this is not 4 bits it's going to be a 32 bit 000001 so now we shifted two times so you end up with this so you basically shift two times and you have 0100 which is in decimal 4 so after the end of SLL you will have S1 equals 4 now branch on equal S0, S1, target what this basically means is that if S0 equals S1 jump to target okay so it's S0 equal to S1 and then you simply jump to target and execute it that's it and the two instructions add immediate and sub are simply not executed at all here and the word target itself is called a label and you can use pretty much any word except the keywords like add immediate add or SLL as a label and then you follow it up with a colon and then that's the location where your branch is going to go right simple you mean you write target above sure but then you will end up with a loop because your instructions are going to be executed in sequence you will see an example like that kind of so pretty much what we did on the blackboard you have the branch taken and the add immediate and the subtract are not executed at all now an example with branch not I mean branch not equal to branch if not equal so you have a pretty much the same example you have S0 having 4, S1 is 1 and now you shift 1 so you have 4 and you have branch not equal so this means that if S0 is not equal to the value stored in S1 then jump to the target but here you are not going to branch because S0 is equal to S1 and then you execute add immediate and subtract what happens here is after a subtract exactly what we were talking about but in a kind of a different setting your instruction which is after the label is also going to get executed so your S1 is not going to be just 1 in the end of the execution because don't expect your assembly program to stop at the sub it will continue until the last line so your S1 value is going to be 5 okay yeah if the branch is taken only the target gets executed no it doesn't get executed if it was taken for example in the previous example you don't execute add immediate and sub because the branch is taken because this is the end of the program it doesn't have a jump or anything to indicate that it goes back great so an example with unconditional branching so you have the same example but here you don't have any condition to check whenever the assembly comes to jump to some target the program counter will update the value of the next instruction to have to jump to the memory which contains this instruction so you will simply jump to the add instruction after target and all the remaining three instructions is not going to get executed at all what is this useful you will come across quite a lot of programs that I actually will have some couple of examples where you simply want to as you will see it it's easier that way so yes yes yes there's always a set of memory locations that is allocated for instructions and there's a set of memory locations that is allocated for data and you will never go beyond that I mean if your operating system is trusted and works fine you end up with memory access denied false you see all these kind of false when you implement a C program with a pointer to some random address have you done C programming already in some courses so when you implement pointers you will end up with so many of these errors you will feel it you will really feel it from segmentation false to all these kind of errors that you cannot jump to that particular address because your operating system does the protection for you if not it's a chaos people will start writing into operating systems code memory you will see the blue death screen quite often so great so this is the jump jump register instruction so where you what you do here is basically load a particular value into s0 so you have after the add immediate you have s0 having 0x2010 and then when you reach jr what the system will do is that it will read what is the value in s0 and jump to that address now the value is 2010 it's going to simply execute the load word skip the add immediate and sr right it's as simple as that so when you want to compare so when you want to compare all these instructions with the question the gentleman asked when are these things useful let's start with a basic if statement you have if i equals j you have f equals g plus h and here let's assume that these are the register location containing each of these variables now so now what's happening in the assembly code if you want to interpret if you want to write the high level code is that you will notice that if you want to check for equality here you will always check for not equal okay so what happens here is that i is stored in s3 j is stored in s4 so what you will do here is that if s3 is not equal to s4 then you jump to l1 to do the subtraction here what you do here is exactly that you just check if it's equal you execute this and then you execute this but here as soon as you figured out that it's i is not equal to j you simply go here and this is pretty much what you implement so if you want to implement an equality if condition simply start with a not equal branch not equal condition in assembly and then the rest of it is history add s0 s1 s2 is pretty straight forward and then you have a sub and you see that when you are not taking the branch here where you execute the add instruction this will be eventually followed with the sub instruction being executed so you're not going to skip the sub subtract instruction here so you will get executed similarly if you are if you want to do an if and an else which basically means that if the branch is taken you are not going to execute the subtraction instruction this is where jump comes into picture so what you want here is that unlike the previous example where even if you enter into the if condition you want to execute this instruction but here if you enter into this condition you don't want to execute this one and pretty much this can be implemented by if you have equality you check for not equal you do the addition and then you simply jump to the end of the program you don't want to execute this one and this is pretty much why the jump is useful got it while loops so if you want to implement a while loop let's see how we do it so you have you want to store p o w the power and x in two registers so these are the registers we choose we load these two registers with the values that we want and you also need a temporary register to store the value of power that you want to compare with so you have a temporary register where you store 128 and in the while loop you have a not equal to here so what you do you use an equal in the assembly and then what we do is power and 128 so the power is in s0, 128 is in t0 so you say if they are equal you simply jump to done which is basically if it's equal you finish the program here and exactly and if it is not equal you continue with the program and if you want to multiply a value by 2 all you need to do is shift it by 1 this is this is known to you right so if you simply shift a bit bit if you shift a value in binary by 1 to the left you multiply it by 2 so if you shift it by 2 you multiply it by 4 and then you simply implement the multiplication with a shift and then you implement the final x using the ad immediate and then you use the jump to go back so it's kind of a loop and another way of another reason why jump instruction is useful here similarly you can also do for loops but I would leave that for tomorrow and let's meet tomorrow