 Hello, I'm Ragu. Just like Adnan, I'm also doing an FPGA talk. But unlike Adnan, I've been on and off at headwear meetups since 2012. But yeah, so I got interested in FPGAs partially because of headwear. Because Ibrahim and friends, they always talk about FPGAs. And then coming from a microprocessor background, the flexibility was very cool. So this term that I finished during the end of 2020, we had this module called Computation Structures. I studied in SUTD, but this module is inherited from MIT. The MIT module is amazing. You should go check out the website, computationstructures.org. But basically, they teach you starting from building a one bit adder, which most of us wouldn't do, even though you do electronics, this is like a one bit full adder. You know how to do it, but you would never actually spend two hours soldering and testing. Yeah. So from that, we moved on to more complex things. So here I have a video which is from a movie. What? What are you working on? It's computer terminal that holds up to the TV from the display. What? So when I watched this as a kid, I was like, one day I have to do that. And well, I was lucky enough to be able to do that during this module. So we built this risk processor, a very small processor, which is called Friendly Un-Pipeline Beta. It's a load store 16-bit, but 24-bit instruction encoding. It's Harvard. That's why there's two different sizes. Instructions are larger than everything else. And it can run up to 50 megahertz, although I'm only clocking at three kilohertz for my actual game because that is sufficient. Actually, surprisingly, after going to this course, I realized the internals of a CPU are very simple. That you only need a bunch of components, an ALU, a register file, a control unit, and a program counter for ARM at least, like ARM-based risk machines. And before we built it, we actually simulated this thing in Logisim, which is like this open source digital logic simulator. If you look at the clock in the top corner of the screen recording, you realize this was fast game in three hours. But after like three hours, da, da, da, da, loading the ROM, it actually kind of worked. Minus some small bugs, which was epic. I didn't realize that this software can actually simulate this many logicates. Then we just translated that directly to hardware through a few steps. So the ALU has a few basic operations. There's your plus minus times divide, your checking boolean unit, shifter unit, and some like ZOR and all. Then we have eight registers. But actually, the last one is hardware to zero, so it doesn't count. And the rest are all general purpose, although we dedicate some of them in the assembler to like base pointer, linker, so on. Yeah, and a program counter is 16 bits wide. And this is where we actually, like, due to the module's intensity short time frame, we have interrupts and illegal operation handler. So stuff for a proper operating system implemented. But we just hardwired everything to reset, because like, you know the saying, if it doesn't work, have you tried turning it off and on? Yeah, we just baked it into the hardware fabric, so it just turns itself off if like anything happens that it's not familiar with. And the control unit is just like a big rom, like back in the old days, it just like takes the opcode and then it will like send out the signals to the various parts. Glue logic is just moxers and like some miscellaneous stuff. So there's only two types of instructions. There's the three registers, opcode register register register, and there's opcode register register and a 12-bit immediate. So that gives you like a valid range of like 2000, positive to negative 2000 something. Yeah, so that is enough for most simple games and stuff. This is like the 32 valid operations. Blue is the three register instructions where it's gonna be like register C is equal to register A, operation register B and then like the green operations are the same except it's register C, register A and the immediate. And yeah, so you can see most of it is like arithmetic operations and then there's some control flow, which is like the bare minimum unit for like a Turing complete system. So that's actually just the CPU over here, but most of the work actually went into like the memory unit. It's not an MMU, it's just a mapping unit. So that just helps the CPU read and write to some peripherals that were actually handling most of the difficult things. So the VGA frame buffer generator is actually the thing that's doing most of the like high speed stuff, but it just abstracts it away. So it just reads like what character to write in text mode and it will just render it in color using its own ROM. Then the joysticks are just mapped straight into memory like the good old days or like in Arduino now. Yeah, your ROM is mapped at 800 even though it's not really like that. It's two different memory controllers. And this is how the button, the joystick and the buttons are mapped. They're like literally like unsafe straight into like the memory unit. So if you send more than 3.3 volts, not a good idea. So to generate the VGA, it was surprisingly easy. I just looked up like the timing information and then like you just, thank God my microcontroller has like a, my FPGA has like a clock that's close to like the 25 megahertz. And like, unlike modern standards, VGA is very tolerant. So like, even if you don't hit 25, 24 is fine. Your monitor will actually correct for it. And then this was me testing with one of the other FPGAs before term to like just draw from like BRAM straight into like the screen. So how textbook works is like on each word in text mode, there's 300 words to form a 20 by 15 display. Each word contains the bottom eight bits are like the ASCII code. And then the next three are the background color in three bit color, so eight colors. And foreground color, then the top two bits are like reserved for like blinking or whatever which I haven't done yet. And that gives you like a simple text interface. So this is how the construction works. Looks like it's just an FPGA, it's very empty. And then there's like some like cable management for the joystick and the other buttons. And then there's like five wires for the, five wires plus ground for the VGA and then there's just power. So it's implemented on an Arctic seven. So the school kind of forces you to use this Arctic seven development board which comes with this language called lucid which is not ideal, but it's fine. It's fine. Yeah, so before we wrote it, I actually wrote a simulator because I wanted to practice writing it and catch all my bugs before implementing in Silicon and like forever not be able to change it. So I wrote like this pie game implementation of the VGA, which was terrible because like it just pins one CPU at 94 degrees and it's still slow. So it's like it heats up and it's slow and like lost, lost. Then I was like walking home one day and I was like, yeah, JavaScript actually is type sensitive when you want it to be and it's not when you don't want it which is damn convenient. And it also runs everywhere. So like boom, ported it to like JavaScript in like a few hours of debugging. The most of the time was wasted when I realized the largest amount of like representation is 52 bits for like the standard numbers in JavaScript. So you have to use begin which is, okay, minor inconvenience. So this is actually the live, let me move this window. Yeah, this is like the thing running. It's just like a local image. So I have like a simple program written here. If I load it, reset it and run, then it will just run and then it will actually simulate the VGA interface. Like the joysticks are implemented here also if I use like the arrow keys will like run and then you can like make it like very slow and then you can like see like the individual instructions and like all the registers and I have like a simple disassembler with like some tracing. So how we actually write the program is I actually pick it back the MIT cause they use it for their cost. They actually have this assembler which is like just a macro assembler. So I just rewrote the macros for my CPU and then you can just write your program here. So for example, I can copy our actual game program which is just like extremely raw, like just a lot of outcodes, no time to make a compiler. And I just paste it here and I just hit assemble. So what I did was it doesn't actually give it to me in the format I wanted. So I actually just like instead of writing or like finding source code which is quite hard to find, I just like stepped through it in like the debugger and I found the line where it's like expose and I just dumped the memory. So it's like fast game. And then I just paste it here, load it, reset, run and then change it to like high speed and then the game will run. And this is how we developed our game because the FPGA only finished like a couple of days before submission. But it was surprisingly similar to this simulator. So like when I got that working, I just copy pasted this into like the ROM inside the FPGA and then it was like, boom, it just worked for short. Yeah, so you can like play the actual game here and that's what the simulator was for. I was to speed up development time. So the CPU looks like this, somewhat similar. It's a standard load store architecture. REC file goes into ALU. Everything had must pass through ALU before it goes back to write to the memory or write to the REC file. Then your program counter is over here and then your control logic is just the ROM. This is like the opcode matrix again, but like the MIT version, which is slightly different from ours. We have a bit more instructions, but yeah, simple stuff. Then this was like the first time it booted and I fixed some bugs and like the same stuff from the movie, it printed out the ASCII table and I was like extremely satisfying, but like it's not just the ASCII table. I added like hard symbols and some extra stuff. So that's the benefit you get when you control the silicon, I guess. And this is like the final casing after laser cutting. So like now I'll show you like the live demo before we just end it. So here I have the Arctic 7 FPGA here in this box. So I'm going to close it up and then I'm just going to provide it power. It's like connected to this monitor over here. So this is like the joystick and like some of this. So it just reset and now you can just see me playing. It's playing a game called nonograms, but like it can run anything cause just ISA. So the game was just part of the requirement for the cost, like you need to make a fun game. So we made like this puzzle game, which is like hard to do if you don't have a CPU. So it demonstrates that the capabilities of like a CPU versus a state machine. So like you can solve this, you can Google what a nonogram is. It's like an image decoding puzzle and then there's like a few levels. So you can like go to the next level and then load the next level and then render. And this is running at like three kilohertz. It's not even at max speed yet. So if you look inside, you can see the bad boy running at like the high speed. I can slow it down and then you can see the individual pixels, but like, okay, let me do it quickly before we run out of time. Okay, so I use the tweezers and I just slope the clock down. So if I read, so like now you can see it running individual instructions. I just have one of the dips which is controlling the clock divider. So hopefully it should be wiping the screen now. Let me speed it up a bit and slow it back down. Okay, I missed it again. Okay, but yes, I tried to slow it down but it becomes too slow by it. Okay reset. Okay, I managed to make it slow it down. So now you see it literally running each instruction in like real time. Yeah, so that's the entire project. And yeah, I actually started this project not with a Xilinx FPGA. I started with this FPGA from China, the users at NLogic. It's very cheap. You should totally buy one if you can. If not for the Zinc, the Zinc seems like a better deal because it's like Xilin. But yeah, this file has a lot of features. It's really cheap. And that's how I learned FPGA. The synthesis timing for this is much better than Vivaldo though. This is like 30 seconds to a minute. Vivaldo takes longer. This is like 30 seconds to a minute flashed. Yeah, so yeah, that's the thing. And then we just had fun the day before submission. We like, that's the end of my presentation. Thank you, Rago. And next up, we originally had Michael's talk but I think James will be going next because Michael is not here. I think he might have gotten the wrong time zone. So that's why he didn't show up. Wait, do we have time for questions? I think we have a short while for questions. Maybe we can ask Rago. You can take two questions. Okay, so my question would be, so the simulation that you showed, so what was the level of detail on that? Was that more like an emulator of your instruction set or was it really simulating cycle by cycle the CPU? Okay, so I actually built two simulators. The first one I built in Logisim is like gate level, right? So the one I showed you in this slide over here is gate level, but the gate level one has its limitation when it comes to like drawing to the VGA. So like this was just used as a proof of concept to prove to our professors that we will not be wasting our time for the whole term trying to do something that doesn't work. So once we knew that the CPU at least worked, I built a JavaScript simulation which is not gate level. It is instruction level. I can actually show you the source code. It's just a big bunch of switch case statements that emulate the ISA. And then the VGA frame buffer is just like some canvas drawing stuff. Right, but then that means that the control unit, like the microcube that I guess you could say, which you had in ROM. So that one you had no good way of testing them, right? Because one of them- Oh, no, no, no, no, no, no, no, no, no, no, no, no, no. For the microcode, yes. For the gate level simulation, you use microcode. So that was actually accurate. Oh, I see. Yeah, so at the end of the video, I'll just go straight to the exact point. At this point, this is me loading the microcode into the ROM. So there are there, Logisim has these ROM elements. So this was my microcode ROM. So it itself encodes like for each out code, it outputs the signals. So this just did the exact microcode. Yeah.