 Hi, nice to meet all of you. I'm Victor from the Carousel team. I want to give special thanks to the Google Cloud community for inviting us here. And I think like what John said earlier, we're very honored to be able to share our story, or rather, my mistakes with all of you here today. I see a few familiar faces from my office around here, so don't laugh at me when you go back tomorrow. But I think down the line, more of you as more of you use Google Cloud, I think there will be opportunities for you to come up here and share. And I'll be really happy to be one of the members in the audience hearing your story, because everyone will go through a different route. That's why we're in startups. We are going through the same kind of difficulties, but I like to hear about more stories. And today I'm sharing mine. And next time, I want to be hearing yours. So what I'm going to tell you today is about actually a lot of our early journeys on the cloud. And they may or may not be on Google Cloud. It's not a sales speech. I'm telling you more about the mistakes I've made early days and some of the things that we're doing now. So hopefully, you can take back a couple of lessons and apply them depending on which stage of the startup where you are right now. So a brief introduction about myself. I'm the first employee to join Carousel. So apart from the three founders, I'm the first person to join them. And since then, I've taken up part-time roles, full-time roles, more on the engineering side, and gradually moved to management. And now I'm the interim head of engineering. So I have to look after the whole engineering team at Carousel. And today, the things I'm sharing will be great used-to-be decisions. Now, in retrospect, mistakes that I'll be sharing. And I can share them all I want because at that point, I was the only back-end engineer. So everything I share is my own mistake. So no one would get offended by this. So I'll cover a couple of things first. I'll give a very quick history and timeline of us on the cloud before I sort of dive into the lessons and the stuff that we've learned over time. So I think back in 2012, 2013, and 2012 was when Carousel first started out as a company as a startup. Everything was hosted on a single instance on the cloud. So when I joined them, I went in. I asked, where are the servers? There were no servers. There was a server. It was on the cloud, but everything was inside. And the best part was that I found out that development and hosting were all done on the same machine. So this is pretty much the whole shift your laptop to the cloud, and that's running production sort of mentality right there. And that was a lot of fun. Going in, finding out why your app has trouble meeting the needs of your users only to find out about this. And before you can even get started out coding anything, you need to fix a lot of things. So I pretty much spent the whole 2013 summer moving things out bit by bit. And since then, we started to use the cloud proper, as I will call it. So we were on cheaper hosting providers. We did not have any funding at the time. Options were limited. And so in the name of cost, in the name of price saving, we moved to a cheaper cloud providers. But I think regardless of where we were at the time, in terms of mindset and mentality, where we were was that. We were using the cloud like VPS providers like last generation. They were just machines and we were just deploying there. We had no concept of the very fact that we were on the cloud now. To us, it was the same thing. And our architecture at that time was very simple. If you look at this, this is pretty much how things were at the era. And there was no concept of containers. You don't hear of anything about like that. So it was very simple. You had a load balancing layer. You had your application servers. You had your databases. That was asked at that period of time, back in 2013, 2014. And in 2015, 2016, when we started scaling and growing, that's when we really hit a lot of issues. You wanted to grow. You wanted to scale. But you were experiencing all the issues that you had on the cloud. And if you were on what we call the lower cost providers, things like noisy neighbors, night work issues, shouldn't be stranger to you. I didn't even know that there was such a term called noisy neighbors until I went to research. Why at unknown periods of time, my server that's performing fine would just become slow. And depending on which provider we were at, sometimes we will get hit by what we call random reboots. So you will only know that after the fact, because they will email you and tell you that we had to reboot the instance 10 minutes after they have rebooted it. So in your metrics, you'll be like, what? And then after that, they'll tell you, I'm sorry, we messed up. And at best, at the best when we were growing rapidly, like 10 reboots in three days is way too much excitement for anyone. And I don't wish this on anyone else. The best part was that at that time, I was the only one managing on the servers. So even when I had to go for overseas trip holidays, you could never sleep in peace. At that time, I had not embarrassed about my server scratching. So I think back in 2016, 2017, that was when we did a benchmark on Google Cloud. I think we were really blown away by the performance to price ratio. And then we started taking plans. We went to talk with them, talk to our partners at GCP. We planned for migration. And the whole thing took more than two months, actually, from the first conversations. But in terms of planning, in terms of execution, that took two months. From start of June all the way to the end of July. And at that time, I think we tried our best to do a one-to-one migration. That means we are not re-architecting our app at that point. We just moved our VPS concept over and deal with that later. I'll share why we had to do this as I go on. So this is us in a very brief history and timeline sort of concept. Now, I think we use a lot more GCP services. So we use Compute Engine, App Engine, and the rest you can meet yourself. And right now, our entire data infrastructure is possible because we are using Google services. And we try to move all our legacy VMs to containers whenever possible. And that's the direction that we see ourselves going forward. So now I'll talk about the lessons that we've learned. Three main ones, very simple, nothing mind-blowing. I'm not selling you any magical solutions here. But these are very simple things that anyone can take away. The first thing here is, if you are designing and building something now, design and build it with a cloud in mind. Don't be like us. When I first came out, I designed it like VPS services because that was the mindset and what I had in mind. The cost to change a cloud provider or change your architecture increases dramatically as you grow. For us, the worst thing was that we were hit by all these issues when we were growing rapidly. And that was like we were talking about a huge percentage of growth week on week, quarter on quarter, month on month, year on year. And when your architecture and your services can't keep up with you, you're going to be in a lot of trouble. For us, it took two months. If you are an early-stage startup, you have your first funding. If you re-architect, that could take you days at that point. Days is much better than months or even quarters down the line because speed is of essence for startups. Adopting new technologies now, this is top-works technology radar. And you see that communities is right there, right? Adopt. There are a couple of more leading-age technology over there. For some reason, GCP is under trial. But I think they should be under-adopt. These are things that you should make sure that you are aware of. Don't be building your tech on last generations of tools. And in terms of architecture and design, keep the modern ones in mind. I can blame myself because I was a fresh graduate. I came out of school. That was all I knew. But for the rest of you, I think this is a great time. And if you're building something from scratch, start from the modern tooling. There's no need to be leading-age. You don't have to try out the new thing that came out the Hacker News last week. But there are things like humanities who have been around for a long time. Active open-source community, use them. Docker is another example. And for early-stage startups figuring things like this, getting it to production, it takes you one to two days at most. For us, trying to re-architect a scaling app and one that you can't stop and just say, I'm going to take one day downtime to migrate, it was very expensive. Our engineers had to do that planning and migration. And every time they had blockers, I just think to myself, wow, shit, I'm the one who got them into this mess. And you don't want to be in that position when you're growing. Second lesson, you want to make sure that your hosting platform can scale with you. The only thing that should be scaling when you're building a startup is your own startup. Don't choose partners that have to scale with you. This was a very, very expensive lesson for us. So we built Carousel on top of our hosting providers. They didn't have any what we call quality of life improvements things. So all they offered you were instances, like how you would use VPS servers. And that was all. No load balancing, no nothing, no storage solutions. All you had were instances. And you had to use them to scale. I lost count of the number of times where we were looking at other solutions and we were like, to be able to do something, we'll have to shape, we'll have to maintain this entire service by ourselves. And at that point when it was just mainly me alone as a back-end engineer, trying to take out more things, trying to build more services didn't seem like the best choice. And the hosting provider, they were also a startup. And they were scaling with us. Scaling with us means they had their own scaling challenges. I don't know what was more exciting than knowing that, I have a service downtime. Let me go to the console and reboot it. But their console was down because they have bugs. Or they have hosting issues on their own. I don't know what went through my mind at that point. Like if I had a partner like this, shouldn't I just switch at that time? In retrospect, yes. But I don't know what I was thinking at that time. Too busy trying to fight fire every day that I don't have time to plan ahead. So if you look at pros and cons of why we were on this hosting provider, very simple billing was what they advertised. And it was very easy for us to do our financials. But everything else, we had nothing. In terms of instant size, forget about customizations. This is all we have. Use them. If you're going to exit, what they'll tell you is, I will try and accommodate, but it will take us a few days or weeks to give you a dedicated instance that could do something like that. And of course, the last one, random reboots, very nice of them. If you ask them to tell you why they'll just tell you we had hardware issues, we had to reboot the instance, that's it. Good luck. And when I say we need to reinvent the wheel, that means load balancing is a soft problem if you are using most of the hosting providers these days. They had none of that. So we had to build our own layers. We had to build our own routing solutions, optimize it for ourselves. Zero helpful services, same thing. Most of the things you take for granted these days, none. They just give you servers. Build everything by yourself. I know there are some people who love that, like being able to do everything by yourself. But when you're scaling, when you're growing, and having to do that by yourself, it's probably not the best thing ever. Costs will and always be the biggest consideration for startups, even if you are funding or if you are bootstrapping and you don't have our money in hand. I think what John shared earlier was that, if you're a startup, before five years old, you can get funding, can get credits and stuff. But at that time, we didn't have all these options. And so we had to spend our own time. We have to trade our own time to get something that's cheaper. Time and cost will always be a trade-off that you have to manage. But there will come a point in time where you need to think about the savings down the line. If back in maybe, say, 2013, 2014, if I took some time out to plan to re-architect, that would have manned compounded savings down the line. And it's very hard to tell other people in a company that we need to get this thing done. We need to build this and ship this. And you tell them, no, we can't do it because the server is not going to be able to keep up. And that's a position you never want to put yourself in. And the last one, I hope, I'm not doing too fast on time, is that you should plan for how you're scaled. Right. What sort of scalability do you need for your architecture, for your design? This is up to you. This is up to how you want to plan and build it. Do you need a data pipeline? Does the setup need to have, are you using a machine learning framework? Do you need to build things that are scalable? I think the thing about those, like for the longest time, we didn't have our own internal data pipeline. We relied on external services. And when we had to build them, that was a very tricky thing. Do you want to be using your own, do we want to build our own all the data layers again? The data framework, data tooling, or do we want to use services and save us the time? Thinking about how each part of this will scale is very important and very critical. And when we were so busy fighting fire every day with every downtime and reboot, we just didn't have time to plan for this. And I believe there are two domains of things you should want to focus on as a startup. Things you want to spend time on, it's a very limited set of things. These are the things you want to get really good at. Your business, your growth playbook, what's your scaling strategy, maybe. It's not going to be a lot of things, right? Because you just want to make sure that you focus on a couple of things and you do them well. And there's going to be a large pool of things that you don't want to do. And for us at Carousel, I think early days, I think we spend too much time on things that we don't want to spend, we shouldn't have spent our time on. And as a result, we didn't manage to hit when growth came, right? We didn't manage to hit everything and a lot of things had to go in order for us to scale our architecture up to meet our growth needs, right? So what were some of the things I was spending my time on? When it was peak hour, sometimes I would say, okay, let's spin up a couple more servers and just to be sure, right? In this age and time, if anyone tells you that they're doing this, you'll be laughing at them. But at that time, we were so busy that we didn't think about how ridiculous this sounded. We don't plan on how to scale. So that means every day, whenever incidents happen, we'll be too busy fighting them. And once you're done fighting them at night, you just say, okay, I'm too tired for this, I'm going to sleep. And then the next day, the same thing happens over and over again. When you try to do all these things by yourself, like maybe you made some decision that I want to manage this myself, I want to manage that myself, and you only have a limited set of people and time, you're not going to do them well. I try to juggle five balls versus two, which one is easier? And for me, when I look back like, shit man, this is where I spent all my time on. And if I did, just say, guys, give me one week, let me sort things out. Things will have been a lot better. So a lot of our negative stuff, so let me talk about something more positive right now, today, right? Today, we are on GCP, which should be of no surprise to anyone coming here today. We use quite a bit of stuff, right? I talked about us. Most of our applications right now are running on Docker Kubernetes, say for a few things, a few random scripts. I even have stuff running on App Engine that I don't know if anyone knows. I just snucked in my script there and just kept them running. For our data team, they use a number of Google tools a lot, BigQuery, Cloud Dataflow, Cloud ML Engine. And we still have our own stuff that we built on top of their Compute Engine, yes. But majority of these things that we are using, these are things that we don't have to worry about. So two months of migration that happened in 2016. And I think last count, we had about approximately over 500 Compute Engine instances running. And that's excluding containers. I don't count them, nobody counts them. A lot of our data and machine learning processing, it happens on GCP, right? So when we have new features that powers our new personalized home feed, things that you might like, all these are run from on there. And this whole platform powers all the work of all our engineers at Carousel today. We've moved over and we don't look back and we don't regret. Some tips, if you're in the same situation as us, in case you look at some of my points and you actually resonate with them, I think first thing is to take a break and stop fighting on a fire, right? There's never a good time to do migration, right? Always tell yourself, if today is a bad time, tomorrow will be a worse time. Just do it now. And don't be like me, I don't want to hear someone sharing the same story as me down the line, like, oh yeah, I didn't do it then. But that's what people like to hear, like shit that they share about themselves. And start using managed services, right? Like, if you can, if you think that it's something that you can accept with your technology team, do it, use it and save yourself the trouble down the line. I think the key thing here is to figure out what are the things you want to spend your time on, right? If you are more focused on data, you want your data pipeline to be the best, then spend your time on that. Worry less about the rest of your application hosting and all this, right? We tried to do everything by ourselves, we couldn't do it well, right? And you have to look at the size of your team, how much funding you have, what kind of a growth level you're at, yeah? So, I mean, maybe just take away that don't be like Victor from Carousel, like, don't try and do everything by yourself. That's a very, it was a very bad position to be in and I'm fortunate enough that, you know, like, after almost five years of working with the team, they have not fired me yet, I'm still here able to share this story with all of you. Okay, yeah, that's actually it for today's sharing. I don't know if we have time or do we take questions? Yeah. Yes, anyone has any questions for Victor? Sure. Yeah. Hi, so my name is Arringam. I work with GoldenGridder Consulting, which is a consulting firm in Singapore. So, can you talk something about how you build your deployment pipeline? Is it a word, Google Cloud, or like, do you use any other services for your deployment pipeline? We're building it on Google Cloud. Okay. Yeah. Tooling-wise, what most engineers see are Jenkins, where they're trying to deploy, depending on what interface they use, some of them want to deploy over Slack and other stuff, but everything is on Google Cloud. Any other questions? Okay? No questions. Thank you, Victor. Thank you for your time.