 to ensemble these models' outputs before your output responds to the user. So what I'm trying to say here is that it's never, deploying a model to production is never as simple as putting a model in a microservice, in a cloud, and expecting that what it produces is good enough for whatever application you're doing. Right? It'll probably look something like this in the end. It gets complicated pretty fast. So case in point, what Peter and Giao showcased to you earlier were lasso. Oh, sorry, not lasso, I mean Yeager. I'm too excited about my own thing, but yeah. So if you looked at what they had, right? What they had was multiple models, some sort of experimentation service. They had teacher getting, they had post-processing they had, one zongling, right? And if you think about building something like this, sorry, okay, I'll talk about it later. So it's very difficult to think about a model in production as a model. You have to think about it as a predictive unit. So this is a term that was coined by the developers behind Selden, right? Who's building something similar, but it's way more featureful. So what I'm trying to say here is that when you want predictions from something, that something is usually not a model, but a system. A system of different entities doing different things, right? I can imagine if you built this entire system within a monolithic application, right? All that complexity would become something that's incredibly hard to handle even for software engineers, right? Much less data scientists. And you want data scientists to not have to grapple with code as much as possible. I mean, you want them to code, right? They need to code to make their models, but you don't want them to be building massive, not even, massive microservices sounds a bit strange, large services to house their models, right? So the problem with something like this is that a lot of code ends up being boilerplate, right? You end up using the same code across your applications and then it becomes very hard to make changes and one of them will not have to propagate them across all those other applications. It's also very difficult to orchestrate all these flows within a single application because it's not immediately visible what each part is feeding to another, right? And it also means that you're locked into a single language in this project, right? Within machine learning can be quite a handicap, right? You would want, like for instance, let's say you train an XGBoost model, right? Usually you'd serve it with, nowadays you can serve anything, but back when this was built, you could, you could only serve it with like Java or Python while the rest of our, the rest of the parts of application we wanted to do didn't go, right? So it was, it was difficult, right? What's the next slide? So logically, what you want to do is break it up. You want these entities to live within their own services and interact with each other somehow, right? But the thing is that when you break, when you broken it up, how are you going to define the flow, right? Where does it, where does the logic for how this input flows into this input and how the request flows to all three models, where does that live? Do you put it inside each of the applications? The problem with that is that you have to, it becomes very difficult to make changes. You make one, you add one model, you have to change that model, you have to change this, you have to change that. And it becomes extremely difficult to maintain this system. And that's not what we want. So Enter Lasso, which is something we built to, as a solution to this problem. It's a microservice orchestrator, which is an extremely big work, big word and very fancy sounding, but I hope it explains why it does someone, right? So it's supposed to orchestrate flows between different microservices. In this case would be services that comprise a predictive unit. So by orchestrating these microservices, as you can see, Lasso would have access to each of the services within predictive unit. It's able to execute workflows similar to how your Airflow DAG would be like, right? You define some sort of directed acyclic graph, your request to flow through this graph and you get some sort of response. So what Lasso is is that, sorry, Lasso is comprised of workflows, right? So workflows are, as I described to you, a directed acyclic graph of tasks. So there's a variety of tasks that Lasso is supposed to execute. And all of these tasks have access to an in-memory JSON storm. And when a request comes in, Lasso will orchestrate the tasks as they are required to, as in whenever, okay, let me show you later. So Lasso will execute these tasks in turn, either synchronously or asynchronously to produce some sort of response. So what's going to look something like this, right? You'll have some set of options. So you're able to set which endpoint you want it to be at. You're able to set a global timeout, which has proved to be very useful for your scientists. And you are able to find a bag of tasks, as I showed in the previous diagram. So what is a task? So a task is basically something that does something if the conditions are met, right? So the way we've decided is that you don't draw a graph. You define a task that only executes when certain triggers are filed, right? So it's able to depend on either the content of the request, other tasks' statuses. So for instance, if an upstream task has succeeded or failed, and also other tasks' outputs, right? And the tasks, we have a variety of tasks that are currently built into Lasso. We have tasks that can echo values, that can make HD calls, and also execute some sort of lambda function to transform the requests or outputs or other tasks. So it looks something like this, right? So you have the task type, the configuration of the task, you have additional inputs to the task, output path that you want to put it in, and then the conditions to run this were extremely important. And then if you want this task to be exit, then there will be an exit flag. So how does this whole thing solve the problem, right? Thanks to Lasso, microservices can have distinct roles, right? You don't have to have a microservice do a bunch of things and do a bunch of things in a mediocre way, right? You can write a task that, sorry, a service that processes data, and the language that processes data well, for instance, Python, right? You don't have to write it in like, go, which is like ripping your hair out, right? And then there's also encapsulation. The microservices don't necessarily have to know about each other, right? Only Lasso knows about them. So they can basically care about the world around them. All they care about is whatever they get and whatever they have to return, right? And also similar services can be used for other predictive units. So instances of services that can be reused to be like feature getters, the many feature getters, yeah. Yeah, and of course, there's no dependency on a single framework, like all language, as I've explained at last point, where you can have a predictive unit that comprises of microservices that are in go, and Python, and Java, whatever your developer is comfortable with, or whether it's best suited for that task. Yeah, it also means that extremely easy to edit and iterate on workflows, because you don't have to write additional code, and the configuration of the workflow sits outside of the definition of the services. Yeah, oh yeah, and boilerplate can be moved to the Lasso. So for instance, you're planning to move feature getting into Lasso, so that you don't have to write a service for that anymore. So it will be a task. Yeah, so Lasso brings to the table some other nice things. It's fast and it's scalable. It's extremely lightweight. It sits at the heart of Yeager, so I know at least that it's able to handle the load that the Indonesia marketplace puts on it. So that's pretty impressive. Also because it's written in go, it can be compiled into a single executable that's very, very easy to deploy. We have a help track for it. So all you have to do is to help install, provide it with a contact map that has the workflows specification, and it works. That's it. Yeah, and then if you've seen from the previous slides, we also support both templates, which will be similar to your ginger tethering, right? It allows you to have logic inside your workflows, right? So you're able to do things like input, output validation. You're able to transform requests and responses and so on. Yeah, and the nice thing about the workflow sitting outside of the service definition is that you're able to version the workflow separately from the services. Yeah, there are some qubits though, like it's not all sunshine and roses in Lasso land. There are trade-offs between latency and type safety. So I mentioned the Jason store earlier, right? The thing is that JSON can be very expensive to pass. It takes very long, right? So what Lasso has done is that it opts for a lazy JSON parser that doesn't really care about how well-formed your JSON is. In exchange, you're able to pass the JSON extremely quickly. Yeah. So to handle validation, you have to write it in the lambda instead. It can also be very intensive because we use it in the JSON store. Go template is also not introduced, right? If you guys have a tangent template thing, it's the same. And you almost can get believing. And also, it's currently HTTP only. It's not cool, right? Knee handle DRHC, right? Yeah. So I think that's it. I have a final slide with my, get hammered it because I didn't know what to put in the final slide. But yeah, I hope you enjoy what I'm saying. Does anyone have any questions? That's a very important talk to say to you. Do you guys have any questions just in here? Yeah, I remember like, we're not gonna throw a shot immediately. So, you know, Peter and Java are gonna be there. Julien's gonna be there. They're already charismatic speakers. But I will say this, that this is the first time I'm hearing the official version of why Yeager's named Yeager. I always assume that the team like to drink. That's why they call it drank and Yeager. So, okay. Yeah, let's go with that Pacific Rim. Okay, cool. I am actually gonna pivot a little bit and talk more about, thank you so much, burning out data science. And why is this a more topic? It's because who, what's the most important thing when you are trying to launch a data science product? What is that sound? It's people who said that, thank you. And the reason why I say people is because honestly, like the teleport here in Singapore and Southeast Asia is, you know, and to a logic of the world, like it's very thin. Like, you don't support people, they're gonna burn out. They burn out, you don't get data science. Cool? So, a little bit more about me. My name is Jairet Tan. It's pronounced Jairate without the T. Here is my contact info. If you wanna get in touch with me later. My current favorite animal is the penguin. Because they're just so cute. Look at that one. Even real life penguins are very cute. So, why do I care about burning out as a data scientist? Well, my task here at Gojek is simple. I support 46 also data scientists and I'm here to help them in their career and to make sure that all of them feel supported by the organization. My background is very all over the place. I used to work as an economist at MAS. If you've heard of it, it's right across the street. I'm a bond breaker. I have strong opinions about this scholarship system. Please come talk to me. I worked at Fliprred, which is an app that has since gone on to do two greener pastures. I was the first data scientist that worked on current models, data, growth hacking. At Facebook, I started out as a data scientist on groups. Then I transitioned into a software engineering role on AP testing. Then I quit and started my own consulting business, published a paper in the Lancet on Bayesian method analysis. And now I'm here supporting my teammates and I have to say that these people are on point. You know, I've been around the world. That's a song from the ADs in case you didn't know. No? You don't want it? And I have to say that the talent here at Gojek is like stellar and I've really been impressed by all of my teammates. So I'm living proof of the fact that you can have many careers and you can reinvent yourself even if you're not the donor. So what happened to me at Facebook? I burned out. In 2014, my best friend died and I didn't take any time off because I was young. I was, back then I was in my 20s. I still am. And I was just like powering through, right? And then in 2015, Facebook grows huge. Like that's when they had a doubling in size of headcount at Facebook. By 2016 in January, I was physically exhausted and in 2016 in March I actually took two months off of work. And in November of that year, I left Facebook. And then soon after, trained as a legal instructor, focusing on mindfulness and then started my own consulting business. Now, what is burned out and how do you recognize it? There are three points here. It's exhaustion, cynicism and attachment. Oh excuse me, exhaustion, cynicism and feeling ineffective and unaccomplished. So exhaustion is pretty obvious. I put a picture of Yoda here because I think Yoda's cute too. But you can read the symptoms by yourself but it's important to know that exhaustion can refer to both physical and emotional exhaustion. So if you feel like you're irritable and that you're short with your colleagues that's an example of exhaustion. Cynicism and attachment refers to a lack of enjoyment when you engage with your work. You start to feel pessimism. You start to isolate yourself. I did this a lot at Facebook too at the end of my career there. And attachment refers to things like I go and lay or I don't really want to engage my colleagues. The last point is feeling ineffective and note that feeling ineffective can also be a cause of being ineffective. So it becomes like a vicious circle. So congratulations, if you exhibit 15% of these symptoms you may be suffering from burnout. And that's not a good thing because burnout equals attrition. I mean from an organizational perspective but for you burnout means burnout and that's bad too. Okay. So why did you decide this burnout? And I think that there's something unique about burnout as a data scientist and I'm gonna give you a couple of hypotheses precisely six. One, this field is full of media hype. So there is a myth that data science can do everything but in reality it can do some things really well like it can play a goal but it cannot really stand in for product or design in which I think we often as data scientists get involved in conversations where PMs or directors ask us like so should we launch or not and then you're like... Um... And hey, yeah. Yes. Cool. Hypothesis too. It's a lack of clarity about what we do and this has to do with media hype but that also has to do with the fact that we're a relatively young field and we do so much. We do machine learning. We do statistical inference. I love, I train as a statistician. That's where I think I shine at. Software engineering, we do data visualization and also sometimes we're expected to tell stories and not all of us are going to... This is not an exhaustive list, yeah? Not all of us are going to be good at all parts of this... of this list. Hypothesis three. Data is where product and engineering disagree because engineers tend to be realists and product managers tend to be idealists and data science often gets caught in the middle. How many of you have fixed the locking bug? No. Oh my gosh, really? You all are living the life. I have fixed the locking bug so often where the PMs define one thing as one as something and the engineers just increment the wrong part of the app and all the metrics look crazy, all the results look crazy and then I'm the one that has to chase it down. That's where product and engineering... That's an example of where product and engineering disagree. I'll leave some time for questions later. This point is important to know that the data science lifecycle is very, very different from the product lifecycle or the engineering lifecycle because why we have actually... Engineering, you tend to go from prototype to implementation to maintenance, to department maintenance and improvement but for us, there is a front loaded analysis portion, right? Because we're not going to execute unless we have a fairly good idea that it's going to produce good results. And how many of you have been involved in daily split in the stand-ups where you're expected to just say, I'm still researching. You're the same as yesterday. Yeah, right, right? Very exhausting, right? The other thing to note is that we're often, often by the organization. If you work at a startup, the data science person is usually not the second or third hire. It's usually like the 20th hire and then when you get there, the engineers kind of look at you like you're a bit crazy. And when you report into engineering, you're reviewed and ranked against engineering standards which tend to ignore the front loaded analysis portion. And when you report into product, you're reviewed and ranked against product standards. And who knows what those are? I'm sorry. I love my product managers and everything but sometimes I don't really see how to lie. So who advocates for data scientists? That could be another course up right now. Oh, excuse me. The last thing is the relative lack of mentors and managers. And I think really when I was at Facebook, this was the number one reason. Because data science was such a young field then, I was 12, you know? And I'm not 22. And my managers were just 14, you know? I'm exaggerating. My manager was actually younger than me at that point. And bless his soul but he tried his best and the issue here is that a lot of us data scientists, we have an obsession with being technical which somehow has come to mean like I know how to use deep learning. But we don't realize that in order to leverage ourselves, we need to leverage ourselves in terms of talent rather than technology, right? Because when do you really encounter a problem that like deep learning or like whatever the, like I come from statistics so like right now everything is patient, right? When does that Bayesian approach like Bayou, you know the 20% win? Not really. So give, these are my hypotheses. How do I think that we can prevent burnout? The normal thing is to set expectations, right? I, when I work with my team here at Goja, I continually help them to remind stakeholders that data science cannot answer every question. In particular, don't get data science to answer really hard business questions that are at the intersection of product, engineering, design and UX research. We are a voice at this table. Yes, I agree. We can provide a lot of valuable insights. We can provide a lot of valuable strategies on how to grow, but we cannot solve everything. The other thing to note is that we, we have to be very cognizant of where we are in the data science life cycle which means that when we are in the research or analysis phase, we have to really be aggressive at telling other people to, you know, give us some space to breathe, to say it nicely. I was about to use like an expletive. And each phase requires different skills, different cadence of check-in and different project management techniques. We start to answer very broad and open questions at first. Data scientists answer broad, open questions at first that kind of converge on an implementation solution, right? And we have to be able to recognize and advocate for ourselves. Another thing that you could probably do is to ask for organizational clarity. As I said, being a data scientist at an early start-up sucks because you do need to be founder, but too early to be a comfortable employee. A lot of you are laughing because probably it's true. And being a data scientist is a bit complicated suck because you're often called between product management engineering but you have none of the authority. And you may also have to compete with other data teams like data engineering or product analytics. I think in the face of this, you can ask yourself what data science as an organization and your teammates, what you all can uniquely do and focus on those things. And once you get clarity of what it is that you should do, other people should listen to you when you are the authority on that. And I think that this requires a lot of self-awareness and the ability to advocate for yourself. The last thing, I think it's the last thing, but I'm just crossing my fingers and hoping that this is the last slide. Sorry, I'm not putting my foot in my mouth. It's to find mentors. You find mentors inside the organization so you get more senior data scientists within your organization to mentor you. But you also, after the organization, use these meetups to find counterparts that you can bounce ideas off of. And that's what I hope that you will do with DSSG tonight. But the more hidden and more valuable thing, I think, is because we're such a young field, you may not necessarily get mentors with that much more experience than you. As I said, I'm only 15. Like, they're not that way. Wrong. I'm only 22, yeah. And this field is only, what, six years old? Like, who else is here to help me? So one of the ways that you can overcome that is to become a mentor yourself. And peer mentorship, I think, is where you start to learn about, and that's actually how I got started in this business, by helping my teammates and shepherding them through tough times. And in doing so, I put trust and they also mentored me when I needed them. So this is not an exhaustive list, by the way, of why burnout happens and how to prevent it. But I'm hoping that it will spark a good conversation. And if you're interested in being a manager or a team leader at Gojek, please come email me. This is my name, jiai at Gojek.com. And we have one last video, but I feel like the sun's a little off and we're, oh, it's all right now. Well, we can watch this video, but before that, do you have any questions for either me or Zhiting? Could you give an example of when you were asked to answer a stupid, sorry, next business question that really wasn't all posed? I'm not covered by a fist of India anymore, right? So I worked on Groups, right? And one of the last products I worked on was the Groups app. The standalone Groups app. How many of you have heard of it? Exactly! Right? Right? Because then they were just like, basically every week, they'd be like, oh, did we hear on our tricks? You know, they turned out to be like, so should we launch an app? I'm like, I don't know, it depends on what you think the business value of this is. And whether you think that having a standalone Groups app is part of your entire strategy of having individual products be broken onto their own separate apps. And then they'd be like, should we launch an app? You know, you get into those weird conversations which are very on loop where, and you have to start recognizing this, right? When you are asked to behave like a product manager, like to make product decisions, because your product manager is not bold, not bold, you should find a way to figure out, you know, and say like, hey, like I provided you with the best information that I can, we need to have a multifaceted approach to decision making at this company, because you can't simply just look at me and say like, data is all the answers. We often do, I mean, we're geniuses, but we often don't. Any other questions before we play this video? Yes. Sorry, I had a question about Lasso. Yes. Dylan. Yes. Regarding, thank you. So regarding preprocessing, very often it's a part that actually is duplicated. You have preprocessing when you're training a model first time, and then you have preprocessing for live requests. And I'm not entirely sure whether you were using the same code or same components, basically in feast or in live preprocessing. Oh yeah, excellent question. That's where feast comes in. Please, like and star us, won't get hot. Yeah. Okay, so the way feast is able to help you in this part of the machine learning workflow is that we do our preprocessing outside of the app. It's done inside the stream. So in our case, we do it in deep. So if it's real time data, it passes through the stream. It's transformed. And ideally what comes out of it is a pure feature. It's whatever you need for your model. So whatever goes through that stream would ideally be something that could be shared across multiple predictive units. Yeah, so does that answer your question? Absolutely, thank you. Okay, thank you. We're gonna play this last video and then hand it back to the organizers of DSSG. How do I play this video? But there's a button here. Oh, okay. Sorry. Let me guess. Do you love making mobile apps? Bitcoin predictors and flippie bird games. That's cute. Me? I'm a super app. Yeah, that's right. Super. Names go Jack. Baby app killer. Am I super? Hmm. School's in. Ready to learn some stranger things? Here we go. Welcome to Indonesia. Brown Jewels to East. 18,000 plus islands. The 22 billion minutes wasted in traffic. And aligned with me. Go Jack. Super app. Ooh, nice. Payment bills, rewards, shopping, business. You get the point. I do 100 million orders every month. Wait, what? No. Yes, you heard me. 18 plus products for 261 million people. Who are you gonna call? Me. I do it all faster than you can be, Scotty. Still with me? Good. Think your linguist? Say, makasimasu. To one of the largest JRB Java and Go clusters in Asia. One in four in the nations have me and their pop. Even your grandma. Every day, my riders cover 16.5 million kilometers. That's more than 21 round trips to the moon. Does your app go where no man's gone before? I didn't think so. Oh, sorry, Neil. Most importantly, 2.5 million people rely on me for the income every day. I help build buildings, run businesses, and move an entire nation. I know. Where do I get the energy? Wait, now going to other countries too. Vietnam, Singapore, Thailand, and Philippines? You ready? Now, what were you saying? Right, you're looking for a job. Say something. Super? Of course, thank you to Gojack for sporting venue food, having a speaker share with us. How many of you here have actually looked at the Feast Github? Who is the last committer? Want to find out more about Feast? Check, I'll talk to her. Gojack also has an excellent medium blog. Go check it out, they have tags, they talk about life, they talk about culture, everything about that. So, I mean, I'm sure the speakers will be hanging around a lot as well. So, just do whatever you want. If you want to head home as tired, it has been a very packed meetup with a lot of content. So, after that, that's it. Thank you, everyone.