 And today I'll be talking about, once you build like complicated products, complicated data products, or it could be anything else. And then how do you communicate that entire thing to your users? So how do you build that documentation as a product all around? Okay, a brief about Gojek. So Gojek is an online platform for ride-hilling. In 2010, we started as a phone-based service so anyone can hill-ride. But in last over a couple of years, we have evolved as a full platform, which hosts kind of more than 18 products starting from the ride-hilling to financial products to even food delivery and different markets. So the business has grown and the complexity of the technology has grown very, very rapidly. A little bit about me. I started my career as a designer, turned into a developer and then was in some time as a data journalist and went into data visualization and these days I'm doing like a lot of data engineering. So okay, agenda. So today we'll talk mostly about, I'll give you a little landscape around data engineering at Gojek, what we do, what kind of products we build. Then I'll talk about Chronicle, which is actually one product which is our technical documentation and which we treat as a pure product in itself. Then I'll talk about why exactly you need an internal product website. Your customers are your internal customers and still you need a product website. Why do you need a micro-website? We'll talk about that. Then we'll talk about the entire process we went through on terms like having nothing to having this proper product website. We'll talk about the process. Then at the end we'll cover what exactly was the impact of this whole exercise. So let's start. This is how the entire landscape of the data engineering looks like at Gojek. We have different different OMS apps which produce data from different streams and that goes to a fronting stream, fronting architecture that goes to main streams which are like a lot of Kafka clusters. Then we have consumers, people who consume this data and then on one side we have aggregation where all this raw data gets aggregated and then there are teams which consumes that data as well. Then we have certain parts where data visualization happens where we take this aggregated data and we visualize and give it to users. Then there is warehousing, then there is infrastructure orchestration, then there is a monitoring piece component, then there is a load testing component and then auto healing components. So this all is how the data engineering landscape looks like. Out of this whole landscape we have more than 18, 19 internal teams working on different different things and all these teams data engineering is cross cutting and all these teams utilizes products we built for them. So for data engineering the approach we follow is that we are a small team somewhere around I think 10 people team and we build products for all these teams. Then the approach we follow is like one thing we definitely focus on is scale. So the scale at which we grow is not linear because even the number of products and even the business itself has grown from one product to 18 in the last three, four years. And so is the scale of the data and so is the complexity and so is the number of the users within the organization. And second is automation. So as soon as the product grows we also grow into international markets and when you grow into international markets you need to set up the whole infrastructure for a different country. And when you want to set up the whole infrastructure for the different country you want to communicate the same thing for all these different teams as well. So now as a team of engineers if you want to focus on different products as well as their scaling as well as scaling to international markets. You want to make sure that entire process is automated and no manual intervention is required. So a lot of what we do we heavily focus on that whatever we do whatever products we build whatever the process of setting up the infrastructure is entire thing is automated. And third is product mindset. So data engineering itself right now is like not there isn't no data engineering product outside go check which we offer to the end customers right data engineering is an internal team which get us to different different teams within the go check. But we still like operate with the proper product mindset where we consider ourselves as a like complete B2B you know product company within go check and we operate in the same fashion. We operate in the same fashion is that we do not ask for like you know let's say for example access or in terms of how do you provide documentation or how do you provide what what particular products you are building. So we treat ourselves like a pure B2B company and we follow the entire cycle like even starting from building products collecting requirements sales even going to talk to people showcasing our products like the whole process which you see for any enterprise company happens within within the company. So all these 10 engineers they are just like not just developers but they focus entirely on the full full product cycle. So let's talk about like the problems we were facing when when when we were at the point where we did not have this this you know technical documentation as a product. The biggest thing was that since we were operating as a B2B and we are operating as a complete separate team communication becomes hard like what exactly we are working on. A lot of people most of the people at the floor do not understand what exactly detention team is working on and what will be the next next in line up what next product they are going to release. Then you deploy infrastructure and products for different these different teams then a lot of support records request starts to come and then starts to consume our core development time and given the size of the team. Then if you're spending a lot of time into making sure that you know that you have the support request get us to then that consumes a lot of our development time and we are not we are not progressing fast on on the features track. So a lot of sports sport requests are coming at that time. And the last thing is that it becomes. Okay, but when we are building as an internal product team right it is very important that we communicate that you like whatever you guys are problems you guys are solving are solving certain problems and whatever you need in terms of data in terms of consumption whatever you need our product is this final solution you are looking for and you need to communicate that very clearly. And this without any proper proper channel to do that this was getting really hard. And I think this is this was one of the worst so we have I think in data engineering we have more than like internal products which data engineering offers two different different teams it's somewhere around 1520 like smaller big products. But like some teams are using certain product like let's say bi is using certain products and the developers are using separate products data producers using separate data consumers using separate product. For these particular channels what happens is that data engineering became equivalent to one particular product so let's say firehose is one particular product right for consumption. So for certain people data engineering is all about oh that the team that built firehose or let's say there is another product daggers which is mostly caters to bi folks. So for them oh data engineering team oh they're the dagger team so the landscape was like that so the entire team became actually equivalent of one particular product team and that was the problem we wanted to solve. Because if that happens then these these teams are not looking at the other products we are offering and they are not they are not able to like you know get the full capability out of it or whatever whatever they can use. So this was this was I think the biggest problem. Then this is this is the word the end product looks like in our terms. I think the fonts is not available. But this is the this is the product website look like which we call chronicle and this is a microsite we have built. And if you're looked at the like you know the different different products outside for example hashi cobs like you know products and you look at the documentation these are large scale open product projects which they offer to the entire like you know the entire ecosystem. And they build very clearly build very you know very good documentation or and how do you go about using it why do you use it and everything. But then we'll talk about like that makes sense for them because the product is global and ecosystem is global and people don't know anything and they need to communicate that understanding. But how exactly does that fit into an internal organization. So so we followed the actually kind of the same same cycle in that in that front. And this is how then the end product look like where we talk about what data engineering team is doing what different scenarios we cater to what different particular products we offer and the whole cycle. These okay these like then we talk about like you know from product point of view from developers point of view as well like what all features developers can use. There are like you know complete knowledge resource websites and everything was there videos from content documentation even like everything even code samples like starting from just pure why to use it to how to use it to everything. Okay let's move on to process. So the whole cycle for us was I think two weeks when when we had internal wiki which was more like a get lab pages or things like that. But that's that's where we started at something was there with like small small pieces. But then then we started the process and the first thing that that's there is that how do you go about research and planning. Like this was a product us and we were treating it as a complete complete product nothing more than that it's not a not just a documentation. It's not just a read me file it was a full product for us product which will communicate what exactly we are doing there. So in planning what we did is that as I mentioned there are different different categories of the teams right. There is bi folks there are there are leadership there are people who are into market section there are people who are into like pure development who are who are the consumers of this data. So first we walked on like designing personas around it. Who are these different kind of people and what kind of products they are using to start by mapping different different products into different different person. And we heavily use narratives to do that within the within the team, which I'll talk about in a while. But then we then we went about this whole process, taking like different different set of users taking different different set of our products we are opening. And then doing a mapping of which all particular these users are using which products. And then we sat down to exercise of if if we ideally we want them to use these products as well what is that particular mapping. So categorizing everything and then going about that and talking to people. So communication was the biggest into when we started about research in the planning of this phase. Just just ideally in a nutshell figuring out what exactly users want and what is the next thing they can utilize. The next was preparing content. So what we did is that we for two days we stopped the entire entire development you're not doing any development at all. The entire team actually sat down because since it was an entire backlog right we did not have anything concrete at that particular point. So we sat down and like you know listed down everything we have every product we have offered and everything. And then went about like you know designing the entire content of that. So in the screenshot what you look like is that where we are mapping of the entire entire list of the things. And in the entire list there are streams. So streams are our Kafka clusters and these Kafka clusters hold data. And in the data there are mainstream there is log stream there is corn stream there is tag stream. There are different different kind of clusters which hold different different kind of data. Now anyone wants to look as soon as I told you this right there are streams and which hold data the next question will be okay what kind of data. Who is publishing who is producing who is consuming what use cases are there what case studies are there who has solved what problems what kind of problems I can solve. All these things came into your mind right. So these are the questions which we all gathered into the into the content design strategy and we put down we put together all these answers for all these streams. All these aggregation products and everything. For example there is a dedicated page for the for the mainstream. If you let's say if you visit products and we talk about like all these streams are listed here mainstream upstream aggregation products are listed experienced products are listed. Products around warehousing products around data infrastructure. If you go to one of the stream this list down like all the overview application topics that's listing publishing consumption entire architecture of it where exactly you can monitor that particular stream. What are the case studies of it the whole all this entire thing is listed here what is the scale of the data here seven days of retention you get on the mirror you get three months of data retention. How many events are publishing there who are consuming how many topics how much data is is there in there in that particular stream. So and this is for just one thing right we listed we prepared this entire content for everything every data every piece of data we were offering. Even after doing that entire cycle since this landscape is so huge that there are still there are a lot of sections which are not like documented but we tried our best to make sure that the entire thing is. Then let's after research is done we have the content ready and the content content is actually pretty extensive right. Then how do we how do you go about designing it how do how do you want to communicate it. So, if you look at the like you know as a as a term like data engineering team itself, you will hardly hardly correlate design with the data engineering team. If you talk about the engineering team what comes to your mind is that together as the people who are like pure nerds or pure developers and they focus more purely on the code right. And you will have design will be the last thing that you will you will look into but it's not the case for us we actually focus a lot on the design and since like even for me less and less I know my background has been into design. So, design we like everything we build we make sure that you know it's well designed not from the even from the development point of view even from the representation point of view. So, even for this particular product we wanted to make sure that it looks good and people whoever is looking into it actually it feels good and looks good. Conjunction has to be experienced has to be good but yeah we care about the look and feel as well. We care about if you look at like you know a lot of architecture diagrams usually what people do is let's say you pick mono draw or you pick any other like you know tool out there and you just pin down boxes and you just just make the entire diagram. But we wanted to get away from that as well and we wanted to make that experience also pretty well so when I explain the you know landscape. Even that is a that is a part of which architecture design like components we designed on our own. So, what do we have in the design language is that we have a we have a complete UI kit which specifically focus on the data engineering. And what it has is certain components which you can utilize to even design the architecture diagram. So, let's say for example, we have a Kafka cluster piece right there is a component which represents one particular cluster one particular stream. We have automation tool let's say we have a firehose as a product it is a full component which list downs firehose as a product. But if you want to represent firehose itself then there is a smaller components which you can utilize for this whole thing. In terms of tooling we just simply built it with the with the sketch and then exported these components as a as a symbols in sketch. So all these UI kits are actually symbols symbol in sketch which the entire team uses essentially. So, not like a pure design team or something but we still follow the same guidelines which a pure design system designing team will follow. Now that's that's to design architecture diagram and everything right and as I talked about we follow we believe load into narrative design. So whenever we come up with a product what we do is directly go out and build a narrative around it. The narrative I mean is that if we have to communicate that particular product and that particular communicate has to product communicate with the user. How you will go about it some user will ask some certain questions. I'll tell certain answers. So even for our CLI tools let's say for example we build right even for a simple CLI tool we build product narratives and not like you okay this is the CLI this has to do but there is a complete product narrative around it. And then even to distribute our products to different different people we have a complete design set of brochures and the kits and the print materials as well which we use when whenever we go for showcases and everything. So yeah so in given itself so we have this entire library of the UI kit even the narrative design and then the entire design languages there which we utilize to communicate our product to build architecture diagrams and even to build narratives around it. And all of this is actually like you know maintained and self utilized by what did I do like these are like some of the components utilized to build architecture diagrams. And this is one example of the of the product narrative. So this is this is one of our tool or in which is a infrastructure orchestration tool. So any user will just so ordinary here it's saying I'm ordinary almighty God user last what all you can do. And it tells okay I can safely create infrastructure for you I can create data clusters I can create viruses I can create consumers I can replicate your entire infrastructure to a different country as well. Then people can just simply say okay I want to create new infrastructure and I want to create it for this particular city. And then it can just ask okay how many data clusters you want do you want different different streams of the data or for this new particular region you just want certain streams. So this is the narrative we designed before even starting to design the design the service itself. How exactly people what will what will be the best narrative user can have with this particular service to make sure the experience is good. And what we are talking about is a is a backend service which which is rest service API calls. So even to design that API call we take sure we make sure that there is a net narrative around it. Way before we go into the development or way before we go into the coding that comes when exactly we are vision envisioning that particular product. Okay in terms of developing chronicle. So we wanted it to be so since like a lot of this is data intensive right and this data is growing every day there are new streams coming new products coming. So we wanted to make sure that this website whatever micro website we build can pull data from different services or meta data services and can always stay updated. And we wanted to make sure since we are developers we don't want to spend a lot of time after once we finish like core development we did not want to spend a lot of time writing HTML CSS. Doing all this. So we wanted to make sure that whatever bare one structure we set up and we spend what time into it later we can make sure that you can always fill custom design layouts into it without without much effort. And indexing and product categorization happens so fast that you just provide certain tags to it and then the entire things falls into the place on its own. So even even our product manager doesn't have to write any code after we finish the bootstrap. And then they can just specify certain tags write certain markdown and it automatically pulls into where exactly on the entire micro site it has to go. Okay so if you look at that this is the boilerplate. So if someone has to add one particular section all they have to define is this this metadata. What is this title about tagline description, which path it belongs to which category it belongs to which position it belongs to what are the tags and when it's written. And as soon as you define that it automatically knows on the entire website where it has to fall. Is it does it belong to developer documentation does it belong to product which is more focused on the end user in terms of BI and people who are like non developers. And what what I have to do with it. And then then just someone just write the simple markdown around it and automatically fills it. The what exactly it helps us in is into just making sure that we don't we can continuously do it. So that becomes as as soon as that you as soon as you know as close as to that edge as you had comments in your in your code. Simply you just write simple markdown here and it goes to the micro site properly. So after that particular process happened and we are we were ready with the like you know entire micro site. What we wanted to do is deliver it to our users and our users are like internal teams. So what what we the process we follow and this is not just for for this particular micro website. This we follow for any any product we build. So we do showcases so we followed the same process for us because for us it was it was no less than a product. So we went about doing showcases in showcases we invite certain teams and we told them that this is the product. This is what can do for you. You can look through all the use cases. You don't have to come to data engineering for all your needs and to understand what exactly we offer what exactly like and what is the landscape of the data inside the project. And then we printed brochures and we did all those cycles. We had personal interactions with the people as well. Talking to product managers. How do you feel about the entire experience and the micro website? Does it help you into reaching from point A to point B where you don't know anything about data engineering what we offer and you still have. At the end of the day you still have like no good understanding of how you can go about in terms of adoption. So there are like in Bangalore office for Gojek we have we are around like 300, 300 developers. And this is how like you know the active users and weekly active users looks like so users were like pretty good people were extensively looking at into it. I can't say that we still have reached that entire stage where everyone is using it and this is the single source of the truth and they don't have to come to the data engineering. But I think still we are still at a really good position from where exactly we were in terms of a lot of people using it and a lot of the requests which were coming earlier are not coming to us. And once you ship something right and then you give it to you give it to your users and you feel the same things when they give you feedback about so these are like some some feedback from like people. They don't talk about okay this is one of the best documentation like you know within the within the org we have built or within the different company if the people have experienced working with different companies they have seen. You will see a lot of good documentation as I gave example of Hashi Coffright and or other any large open scale project will see really good documentation they build and but you will hardly like very often it's not very like you know often that you see this kind of efforts put into building a technical documentation for an internal product. For a team of 300 engineers internally that just shows that how much how much the team itself is care about the product and how much how much value it can bring in total. Few lessons learned like as I talked a little about it destination is a myth I won't say we are in that perfect situation. I still won't say that we are close to it I think we are still very far and it's an ongoing process and there are a lot of sections where people still feel like they don't have the entire understanding of where exactly the data is. And but I think I think we are very we are a good situation from where exactly we were like before doing this. And I think with the continuous progress this is the way to go and I think we can solve most of our problems into giving the entire like you know landscape knowledge to people within the org with this micro site. And I would say like your documentation whatever you are building right, even if you buy a single product from from a market, it always comes with a handy guy does not matter how simple that particular product is. If that documentation is not there the only thing you will do is you will do a trial and error right will try to figure out what this particular button does let's say for example even for a washing machine. Since we use it is fine but if it's a new product you would need that particular documentation to start with this documentation come come in form of a video. It can come form in terms of picture it can come in terms of text right but that has to be there and it's not just a product it's a part of the entire product experience. So if you are if you are building a product it's even an internal product even if it's a REST API right it's a product it's a product to someone someone is going to be the user of it. So I would say focus as much as you want if you focus on the product focus as much as on the on the documentation itself because from my point of view documentation is extension of your product it's extension of your entire product experience. Next lesson we learned is that communication matter does not matter even if you build the entire product website and you communicate it to people. Still always make sure that you have personal interaction with the people to figure out what exactly what exact problems you are solving. This happens this applies to even if you have very genuine experience delivery or doesn't does not have but always make sure that the communication is there and communication matter. In whatever form you do it you make sure that you do it and then fourth is that if you look at like you know as different I talked about these open source projects. So what they have is a dedicated team usually everything you will see that they will have a dedicated team of before developers working on it before people who are good at writing pure technical documentation. There are three or four designers who are enrolled into it so there is largely a pretty huge team focused on just making sure that it is communicated to the users. What I want to bring out is that it's it's good they need it maybe for for serving such a large ecosystem but it's not necessary. We were team of 10 developers doing still the same thing and still maintaining it. No designer role involved no extra front and developers were involved. No one else was involved and then the entire cycle actually just took us close to two weeks starting from planning to finishing the entire program even doing showcases in two weeks and that was done. And after that it has been a continuous exercise that as as you write comments in your code you just write certain markdown lines in the code and then that things just keeps on rolling on its own. So you can still deliver you don't have to specifically focus or you put your mind put your mind into that okay you need a dedicated team it's not it's not it cannot be done without that. I mean I would say it's it's you can be even a team of two developers can deliver the same experience which a dedicated team can do. So developers are good enough for this particular. I think that's that's pretty much it from my side. But if you have certain questions around any any of the I think I'll be happy to. Changing pretty fast. Yeah. How do you make sure that your. Sure. Okay, let me put together this this microsite I'm talking about then the chronicle I showed this is not actually your code documentation. What it is is your product documentation. Yes, if your product features are changing then you just go and change the change the markdown files a little bit. It's as soon as as you are documenting something that say if your API contract is changing for example right you would document it somewhere. So it close to the exactly the same as that if your API contract is changing let's say you have a postman collection will go change the request features there as well if you're not doing that to do it. But it's for us it's the same as maintaining that as an as well as soon as one contract changes which insert certain markdown things and whatever is it's referring to. There are new features coming we update that it's it's part of our as I like Russian cycle from verification story goes to the production and this is done and it's documented and you're even.