 Welcome to my lightning talk. My name is Andre Giordini. I'm an independent cloud native consultant and trainer. And today, we're going to talk about ditching data pipelines. Why treating data as assets is the best thing you can do. I am a platform engineer. And I happen to do a lot of work with data companies and data platform. And if you've worked in data for more than a couple of weeks, pretty much, I'm sure all of you know what this law is about, right? Airflow. A lot of people like it. A lot of people hate it. But every time I look at airflow and the way airflow works, I cannot help to make this comparison. And let me know if you see the similarity here. For me, every time I work with airflow is very similar to every time I happen to work with Jenkins. Sure, they do similar things, but there is a similarity underneath in the way airflow and Jenkins do things. And I think the similarity is this one. Both of them are like a white canvas. Every time we install a new airflow instance, every time we install a new Jenkins instance, it's like starting from scratch. We start from like a white painting. And we need to get started and build things on top, on top, on top, on top. And now, when you have a white painting, there are two things you can do. Two things can happen, only two. You're either an artist, and you can make an amazing masterpiece. Or you can just figure out that you're never becoming a good painter after all. The main problem that they see when company use airflow is that the airflow focuses a lot on how data is built, on what's the process to bring the data from the source to a usable state. Instead, what I think should be the focus of a tool that works with data is to understand how data relates to each other. With airflow, airflow is a software that has been around for a long time, where data was just stored on disk or databases pretty much. But right now, we live in the cloud. Our data can be pretty much anywhere. Our data can be a BigQuery table, a file saved on S3, a Postgres database, a file on storage. But it's not as simple anymore. And it's becoming more and more complicated to figure out how things are connected to each other, how all our pieces of data connect together. And this is why today I want to let you think about treating your data as assets rather than as workflow pipelines pretty much. I happen to be a contributor of an open source software called DAGSTAR. I've been working with this software for a while. It's open source. It's cloud native. And I've contributed with a couple of PRs to their integrations. And DAGSTAR really gets this thing right, in my opinion. It really treats data as units of things connected together pretty much. So in here, I just shown you two example pipelines where you can see clearly, for example, the data is being pulled by airbite. Then we do all the processing using a tool called DBT. And finally, we run our Python function to make some final forecasting model. On the right, the same thing happens, but right now we're using five trends for ingesting DBT to modify the data. And finally, we use a TensorFlow model to predict the orders, for example. And look how nice it is. The lineage here is so clear. It's so easy to understand how data connects with each other. And all this comes out of the box just by writing Python. You just need to import, you know, return data from a function, import to another one, and all this thing will get, you will get all this thing out of the box. Another thing I really love about DAGSTAR is that it's a tool that is cloud native first. With Airflow, it was always like a big hack to figure out how to run your pipelines locally. With Airflow, it's really complicated. DAGSTAR instead, what it does is abstract things that are not needed to let you focus on the pipeline. So IO Managers is a great example. Are you running your pipeline locally? Well, then you're using your Docker daemon and your local storage. Are you running it now in the cloud? Well, now you're using Kubernetes and you're using S3 to store your logs and your intermediate state. And your pipeline does not change, not even a bit. And finally, it's the integration. DAGSTAR has an amazing ecosystem that integrates with plenty of different tools. Of course, the major cloud providers, DBT, Slack, Airflow, Prometheus, DuckDB, you name it. There's plenty of different integrations and plenty of different ways of working with it. So if you want to know more about DAGSTAR, you can check out DAGSTAR.IO. My website is underagerdini.com. I will be around the whole week. I will be more than happy to understand your strategy around building data pipelines, about building data assets hopefully in the future. Thank you.