 Hello, okay, all right. Okay, why don't we get started? How's everybody doing? I already asked, but yeah. All right, thank you, thank you for coming. I'm gonna talk about how to improve CICD process for your cloud-native Python applications using a few different PyPI cloud options. Yeah, so let me introduce myself a little bit. My name's Jaehyun, my friends call me Jae because they cannot pronounce my name correctly. I'm from Boston, so having a hard time getting used to this weather as well. I'm a software engineer at Ikigai. We're building really cool data science platform like on a cloud, and we're helping our users who use their data on their day-to-day life to automate the process. So please come check it out, it's pretty cool. So the motivation of why I'm here is as a startup, we did not spend a lot of time on our CICD process or either cloud infrastructure. When we started off, we just wanted to focus on our POC and our investors are there. We just wanted to make something. So naturally we didn't focus too much and now it's a big problem and then we had to spend a lot of time with a lot of resources to build something that is meaningful, something that's not breaking. Also, I didn't specify here, but also we use a lot of open source tools to make our platform work. For example, Superset is a visualization tool written in Python and also Ray is a computing engine, which I love. So I always wanted to give back to the open source community and then I found this opportunity as a perfect example. So yeah, that's why I'm here. So today I want to talk about where we started, like how we start this journey of setting up cloud infrastructure with PyPI and what kind of CICD challenges we faced on the way and how we resolved it. And in that CICD processes, we utilized this something called PyPI. And then I'm gonna introduce some different types of hosting options for PyPI in your infrastructure. And lastly, I want to talk about some few example clouded architecture that I utilized PyPI with and yeah, so why don't we get started. So I'm gonna start with how we started our building our platform. So in a nutshell, we wanted to build the data science platformer cloud and I'm a big fan of all these tools listed here. We wanted to build our platform on AWS cloud. We wanted to use Python for our language and for our microservice ecosystem, we wanted to use GRPC and to containerize our application, we utilized Docker and we wanted to deploy and orchestrate those containers on Kubernetes. Of course, and following, we wanted to build a CICD pipeline that incorporate all these things. So I'm gonna start by how naive you were. So this is what we thought is gonna happen. So for our CI continuous integration portion, we thought we're gonna write Python code and we're gonna build the service. So naturally Python doesn't really build a service like other languages, but we utilize the GRPC. So we compiled ProtoBuff for services to communicate with each other and we test, we perform some kind of unit test for each services. If the task passes, we merge to the main branch and containerize, pretty simple, right? For deployment also simple, most of the companies have, most of the product has like two environment, two clusters, either dev or staging environment where they push and test. So exactly that. We're gonna push to our development Kubernetes cluster. We run a platform test so that we, to see if services can play good with each other and which is also written in Python. And we then deploy to production environment and then we test one more time, a platform test to see everything's perfect. So as our platform grows, our company grows, we have a lot of feature requests. So we build multiple services on top of it. So I'm gonna quickly explain like what these are because I'm gonna use this example until the end of the talk. So API server is basically what users interact with. It's like, hey, where's my project? Where is the list of the model that I can use? Things like that. ETL services, the data pipeline, they run the data pipeline. They perform basic ETL steps like a filter, sample, all these things and deliver the data so that ML service can utilize and ML service of course build a machine learning model with hyperparameters and persisted in somewhere, all the goodness. So what we thought, what we expected is if we have multiple services, we're gonna just replicate entire pipeline and then just build it. And then it's gonna be nice, it's gonna be clean, all independent, great. So obviously that was not the case. So I'm gonna introduce some obvious CICD challenges. So first one was a share code base. As a startup, we didn't plan it right. So this ML model portion of the code base was embedded in every single services and that's not good. So example-wise, API server wants to get all the list of model that it should return to users so it calls some part of the model portion of the code base. ETL service wanted to pre-process the data that is specific to each model so it had to access that. And then ML service of course has to build a model so it needs access to that model code base. So it was really tangled up like this and this was the problem because every time you want to change something in the library, for example, let's say you added one type of machine learning models to this model portion of the code base. Now you want to expose that to user, hey, there's a new model and now you want to run some pre-processing in ETL pipeline when that model was requested by users and then now you want to build that model in the ML service. Meaning you need to trigger entire pipeline again. So you need to build three images just for this and then when your service scales, like you don't want to even think about it. So easy, simple, natural solution is you just port that out. So code base-wise, you're going to port all the ML model portion of the code base into different repositories so that you're managing them separately, not embedded in each other. So I'm going to use terminology package and library interchangeably in this talk. So I'm going to just make sure that everybody's tracking. So library is just nothing but collection of packages in my understanding. And package, if you have played with Python before, if you see one directory with init file in it and there's a bunch of Python file and then you can import that, what is it, directory as one logical group, that's a package, right? So library, so in my understanding it has more of a purpose. You bring them together and then you give them purpose, it's reusable and that's a library. So in that sense, we separate the model library out from each code bases and diagram-wise this is perfect, this looks independent. So are we good now? So not really because you differentiated, you like ported out the code base itself but CICD process is still very, very tightly coupled. So for services, yes, you can write code, you build it, you compile the protobuf, test it out, the same thing you did from before. But still, when you change something in the library, you still have to go through the service portion of it one more time because yes, you separate out the code base but you still have to embed it into the service itself so that they can utilize it. So this is not good. So why is it bad? We still need to redeliver services when library update is required. So that's what I just said. And hard to keep track of the library versions within each service version. So what this means is that every time you build a library, yes, it's gonna be good. It's gonna be ideal if you just build all the services that utilize this library at the same time. But let's say API server has no problem. It returns all the list of models just fine but then maybe one of the model has bug in its preprocessing function. So you just have to update that. Now you have different versions of this ML model library in three different locations and then if some bugs happen, you just don't know if you just wanna fire everything or like, yeah. So it's really hard to optimize because you just physically cannot keep track of the version there. So what I wanna say is decoupling is still required at CIC level for that purpose. So wanted to quickly introduce what Python package repository is. So if you use Python before, you know this. So PyPI is classic Python package repository. It holds different types of package that user uploaded. You can download the package easily, pip install, that if you just do pip install something, you're downloading from public PyPI. And you can also specify the version. I'm pretty sure you guys know. So if you do pip install ml library equal equal 001, you're just downloading that version. So you don't have to, yeah, I mean, yeah. So with this Python package repository, let's say we have some stable version of hosted solution for this somewhere. So what we're gonna do for the library is a little different. So we write the library code, we test it out with the unit test, we merge it to the main branch, we build the library into some portable format like .will or user snippet, and you deploy the library to the Python package repository. Simple as that. So now it's version and then it's accessible for anybody. And now for services, it looks a little different because it utilizes a Python package repository in its entire CICD pipeline now. So I'm gonna walk through. You write service code, you compile the protobuf, and then when you're testing service, now to fully test how service works like what it can do, you need, sometimes you have dependency on the libraries, right? And that's when you pull in, you download some specific version of the ML model library that you wanna test the compatibility with. So on that spot, you download it and you merge the code, which is not related to ML library now. You containerize the service, and also when you're deploying to the staging environment or some kind of cluster, in your startup code, you're gonna download that specific version of library on the startup. Let me explain why that's useful later. But yeah, that's what you're gonna do. When you're testing it, you don't necessarily have to download it one more time because it's already inside of the package, you're running as a server. So you do the platform testing, and you do the same thing when you're pushing to the production environment and test it out. So why is this useful? So this is one example how we're doing it. So we're setting ML library version as an environment variable for each pod or deployment in our Kubernetes cluster. So let's say, going back to that specific use case where there was a bug in ETL pipeline function. So in ML service, they don't have to update anything. So they're still utilizing ML library version as a 002, but ETL service wanted to update something, so now they're using 005. And this is maintained in the startup level, the launch script level, sorry. So when you want to build a library, and then when you want to just fix the library and push the change, you don't have to rebuild each container, you just have to restart the pod after you specify which version you want to use with the library. So yeah, and in this way, your library version, compatibility between ML service and ML library version is specified inside of the code base. So for you, it's easier to keep up with if you're using GitOps or something. So is that it then? Like it's pretty simple topic, but no. So this is all, so we can decompose the processes between libraries and services if we have a Python package repository that is stable and always running. So which is why I will actually want to focus in this talk is a Python package index, PyPI in short. So I'm going to introduce you a few different ways that you can utilize PyPI in your cloud architecture. So before I start, how I want to measure this is a good PyPI hosted solution for your system is I want to see if it is a cloud native. This is from CNCF website, I'm pretty sure this is, there's a connection for this. So what I want to focus is that to be cloud native technology, it should be able to empower organization to build and run scale of application in modern dynamic environment. So dynamic, it should be, your PyPI solution should be able to run in any environment. Loosely coupled systems that are resilient, it should be available all the time and allow engineers to make high impact changes frequently, so pretty much like it has to be fast as well. So these are few requirements that I set up for a few different options that I'm going to introduce how you manage your PyPI server. So first one is portability, you should be able to put your PyPI host solution anywhere within your system and security, I just added it because well, you don't want everybody to see your code base on time. Resiliency, it should be always available, speed it should be fast, pretty straightforward. So I'm going to talk about public PyPI first, just because, so public PyPI is, well, public PyPI, this is where you download yourself if you just pip install something. Gonna spend really quick time on it. Security, nothing, everybody can see what you upload, everybody can download everything. So if you want to use anything that is specific to your company environment, anything proprietary, then this is not an option. So resiliency, yes, portability, yes, speed, yes, use CDN, so I mean, if your code base is, can be public, if you're working on open source, sure, use pip, but then that's not why I'm here. So we're gonna explore different options, so PyPI server. So PyPI server is, well, as it says, self-hosted PyPI service. So you're hosting this in your specific environment and then those packages are gonna be available just for you, right? This is open source, go check it out. But this is how PyPI server generally works. It has some kind of a disk space and if you upload a package, it's gonna be stored as a directory. And each directory represents the packages there and inside of there you're gonna find either ag file or either wheel file, which is like compiled version of your Python code. And it's versioned inside of each directory. So in Kubernetes, if you want to make the very dumb version of it, you launch up your PyPI server pot or as a deployment and you hook it up to some kind of storage which is a persistent volume and that so that when pot dies for some extreme events, you're gonna lose all the packages, of course. So that's one way in Kubernetes. If you want to make it in AWS, then you launch up EC2 machine, you install PyPI server there or run the container that has a PyPI server there. Mount the EBS volume on it so that you can keep your packages in the event of PyPI server just dies. So yeah, pretty simple diagram here. So the problem is it does have a little bit of scaling issue if your company's big and then there's a lot of people working on it. So every time when you do pip install, what it does is it sends the HTTP request to the PyPI server that the pot or EC2 instance and every single time when that happens, it scans entire disk space. So if you have actually thousands of packages, it slows down a lot. Like you might have to wait for a few minutes just to install one package and I don't think you want that. I mean, that's initializing the installment, not just actually installing it. Once it finds the package, installing is pretty fast but like that just searching for the packages itself takes a lot of time. So what you can do is you can actually install the caching of PyPI server cache manually in your PyPI server. It actually remembers every time when you, when some packages updated or requested, it remembers where the location of each PyPI package and then it immediately returns. So that's one way you can actually speed it up a lot. PyPI server cache, it internally used a watchdog, Python library, it just monitors the file changes and that's how they maintain the caching of like where this package is. What you can, and if you want to do more, you can actually enable caching at the reverse proxy level like NGNX has a innate reverse, what is it? Built-in caching logic, so you can utilize that too. If you wanna see how that actually works, you can go to the reference in the bottom. But yeah, so you can resolve the scaling issue by doing this kind of manual. So I'm gonna do a little bit of demo. Let's drink some water, really quick. This is just a public repo that I made for this specific demo, so please check it out. But what I'm gonna do here, I have a PyPI server and I have three directories set up. Setup is how to set up your basic Kubernetes PyPI server in your own Kubernetes cluster. I set it up already because I hate live demo. It always, it doesn't work, so I'm gonna do that and upload is how you can upload your own package and download is how you can, well download the package is pretty simple. Try to be very descriptive on how each thing works, so let me know if you cannot follow. So okay, let me go through this one first. So what happens is to deploy the PyPI server by yourself, you add the helm repo and you need to generate encoded username and password with htpasswordd. This creates the file that has your user information and then you're embedding this when you're building, when you're starting up the PyPI server. So PyPI server remembers your username and password and then only enable you to download something if you authentic yourself, pretty simple. What I did is I have a kind local cluster. I create a namespace PyPI server and I deployed my PyPI server helm chart. So in here, it only has replica count one. I'm gonna explain why I just have one even though it's a Kubernetes setup. And this is what I put as a credential. My credential is test user and password is a test password. So yeah, so this is already running. I deployed it already. We started once because I updated the Docker and what I'm gonna do, what I'm gonna do is I'm gonna do the port forwarding for this so that I can access the PyPI server in the pod with my local host. So that's exactly the pod that is running here. So if I go to local host, you're gonna see this PyPI server. It's very small, PyPI server. Doesn't have too pretty UI, but. So if I go to simple index, this is a default index that if you don't specify any other index and don't create anything by yourself, simple is where you're gonna go. So I deployed this cloud open before this talk and it has two versions, 001 and 002. So as I said, there's a disk space with a bunch of directories in it and each directory, what is it? Each directory represents one package and inside of there, you're gonna find binaries for each pack, each versions of that library. I can download this by doing this. So I'm gonna quickly show. So I made this really nothing package called cloud open. So I'm gonna, actually let me upload it first, sorry. So if I go to upload, this is how you, I'm pretty sure people, if you have published something into a public PyPI, you know this, but just wanted to quickly go over. So that PyPRC, this is the file that keeps track of which PyPI server that your computer remembers. And you can specify different types of servers here. If you put test and then like next up like development, you just have to add another field that looks exactly like this. This is where your PyPI server is located, password and username. So you have to put this in your home directory, just like that AWS file. And you specify the setup.py file, specify the name of your package, and you specify the requirements, like just like requirements.txt and version, I'm gonna specify that with the batches environment variable. And how I build it, build and deploy is I call that function, I invoke the function, invoke the file module, and upload it to repository that I specify. So obviously I'm gonna use test. So if I do PyPI server, go to upload. If I do Python, oh no, sorry, bash. So we had 002 version, I think. So I'm gonna upload 003 and to the test pretty fast because I have nothing there. So if I refresh, you're gonna see your version there. Downloading is also very simple because you're not downloading from your public PyPI where you don't have to specify anything else. You do have to specify something called index URL. That's where your own PyPI server is located. So I'm going to copy and paste this one more time and I'm going to change the test username and password to test user and test password. And let me download 002, although there's absolutely no difference between the version. Oop, let me just download CloudOpen for now. This is why I hate it, localhost. Oh, oh, there you go. So CloudOpen, so obviously this is the latest version. So I download that, let me retry this. So I have this, I'm gonna download the 002 version. Thank you for waiting. Yeah, so you can do the version control like this, obviously. And just to make sure that you download everything correctly, so import CloudOpen and CloudOpen greet, it does its thing. So now your computer has a package from your own PyPI server. So that's good, moving back to the slide. So other option other than PyPI Cloud, which we're using in our company is PyPI server, is PyPI Cloud. So think of this as a yet another way of hosting your private PyPI server, but you're making your caching layer and you're making your storage layer into a hosted service. So small teams like us just loves this because we don't have to put much time into it. So in the storage layer, you can use Amazon S3, GCS, Azure Blob Storage. For caching layer, you can use Dynamo, we're using Dynamo, any kind of a Redis distribution, any kind of a SQL library, SQL database, sorry. I just put Postgres there because I love Postgres. And because these things are a hosted service now, or can be hosted service now, you can literally put PyPI Cloud instance anywhere. Like if it is Kubernetes or if it is like some, if you wanna use it in Heroku, yeah, sure, go ahead. So another demo is, I'm not gonna do much this time because it's pretty much the same. Gonna go to PyPI Cloud and go to setup. So in this case, we're gonna actually build a Docker image or I built a Docker image and deploy that. All that information is here. If you go to Docker file, I have that. And then we're gonna explore how to change the server.ini file really quick. But, so just like you create a password in PyPI server, you're gonna have to create a password here as well. But instead of using HTPasswordD, you're gonna use PyPI Cloud's own executable, PPCJ and password. Once you use that, it's gonna ask you password twice and it just looks like password dot, so it looks like a max password thing. So I was inputting my password for like 10 minutes and notice I was dumb. And yeah, so what you're gonna do is you're in that server.ini file. You're gonna specify what user you're gonna use. I still use a test user and this is that encoded version of Task Password. And you're gonna specify what storage back and you wanna use. So we're using S3, we're using Dynamo for this demo. And if you provide your PyPI Cloud instance enough credentials, for example, I gave AWS access keys and secret key to our pod and then it auto-generates all this for you. So you don't have to do anything by yourself, pretty easy. And then build a Docker image locally and I create a PyPI Cloud namespace and I deploy with the Helm chart. So I'm gonna do yet another forwarding. So this is where that PyPI Cloud deployment is. I'm gonna do that. If I go to local host one more time, so now it's a PyPI Cloud. Looks better, I think. It has nothing because you need to log in. So test user, test password. And it's gonna take you to, oh my God, oh my God, oh my God. The Cloud open with better UI. So I only uploaded one package here. How to upload, how to download is exactly same with the PyPI server. I mean, PyPI server's way of uploading is exactly same with the public PyPI as well. So not gonna go through that one more time. So that's PyPI Cloud. So I wanna kinda compare PyPI Cloud and PyPI server really quick. So PyPI server portability, yes. You can deploy that inside of the Kubernetes as a pod. You can deploy it in EC2, so you can pretty much do whatever you want. Security, yeah, if you put this in your own VPC, yeah, you have your security, and it's a little plus. You have a username and password. Doesn't do much, but still there. Resiliency, so you need to do a little bit of work for this. I'm gonna show you with the diagram how we achieve this in the next slide. But what happens is, if you want to, so in Kubernetes, I'm gonna just keep using Kubernetes as my example, if you wanna have multiple pod for availability and resiliency, in PyPI server, you need to mount that to the disk. If you want to mount multiple pods into one disk in Kubernetes, your PVC, your persistent volume claim has to have a read write many access mode, and you kinda have to have an NFS that's there for you if you want to mount from many different pods. So for us, it was a little bit more work than we expected, so with work, but Resilience is gonna be there. Speed, yes, if you install that caching thing by yourself, you're gonna get speed as well. So PyPI cloud, portability, security, exactly the same thing with PyPI server, so yes, it's there. The beautiful thing about PyPI cloud is resiliency and speed just follows with it. So storage, since you're using S3, GCS, this hosted solution, you don't have to worry about setting up your NFS server or something like that. And speed caching layer is default here, so you automatically get speed without doing anything by yourself. So yeah, next up, we're gonna talk about these. I'm gonna introduce three different types of cloud architecture that you can host your own PyPI server. And I actually use this, use all of this in my current company and my previous company. So we're gonna start with what I used in the past. So we had PyPI server, two instances, multiple instances of PyPI server in EC2. And we did have multiple Kubernetes cluster, I believe most of the other companies do as well, because the blue, green or staging production, I'm gonna just set this as a default option now. So we have a multiple PyPI server that talks to each cluster. And for resiliency, it doesn't have to connect strictly with this guy and this guy. And each PyPI server EC2 instances are mounted to EBS block where we actually store all the packages there. The problem, not the problem, but then so EBS does allow you to mount multiple EC2 machines, but I think the limit is like 16 machines at this time. And then you kind of have to be in same availability zone for that to happen. So for us, little too much work and then we can see the limitation when we started off. So I didn't want to use it next time I'll set it up. So the next time I set it up in the current company, we start with this. We actually use PyPI server in a pot and each of those pot were mounted to EFS, which is a kind of NFS service that MSN provide. And PVC, of course, it has to be read write many. We noticed there was some little slowdown issue for some reason in EFS like 2016. So we stopped using it. So we install Ceph and it was beautiful, but then it was little too much work just for this. So we decided to explore other options. Even though this was working totally fine, we wanted to make something. We wanted to explore something that is simpler to manage. And that's why that's how we ended up with PyPI cloud. So as you can see the diagram is like a lot simpler now. You can deploy your PyPI cloud pot in deployment, state, what is it, replic set, demo set, whatever, into each Kubernetes cluster and they all connect to DynamoDB. I don't have to worry about the scalability and they just do it. So yeah, and then with this setup, our info team didn't have to revisit, hey, this is not working for our PyPI cloud deployment for last half a year. We didn't really worry about it. So I think that's a really big win for startup like us. So yeah, just putting that idea out there. So in conclusion, by using PyPI cloud, we can organize code base with libraries and services without sacrificing efficiency. So not just separating your code base out, but then your CI CD pipeline can be more individualized for each services and library. So whole scalable and secure Python package repository for your cloud native environment. So I kind of suggested two different options, PyPI server and cloud. So if your company has specific infrastructure you're running, try to find what's best suited you and then lastly, engineers can focus on building interesting stuff rather than wasting your time on setting something that is always breaking. Really, yeah. But yeah, thank you. That's everything I wanted to say. Thank you for coming. If you have any other question, I'm gonna stay around for next few minutes. So feel free to talk to me. Thank you.