 Alright, since we have a lot of talks today, I will try to keep things a bit shorter. So today, a serverless first approach, or in short, will-it-lambda. My name is Nik van Engelsman. I'm a Cloud Solutions Architect for BandLab Technologies. And today, I will mainly talk about our flagship project or product, which is BandLab. On the agenda, so why BandLab went serverless with it-lambda, our first real-world migration, some lessons learned, serverless live video streaming setup, and some Q&A. But before we start, how many of you have tried the API Gateway in Lambda, for instance? And who is using it in production? It's a lot less. But it's still cool to see that everyone's interested. I will keep this short. So in the past, I used to work at a Dutch AWS partner company. And in there, we did managed hosting solutions for clients. And my role was basically to educate clients and migrate them to the cloud. And this was way before serverless, so it was from 2011 until somewhere 2016. And what it involved was usually, like, looking at their current applications and identifying which parts needed to be changed in order to support becoming highly available and fall tolerant. And in turn, clients only wanted to worry about pushing code. And that's kind of similar of what we're trying to achieve with serverless. But as you can imagine, it's kind of a big deal, because we as a company needed to keep the infrastructure up to date, scalable, secured, patched, autoscaled. And back then, there wasn't even Docker. So what we usually get was artifacts which we had to unzip, install, dependencies, and then update the infrastructure, the autoscaling group. So yeah, if you think about it, that's a lot of work, right? So if we fast forward to 2015, 16, that's when API Gateway and Lambda was introduced as a combo, and things really started to change, right? Everybody could see that this brings a lot of value and can change how we develop applications. And that's also why I try to invest more time in these solutions. I asked my company if I could do some R&D, and came up with a few example applications which really show that it's totally possible to make a serverless backend which is highly available. It supports a lot of traffic right away. And yeah, you can basically skip a few steps like setting up VPC security groups. You name it, right? I also started to invest my time in the serverless framework, so did some contributions and tried to stay close with the community there and the guys who are contributing to the core. So now 2016, that's when I was given the opportunity to join BandLab. The BandLab platform, as it says, it's a music creation platform, so we're mostly dealing with audio objects and audio in general. So when I joined, there was really like this startup culture, things needed to move fast, the platform was gaining traffic, and the main value was bringing value to users each day, right? It wasn't really about making it high-performance, fault tolerant. So surprise, surprise, they didn't set up autoscaling when I joined. So that's kind of a big deal if you're working on a platform and you estimate some huge growth. So there was definitely something that needed to change. So looking back at the earlier slide, some of the things we cannot forget because of serverless. So highly available, it's taken care of, infrastructure up to date, we don't need to necessarily think about that, it should be patched and autoscaled by AWS, and the teams wanted to focus on the application, which serverless, at least it gives you more time, in my opinion, to work on the application and the features you want to bring out there to users, then thinking about autoscaling, setting up the infrastructure, etc. So next topic, Will it Lambda? So we decided each time we wanted to migrate something that existed, we needed some sort of flow to identify, like, is this suitable for serverless yes or no, or should we think about something else? And we came up with this. It's very simple, right? It's so simple, right? So the idea is you ask the question, like, Will it Lambda? Of course there's more behind that than just saying yes. And if it's no, we must try harder because the reason to try harder is that if you think about if you can get it working, it gives you so much more value and less, in my opinion, operations is required in order to keep it out there and keep it running. And yeah, that's why we say try harder. So I guess most of you are already a bit familiar with the limits of Lambda in general, the concurrency. So by default, if you open a new account, you get 1,000 concurrent lambdas, which is already quite a lot for most applications. There's the memory limit, the timeouts, of course. I will not name everything but because it's already there. But there's more than just these default limits, right? So there's the option to choose runtimes today. It's already a lot better because of the new introductions from reinvent to bring your own runtime. So that's awesome. Cold starts can still be an issue for certain applications. So yeah, you have to think about that. Then where do you want to keep your state? Does your Lambda function even require some state? We usually go with S3 or Dynamo, RDS. To be honest, we don't, right now, we don't do much with VPCs and lambdas because for us still the latency is too much and it requires this extra complexity with the network interfaces where there's still limit. And you can end up in a situation where your network interfaces aren't cleaned up while still the traffic is growing and you don't want to have that. And then there's the other thing if you need, like, if you're in a private subnet and you still want to make requests to the internet, you're paying this, yeah, this amount of money for not gateway, which is not very surplus, right? And then there's third-party binaries. So if you're thinking, like, will it Lambda? You should also consider, like, am I dependent on native packages and do I need to compile those? Again, there's something new which is the Lambda layers, which somebody is talking about later, so that's nice. So that definitely solves a few things because now the community can also push out these shared layers, which others can depend on, and it saves a bit of your time, so you don't have to, for instance, compile FFMPEG yourself. Next, so our first real-world migration. So what you see on this image is what we call our mix editor. It works on mobile and web. And the colored bars you see there are those represent audio files. So when I joined, we already had a service for this. This was hosted in Azure, it used Azure Storage as well. But the idea was that we wanted to start with audio-specific services and we wanted to move them to AWS because in Azure, they were all hosted inside this monolith and we wanted to keep it more for the social part, so it would only, yeah, concern about doing a lot of HTTP requests and rather also doing audio processing on the same auto-scanning groups. So again, we did the thing where we say, like, will it Lambda? And there were things we needed to figure out, right? We needed to figure out if we could have FFMPEG running inside Lambda. So this was 2016 before even the Lambda layers and nobody was really doing that much with binaries. So, yeah, it definitely took us some time to get it compiled and working within Lambda, especially with permissions and identifying where it is on the Lambda within the runtime environment. So there was this project. It's on GitHub still. It's actually really good. It's called LAMCI and they have this Docker Lambda image. And what they tried to do was backwards engineer how the Lambda environment looked like and to represent it as good as possible. So we could use this Docker image to compile FFMPEG inside and then we could copy it to our Lambda function and it worked just fine. We had to deal with a live endpoint for the migration. So there was already this API endpoint and we didn't want to introduce a new path because we couldn't update old mobile clients. So they would break if we would migrate it without taking care of backwards compatibility. So the way we did that was, of course, we had the domain already in Route 53 but we used CloudFront with its path patterns and by adding a new path pattern, we could do like during the deployment, take over this specific path and then redirect it to a different application or in our case an API gateway endpoint. And we needed to worry about state and this was rather simple. We wanted S3 and Dynamo to do the storage and Dynamo to do the fast lookups. So this is sort of what it looked like back then. I tried to recreate it as good as possible from my memory but the way it worked is like if you see the blue line, that's basically the default origin going toward the Azure cloud. The next line you see is going to an API gateway and it was using what I said, the CloudFront path pattern to switch the traffic to go to an API gateway endpoint. Then we would have this proxy lambda function which would know if it's a get or a post or put request and if it was a get request, it would first check in S3 if the audio object would be there and if it wasn't, it would go back to an HTTP call directly to Azure, return the original response which was coming from Azure back to the client and in the background it would send this SNS message to trigger a downloader function which would then download the audio which was still on Azure. It would forward it to the bucket and the bucket then would be triggering this DSP function we call it digital signal processing and it would then generate different extensions of the audio file to support multiple mobile clients and put it in another bucket where we store the formatted versions so then the next time the same request comes in it would be in S3 and basically the sample or the audio object would have been migrated from Azure. So once we had that running we basically could support like we had a living migration in process and we could run all the old requests of samples which hadn't been migrated and since the concurrency was pretty high we could just run thousands of get requests to initiate this whole migration flow. So some lessons learned from this. So our dream was to only ship or write code and ship it but there's more to it because you still need ops, right? You still need to know if your function failed so we use CloudWatch alarms in the beginning and then slowly moved into a different route there but which ended up being aggregated events going to Slack which one of our engineers could accept and then look into the issue but saying there's still some work there it's not about only writing code. The two states of every programmer one day you think I'm a god I don't know what I'm doing. This is basically because either sometimes the documentation isn't up to date or you're testing this totally new AWS service which isn't generally available yet or it's just new and the documentation is different. Yeah, that's the state that you can end up in and cold starts, yeah you need to be prepared to defend like cold starts that they aren't an issue and that you're willing to take the cold starts just because having a serverless setup in the end requires less operations but once it's warm then you can just throw whatever you want at it and it will just scale which is very hard to do in the old setup, right? If you had to do it yourselves. So a few migrations later are will it lambda diagram look a bit like this? So who sees the difference here? So yeah, ECS, we figured like for some migrations or in general creating new services some of them it wasn't worth the effort to get it in lambda like I mean you always want to but in some cases it just wasn't worth the effort because it will be way more complex even or just way more time needed to develop it so in this case is we use ECS to still be sort of serverless with Fargate but yeah, there's still more management because you need to create the Docker images and it's a bit more overhead, right? So last topic, serverless live streaming and I think end of 2017 BandLab acquired a platform called 2TV it allows online video streaming kind of like Twitch but it was specifically aimed towards DJs and since we are a music collaboration platform it was a good fit for us but there were things we needed to find out, right? So their current infrastructure that we bought it was scalable but there were a lot of pieces which were like single EC2 instances with no failover in general the outcome was for us like video streaming it requires servers and it's too difficult to get it serverless until we got a message from AWS saying that the Elemental Cloud would become available to all customers so we did some experiments, right? So if you're not aware about the Elemental Cloud from AWS, the products what they support is basically this setup which is exactly what we needed so you have this live video coming in media live would be this ingestion service which takes the incoming video it would then send as a failover two streams A and B to Elemental... yeah, media package, that's what we use and media package is to distribute the video to clients so it would do on the fly from media live to media package on the fly conversion to different video formats and this was exactly what we needed because it was almost what we had running on Chew but of course we needed to do some research in order if we could migrate it there were some limits because Elemental like having this setup working we did it through the UI it took a lot of configuration to get up and running and we couldn't autoscale this, right? so it would be this manual creation of each of these streams for every user that wanted to stream on the platform so what we did there was... yeah, there were more things actually so the state we kept in Dynamo we needed content moderation because on the existing platform there were people streaming soccer or football matches and gaming which led to a lot of traffic that's not the traffic we wanted and it needed to be identified as soon as the video comes in and CloudFront, so you can set up CloudFront with media package, it's default to the product but in order to have it fast like to have an endpoint up and running fast CloudFront, we couldn't use the feature as it came so what we did was we used Lambda Edge to still have one CloudFront origin and then based on the path patterns or the paths we would dynamically switch to these media package endpoints and autoscaling we replicated you can think of it as like autoscaling for EC2 or ECS it works around CloudWatch metrics and this decides like you have a min and a max based on percentages so we tried to replicate that where for each input stream that was generated we would keep track of this with CloudWatch metrics we kept track of percentages and based on these CloudWatch alarms we would trigger step functions which would then make sure that there are always a few video ingestion setups up and running so I will go with fast racer step functions if you haven't tried it it's an easy way to co-locate multiple Lambda functions and if you have a specific it's just a state machine right so you can have different routes or you can have things triggered in parallel but the most important part is from start to finish each of these steps would pass the state to the next function so you will always have the full state of the whole trigger from start to finish which is very useful if you're dealing with state and you want to have things running based on maybe timings or you want to add delays that's also very nice it supports retries logic, delay logic all within step functions itself so you can remove all of this logic from your Lambda functions I will do a quick demo if there's still some time to show you how the video streaming is working in action if it's not we're still working on the actual delivering of the migration for QTV the feature is already live on BandLab itself but you have to be a better user so if you're on the platform and you want to try this feature just send me a message and I can see what I can do for you yeah that's it yeah, alright, questions I wanted to say when getting ffmpec running in BandLab you just need a static build of it correct yeah, but still I think it's a good process to at least try it yourself so you get used to the limits and definitely with layers it's getting way easier which is awesome yeah other questions when you use ffmpec how do you prevent your Lambda function from happening so in our case so the part which you saw for which we did the migration is mostly about smaller audio objects so it's always within a fixed limit so our mix editor goes to something like 10 minutes at max so each track can be 10 minutes long max and it would be a lot of maybe small samples on a specific track so it still always fits within our limit but let's say if we wanted to move to podcasts as an example I tried it once you can do input and output buffers from what I tried I tried a 2 gigabyte video and it would use input buffers from S3 directly and output buffers directly and then Lambda is just being used as this proxy and then you wouldn't hit for instance the local storage limit it's, I mean it was experimental but it worked other questions just recently so when you Lambda appeared today so when you put a Lambda function there's always in different languages but there's an insertization part where you just alter the function and if you do the warm up you can do all kind of insertization if you don't pay for it actually it's a nice thing and we'll leave it at that once I can see it makes it use a function code but I was actually looking it was coding around for a genome part a code that functioned as a yeah some sort of like after I'm going to give it back once the function is taken off I couldn't find anything where you can get a code back saying code is functioned when you take it off yeah so what you're saying is like if Lambda would have a feature where the moment you triggered let's say the callback to whatever was calling it that you would still have some time to do some after logic that's what you mean yeah yeah that would be nice but for now yeah the only thing you can do no the only thing you could do is like doing it right before the callback yep yep yeah so our goal is always to to make it serve us first and if it really doesn't work or it takes too much time then we would default to ECS so yeah yeah so one of them was like what you mentioned if what if the function wouldn't be able to run the FFM pack in let's say 5 or 15 minutes that's always a thing we're looking into of course you can do things like for instance do it in chunks or in parallel and then have one thing which joins everything together that's possible so that's something you could do with with step functions but sort of like defeats the purpose where you just want to to build your application and ship a feature and if you're already seeing like this what this it makes the effort times 3 or whatever goal you have there that's the point like where we are like that it's probably not worth the effort but to be fair it for us it hardly ever happens the only issue we have was for instance local storage size so we have one process so now we only talked about the individual audio objects but our mix editor supports 10 tracks to 10 minutes and to like mix all of these layers into one final output it would do all the mixes for each track then it would like mix everything onto one track and apply maybe a final effect like some reverb like this whole process at one point we had it working in a lambda it took us a lot of time and effort to have all the states and it became so complex and we would still hit the issue as soon as we wanted to go from 5 minutes to 20 minutes we knew that the issue would come back so that's when we decided like we were just going to move this to ECS so we're also running on Azure cloud for the Azure functions as well so most of there all the new services are being built also in a serverless manner but me personally I don't have that much experience in Azure cloud thank you thank you