 Okay, good afternoon everyone. Can you guys hear me okay? Yes, okay, so the conference is coming to an end. I hope you liked it Who in the room is interested in metering metering what the applications are doing what services they use Good. Okay. That's good because this is what the talk is going to be about So I'm going to give you a brief introduction of the abacus project So the abacus project is part of the Cloud Foundry Incubator It's just half an hour. So we're not going to go into a lot of details But I hope I'm going to give you enough pointers so that you can go and explore the project later at the end of the The presentation and maybe contribute to it So I'll briefly go through what the project is about and what it's not Examples of how you can use it We'll go through the architecture. I'll give you a brief overview of the architecture The API's that you can use to submit usage to us retrieve usage reports We'll talk a little bit about the team and the development process that we're following in the project the status. We just Released a new version of abacus today. I was kind of busy in the last few hours And then we'll talk about how you can contribute to the project and then at the end of the presentation There's a bunch of pointers for you to do your homework at the end of the conference So what is abacus and what are its main design points? So it's basically a pipeline of services. We call them microservices these days That will process your usage data. So what is usage data? Well, you know, if you're using cloud foundry, you're running applications These applications are running in multiple instances. They use memory, right? So maybe at some point you want to know how much memory your applications are using, right? So that's part of the data that we collect and then we turn that data into Aggregated usage and then we multiply that aggregated usage by all the quantities that we collect by price That gives you a cost and then that gives you a way to charge your customers or your teams your departments depending on your use case In order to support this we need all the all the metering and the aggregation functions to be customizable You know, you don't meter a database like an object storage Service or an application or an SMS service for example, right? So you're gonna have to be able to define your own matrix How you Aggregate these metrics into something that makes sense for the users, right? The usage is typically submitted by the service providers. So if I provide a database service I know exactly what applications are doing with my database how many API calls how many databases they use how much Space they use in these databases. So I I have all that information. I need to submit it to Alakuz so that Alakuz can actually Make sense of the make sense of the usage You can do that anytime So I in addition to the Alakuz project. I also work on Bluemix at IBM and We see service providers submit data every minute whether 15 minutes or every hour, you know We have to accommodate all these different requirements So we've made Alakuz flexible enough to not impose too many constraints on the service providers So service providers submit usage to us and then we we process that usage through our pipeline And at the end of the pipeline you see rated usage So that's basically the cost for that usage aggregated along different dimensions So for example, you may want to see usage at the organization level or at the application level the space level maybe you're interested in usage for a specific application or Just a specific type of service just the database or just the caching or just that other notification service that you have, right? So we provide a way for you to get all these reports along all these different dimensions in a flexible way So I should note here that these usage reports are typically the input to billing We're not doing billing in Alakuz. We're not doing payments. We're not Issuing an invoice or anything like this, which is providing you with the data that you can use to build your customers So we've we've looked at some, you know We've looked at what's going on out there on the market some alternatives for for this And we actually couldn't find any so that was kind of the reason for building our own initially in in in Bluemix We couldn't find a comprehensive solution open source well integrated with CF. So we We built Alakuz for that So like I kind of alluded to in the previous slide, we're not billing or charging customers If you want to do that, you need an external billing service. That's what we have at IBM I believe other people who use Alakuz are doing that too. They have a billing system And they're feeding that billing system with all the usage data computed by Alakuz the other thing that we we tried to not do Was to try was to to make all service brokers common, right? We like I said before You know metering a database or an SMS sending service is very different You know the measures the metrics the formulas that you'll want to use to to compute the usage are gonna be different You can't really make everybody fit into one kind of Common pattern. So that that's the reason why we needed Alakuz to be very very Flexible Where is that I could use today? Well, the code is initially coming from Bluemix, right? So we initially developed the metering and rating system for Bluemix internally at IBM and And then we took a cut of that code base and we thought about open sourcing it There were multiple reasons for that. We wanted to contribute that back to The Cloud Foundry Foundation because there was there was nothing to do that at the time Also, we wanted to make sure that the APIs that service providers can use to submit usage and You know people who are interested in seeing the usage can use to get reports would be open To to help the ecosystem of service providers all get on board common metering platform Right now we're using abacus in all the new Instances of Bluemix, you know all the new versions of bluemix that we have in particular bluemix dedicated, right? So we have an offer for customers who want to run in their own Slice of software environment isolated from from the others. So we're using Abacus there if you use bluemix dedicated, you will recognize some of the APIs that we implement in the open source project bluemix local also, you know, if you want to install bluemix on on in your own data center Abacus is there in the middle of That installation and with respect to bluemix public at the moment We are actually running both engines the old one that we had like last year and abacus at the same time And we're feeding the usage into into both and we're comparing the numbers and we're gonna do that for We're gonna continue to do that for a little bit of time before we switch everything completely over to abacus You know, it's like trying to change the carpet in a big Ballroom where people are dancing, right? It's kind of difficult to change one of the engines in the middle of your cloud As you have so many customers already on board SAP has been very Active and contributed a lot to the project right from the beginning. They have implemented for example all all the app Usage metering, you know all the runtime metering So that was a really good good contribution Recently they've I think the previous presentation was about concourse. So recently you say be contributed a whole Conquest pipeline for abacus, which is really good too. And I'm sure that if you go to the booth I don't know if the booth are still open now, but since the conference is coming to an end But if you go to the booth then and ask them about SAP HANA, they will tell you that they are Using abacus too, right? If you go to the GitHub project, you will see that a number of people have start the project and ask questions open issues Made some contributions, you know pull requests to the project So we can see that other companies are actually kicking the tires with abacus or using it almost like in production also So a quick overview of the architecture like I said before It's a data processing pipeline right Made of a bunch of microservices this together using HTTP based API, you know rest API So typically people service providers or runtime providers post usage events to us So that's the left of the of the chart The first thing we do is that we try to make sense of that usage data, right? We try to see okay. Is it coming from a service provider that we know about is it a kind of Resource or service that we know about What additional information or configuration can we collect about? About that usage what metering and rating and pricing plans should we use? So we collect all that information in that entry point into the pipeline called the the usage event collector, right? Once we have all that information and can make sense of that usage We apply the metering formulas that have been defined by the people who want to use abacus to service providers or the runtime providers They have to give us the metering formula that they want us to apply to the usage So that's the meter service the meter service is going to turn a bunch of Measures for example for an application. We have two obvious measures You know the number of instances and the memory allocated to each instance in two metrics, you know Gigabyte total number of gigabytes for your application. That's the metric that you want to use maybe To charge your customers on your particular platform, right for database. You could say well, I have story I have the space. I have the number of API calls Maybe the number of databases so that these are going to be the incoming measures and you turn them Into metrics using whatever formulas you want to use to configure abacus So that's the output of that step is what we call metered usage, right? That's usage you look at this usage and it's a bunch of metrics You're not interested in the the raw measure the the raw measures anymore Then we get into a very interesting part of the pipeline, which is The step where we need to accumulate usage over time, right? Typically at the end of the month you want to know how much you've consumed for the month But maybe you also want to know how much you've consumed the last hour or the last minute or the last week or the last day Or maybe you know the last 30 minutes so the last 30 minutes doesn't necessarily match with the hour, right? So that's a that's a pretty interesting part of the pipeline There's all kinds of issues there that you have to resolve if you want to make this work, right? Service providers will send you usage out of sequence They will send you usage a bit too late Or they will send you incorrect usage and then they want to send a correction, right? So how do you make sense of that as time passes, right? How do you accumulate usage into all the time windows that I just talked about, right? So that's a that's an interesting part of the project if you guys are interested in contributing to it You know, there's a quite a bit of computer science here It's not very different from what companies like Twitter or Facebook or Google for example Have to do with their all the data that goes through their social networks, you know If you want to to place the right ads you need to know how many clicks went to this set of page in the last Ten minutes for example, right? So that's that's and so that means you need to aggregate data into a time window potentially a rolling window Some of the data may come out of saying delayed So they these are all kinds of the same issues that you see all over in the Bay Area here in the valley Many people are working on these tough problems and these problems are pretty tough when you have a lot of data Right. So on Bluemix, for example, we have I think more than 200 or something like this Service providers and they send us a lot of data So that accumulates a lot into these little databases that I have on the on the on the slide here And and the other thing is that you know, you can miss page clicks, but usage data. You don't want to miss it because it's about money, right? You need to charge off that data at some point. So you better work. Well Once we've gone through that tough step We're ready to aggregate a usage along the different dimensions that I mentioned before space Org application collection of orgs because typically, you know a team is is going to use different orgs right Resource types because service provider is interested in only the usage for the service that he provides not necessarily the the other one Right. So there's all kinds of different dimensions here that we use to aggregate the data And we need to do this in a kind of a space efficient Way, I'm not gonna talk about The volume of data that my team has to handle on Bluemix But this is a lot of data and we better store it in an efficient way to not blow up all the databases So once we've aggregated all that data We provide a set of reporting APIs flexible reporting APIs that you can use to to get Data out of out of the databases reports along all these different dimensions Let's dive into a bit more of the details of the architecture This slide really shows what you need to do what you want to do if you're going to integrate Abacus into into your platform right and it shows what it's not in abacus what you have to do around it So first of all you need to work with your service providers so that to have them implement the abacus APIs to post usage to us We do all the work thanks to all the the SAP contributions To do that for you for app usage right because this is really internal to the cloud foundry platform And we are part of the cloud foundry foundation So we thought we really have to do this we collect all the all the the app usage from the CF app usage even databases Right, and we turn them into something that Abacus understands You might want to meter other things, you know outside of cloud foundry for example if you have Containers if you have VMs if you have They you know other types of resources like bandwidth for example at IBM We have this with soft layer right because we also offer infrastructure in addition to the platform So I like this is not really tied to metering just things from cloud foundry You can feed it with whatever you want Yesterday I was actually talking with somebody who does some IOT work and he was like oh I get a lot of data from all my sensors. I could actually feed that into abacus. So that's another use case You know you're interested in doing that Hmm So the other thing you need to do is is provide all the configuration that's actually making abacus flexible Provide the metering plans the rating plans the pricing plans that Abacus is going to use to do something with Significant with you with your usage, right? So you do that by implementing a set of REST APIs that will call so you're not calling us with the data We call you to ask you for example. Hey, I got that usage document It's for resource ID one two three four service ID one two three four What is the metering plan for this right and you just give us back the metering plan and that that's what we're gonna Use right and that metering plan is going to say here are the metrics that I want out of the metering step And here are the math formulas that you need to that abacus needs to apply to that usage to produce these metrics On the other side of the pipeline Well, you want to be able to consume all that metered and rated usage So you'll typically plug in a billing system analytics things like this, right? So at IBM for example, we plug in billing and we also have a lot of analytics I work on these analytics to that consume all the the rated usage and Try to make sense of of it The technology It's not written in go. I Get that question a lot It's written in Node.js We've done a lot of work on node in the last in the last few years in my team So it kind of came, you know, naturally to Node.js and node also gave us a very flexible environment You know, this is really about business logic, right? You want to be able to tweak how Usages metered and rated without having to not go into too much system programming, right? And JavaScript is kind of good at that in a way, right? Node is also pretty good at handling high volumes and a lot of connections, you know When service providers send us data sometimes they send us one document at a time Sometimes sometimes they send them by batch of batches of like 500 Sometimes they open like way too many connections. So we have to handle that, right? So node is pretty good at that It's lightweight. I like lightweight stuff. So node was was a good choice for for me The development version of Abacus is really self-contained and doesn't even require like a heavy database or something like this Typically in production, we use couch DB and thanks to the guys from SAP who ported Abacus to MongoDB. You can also use MongoDB. I think they use MongoDB But you know to play around with a project or even to run most of the unit tests and Kind of the functional and integration tests. We don't use couch DB We use pouch DB. Pouch DB is a small in-memory version of Couch DB and what's nice really about it is that it doesn't require any configuration And you can play with it for a while Mess up all the usage data in your database. You stop the server and it's all fresh, right? It's it's empty. We use JSON for all the data representation When you want to deploy this to production while you deploy to to CF we run these node applications as CF apps So we push them all to to CF And again, you can connect these apps to either couch DB or MongoDB for your production environment So we don't have a lot of time so I'm not gonna do like a full demo I just want to give you a glimpse of what what how you can play with it, right? And I'm hoping that some of you guys will do this later So I'm gonna switch to my come-on prompt here. I'm kind of a come-on line kind of guy So typically, you know if you want to get started with the project That's what I want you to get up with right you clone right get clone from GitHub, right? then you'll follow The steps on our read me which are really really simple to get started So you get into the directory where you've flown the project and you do npm run build on my laptop That takes about a minute and ten seconds. So that's kind of lightweight. Okay, and Then if you want to run the test you do npm test That takes about a minute to but what I want to show you here is how you can actually play with with it Locally without even deploying it to CF right these are just no applications. They run on my laptop as you know processes Listening to port numbers on my machine. So if I just do npm start So this is all in the read me on the home page of the project. That's it I've actually started right and you will recognize here if I do a ps The What I talked about, you know in the in the architecture so you can see here the collector Application the meter application the accumulator application the aggregator We have a bunch of other things around these applications to stub out, you know simulate real CF Environment for example, we have an odd server. It's a little node application that Simulates you a right because I don't have CF running here, right? We have this pouch server So it's the little database that I was talking about right which is going to contain all the measuring data The reporting service here, right? So it's it's pretty it's pretty lightweight easy to start easy to run and then if you want to run the demo You just do npm run demo and that demo is pretty simple. It doesn't do a lot But it actually exercises the whole pipeline. So what it does is that it submits three, I believe it's three Usage documents that represent usage for fictitious service think about a fictitious object object storage service, right which Supports fun concept of lightweight API calls and heavy weight API calls, right? and and also understands the concept of Space, you know size of your object. So here we submitted three usage documents and once the usage has gone through the pipeline we we give you the report with all the usage so I Actually just spent 46 dollars with three Three usage doc. So that's a very expensive API that the API is very expensive So I'm not going to go into all the details of that Jason thing, but you can see here that we have Quantities and costs and charges and summary of the of your usage along different dimensions, right? I have the aggregated usage for just for storage Along the the plan dimension so I can see here what plans were used to actually meet it is I think at the bottom here. I have The aggregated usage per resource instance somewhere It's somewhere here And this is the usage for fictitious CF organization, right So there's documentation for the APIs that you can use to Submit usage and get the report so it's all documented on the on the on the abacus website You have examples you have jiz and schemas description of what you do, you know the post and I get that you're supposed to do So the other thing I wanted to show you is that this also runs obviously on CF and here I just did just before the presentation a CF apps to get the list of apps deployed on a my account in the CF abacus space so that's the The blue mix account that I use to to test our bills right each time we commit something that kicks Travis build and These applications are pushed to my account and then we run integration and performance test there, too. Okay, and You will see here if you look at These this column you will see that I'm running multiple instances of some of these applications right to To test the the capacity When we run that performance test so Like I said before abacus is flexible you provide Configuration for your metering plans your rating plans you do that using API So you post jiz and documents that describe that I'm not going to go into the details of these jiz and documents I'm hoping that you guys go to the project after the after the presentation But the process for a service provider is typically you get on board CF you get your security set up You get your security token to be able to talk to the abacus API's and then you submit your usage metering plans And then you submit your usage So This slide shows an example of a document representing usage for the same fictitious object storage service right like 10 gigabytes and 10 API calls It's pretty easy to To figure out what that does If you want to use the rating part of abacus You'll want to configure the pricing for the resources that you're gonna charge for right so you can do that also by posting to us Pricing plans which allow you to define different types of pricing and even do that on account on a country basis For example in bluemix we needed that because we run in many different countries Once the usage have gone through The pipeline you want to get a report and that slide shows The type of report that you can get so that's basically what I was showing you in the demo Now let's talk about the team so fun fact about the team that slide shows our github profiles You can see that a number of us do stuff outside of work So there's biking going on there's King going on that the guy on the right does Precision shooting and Raj looks like it just climbed to the top of the world He's here. So he's laughing now So we're actively looking for tennis players soccer players Surfers right, but if you want to get on the team you can also submit pull requests, right? So that's another way to to get on the team If you're going to submit your pull requests, you don't necessarily have to pair program with us Especially if you don't know us, you know ahead of time We are using a distributed commit commit process. The team is pretty distributed around the world and in different time zones and Here's though for example is in Bulgaria. So here still works for for SAP and and you know We managed to talk like about an hour or two Yes, you know during the day, but obviously we have to we have to be distributed, right and At IBM I also do other things in addition to have access to sometimes my time is a little hectic So it'd be difficult to pair program with me So if you look at the team, it's pretty balanced We have some engineers from from IBM max here in the room is the PM and I'm Jess Delfino and Where is Kevin Kevin is here, right? We have here still and Georgie and I think we have somebody else from my safety who started to contribute recently in the room Here So she's she's here. We have some independent Contributors to so it's pretty balanced, right and The reality is that you know if you're interested in the project want to get on board Submit a bunch of pull requests and we'll get together and we'll vote you into the project as a committer Right, so we're really open to contributions from anybody who wants to come on board Obviously we're on github you can submit github issues if you have new ideas or run into problems. We also have our own Tracker right So you can go to the github project or the tracker to see what's going on Recent updates. We just released version zero zero five today I thought it'd be a good idea to have a release today so that you guys who don't necessarily want to get to the bleeding edge Of the head of master can actually get a more stable release I should admit that the release is actually the head of master as of now, right? Because we just got the release These slides are actually On github to you can get to them from the homepage of the project So I'm not going to go into all the recent updates I want you guys to download the release and you will see actually that list in the in the release now It's like we did interesting things recently, you know, since we're using this in production at IBM We needed more flexible Metering and rating plan to accommodate some of our requirements SAP needed to have a abacus run on MongoDB so they did a lot of work good work in that space We ran into interesting situations with the service providers not behaving nicely and submitting a lot of usage out of sequence So we had to to improve our handling of that we got contributions of Nice concourse pipeline from the SAP guys Just a few days ago actually Um, so there's a lot going on in the project in the near term We're gonna we're going to do a few more interesting things Since we need to run this in a multi data center multi region environment, you know to accommodate for that will need Some level of queuing in the abacus pipeline initially We were not so sure about that because we thought we can run I think one abacus pipeline per data center and then use something else to flow the data between data centers But recently some people have asked us to actually split the pipeline in in to and have part of the pipeline in one data data center and that the rest of the pipeline more of the aggregation logic in a in a different one So we're going to introduce some queuing here And in very interesting space if people are you know machine learning and and AI are all the rage these days right and recently we got some Some interesting requirements from some people at IBM, you know, they were they were saying well What if a service provider forgets to send us usage or what if a service provider actually starts to behave not so Not so well and sends incorrect usage could we actually detect this Ahead of time or you know as it happens if they're not sending us usage, you know if we don't do anything We don't know about it. Right. So how could we detect that right? So we're starting to kind of explore that space to see if we could predict You know some of these situations instead of having the pagers on the team here to actually handle this after the fact So we're also doing interesting work in the handling failed events, right? So for example somebody misconfigured a metering plan and the usage went through the pipeline How do you recover from that? How do you replay that usage after the fact? How do you help the team? That's actually operating the pipeline More longer term actually the first item. I'm not sure if it's going to be longer term or more near term I know that the SAP team is in India I believe is working on some new some my new UI some UI fly because so I'm really excited about that, right and I We also had discussions about offering an exercise of service another interesting contribution I think from from ACP since you know most of the people who deploy stuff to the cloud are also providing services to their Customers and they might need some metering engine for that kind of service, right? So that's an interesting project notifications is something we've been talking about for a long time I think we're getting a bit over time as usual. I talked too much. So sorry But it's the end of the conference. So you guys have a lot more time now So anyway, I'm gonna stop really soon. How can you contribute, you know, I'm not gonna go into really these slides, but You know, if you are an integrator of Cloud Foundry, you may want to integrate abacus because That's gonna give you a way to meet our usage on your platform and charge your customers and make money and be happy, right? If you are a service developer, you want to get on board that platform, right? You want to learn the APIs and be able to submit your usage because this is how you're gonna charge for your service, right? If you just want to have fun and write code, well, this is fun code, right? Metering rating billing doesn't sound so fun initially but believe me, there are some really tough computer science problems here, right and There's volume big data lots of data So that makes it for an interesting project, right? If you just want to write documentation, we like that too because usually the documentation lags a little bit behind My theory is that, you know, if the documentation is too good, we'll get more users, maybe less contributors I want people to get into the code. So sometimes it's actually not so bad to not have too Good documentation because people then read the code and if they like it, maybe they'll contribute to the project That's what I would like to see, okay? So that's about it. Thank you for listening. Sorry about the overtime. There are a bunch of links to the resources that, you know, will give you more information about the project But, you know, it's pretty simple if you just Google, Google CF, because you'll get to the homepage and then you'll get the same thing Okay, and finally, if you want to talk with us We are usually on Slack or IRC or Gitter The awesome little labels at the top of the homepage of the project if you want to join us and chat Okay. Thank you very much