 And I'm going to cover what is data microservices, right? So the thing is knowing that you have had an understanding around what is the whole microservices thing and how is customers actually adapting to this new culture of moving from monolith to microservices. So what we thought was it would be good to actually present to you guys. It's not just about the applications. You also need to look at applications and data as one thing rather than suppressing them. How many of you here have at least worked a bit on databases, at least on nosy per databases or adoops or not bad? So most of you are more into development, is that? Don't really interact, I mean with database stuff as such. Okay, so we have very few people. We will try and make it much more easy for you guys to understand for those who are not coming from the database background. But again, the concept is kind of very interesting. Like I said, if you look into all these big internet giants like Google, Facebook, Yahoo, the way how they design things is not that they only look at applications. They look at applications and data together as applications start generating more data and collect all this data and then you try to find insights in it and make your applications even better. So that's what the talk is all gonna be. Again, we'll try to keep it short and sweet so at least you're not bored. So before we start myself, I'm Vish Fanindra. I actually run the data engineering here at Pivotal for the South Asian market. So I handle both the pre and post sales organization here at Pivotal. Been in this company for about 10 years. All right, so Carlos, you wanna just introduce yourself? Yeah, I'm with Pivotal as well. I'm a solutions architect. So I'm more in the PCF side, I'm deploying PCF and show people how to play apps, screen-based apps, screen quality of apps on PCF on using micro-sale systems. Okay, that's my intuitive handle. So before we start, I just wanna take a step back and just wanna demystify a bit around what exactly, why are we doing this now? What exactly is this microservices? What is this data microservices? The idea here is if you look into the whole, you know, new enterprise world, right, especially some of these companies that are moving into agile or adapting DevOps or looking at all these big monolithic applications where they're trying to break them down to, you know, monolithic, sorry, microservices architecture. So one thing is very clear, right? The thing is, the time it takes, you know, everyone has ideas, right? The time it takes for you to be able to make this a reality, make this idea into a reality and then being able to make money out of it is the whole thing, right? You can have n number of ideas, but then if you really can't make revenue out of it, then there's no point. So this is the concept, what we call as a concept of cash, right, where you really have this idea and I wanna do something materialize it and make it real and at the end of the day, start taking that, innovating on the idea that is out there. So it's kind of iterative process. So historically, you know, we all started with waterfall methodology and then we moved into drill practices and lean practices. Then as we started getting mature, we started looking at CI, CD, ultimately, you know, talk about continuous deployment. Who's doing CI, CD, A, conscious deployment? Very few. So again, going back to all these consumer-grade companies like Amazon, like Netflix, these guys, they kind of push release code to production every few minutes, several times in a day, right? So this is kind of iterative process. And if you look into this, right, there is one, the evolution of this has one common characteristic, which is the shorter the release cycles are, the software quality is improved because you're basically taking a bigger problem, deconstructing that problem into smaller problems and you're attacking at each one of those rather than looking at a bigger picture, right? Whereas on the other side, where the traditional way of building applications, waterfall model, where your release cycles are at least like nine months or a year. After one year, things would have changed, the market has changed, you really have no control and it's like building bridges, right? There's this talk by Anshi from Pivotal. So just recently we conducted the Spring One conference in Vegas. He talked about help developers do what they love, right? This is an amazing talk. Everyone should watch it, it's up on YouTube. Just look for Anshi on the side. So he had, it's a 15, 20 minute talk. I kind of really loved it. My background is more into data, really talking about data, analyzing the data. So I'm more into the analytic space, not really into the development space, but I kind of really loved what he said. It's like building bridges, right? You take years to do this, right? From a software perspective, you can't really think that long. You've got to really be looking at doing things very iteratively and very fast. So that's the idea here. So one thing is very clear. If you're going a gel, generally in a traditional monolithic architecture, you have hundreds of developers. All these developers tend to work on one big thing and the challenges, all these developers tend to work in a dependent mode. They can't work independently. Whereas on the other side, what you're doing is you're deconstructing it, you're making it into a microservice architecture where each one is like a service and you have smaller agile teams really focusing and building that one thing. It's more or less like, let us talk about this SOA and now we talk about microservices. There is some kind of convergence between SOA and the microservices. But the point is, the smaller the teams are, they're more focused, they're kind of independently trying to build things. And the beauty here is that you're not really going with what we call as one big giant database. From a database world, your applications are not just trying to hold the data, the complex data models that you build and the way how you store data from these applications is not just in one location. Because the moment you try to deconstruct this into a microservices-based architecture, the idea is you're going to pick a right tool for the right job. So which means one service would probably use no SQL database, one service would probably use a simple SQL like MySQL or Postgres or whatever it may be. So the idea is with this new architecture, we're talking about polyglot persistence where you have a lot of tools that are out there or backings that are out there that you could actually possess the data in and you're not sticking with one big giant database. Now, the moment you deconstruct this into a microservices architecture, then if you're running this in a platform like structured platform like Cloud Foundry, how many of you have heard of Cloud Foundry? Today Josh covered about Cloud Foundry. So again, Cloud Foundry being a structured platform, being able to deploy these services become much easier. So you see a lot of things like updates and upgrades and scaling your applications and your persistent layers horizontally and being able to migrate them to a new environment or being able to redeploy them and redistribute them. All of this becomes much easier when Cloud Foundry. And the beauty again here is this Cloud Foundry becomes like an abstraction between your infrastructure and your microservices which means you could easily change your infrastructure. So in the past, it was all about your applications are tightly coupled with your infrastructure. Any change that version upgrade, anything that has to happen is a lot of panic between the developers and the operations team because everyone has their own KPIs. With this approach, you're really talking about automating all the activities. Now, again, from an enterprise architecture point of view, if you look into it traditionally, you have all these big monolithic systems as you start growing all these systems, then you would end up introducing things like integration bus or enterprise service bus. So the integration bus or enterprise service bus really helps in moving data from one system to the other. As they start getting bigger and bigger, then your enterprise bus becomes the key for all the integration points between these systems. And the challenge then is if you change one system, which means all the other systems will have impact, which means you need to have a team of people who is managing all these integration bus or enterprise service bus and also to make sure that all these impact analysis is done for every change that is being done on any of these systems. And if you look into the left-hand side of it, we kind of, we scope that to microservices based architecture, but if you look into the other systems, you have all these big giant databases trying to generate a lot of data and all this data then has to go into a platform like Data Warehouse, which is where all the data gets collected and analyzed and used for management reporting. So that's where you introduce another set of tools like ETL. If you've heard of ETL, if you haven't heard of ETL, it's an extract, transform and load. So you introduce another piece of software, which means you have this huge problem where your Data Warehouse holds D minus one, which means yesterday's data, which is what management is using for doing the decision making. And on top of it, this system has to be able to help business to make decisions and this system has to give feedback to the applications, which is not really a right approach to start with. Because like I said earlier, the idea is applications generate data. You color data, you try to find insights in the data, make your applications better, which in turn will generate more data. That's what all these commercial, I mean, Kajima grade companies like Google, Facebook, yeah, all these guys are doing the same thing, right? So with this, you have a problem where you don't have data available in real time, which means your analysis is not kind of accurate. Now, if we zoom into this and really look at the data pipelines, so whether it's an integration bus or whether it's an ETL tool, you have data coming from one system and you're moving it to another system. In between, you end up doing the last processing steps. What I mean by that is you could probably do data filtering. You could probably do some kind of data transformation. You could probably do multiple things before it lands into the destination. So the challenge here is if we kind of deploy this into an architecture where it, which is kind of traditional monolithic, if I have to change one processing step, which is what I was mentioning earlier, any change that has to happen requires a lot of impact analysis because it's not straightforward. It requires you to go look at how is this data going to be used by other systems, which is a huge mammoth task to start with. So, which means it's the updates and upgrades and scalability, all of these becomes a problem. The one most important thing, I'm sure you all know this, no one cares about day zero thing. It's about day zero or day one, whatever you call it. Day two is what is important. So you build a cool application. You deploy into production. That's not where the real work starts about how do I manage this? How do I really make this better? So it's all about day two operations with this kind of architecture, all the day two operations, the upgrades and scalabilities and redeploying it, all this becomes really a complex task. So what we are seeing is, to basically break this into a bunch of spring boot maps, where you have each processing step, an app with a proper contract. So if you look at this spring boot, so you have this abstraction layer. I was talking earlier about Pivotal Cloud Foundry. So you have different kinds of messaging queues or in-memory cache layers that you would actually use with this spring boot, all these contracts of the microservice architecture. So the data pipeline of the abstraction layer, everything is kind of automized. Like it's automatic. Once you bind the service to a specific app, it kind of starts moving the data from one app to the other. Yeah, you wanna add something there? Here, I don't know if you heard about, this is what they call spring cloud stream. This is one of the spring products that you have, spring libraries. And what make it cool here is that those guys are completely independent, right? So they don't need to know each other. So they only need to know how to send the data over or how to receive the data. So they're completely independent, they're not completely independent. So you can scale this one, you can scale down that one and so on. And the layer here, they bind the abstract layer that can be message queues like Kafka, or Raptome Queue, or Geode, or Redis, and it moves the data between them. And then you can scale these layers. So you're keeping writing spring boot apps, but now rather than you are doing business logic, you are doing processing the data. You can receive data in a stream flow or can be a batch, but this guy here that's a spring boot application is gonna process that data for you. And he can send the data over using these abstracts layer. He'll make you very easy to program this. And if you are developing microservice, one very common way of moving data between microservices via REST APIs, right? Here, you don't have to rely on REST APIs. If you need a synchronous communication in that make those microservices completely independent, you can use this. And the message layer here, you wanna move the data for you, right? This is making you very powerful because you, as a developer, doesn't have to change there that you develop software. You still use the spring boot as application. And spring, you wanna inject the code for you that makes the data move between the services. So it's the same concept as before, but now, if I'm all part of the technology that makes various to move data around microservice. That's why you call this data microservice because you are specifically moving data between them and processing them. And then you can layer them as here, the data is going from this guy, for example, that one, for that one, so you create your data pipeline. And your data pipeline, you can transform your data as the data come. And then you can sync your data in a database or you can send the data to someone else and you can trigger another process. You can put things on as a Lego and create your data plan as you wish. Making very powerful because, for example, if you want to, this test test here wants to consume some data, doesn't have to ask this one. You can just type this guy straight to the bind and start to consume the data from there. You don't have to give this, and this guy doesn't have to know that. So make very easy, flexible inject process that process that particular data for you and that doesn't pack the flow of your data pipeline. Your data pipeline is still there processing the data and you can inject different apps to process different things. And your data pipeline doesn't have to change. The next one, it's the same. Yes, and here, that's the... Sorry. This is how you do this in Spring, right? So as you use, if you are a Spring user, pretty much put annotations in your code. So there, on the top, you have enabled binding, what that means, processor annotation, that it means that this is a processor. This is receive data from someone, process the data and send it to someone else. Who you're receiving from, you don't know. Who you're sending to, you don't know, because this is gonna happen when you bind your Spring bot applications together. The time that you do this, that you create your pipeline, then the data is gonna start to flow. So this basically talks about the previous slide. So again, this kind of talks about deploying these Spring boot apps into a platform like Cloud Foundry, which not only allows you to make it transport transparency, but it also allows you to make your infrastructure to be transparent. It kind of has all the integrated metrics and it has all the integration monitoring. And it comes with a dedicated login auto healing. So you have containers, you kill one and it will automatically start the other container. So that it comes with a lot of goodness, especially from a Cloud Foundry perspective, which kind of automates all your applications and kind of binds your applications with your data services. One detail here is that, yes, you can deploy this in Cloud Foundry, but you can test this today in your laptop if not Cloud Foundry. Why you want to deploy this in Cloud Foundry? Because you want to aggregate logs. You want to have high availability and so on. That the platform can give to you. Then you, as a developer, doesn't have to worry about this. The platform can give you the message layer that you're gonna bind your applications to it. So you don't have to worry about this. But of course, if you want to test those things running, you can test in your laptop. And then you go production and you're putting a platform that they're gonna put those things for you and you're gonna keep those things running for you. That's the powerful of Spring. The same data that's here that you test in your laptop is the same code that you're gonna go live. No change. You don't need to change anything, right? And you can run it in your laptop today or in a platform like Cloud Foundry. No change in your code. This is the powerful that comes from that. From the annotation of Spring, from the injection code for you, depending on which platform you are. So this is the Spring, the Cloud Data Pipeline. Carlos, you want to touch on this? So you saw a bit of Spring Cloud Stream, right? So that's the, it's a library from Spring Cloud that you use to bind applications together or if you want to call data microshaps together to move data between applications. On top of Spring Cloud, Spring Cloud Stream, you have what you call Spring Cloud Dataflow. When you saw Spring Cloud Stream, you still have to write some code and bind this to the message layer and so on. Spring Cloud Stream, abstract this for you, give you a higher level of creating your pipeline. You now can create your pipeline using a DSL language that's made very easy to say. I want to connect HTTP server output with a processor that gets the output from the HTTP server, process this and sends to a database. You can mix and match and create different pipelines using a console, a common line, using the DSL provided by Spring Cloud Dataflow, making it very, very easy to create pipelines. I'm just gonna just give you a very, very quick example how this can work. So let's, yeah, yeah, yeah. Is Carlos audible to everyone? You hear him? So let me, yes, I'm trying to just one second. So here is the console of Dataflow. They call the Dataflow shell. This is how we interact with that layer that I show you. Here you can create pipelines. You can say, I want to receive, put a server, HTTP server that receive requests. Whatever someone sent that request to me, I want to capture that request. I want to do something of that request, transform from CSV to JSON to XML, whatever you want to do there. And then the output of this processing, I want to send to a database, or I want to send to a file, or I want to send to a log into a file, or I want to send to my have doops and clusters and so on. You can do this, this orchestration, how you data move around just using this console. That's the power of Spring Cloud Dataflow. It abstracts to you a lot of code that you have to write to bind those things. Of course, when you bind things here, that software, that application has a rep to be there, right? But once that you have a collection of them, that's what Spring Cloud Dataflow gives you, you can use them to orchestrate your data pipeline. So Dataflow, here you have a server that just manage your applications. It doesn't run your applications for you. The ASB, you're gonna be there to move your applications around. This guy here is just an orchestrator. He just connects things. He doesn't move any data around. He just say, hey, A goes to B, then goes to C. That's it. How the data goes through A from B is not his job. It's the main search layer job. So the ASB does this, does the security, does everything, and that's why it become a big monolith because he wants to do everything. These guys here only does one thing, orchestrate how the applications connect to each other. That's the bind, but doesn't do any more than this. Doesn't move the data around. How do this is the main search layer that's provided by someone else that if you are in Cloud Foundry, you can use the main search layers that they provide like Kafka, like Raptor and Q, and so on. Do you know what these types of things are? Yeah, you have each steps doing what they do best, right? So you need to move that around. So you have Kafka that move that around very fast. Can persist the data, it's durable, so you can replay and so on. Nice feature that they give that for you. You can use that to move your data around. When you develop your application, you don't have to know about this. You just say, I want that my application received from A and send to B. How the data is gonna move if you're gonna use Redis to move the data, or Kafka, or you're gonna use Raptor and Q. It's not your job. It's how you're gonna deploy this when you orchestrate your things. You can say, yes, I want to use Kafka for this job, or for this pipeline, but for the other pipeline, I want to use Raptor and Q and so on. So the developer, if you are developing your application that received data from a third party, you don't have to worry about, ah, I'm gonna receive from a Kafka endpoint. I have to send this to a file. You just say, I have an input source, I have an output source. When you bind things together, using this guy here, they wanna connect to you to the message layer that you want to use for that particular pipeline. That's why the platform does this for you, right? So those things, the message layer, the binding, the high availability of the applications, when you run this in Cloud Foundry, they're gonna provide this for you, right? So all these services run as part of Cloud Foundry, right? The moment you push all these microservices, basically you're binding them together. At the time of execution, you're not really writing your stuff based on one ESB or anything. When you write for ESB, you write for that particular ESB, right? Most of the time. Even though they put standards together, but never work because, yeah, someone, in my case, you do this, in my case, you do that. Here, the only thing that you say here, I'm writing a Spring Boot application. How they're gonna run, if it's in Cloud Foundry, if it's in your laptop, if it's in Kubernetes, is how you decide, you as the manager of the guy that managed the platform, inside this platform you want to use, and how you bind them. If you like, wrap the MQ to move that around, you use wrap the MQ. If you like Kafka, use Kafka. If you don't like them, use hesitate party, you can implement that and bring that over as well. So, you as a developer, don't have to worry about this. You as a manager, or as a dev ops team, or part of the dev ops team, define where those guys are gonna, if you're gonna use Cloud Foundry, or now, it'll be fiscal servers. I wanna put the things there. I have process, I have a puppet chef, these scripts that put those things up, put those things down and monitor that for me. No changing the codes required. The same way that you interface with this guy, you're gonna keep interfacing that. You are not binding to something. Only you run time and say, I want to bind. I want to run this in Cloud Foundry. I want to bind this to Kafka. Then you are telling that. That's the main difference, right? It's about the flexibility that the platform unfortunately, to be able to, to be able to, I think that's the value that is. So, here, if I do a stream list, I see the pipelines that I have created so far. So, I have here one very simple pipeline called full, that pretty much every second gets the time and sends over, and the log gets that time and sends to the log. Very simple pipeline. Here I have another one that's called tag count, another one called tweet length, and another one called tweets. Those are my complex pipelines, because what I'm doing now is, I'm getting the other from Twitter, and for the data that I get from Twitter, I'm processing that data, for example, to aggregate by hash tags, to aggregate by language. If people are speaking in English, people are speaking Portuguese in Spanish and so on. So, all those pipelines here was aggregated by me, but if you see here what I have used, there's a time application there, there's a log application there, there's a tweet stream application, there's applications that are not developed by anyone, might be developed by someone of you. To develop your application, I spring both application, put available for me there that I can use and match how I like it. Right? So, someone developed the tweet stream that connects to the Twitter firewalls, get the tweets from there, that's it. Only does this, get the tweets. How you're gonna process the tweets is up to you. So, I'm saying here, get those tweets, count them and count them by hash tags. So, now I'm putting something else. I'm using that to count my tweets and aggregate by hash tags. That the guy that developed the application Twitter, doesn't even see that. You only say, the application only does one thing, connect to Twitter, get the tweets, that's all. Here, you are using these and all the applications putting together to do something that you want, to do some part of a data analysis. That's here, the data analysis, analyze the tweets. That you, when you deploy your application, you know about this, but someone that developed the tweet stream, only did one thing. Okay, let's connect to Twitter, pass your credentials, start to get the firewalls there and start to feed the tweeters in. How you want to do the tweet, I could say here, get all the tweets sent to a database. Get all the tweets sent straight to Hadoop or process this and also send to Hadoop. Because I want to process this later as well. I'm processing this in real time. As the tweet comes, I'm processing this. But also I want to process this in real time. I want to process the hash tags for the last month, for the last year. My data is going to Hadoop in HGFS, you'll be there in HGFS, and also I'm processing online here. So you define what you want to do. You can have multiple streams coming, aggregate them, and you can have multiple outputs. Going to a database, relational database, going to a HGFS and so on, file system and so on. That's the power that you have here. You keep changing things and mixing things around. So if I, let me create a very simple application here. Just, I'm going to deploy this full application here. I'm talking to you there. Yeah, but say that. Let me not put these tweets to run. I'm deploying tweets, that's the only I'm talking about. So what I'm doing now is, there's no connection there. Oh internet is not there? Probably not. So keep talking like this, I'll come back. I'm going to come back to you this, I mean, how you finish talk, this is live because there's five o'clock. Yeah, so while Carlos is fixing that, the other thing we have to look at is, we talked about Cloud Foundry aspects where it helps you to be able to scale your applications automatically and it allows you to be able to quickly be deployed and migrate from one platform to another. All these things are kind of quite possible. And moving on, right? So if I were to, this is something that's just to finish this up. So when I talk that, so here I create a pipeline, right? So JSON filter, a transformation in that JSON filter, send to Spark, process this in Spark, get the data back and now I'm sending. Aggregate the data and send to Jyot, do something else and send to a relation database, right? This is my pipeline. This pipeline is a bunch of Spring Boot applications. JSON filter is a Spring Boot application. Transformer is a Spring Boot application. Each of those things in your pipeline is a Spring Boot application. So essentially what you create is a catalog of application that you can use, you can connect them together and create a pipeline. So Dataflow catalog has this list of catalog of applications for you that you can use to connect them to process your data. That's the main idea. In those apps, they run as Spring Boot apps. It means they can run in containers, they have high availability. If you are running this in Cloud Foundry, if someone crashed, they can restart. If they can restart, they can be deployed again and so on. So the platform can take care of the high availability that you don't have to. And the data keep moving around, right? That's the main thing. And for some reason, for example, if that one of this transformers that stop to process, let's say, transformer here crashes, the data is not moving on. If you are using some message layer like Kafka, what's gonna happen? Kafka gonna persist this, then what you can do? Replay. Then when replay, and send the data again, and now you can process, right? So you didn't have to develop none of this, right? You are using the tools that are available to you to connect them together and create a pipeline. That's the powerful thing there. Clashes, crashes, then we can go through this again. Yes, if you are using a message layer, a message box like a message broker like Kafka, you can replay your message because they persist the message and allow you to replay that. So the powerful here, because they can persist on data there, you can replay, you don't lose the data if you have a product that you are out for like five seconds, 10 seconds. And that's why you can do that, right? If you have a product that you are out for like five seconds, 10 seconds. If you are processing feed, financial market that you don't want to lose for some reason, you could use some message, persist the layer like Kafka that allows to have that persistence for you. If you don't care about losing data, you can lose a faster message layer that can lose data but process data fast. Then you decide how you want to connect this, right? You don't have to be by eye. I bought that in ESB, I have to use the message that the transfer message that they give me, I cannot change anything because I have to use as it is. Here you can connect things, right? And you can, the best tool for the job. If persistence don't lose data is important for you, so you have to lose a message layer that passes the data, you can replay that. If it's not important for you, you can lose something that everything's in memory. If crash, if you lose, that's fine. You do not die because you lose something. Depending on what you want to do, you choose the tools that you want to work. And the same principle applies, the same way that you do this, how is the same? You match them using the data flow shell that I showed you. And then after that, they're going to deploy as Springboard apps and going to run independently. If they want to talk to, this guy doesn't know about this, they don't need to know about this, the message layer. You want to show the UI to them? So let's see if the connection is... Do you have my point to the internet? Supposedly. So this is the data flow UI. Before I show this, let me show just here one thing that is... Just to give you a context or give you a high level architecture of those things, right? The data flow that I show you. Yeah. So this is the main architecture of Spring Cloud Dataflow, right? What I show you was a shell that connects to this guy that is also a Springboard application that can run locally in your machine or in a platform as cloud-funded. In this case here, I'm running in PWS. So this guy is running the IPWS. And then I have a shell in my machine, that's what you saw, that send requests, REST requests to talk to the other media. And by doing requests, I can ask what you have deployed, I want to create a new stream to deploy and so on using the shell. If you don't want to use the shell, you might use the Flow UI. There's a graph interface that you can create your pipeline graphically, just drag and drop. The guy, you're gonna also gonna talk to the server. And if you prefer to do things manually in a hardcore thing, you can use the shell to send your request, because just a bunch of REST endpoints, right? This is running somewhere, can be on a laptop, can be in Cloud Foundry or Kubernetes, whatever you want. Then this is how you interact with this guy. And this guy here now knows how from receiving the requests, start the applications. If you say, I want to send time to log, meaning I have two applications, an application that generates time, application that saves the log. So there are two applications that this guy wanna say, hey, run two applications, whatever you are, inside Cloud Foundry. If inside Cloud Foundry, you're gonna run these as containers that will be managed by Cloud Foundry. And those applications, this guy wanna say, hey, you have to talk to each other. So bind to the same message broker, then also depend of you, you can say, the message broker that I want to use, but you'll be Kafka or you'll be robbed and so on. By doing this, now you have two applications running inside Cloud Foundry that are talking to each other via message broker. Right? And then this guy here can monitor those applications, find out if the one crashes. Oh, let's say that you have update. I did a new version of my application, now I have to upgrade. Then you can have cannot upgrade C as well. So meaning that you don't have to put all the service down before upgrade. So this is what the high level architecture of Spring Cloud Dataflow. This is a Spring Boot Apps. It's not something special. This is Spring Boot Apps running on the Java JVM. Very, very, very simple, right? So the same technology that you write, that you use to write your Microsoft, your web applications is the one that's here. So far, are you guys on track with us? Okay, awesome. So, okay. Back, so you wanna show the UI now? No, if they request I cut me some. Okay, let me get the other one. Okay, this is the Graph interface. That's how, the other way that you can interact with Spring Cloud Dataflow. If you check here what you have, this is a list of our application that I can use to create my pipelines. I have FTP source, what that means, that I can now have FTP server or FTP client that connects to FTP server, copy the file over and bring to my pipeline. The output of that FTP can be a JDBC. It means you're gonna open a JDBC connection to a database and save the content of that file inside the database. You created the pipeline as you wish, right? So, if I go here on the streams that I have, so those streams here show you what you have, the same interface that you have in the common line there, you have here, from here you can deploy, you can deploy, you can destroy. Remove the stream from here. It's kinda the same information, but in a graph cut way. So, this is what they call about, and here, let's say that I develop a new application, right? So, if I develop a new application, let's say I create a source meaning, I create an application that can consume data from Salesforce. So, that application connects to the Salesforce using the Salesforce API, it brings the data from my pipeline. So, now I can create pipelines that has data from Salesforce. If you want to register that application to Cloud Dataflow, it means what? Now, someone that use Dataflow can architect pipelines that has data from Salesforce, that has data from FTP server, that has data from Hadoop, and they can process the data together to give you something meaningful. I can register my application like this. So, there are three types of applications in Spring Cloud Dataflow. So, the task you're gonna talk later about this, but here we'll talk about streams, moving the Dataflow on point to Delta. So, you can have a source, it means that you consume your initial point of data inside the pipeline. This is the data that bring the data, is the guy that bring the data into the pipeline. It's a source, FTP can be a source, HTTP server can be a source, and so on. A file can be a source, and so on. These are the name of the application. A processor is something that receive data from one side and send the data to the other side. You can use a processor for what? For transform data. Receive XML, give JSP. Or you can aggregate data. You receive data here, then you aggregate by name, you aggregate by city, or as the tweets, you aggregate by hashtags, or you aggregate by language, and so on. And then you have a sync. A sync is receive data and stop the pipeline there, because now the data will be persisted, or you're gonna be going out the pipeline. It means you can go to a database, it means you can go to a log, it means you can go to a HGFS system, and so on. These are the three types of application that you can develop, and then you can create your data stream pipelines. That's why we call data microcells. Finally, there's a task there. This task is for what? Batch. Let's say that you want to do a batch job. Every 10 tweets that I receive do aggregation using all my tweets from the last year. So it's a batch job that you're gonna take longer, you're gonna have to go to some place once the data is there, and gonna run. That's what they call a task. It's a batch job. And you can combine a batch job with your stream. And by doing this also, you can have trigger-powerful things, right? Let's say that you are processing logs. You receive a log that has a failure, an error in the log. That error can trigger a batch job that will send an email or process something else, or open a ticket, do something else that outside the pipeline. You can combine those things together. So that's the powerful things there. And that's how you raise your applications using the web interface. You can do this the same thing from the common line as well. Okay. So you want me to just wrap it up? Okay, keep going. So I don't see if I want this one here. So probably three things that I want to summarize. Converging data in the cloud, in the applications, and the data, they can't be separated. Building really data-driven apps, which means smarter the apps are, more the data gets generated, the better you can make again the smart apps. And then being able to use Open, I guess most of you, since if you're using Spring, you are kind of aware of the open source community and all of that. So Pivotal is really a big advocate of open source. We, whatever products that we carry or offer to customers, all open source. And what we provide is just the support and the top value added services on top of the products. But so we believe truly that all the enterprises, because now if you look into some of these consumer-grade companies, which I was referring to, Google, Facebook, they're able to wrap it in a way because they're all using open source. There's a huge community backing. Moment you are into commercial products, which means you're locking yourself. Your innovation is totally dependent on how fast can that vendor, that you have the software product vendor that is innovating, right? So the speed is all up to the vendor. But now with open source, you have a community. Agile is something that we also believe in. They're shorter innovation cycles. It reduces the TCO as well as the time to market gets improved. And lastly, now it's all about cloud-ready. So you have to really look at all your applications to be running independent of your infrastructure. It should, whether you move your application to Amazon or you'll move your application to private cloud, whatever be the case, it has to really be in a cloud-ready and you should really focus on solving more business problems rather than creating, you know, working on things that are already available in the market. So that's pretty much it. I have a few more slides, but I think this is really good enough. Probably Carlos will kind of- If you have questions, can, I'm gonna try to run the demo here just, but if you have questions, why- Yeah, if you have any questions, probably I'll take the questions. If not, probably Carlos will try to show the demo. The demo is actually quite cool. It will actually show, it will color all the tweets by hashtag and it starts showing the graph in real time, how are the tweets, what are the top, you know, hashtags within real time. So just to show you some things here, for example, this is, I'm in my laptop here, so let me just get this. There's a question there, yeah. Yeah, so Spring Cloud Dataflow is the, the next version of XT. XT is gonna be End of Life next year. So what is- Can they migrate to the cloud? Yes, you can. What you did is, from Spring XD to Spring Cloud Dataflow, you removed the runtime. And the runtime now is a cloud platform, like Cloud Foundry or Kubernetes and so on. So that runtime business availability and binding message brokers and so on, you left this to the platform. And now you'll focus only in the apps in the orchestrator of them. So if you check here, if you see this here, what I'm doing there is, I'm starting the Spring Cloud Admin on my machine, right? This is a Spring Buddha. So now I can bind my shell to this guy here and run my pipelines locally in my machine. The experience is the same as here or can be on the cloud wherever you are. Same experience, always Spring Buddha application. Now, what I'm gonna do here is I'm gonna run the, I'm gonna bind this to my local machine now. So now I'm binding to my local machine. If you check here, see that you received the request. So now I can, if I do a stream list here, have nothing here. Let's say that I want to create a very, very simple application, very simple, see I have something in the pipeline here. Very simple. I said that I want to create this tweeted thing, right? Now I'm gonna run locally in my machine. So I'm gonna run this. I have to, so one thing that you have to do is when you start Spring Cloud at a flow, you have to rest your applications, right? So you need a catalog of applications to run. So when you start that a flow, that a flow doesn't know anything about your apps. So you have to say that a flow, this is the my catalog of applications. This is the application that I'd be able to use. So what I'm doing here is I import, I'm saying that a flow, this is not the apps that I want to import. Now I can use in my pipeline. I'm using this from the, and another thing here that you check here is that's, I have a rabbit here, what that means. Those applications, they can be connected via a random queue. Then if you want to bring the ones that for Kafka, you're gonna have one for Kafka and so on. The point here is that now, after this, I'm gonna have a catalog of apps that I can use to create my pipeline. If I do this here, so now you see that all those apps are registered. So now I can have services for, I have Sync, JDBC, I have such TCP, I have and so on, right? So one thing now is if you check that, what I did was I'm using RobTemQ. So I have to start RobTemQ. You have to go to college. What? Yeah, yeah, I just saying that now I have to start RobTemQ. Because what I said was that I want to use the RobTemQ. Yeah, now this, let me start Kafka because for some reason my broke is not, let me bind this to the other one because let's try to run this on the cloud because now my RobTemQ is not starting up for some reason, so you don't have much time to figure this out. So I'm gonna bind back what I did now. I bind this back for the data flow in the cloud that's this address here. And now if I do a stream list here, I have what I had before, right? So let's say if I deploy full, so it's very basic, just time log, then after that I'm gonna deploy it to it. So full is deployed. So now if I check here, it's failing for some reason. Let's wait a bit. The demons of demos are no, so keep answering the questions. Any other questions? Probably give it another two, three minutes if not probably we'll do it some other time. Any other questions? Here. Learning so far is that not Kafka is like a Kafka. It's a AWS. Yeah, think of it like an abstraction layer or a noise on top of your cloud. How does it compare with something else like CloudForms? How does the CloudForms, is it? We have CloudForms. Okay, Red Hat has this thing called OpenShift which is comparable to the CloudForms tree. Have you heard of CloudForms before? CloudForms, yeah, but I never use it so I don't know much about it. So what it does, what's there? So you create orchestration, you orchestrate apps as well. It's like Docker Compose, something like this. Okay, so the application running containers, so you have orchestrate of containers. How CloudForms works? Is it a platform as a service or is it? So we follow the platform as a service. So CloudForms is compared, it's like platform as a service and basically things, you know, there are vendors like OpenShift who offers, they're running on Kubernetes and dockers, whereas you have other vendors like IBM Bluemix, IBM has their own CloudForms, so. So, not so sure, I think CloudForms. Hold on, hold on, hold on. Is it done? Yeah, so now I'm gonna start the stream deploy. So we are running everything on CloudForms so probably takes a few seconds. Okay, so it deployed both. The first two streams has been deployed, so now I've deployed the final one, there's the tweets, one that connects and receive the tweets. It's deployed, so now if I go here, I'm running this locally so to avoid any, I'm showing, I'm still not showing here. So the streams, these are the streams. Deploy all the three streams and now we should wait a bit to see the data comes in and once that is ready, you're gonna show here in the dashboard so then I can see the visualization. So if I want to check what's going on, so I can come here for my logs. So this is the Kafka, I'm using Kafka locally in my machine, this is the message block that's moving the message around and here you see that something is crashing here so I can check the logs. For some reason, the tweets cannot process. So the data is not flowing to the other parts of the, I don't know what's going on. Okay, thanks a lot for coming and they probably will be conducting these sessions quite often, so probably next time we'll be able to show the same. I know, I know what it is. It's ready, it's fixed. It's ready, it's out, so now the data is gonna show. The data goes to there. So this is the demo that, this is what I want to show, so red zone was out so this comes from the cloud, this comes from the tweets and based on the size, it means how many words you have, how many tweets you have with that particular word, right? So this is the hashtags and now I can have another one here that also is in real time. This one's for language, so you can see English, Spanish, Korean, and so on. So all of this changes in real time? Those aggregations are be done by the tweets that I, by the things that I show you here, right? So tag counts, receive the tweets, you see that tweet stream? I get the tweets from there. Now I'm putting a counter and I'm putting a counter aggregate by hashtag text, right? So it's counting. Have a time that's had real one, two, three and keeps counting that inside reds, reds is the database they are using to do these analytics. The second one is by language, so I'm using here field name lang, name language. And now if the language is English, put in the bucket lang, English. If it's Spanish, put in the bucket Spanish, and so on. And finally, this is the guy that gets the tweets so he doesn't do anything else, gets the tweets and now those two here process the tweets, right? I have three streams, one just to collect, another in two orders to analyze. And based on these analysis here, if I go to the interface, I can see them running in here. So I put three pipelines together in order to get this analysis here. I didn't have to write one line of code, right? Because the tweet application that conserves the tweet is already being read. The aggregation, I just use the DSL from, tweet itself from Spring Cloud that flow itself to say, count, use the field value counter that's a application that count things based in the hashtag text. Use the field value counter based in the language. So I use the SL itself to do this very basic aggregation of data, but I didn't have to write any code for this. And now I can see this here. So the point is that you can use Spring Cloud to process the data that you have. The data doesn't have to come from the cloud, doesn't have to come from the internet, can come from a file, can come from a database. Anything that you want to process on stream you can do here. And if you want to process in batch as well, you can trigger the start of the batch from here. Questions? Thanks for staying up. Wait. Thank you. So, thanks.