 Many companies rely on batch processing of static data for everyday decision making. Organizations that proactively unlock insights from streaming data have quickly found themselves at the leading edge. But analyzing and acting on massive amounts of real-time data isn't easy. Integrating disparate data sources and formats, while ensuring low latencies and high performance, can pose an enormous challenge, requiring specialized expertise and high cost for development, deployment and continued support. With Azure Stream Analytics, an easy to use serverless real-time analytics engine, you can effortlessly run analytics on your streaming data from the cloud to the edge. From retail to manufacturing, healthcare to financial services, Azure Stream Analytics is designed to deliver real-time results to accelerate your business. Go from zero to production in minutes. Stream Analytics' simple SQL language is easily extensible with C-sharp and JavaScript. Built-in machine learning capabilities enable advanced scenarios like anomaly detection. Azure's enterprise-grade reliability, with an unmatched, financially-backed SLA, allows you to confidently run mission-critical workloads with sub-second latencies. Build an end-to-end serverless solution in minutes by connecting Azure Stream Analytics to event hubs or IoT hub. Directly output results to Power BI for real-time dashboarding, Cosmos DB or SQL Database for persisting big data, Azure Functions or Service Bus for real-time alerting and actions, and much more. Azure Stream Analytics is the first serverless stream processing service that allows you to experience real-time analytics from cloud to edge, running analytics right where the data is generated, independent of network activity. No matter how fast your data grows, Azure Stream Analytics is the ultimate tool to help you effortlessly unlock continuous real-time insights to gain a competitive edge. See where your data takes you next. Well, hello, everyone. Welcome back to Azure FunBytes. My name is Jay Gordon, and we once again are back here at where we get together. We try to get together every week. Here we do. We talk about services, the products, and the people that make up an excellent Azure experience. I am so glad you're here. Some big news today I've got to share. Today is going to be the last live show that I'm doing with Azure FunBytes. I've really enjoyed doing it. This is episode number 70. It's been a lot of fun. I've met a lot of really great people. I've learned a ton about all these different products that work with Azure, all the services that Azure is made up of. I just wanted to say, before we get into what we're going to be talking about today with my guests, thank you very much to you all. I've had a great time with you. I'll talk a little bit more about what I'm planning on next, but it's time for me to take a step aside from Azure FunBytes and try some stuff that's new. Sound good? Cool. Well, thank you very much for listening to my little intro there. We've got a great subject today. We're going to talk about Azure Streaming Analytics and how they fit into your data solution. To help me do that, I am going to bring in my guests, and it's none other than Florian. Hi Florian. How are you today? Hey. Good and you? Well, it's super, super nice to have you part of the show. I really am thankful. I always love all my guests, and you're the last live one. This is such an honor. I didn't know about that, and I'm like, yeah, that's fitting. You need it to go out with a big firewall, and yeah, I'm going to be all firewalls today. Well, thank you very much for that. Before we start talking about what we're going to be talking about today, I wanted to remind everybody of a few things that they can use as far as resources when getting to know a little bit more about today's subject. First of all, if you head over to Learn TV, you'll see on the right rail, there are a ton of docs that you can check out on today's subject, and you can leave comments if you'd like to. We'd love to hear your comments, your questions, anything that you are curious about. You can use the Learn TV portal, or use your native chat, so if you're watching on YouTube or Twitch or something like that and you want to send a message, please do. We want to find out what you want to find out. And of course, you can head over to Microsoft Learn, and what do you get from there? You get free Azure education, so you can learn all about the different features that make up Azure. You can learn about different Microsoft products, you can learn about different languages, like .NET, JavaScript, all through this portal. And it's gamified, I am, let's see what my score is now. I don't see my score up here, but you get points for doing all these modules, and you get to learn a lot. So that is all done. We've got through the intro in Florian now. I think it's time for you and I to talk a little bit more about you. I always like to be able to kind of set some context about who's part of the show and how they got here. So I'm going to ask you that question, Florian. How did you get here? That's a very good question. Thanks for asking it. And that's a part of the show that I really like, I really enjoy. So right now, I'm a product manager on the Azure Stream Analytics team. So that's the product group at Microsoft, the Azure Product Group. I've been in the team for 10 months now. And before that, I was a cloud solution architect in the field. So actually working for five years. Working with customers to make sure they understand our product and use them, like make them successful in using all the products, the good products we have in Azure and was specialized on the data platform. I'm located in Canada. I have a French accent, but that's actually from France, not from. And so I was in British Columbia going over lots of very awesome and excellent customers around here and telling the good story and making sure that they were successful with the products. But I've been, I had been doing that for five years and I was looking for the next challenge. And it was pretty much like you are doing now. And I had the opportunity to join the Azure Stream Analytics team as a product manager and I took it and I don't regret it. So yeah, it's, it's, I'm still like, you know, the first year, every time you just a worldwide, but I love it. Like I have an awesome team. I really believe in the product and how I learned that job was actually doing like community work, like I worked on, I published a unit testing framework for the product. I was blogging on the product, I was tweeting and it comes from before joining Microsoft so five years ago. I was actually MVP. Oh yeah. Yeah. People that don't know it's a program where I am when you're outside of Microsoft, I lost the status when I joined. When you're outside of Microsoft, if you contribute to the community, like doing public speaking, blogging, like building repos, tools, kind of open source activity, you can get recognized by the company and get awarded that title. And what it gives you is you sign an NDA and then you get to hear about the good stuff from the product group beforehand and you can also join once a year. So in the previous life when Pandemi was not there, we could actually, we would be flown out to Seattle to the campus in Redmond and get the MVP summit, get to attend a one week event in person on the campus and get to meet all the awesome engineers and PMs at Microsoft and discuss about the product and how to, like what we'd be next and try to drive the roadmap towards what features you wanted to appear on the product. And our global summit, it looks like it's coming up on the 21st of March where our MPPs will get together, they'll learn about different things that are going on with Windows, with Azure, Visual Studio, all these really important products, the different languages that Microsoft contributes to. It's all really great stuff that people tend to find really, really delightful. Because MVPs are people who really have a major commitment to learning about these technologies and then sharing what they learn. And I really like the term learning in public and a lot of MVPs, they like to learn in public, they like to set up blog posts, they like to do things like tweeting, like you said before, or they get to speak at live events. And I think that that community is super strong, it really helps expand the knowledge across to people directly. And to be honest with you, MVPs are just like you and me. You know, they're regular people who have spent their time outside of that Microsoft world looking in and saying, these are the things I really feel can help other people learn more about these technologies. And so it's a great, great thing that you can see yourself, you know, becoming an MVP led you to Microsoft. Yeah. Yeah. And I cannot tell you, I knew about that being an MVP, but now being a part of the product group, I cannot tell you how much the feedback we get from MVPs appreciated because it's like unfiltered. There's no bias. There's no, when you talk to a customer, there's always a negotiation. There's always, there's a relationship, a business relationship between entities with MVPs that there isn't that. So if the product's not working as it should, they're going to help you accountable. And that's the best kind of feedback you can get from the community. So, and there's room to join that committee. So if you're interested in, so the MVP page that you brought up is I think the best place to get started. And MVPs are recognized both for their technical excellence, but there's also room for people doing great learning, learning public, as you mentioned. So if you're new to a technology and you're learning about it and you're just sharing like the pain points, how you go around it, how you explore, how you take your approach, that's also something that's very, very valuable and recognized in the MVP community. Like there's a new breed of MVP's that are tackling that like level 100, 200, like like exploratory. How do I learn about stream analytics, for example? How do I get into it? How do I get set up? And that's recognized also. So we don't need to be, to be like 15 years' experience, like the best at that specific piece of tech. You can just be very invested into making sure people know about the technology and learn properly about the technology. Yeah, I think that that's ultimately what they really look for is people who have a passion for these different types of subjects. And if you show that you're both passionate and you're concerned with people growing on their own by using the information that you share, then you could be very well awarded that MVP and be able to just expand what your reach is. I mean, simply it's a great little pin to put on your lapel and be able to say to the world, hey, look what I accomplished. And so that that being said, we are going to talk a little bit today about Azure Stream Analytics and how it helps you work with your data to get greater insights, how different use cases, things like that. And I think that people are going to get a ton out of today's session. And so I wanted to kind of start out with this quick little infographic that I think gives like the three big steps that you're going to take when using the service. You're going to ingest your data. You can use Stream Analytics to analyze that data. And then you're going to deliver it to maybe another service like Azure Synapse. You're going to store an Azure SQL. You're going to use Event Hub, Service Buzz, Azure Functions to be able to have that data presented elsewhere. And so I thought that that infographic was really useful just to kind of set the context. But I'll ask you first, Florian, to just explain to us what exactly Azure Stream Analytics is. Yeah. And you know what? I have that exact same infographic in my deck that will come later. So thank you for that. Thank you for putting it up there. So one of the simplest way to talk about Stream Analytics is so it's a compute engine. So it defines a logic on top of a stream. So what is a stream? So that's the first place. Anytime you want to do something in real time, you're going to be generating lots of data. Data is going to be flowing in your data pipeline in real time. And the technologies that we use to employ up to this date we're focused on batch. So batch is how I'm going to run at the end of the day or run at the end of the week or at the end of the month to have a process that reads and ingests the data and then produce some insights. But as the capabilities of the technology and the hardware that gets better, that batching approach gets less and less relevant. And more and more, you're like, well, if I'm generating data in real time as like the real world is happening in real time, if you will, it makes sense to be able to analyze it in real time. And the point of Stream Analytics is to be able to do that. So define a SQL query on your real time data coming in and get something out of it and insights. And really the mission that we have with that product is to be the easiest, like the most approachable product to do just that. Sounds really interesting. One of the things that I'm curious is that always the tooling that gets involved when using services like this. So my next question to you is what kind of tooling can be used? You mean in terms of developer experience? Yeah, like say, like Visual Studio, Visual Studio Code. Yeah. So that's going to be kind of the point of that moment that we're spending together. As any Azure service that we have, we offer the experience through the portal. And that's kind of the easiest to get started on the product. It's just to go on portal.microsoft.com and then Azure.com, sorry. And then just start playing a provisional schematic job and just get start like writing your query. But when things get a little bit more serious and you need like source control and you need testing and set up in CICD pipeline, then yeah, you need to switch to local developer experience. And that's where we have an awesome, awesome extension in Visual Studio Code. And that's what I'm on the show today. Very cool. So yeah, if you want to check it out, you can go to the Visual Studio Marketplace and take a look there or you can go into the Explorer palette for different extensions and install it right through Visual Studio Code. So I know you've got some stuff for us lined up to kind of show us about why, talk about maybe some of the use cases, things like that. And so when you're ready, I'm going to bring up your screen and give you some space to kind of talk to our audience about. Yep. You can go ahead. Sounds good. Yeah. Yeah, I wanted to kind of, before we jump in, because I want us to spend most of our time looking at like building a job actually. That's what I want us to do today, Jay. But first I wanted to kind of level set a little bit why you stream analytics. So, and first, of course, to have that make sense, let's get started with what is stream processing, like sort of the domain that stream analytics is really strong at, what it is. And this is about real time, like we just mentioned. Business processes happen, most of them, like the vast majority of them happen in real time and think about it as a few process business processes that only make sense in batch. And that's going to be, I don't know, payroll, for example, payroll happens at the end of the month. So you need to wait for the end of the month to lend, to calculate payroll and send all the size and everything. But apart from those things that I have to do with accounting, for example, outside of those very specific business processes all by default, all of them usually run in real time. And the more we go into automation, the more we go into like software editing the world, like pulling software, pieces of software everywhere and interconnecting that software. When you think about the workflows happening in those business processes, most of them will have software applications talking to one another. They don't need to wait for a batch system to move data between them. They can talk to them in real time. And then there's the whole, so I put a thing here about how, from a theoretical point of view, managing events is always better than managing states because you can go from an event to a state, but if you have to stay going back to an event, that's complicated. I put a couple of things here. Highly recommend people to have a look at, if you're interested in that, I don't want to go deeper than that, but if you're interested in that, I highly recommend you take a look at that talk from Martin Kipman about turning the database inside out. It's awesome, it makes that point very clear. Okay, so that leads us into stream processing. So what stream processing, what is it? It's about looking at processing streams of events. What are streams? Streams are continuous flow of data. And by that definition, we can like the big data kind of definition to how to define that is to say it's both high volume and low latency. So we want low latency means I want to know about what's going on fast. And by fast we mean under a second, millisecond, second minutes maxed up. That's how fast I want to hear about what's going on. And one important aspect about that, how do we qualify? Like that's the fact that both the data will expire, the relevance of the data will expire, but also the insights I got from those data from those data points will expire. So if you look at that picture, think about that's a manufacturing production line. There's people by building stuff and there's lots of devices like looking at the telemetry and readings about what's going on on that factory floor. For example, one of them could be the temperature of a specific component when I'm doing some welding on it. That's going to generate by the millisecond, by the second reading that tells me the temperature. I don't need to know that the temperature was over threshold when things were going to burn in three days. I need to know it now so I can act on it. So that insights I get from it is time sensitive and that's why I need stream processing to make sense of that. And also the other aspect is that that I will expire. So that specific data point in three months, I won't care about it. At least I won't care about it that much. I really need to care like I'm caring about it in those few seconds, then I don't really care because it's too late, the board has burned and I've lost something to that. So it sounds like IoT is one of the really big, useful use cases, being able to have stuff from say like the IoT hub shipped into stream analytics to make some decisions on what to do with the data. Am I right? Oh, definitely. I'd say 100% of the IoT projects will use some kind of stream processing. There's no way around it. You need to derive insights from those data points. Will they use stream analytics? It's another question. There's multiple and we'll see why you can use an alternative, but they will use some sort of stream processing for sure. Yeah, your own point. So looking at applications now, thinking about data coming from a stream, we kind of see three big patterns. One is I'm getting data from a stream and I need to apply some rules to derive, to create some events and send that to another application. So that's stream to app. That's rules and giants, that's derived events. Or if you have an event-driven architecture with microservices talking to each other via an event log and you need, for example, a reporting service that's gonna read from the event log and do some kind of alerting notifications or anything like that with application of stream processing. If you need to offload your stream of data to a data store, so database, data lake, data mark, data warehouse, whatever the next thing come, then you need a streaming ETL. So ETL, as in, if you remember from our housing days, we extract, we transform, we load, even though now it's more like extract, extract, load and transform. If you need that streaming ETL going from a stream to storage, that's the application for stream processing. And the third one is stream to the human eye and that's the one that's most obvious. I need a real-time reports, dashboards, I need a notification. One, for example, that we are all like, we order things on multiple like online retailers and we enable like shipping tracking and we receive like texts notification when the package move forward in the supply chain. That's an application of stream processing also. I'd like to think about, and I've mentioned this a few times in previous kind of conversations about data that flows. I like to talk about rideshare apps because rideshare apps combine all that kind of information about like time to arrive, which is derived essentially from GPS data and every single GPS data point needs to be stored somewhere. It needs to get analyzed and then it needs to be processed into something that an application like, hey, I wanna go to, like I'm gonna take a car later to get to the basketball game from here. I'm going to find out where the car is and they're gonna make an estimation based on the distance and the speed and all that of when that car is gonna arrive. And so I think that that's one of the best use case examples of being able to show how live stream processing really helps benefit your end user. Yeah, but yeah, exactly. There's another one that's tightly related is geo fencing. Let's say you manage a fleet of trucks and those trucks should not leave specific areas. You have a rental fleet and the trucks are like should stay in the city where they are assigned. Well, that's also you track the geo position of each truck, each device, compare it to like a store of like areas where they should or should not go and then send an alert if something goes wrong. So yeah, exactly. Now imagine also like gaming, online gaming and being able to take scores and statistics and information and processing it into like leaderboards and things like that. Gaming, also online gambling, which is now a tremendous thing here in the US. I'd imagine being able to combine people's locations with their spend habits, things like that. Also for those apps really, really big. Yeah, definitely. We have a very good customer that does that, like the leaderboard for their online games, mobile games. They do just that there also. Sweet. So yeah, that's kind of the applications. Now, switching here a little bit and thinking about like giving you a mental model about how to think about stream processing. The first, there's two lenses that I want us to have to be able to know which product to use in the vast portfolio of product that there's measure. The first one is what is compute and what is storage? So in a stream processing pipeline, we do things as synchronously as much as possible. Meaning we don't want to have like synchronous API calls between components and coupling between components. Things can scale up very fast and very, like it can bring down your whole pipeline. The point is those pipelines needs to be resilient because they're running in real time. So don't want any interruption of service. So we add cues as much as we can so that when your devices send their telemetry, when your app send their events like logs, we send, we lend them first into a queue, an event broker. And then the stream processor will subscribe to that queue and read at its own rhythm. So we can scale those independently and absorb spikes of traffic in the event broker. From the stream processor, of course, you can go out to file the database. You can go to apps and people. But most architecture, stream processing architecture pipelines will have like multi-layered, multi-layered pipelines with compute storage, compute storage, compute storage. You know, layer one after the other. So even hub, and we'll see how that maps to the Azure services. Even hub, stream analytics, even hub function, et cetera, et cetera. That's the expected pattern looking at stream processing. That's the first one. The second layer that I wanted to mention is the second lens is stateless versus stateful. So when you do stream processing, either you look at a specific event at a time. So that's on the left, that's stateless. Let's say you're looking at the geofencing scenario we mentioned. Just look at that specific data point that geo coordinates, compare it to where that truck should be and then send an alert either it's okay or not. That's stateless. You don't need to maintain a state in memory of the whole stream to make that calculation. The other one is stateful. Let's say now you need to, you're looking at a highway with the toll system. You have data being generated at the entry point of the highway when cars go on the highway. And then when they go out, you get another one. To be able to calculate the fare or in a public system transportation, you need to make the difference between the entry point and the exit point. So you need to maintain all the entry events in memory so you can join with the exit events and calculate the fare. That's stateful computing and that's a little bit more complicated. And so I wanted to give you that mental model also because it's gonna help us clarify which products to use when. So yeah, let's now that we have the theoretical aspect dive into astigmatics. You showed that one before. That one is best to think about when you're thinking about thinking about the streaming ETL approach. You have your, you have your sources of data, you're doing your little computation and then you lend them into wherever they go. But that's a very good way to think about astigmatics. That's the simplest way to think about familiar things. But let's talk specifically about what we deliver. So as I mentioned, mission is to democratize streaming. Make that as easy as possible. And looking at that SQL query. So if I want to know an equipment, let's say I'm looking at, so that's a factory floor, but let's say I'm looking at coolers. And I have produced that goes into coolers on each of my shops. And I need to know when the temperature rise above 75 degrees for at this one minute because that means that my produce won't be good anymore. And look at the SQL query that you have that you're writing for that specific business requirement. To me, this is for somebody that knows SQL and most analysts know SQL. This is very easy to think about. The only subtlety is like for at least one minute. There's a temporal element that's quite new. But when you think about it makes a lot of sense. Now, let's think about when to use 3.8x versus the Azure products in Azure. So the usual suspects when you talk about stream processing are gonna be even Hub IoT Hub and even great service bus. Those are storage mechanism products. So it's easy to think now, okay, if I see 3.8x and there's even Hub, they're actually working together. Even Hub's gonna be my storage. Then I need a little bit of a piece of compute that's gonna be 3.8x and a bit of storage can be even Hub also. Thinking about function app and logic app, those are stateless compute environments. And by stateless, I know that's the one I was talking about. I know function app has durable functions. I think you've talked about them on the show and they are stateful. They are stateful, but they are not maintaining the same states. That's where it gets a little bit confusing. But anytime, that's the state that needs to maintain to support workflows. That's not the state that we need for stream processing. If you need to do temporal aggregation, if you need to do joints across streams, you need to maintain the whole stream of data in memory. It's not a specific data point, it's the whole stream. That's the difference between function app and stream analytics. And actually, we work very well hand in hand. Function apps and logic apps are good complements and substitutes for scenarios that are not supported in stream analytics. So it's not a compute at all. Like it's a better together story. That's the fact that we... Great, let's go use those services like logic apps in order to actually deliver what you want. Power platform, I'm sure, somehow or another can also kind of lead into using the... Or actually state could be another place to use that data. Definitely. Think about that text that you're gonna receive for a geo-facing scenario or whatever. Stream analytics will be able to calculate if you should receive a notification or not. Then it's gonna output to even hub or service bus. And then you need an actual compute service to send that text or send an email. That's gonna be function app and logic app because we don't do sending an email. That's what logic app will give you. So definitely. Very cool. Data factory is batch. So pretty much we have the same positioning. They do batch, we do streaming. And data explorer. So that's very like... I like that product. That's a very good one. Data explorer is both storage and compute. So they provide low latency ingestion, a hot storage so that they can give you a very high performance like query experience. But when you think about it, if everything go fast enough and if by fast it means that for your business need that's fast enough, then it looks like streaming. It looks and smells like streaming, stream processing. And potentially you don't need... If you need that storage component, maybe you don't need stream analytics here. And that's fair. The two points that I wanna bring up here is that Data Explorer is those time series. And we do stream processing time series. And I'm gonna make that very simple and high level, but I'm sure people will need to pick over that statement. Time series is about having an absolute timeline. So when you look at time, it's defined. There's a clock time, it's moving forward and you input your events and you put them on that timeline. And when you wanna aggregate, what you do actually is you do time bucketing, which means that you're looking at things by the second and then you go by five second and then you go by the minute and then you go by the hour. When what stream analytics does is stream processing. With stream processing, the timelines are relative to the stream of data coming in. So it's not the time clock that makes the time move forward. It's the events coming in telling us, oh, now it's five second later. Now it's 10 second later. And that's what makes the clock move. Most of the time you can safely ignore that aspect of things. But for certain scenarios, you're gonna be in stream analytics and say, oh, that's so hard to think about time. That way, usually that means that you need a time series product and the other way around. You're in the IDX, trying to do joins on different timelines. You're like, oh, that's very, like it's hard. I don't know how to, that usually means that you need to look at stream processing products. And then I showed up about Data Explorer. The last thing I wanna say is we recently released a Data Explorer output. So you can read data from process data and stream analytics and then output them into ADX. This is the awesome story, the awesome story because we are now the ETL and they are the database. The same way Data Factory works very well together with something like Azure SQL or Synapse SQL. We have now, that's that batch story of ETL database. We have the same in the streaming world with Azure Stream Analytics as the ETL and Data Explorer as the database that does both storage and other queries. That's awesome. And I'm sure you'll hear more about that later because it's brand new when we're building the story here. But that's, that's, that's, I like that, that's a good one. And yeah, so going, going, talking about the developer experience, we mentioned that the Azure portal, that's the container experience, everything and the Azure portal, including a query editor. And it's required still for a couple of settings, manage identities, one of them. The other one that I wanted to talk about today is the local experience, how you can use VS code to actually do everything without even needing an Azure subscription. And I think that's the best. Stream processing is hard to get started. So what better story to have to say, well, you can get started without even having a stream of data. Let's say you want to get started and just, just want to build a query, write a query. Well, you can do that on files on your local, on your local system. So that's, that's what we do. Cool. There's the documentation, everybody. If you want to check it out, here's how to create an analytics, or I should say an Azure Stream Analytics job in VS code as well. So we've got some documentation and if you need to find more of it and in the later time, you can head over to Learn TV. It's all up on the right rail. So take a look. Why don't you show us a little bit more? Yeah, let's show you. So what I want us to be early is a very, let's get started with a very simple one. We are looking at devices in our plant or like the picture I showed you earlier. So we have devices taking like readings, looking at our manufacturing plants and generating a time like a reading, giving us a temperature and a speed in RPM. So there must be some rotation involved. So we send that to our even hub, our queue. And then we look at in instrumental ethics, how to read that. And as the streaming ETL scenario, we want to lend that into a SQL database. So really the most basic scenario that we can get. So let's see how to build that using, using VS code and that's what I think. Thank you to me. Yeah. So here I am and I'm just going to create a new, a new folder. My job invites. Yeah, let's call it that. And in my folder, I want to start a visual studio code. So here it is. Of course going to open on the other window. So let me just grab that and bring it up here. Yes, sorry about that. Okay, problem. So my explorer window of VS code is on the right because I like being a little bit quirky. Usually it's on the left, but it's just to set in VS code. Don't get too worried about that. So what you need here in terms of extension, of course is to check. So that's the Azure stream analytics. And you mentioned that before, how you can find that directly from the VS code, like website, but here it's directly in here. My node, of course, is installed. And if you scroll down in the details, there's a little like gifs or gifs, I don't want to start the war here, that we're added to show you how to just get started. But so yeah, let's just do that. So the way you work with VS code is you go through and so I just pressed F1. And I never remember how that's called. And I used to chime or I don't know, like that specific screen, I don't know. But anyway, you press F1, you get that that shows up. That's how we work with VS code. And all the stream analytics, once you have installed the extension, all the stream analytics command are prefixed with ASS. We can find them easily. Here we want to create a new project. The extension is being activated at the bottom. You can see that that's where we're waiting a little bit. And it should show up. It usually takes just a couple of seconds. And what we are doing here, we're actually refreshing the runtime. So you're gonna be able to run stream analytics jobs locally. So we need to download the latest like cloud runtime. So you can do that and be in sync with how the job is deployed in the service. So let's call that Azure Fun by ESC. That's my project name. Acting me for a folder, we're already there. And if you look at the Explorer, so that created a couple of files and folders that we put in there. Of course, I had already opened that folder in VS code. So now I'm seeing it twice. I'm just gonna read it from here. On the, yeah, I don't worry about it. I'm just gonna do that that way. Okay, so what I have here in my folder is files that have been generated by the extension. And that's the stream analytics job. The stream analytics job is composed of multiple files. Let's not think about it. Like we are starting with stream analytics. We're gonna do local developer experience, that job config, I don't care about. Yeah, thank you. The project, yeah, okay, I'll write it. The project.json, that's managed automatically. What really is important to me is my .aseqwell, so my SQL file. So that's where I'm gonna be writing my query. Not your MySQL, but your SQL. Yes. Yeah, exactly. Little database humor. Yeah. And we considered calling it .seqwell, but then it was gonna come, in terms of extension, it's gonna compete with the SQL extension. That was gonna be a mess. So we have our own extension for our query language, but it is SQL. And of course, what we need is, so let's not think about the output yet, but what we need is an input. We need to look at data. So that there's an input folder. And in the input folder, that's a PlaySolder, that's input.json, that just to give you, at the top look like that little, again, that has a name in VS code. I never remember that, but that's the UI element that's allowing you to create a job. That's not the only way to do that. You can just right click on the folder, or you can do F1, and again, ASA, add input, multiple ways to do that, but that's the way I do it. So I'm gonna add a local input. Yeah, local input. As we said, I don't have the stream, the stream is not yet built. The team that manages the devices hasn't put them out yet. They're not there. The even hub is not created. So I'm just gonna create a local input so I can start writing my query, even though I don't have a stream. So here that's gonna be my readings. And I just created little config files that says, okay, that's the reading. What kind of data are you gonna give me? Here that's gonna be JSON. And ask me, of course, for a local file path. So yeah, I need a new file here. That's gonna be a new file. And I like to call them, that's my own notation convention, sample dot underscore readings dot JSON. That's gonna be the data that I want to read in my query, to be in my query. And you know what, I'm gonna be lazy and I'm just gonna grab it from the PowerPoint. Sure, just to give you some time where we've got about a little more than 15 minutes, but about 15 minutes. Ah, awesome. Yeah, okay. So of course, here we have PowerPoint that's putting like some weird character in there. We're gonna remove them. Let's make them valid JSON. Okay, here we go. Now we have my sample data. So what I can do is, of course, select it from my input definition. Let's use that file. So now when I go back into my query, say select star form readings, and I can then run it locally using a local input. And that's gonna start my stream analytics job. Luckily, without needing a major subscription or anything. Here you go. You're doing stream processing. You're doing stream processing in one and one. Five minutes on the file. So I think that's pretty awesome. So it's gonna take a couple of seconds for the job to start. We'll see. So we have an input and we have an output. Yeah, that's the simplest job I could do. And I can look at my query results. Here they are. So I have a device ID. I have a reading time stamp. I have a temperature and I have a speed RPM. That's awesome. Like, okay, that's happening. My halo world of stream processing is there. But the first thing we need to think about, we need to remember is that we are targeting a SQL database. Here, this is a JSON file. And I know from building that stream processing, extreme processing is no SQL environment where you can receive CSVs, JSON, Avro files. And the data types and the schema is really like on purpose, you can change schema in some of the single event hub and have multiple schema cohabiting like together. And we are going into the SQL world and SQL is enforce a schema on writes. So really that job, if you wanna make it resilient, we cannot just do select star because what if I'm missing a field or I get a text value in that column instead? Like the role of my streaming ETL here is to make sure that my data conform to the expectation like the table I'm writing to. When I look at the table I'm writing to, I'm writing into that table device readings. And I'm just gonna, so I'm in management studio and I'm just gonna create the schema of that table using the create table statements. So here, the database will tell me what are the expectations. So that's where I'm writing to, okay? So there's an identity column, this one I don't need to care about. And then that's the schema of the table that I'm gonna be writing to. So let's think about how to make that happen here in three minutes. That's the column I need. So the first thing I'm gonna do is remove that and explicitly name the fields that I need. Pressure and speed. Well, yeah, I see that I don't have a pressure PSI in my specific, that specific at that point in my, for that specific devices with there's no pressure PSI. So I'm just gonna give out that one. Okay. The other also speed should be an integer. So we should try and cast that as an integer. Here the type system is actually the one for schematic. We don't have a specific int. What we do is begins. We can talk a lot about the types, but just here we have begins that encompass also ints, so let's just use them. And we are gonna keep the name at speed RPM. So what happens here is I'm trying to cast my column as an integer. If it is an integer, then everything's fine. If it's not, I will get a null, which is much better than getting the original value that potentially was some text that would be rejected by the database when I'm trying to insert it into the database. We need to do that for pretty much all of those fields. Temperature is actually a decimal. The way you do decimal is float. And I know we're working on that. Temperature C, okay. My reading time stamp is actually a bedtime. So I'm gonna try to cast that as bedtime, okay. There's no device version. That's a new field, it doesn't exist. So that's actually, I get back to my stakeholders. They told me it's a hard-coded version to just be V1 because that's my new devices. And the device's ID should be a string. How we do stringing in schematics is via in virtual max. And that's only as device ID. And now I have a query that is compliant with the schema of the table I need to write to. That's awesome. Now I can test it. I can run locally. And again, test if it works properly on the file that I've done it. And what I can do also is say, you know what, let's test that. Let's add a second record where I don't have a device ID. Unless let's add a third record where my speed RPM is actually at best. And let's see what happens then. Sure. And the key point here is I'm doing a streaming ETL job. So I need to think about those types and I need to leverage the capabilities of stream analytics to deal with all the errors in types and values that could happen from my input. Because I cannot trust the publishers. They're not in my domain. They're devices like all of my manufacturing plans. Now I see that what happened is, so the device ID was missing. So it's null and the speed RPM, if you look at the bottom here, now it's null. So instead of- So we're validating our data as we work with it. Exactly. And what I can do from here is say, okay, I just don't wanna lose those records. Or input them like that. What I wanna do is say, that's gonna be my data cleaning step. So with data cleaning as my first like step is, now I'm gonna clean my data. And then what I wanna do is select star from data cleaning. And I want that to go into reading sequel. But what I'm gonna do is say, where my device ID is not null. Because I only want records that are expected to have a device ID because that's how I do my measurement in the end. So if the record doesn't have a device ID, just get out. They don't wanna lose those records. Those records, they should also go potentially because I wanna be aware of what's going on and I wanna be able to audit what's going on and find back what why those events don't have device ID. Well, I can use then the opposite one and say, okay, that's gonna be readings. And this time I'm not gonna output to sequel. Because sequel expect a standard. I'm gonna output to a storage account. So I'm gonna do that to a couple. Very cool. We've got about eight minutes left. Maybe a little less. Yeah, so that's good. But let's forget about that one for now and let's focus on sequel. So what I wanna do now is actually create, try and see how I can run that live. So I'm gonna create two things now. I wanna create the live input. So I'm gonna go back into my input and use a live input. So that's gonna be an event hub. And that's gonna be readings. It has the same name, okay? So I can mock live or local using the same from close. But because I'm reusing the same name here, that's how I can do that. So I'm gonna grab my event hub that's set up. And I put my name everywhere. So that should be this one. That's this one. Consumer group, be mindful about consumer groups. Each of your application should have a different consumer group when you're reading from event hub. That's the only thing I would say about that here. But if you have issues, that's right. Usually where it lies. And serialization JSON, I'm all good. Okay, so that's good. And you know what? I need to start in my device simulator. So I'm just gonna, okay, and then I'm gonna grab, I'm gonna, I can show you actually, I have a little script that's gonna generate very simply some while through, just send that payload into my event hub. And I'm using PowerShell to do that, very straightforward. We should have an article up pretty soon to do that. But yeah, I've run that into Windows Terminal and now it's sending events. Those exact same events can see them there into that event hub. So now what I can do directly from this code is to preview that up. And that's gonna connect to that live event hub and have a peek of what is inside my event hub. Yeah, it looks like that. If I did have that already there, and I still want to do local development, what I can do is save as, and if I save as here, I'm gonna save an extract of my live traffic so I can do local development, but also a nice little thing that we have. So going back here now, I can run locally, but now from the live input. Let's see how it goes when I run from the live input. Now this time I'm not reading that same queries, not applied on the files, it's applied on the live event hub that has data. And while it's loading, what I can do now is think about my output. I need an output to my SQL database. So I'm gonna do that. I'm gonna add an output to SQL database, and that's a reading SQL. I'm gonna select my database in here. That's the one. My user, that's me. My password, what's out loud? Filing to not do that. And my table, I'm not sure now. So I'm gonna grab it again from here. That's the vice-readings. That's the one I'm gonna write to. Table is this one. Now it's set up. So now if I look at my job, I can run with a live input and a live output, and I'm gonna go there directly. So now what's happening is my job, my local job, will run from the event hub, and apply the query and try to output to the SQL database. So first let's see if we've set up everything properly. The test connections are pretty fine. Job is starting. And if I look here and I do a select star, right now I have nothing. So hopefully when the job starts outputting, I should see the query being applied. I'm starting to read, so it should start to show up. And that's usually where everything breaks. We don't see the records. And I end up, oh no, here it is. Here I have, let me make that bigger, I don't think so. So here I can see live that my data is being output from my event hub via my local stream analytics query back to SQL in a new form that is resilient because I've mapped the schema myself. I know that it's gonna be inserted properly. So if I'm happy with that, what I can do now is just say, and that's gonna be the last thing I wanted to talk about, is that you can submit to Azure from here. I can submit. And what that will do is deploy that job with all the settings into a stream analytics job. And that's where you will create in the Azure portal. So that job's gonna be created for you. And then you can just start it and let it run 24 seven and forget about it. And now you have your streaming ETL running with like 100% local developer experience. That was it. Very cool. So we're getting really, really close to being out of time. And I've seen and learned and really, really got to know a ton more about Azure stream analytics. That's right. Is there anything else you wanna show in the last like same minute or so that we've got? So just like talking about other things that are there, there's unit testing. We have unit testing. So what you can do is here I've shown you how I've updated my local file adding like records that were incorrect and see how it behave. You can actually have a unit testing solution where you can build your case, what file should be input, what file should be an output and the behavior of the expected behavior. So when you work, you can do test driven development for stream processing using VS code and stream analytics. And I think that's just, I just love it. It's really easy to do. But yeah, looking at that specific case, what I have for hours of your time, usually what I do next is we look about, we look how to build as easily as I've just done it, like add in second output to Power BI, have that in dashboard. But also how when you add a second device type with a second schema, how you can build a query that's gonna deal with that. If you look here, not only is the schema different, so my job, my SQL will need to deal with both of them, but look at the readings, they are inside of an array with a key that doesn't tell me which is its temperature, is it speed, what is it? That definition that interpretation lives in a catalog, a metadata catalog inside of SQL DB. So now what I need to do is load those, put them apart, do a left join in Stream Analytics on that reference table, do all the processing so that I can lend both data payload schemas into a single table. That's what we do with Stream Analytics and it works like awesome, it works great. If you have that, we can do that, it's just awesome. Sounds great. So hey, foreign, we've reached just about the end, the last show for me, and I wanna make sure you let everybody know where they can find you on the internet if they want to talk more about Azure Stream Analytics or any other big data processing and streaming information. Yeah, so I think there's my Twitter handle just right there. Lastname.ca, because I'm in Canada, so lastname.ca, that's my blog. I'd say if you have a, we are monitoring, not this week, I'm late on that, we're monitoring Stack Overflow, Azure, Dash Stream, Dash Analytics, Keyword, we are monitoring that. If you need to get in touch with us directly, there's the askasca at Microsoft.com. That's an email where you pretty much blast into your team. Yeah, those are the places where you can find us. Great, awesome. Well, it's time to say goodbye to everybody. I wanna say thank you very much, Florian. I had a really good time spending this hour with you. Be sure to check the notes if you wanna get any more on documentation. Until next time, wave goodbye, everybody. We'll see you. Thanks for joining us. Take care.