 Good afternoon. Welcome back from lunch. We hope you're having a great time at Node and JS Interactive so far. OK, good, good, good. Thank you for joining us. We will show you some cool things that you can do with spreadsheets and Node.js. You might know me from my talks about V8 and JavaScript compilers, but I switched teams. I work in the Google Cloud org now, so I want to talk about Cloud stuff. So this will be very different from any other talks you might have seen from me. We'll give you sort of a basic introduction to some of the Google Cloud platform products, specifically serverless functions. And we'll tie it all together using spreadsheets for prototyping. Conveniently, spreadsheets, sheets is also part of Google Cloud. So all the APIs work together really nicely. My name is Francesca Hinkoman. I'm at Google based in New York City, and I'm a member of the Node.js technical steering committee. And I'm co-presenting with Leah. Hi, I'm Leah Cole. I am based out of Sunnyvale, California. And I typically use Node to write samples for data analytics and ML products in cloud. And I am by no means a Node expert. In fact, Franzy's my go-to Node expert, especially when I want advice on best policies for our samples. And I'm also a super-duper G Suite enthusiast. I definitely used G Suite before Google, but once I started using it every day for my work, I found that I started using it outside of work, too, to plan vacations and stuff. All right, so let's get started with work class and do more sheets for Node developers. Why are we talking to you about spreadsheets? Well, the great thing about spreadsheets that they allow you to collaborate with anybody, not just with developers. Spreadsheets, they have this universal language, you might say, that pretty much anybody can pick up. So anybody that you want to work with, regardless of their background, they can edit the data, they can read the data, they can filter data, sort it, play around with it, and draw their own conclusions, which can sometimes be really useful. I'm sorry, my puppy is crying. All right, so sheets are good if you want to collaborate with non-engineers. And not only can pretty much anybody work with the sheets, sheets make it really easy to visualize data, and so you can analyze your data by looking at the visuals you created. Obviously, you can create all kinds of visuals, pie charts, bar charts, line graphs, whatever. And yes, as engineers, we could write code, we could use D3JS, and generate really pretty graphs. But it usually takes a lot longer than just using a sheet to visualize some data. And when I need to iterate quickly, then sheets allow me to focus on the trends and correlations instead of the code. So lots of advantages to sheets. And because of all this, sheets are really useful when you want to prototype something with data and metrics. So probably not for a full-fledged, serious production product, but if you're just prototyping and you want to iterate quickly, sheets give you a lot of advantages. I don't know about you though, but I do not enjoy entering and manipulating data in spreadsheets. But it's because it's not something that I do all the time. I do think though that sheets are an awesome tool for the output and especially for beautiful visualizations like this one. And so that's why we're going to use that today. But because the manual parts aren't fun, we're going to eliminate them and instead automate them. And because we're at Node and JS Interactive and we love Node, we're going to use that. I mean, you heard Franzy's a Node core contributor and on the technical steering committee, and she talks about how awesome Node is a lot. And it's a good influence. Before I was working with her, I definitely gravitated towards Python and Java because that was what I was more comfortable with. But especially now that Async await makes things more readable, I definitely gravitate more towards Node for that automating. All right, so we're going to start by writing this Node application to work with sheets. We're going to install the Google APIs client library from the NPM registry, like so. And it's a full-fledged API for talking to sheets that we're going to use. But before we dive in, I do want to give a quick shout-out to the G Suite APIs as a whole. If you're not sure, we've been using this term G Suite a lot. It's our family of productivity tools. So it's not just sheets, which we're focusing on today, but it's also docs, slides, Gmail, calendar, among other things. And you can call all of those APIs for free, which is awesome because free stuff is great. You don't have to have an enterprise account. And as Franzy said earlier, G Suite is part of Google Cloud, which also encompasses Google Cloud platform. So if you've ever played around with Google Cloud platform and used those APIs, using the G Suite APIs is pretty much the same. It's the same account, same family. They are generated the same way. You turn those APIs on the same way, which is great. And if you are at this conference and maybe aren't a Node person or maybe you're new to Node, fear not. We do have these client libraries and other languages. But because we're at Node and JS interactive, we are going to focus on Node today. So let's take a look at Hello, World in Sheets. This could just be a script that's running locally on my machine. And we'll start by requiring that package that we just got from the NPM registry. So we'll start by initializing the Sheets client. And then above that, we see this string that has our scopes. And so that's the level of access that we want to grant this script. We know that files live in Google Drive with other file types like docs and slides. But specifically, we only want to write to spreadsheets today. I don't know about you, but I get really nervous when any kind of application asks for permissions that I think they shouldn't have, like wanting to get into my drive folders and see everything in my emails and my location history and my mother's name. Like, we don't want that. Best practice is definitely to restrict to only what we need. So here, we're going to make our Hello, World call. And we're going to say which cell or cells we want to put things into in Sheets. So in Sheets, we use what's called A1 notation. And that's because A1 is that cell that's in the top left corner of the sheet. And that is where we're going to be writing our Hello, World today. And we're also using Async await, because we love that. But if you did want to use a previous version of Node, you could do that with the script too. And don't worry about trying to memorize it. We have all this code on GitHub. And we'll show this link again at the end. All right. So here's what the code looks like when we run. Super cool. Hello, World in A1, just like where we said it was going to be. And I'm not sure if you remember from the last cell, there was this variable that was called spreadsheet ID. And this spreadsheet ID is in that red box at the top. That's the same ID that you'd find in the browser with any spreadsheet you'd have open. So that's what tells us to write to this Google Sheet in particular. All right. But let's be real. Most of the time, you need to do more than just write Hello, World to one single cell. And the best way to make multiple modifications to one sheet is to batch your requests together. And this method, this batch update request, is very common across all of the G Suite APIs. So why should you group your requests? There are two main reasons. The first is that every call is going to have a little bit of network latency. So if you group them together, you can save yourself from that. And the second is that while the APIs are free, there is a quota. So if you batch your requests together, they will count as one towards your quota instead of many. So definitely want to optimize for that. So each of these mysterious tasks could be a different kind of request. It could be writing data like we did with Hello, World, or it could be adding a chart or formatting a cell or deleting a sheet altogether, filtering your data. If you name it, if it's in the Google Sheets UI, you can do it in the API, which is pretty great. We say it's like a big hammer for all kinds of things you can do or insert your favorite tool. I don't know. So we have this basic toolkit for working with the Sheets API, and we're doing it in Node. So let's talk about a real-world problem. I don't sit in the same office as Franzi. We're across the country from each other. But one thing we do have in common is our love of open source. And we also both really love the Insights tab in GitHub. It tells you so much about your project. And I know Franzi does a lot of looking at open issues and PRs. So we're looking at the Node.js Node repo, the Node Core Node runtime right now. And if you were to go there, you could see all the changes going in for Node 14. And Franzi is really concerned with the health of the Node runtime overall and is a very metrics-oriented person. But the thing about the Insights tab is it really only tells you info about GitHub. They're just numbers. What does 901 issues mean without additional context? Do we know if that's healthy or not for Node? And it's because a lot of other factors affect productivity. If a product team were to come in and change a ton of requirements, that's going to change your velocity, who's ever been there. Or if you were to have a ton of technical debt, that can slow you down. But on the other hand, if you have really good test coverage and continuous integration, that can make things good. Or what if people are hangry and that makes them not as productive? You get it. Franzi wants a holistic picture of what goes into making the Node repo healthy and what helps folks be more productive. So we're going to focus on the question that I mentioned there that's hard to define. We're going to be a little bit silly. And we want to experiment to find out if food satisfaction yields better GitHub metrics. So this is a hard problem, not just from a technical perspective, but because it's a real world problem too. We have our contributors at their machines. We have GitHub. We have food. And we want to write a prototype to automate gathering information about all of those and that's why we're going to use sheets for the prototyping to manipulate the data. And of course, we're going to do it in Node so that we can use Node to monitor the Node repo. So let's start by breaking down that problem. So my intuition is that when we're well fed, we write better code. But we care about metrics. So let's start with the data. Node core is a very active repo. And we would like to know how many open pull requests do we have, how many issues, and how many PRs are we merging, and how many issues are we closing? Luckily, the GitHub API provides all this data for us. And using OctaKit, the client for JavaScript, it's very easy to get this data. It's just a few straight up API calls. And we know how many open issues we have instead of looking at the website. So I'm collecting this data by doing a few calls to the GitHub API. And then just like in the Hello World example that Leah showed us, I use a Sheets API call to write this data into a spreadsheet. Now, obviously, when I'm getting the data from GitHub, I could just copy it and then paste it into a spreadsheet that would work instead of talking to the API. But it's manual work. I have to do this every time I want to update the data. And it's error prone. So I'd rather add another two lines to my script that already pulls the data and then push it right into the spreadsheet by using the Sheets API. All right, so now for the food part. We left surveys, so we just did a survey where we figured out how happy the node contributors in our office are with their food on a given day. Clearly, this is very scientific. We're using the Google form for it or Google survey. If you've used it before, the results are already automatically stored in a spreadsheet. If you use anything else to gather your data, just do another Sheets API call to write it to the spreadsheet just like we did with the GitHub data. All right, so now we have both the data for food and for GitHub. And we want to know, are there any correlations? What's a good way to find correlations? Well, if we can visualize the data, that would help us. So here's the chart. We'll look at the results in detail later. For now, we're interested in how did we generate this chart? Of course, you can click in the UI to generate a chart, but we don't want to use the browser. We don't want to open Sheets. We want to script all of this so we don't make any mistakes that we don't do any manual labor. And yes, there is, you guessed it, an API call to generate a chart. So what does the code look like? We create an add chart request and we have to define the chart and its specifications. So here, we're defining the chart type. We're picking a line chart and then we have to specify the domain. So the x-axis of your chart, in our case, that's the different dates that we're interested in. And then we also have to enter the series, the different data points we have from GitHub and from the food survey. And once we've defined the chart, then we call this badge update method that can do pretty much anything for us and we send the add chart request. So we started with these data sources. We have some data from the GitHub API. We have some data from a form and then we do a write API call to the Sheets API to put this data into the sheet and we do another API call to generate a chart for us. And if we do this, we never have to open a sheet and a browser. All our scripts run locally so we can decide what node version we want, how to run it and all that. It's just a little script that we run the talks through the Sheets API. But Leah is going to take this one step further now. Thanks, Franzi. All right, so at this point, we're gathering data, we're getting insights, but it is only running locally on one machine. Franzi knows it across the country from each other. This is not going to be super great for collaborating if it is only on one machine locally. So we should probably be hosting it as a service. Ideally this would be like a full-fledged application but we need a client and a server and I mean if we really want to be abiding by good cloud principles, there would be high availability, fault tolerance which would mean multiple servers and load balancing and a cache and that's a lot of pieces. We said this was a prototype, let's not. We want to focus on our code, not our infrastructure. So instead, we decided to go with a serverless solution which in Google Cloud Platform is called the Google Cloud Functions. So you just upload those narrow focus bits of codes to GCP and the cloud provider figures out where to put it and how to scale it on your behalf. You don't have to know anything about the machines that it's running on and you only pay for when it's executing, you don't have to manage your servers, security's taken care of, you can use the languages, tools and frameworks you're familiar with. So if it's Node, you're doing it in Express, Python is Flask, there's also Go and Java runtimes and they're triggered by various events. So there are two types of events in Cloud Functions and we have, there's background functions that are triggered by events, like if you upload something to a cloud storage bucket or send a PubSub message or it can be triggered by an HTTP function which is what we're going to focus on today. So those are great for processing data or calling external services. I know in the real world, folks use them to talk to things like Stripe and Twilio. So let's take a look at the Hello World for functions, for HTTP functions. When I say function, do I mean a function? Yeah, it's only two lines of code and we have a fully functioning back end which is super cool. If you've used Express before, those request and response objects should look pretty familiar, that's where they're from. And so with all of this, no additional configuration, you end up with an HTTPS endpoint at a fully qualified domain with dynamically generated SSL TLS certificates and that's what it would look like in the browser. Okay, so now that we're familiar with Cloud Functions, we're going to wrap up the code we have in our local script and deploy it there. So we have our serverless back end. It'll read that data from the sources and produce the chart while running in the data center so Franzi and I can actually be useful to each other sitting in our offices. All right, so what we're going to do is we're going to take the code we already have and wrap it up in an exported function called githubchart to call our main logic and then when the processing is done, we'll send that 200 success just like we did in the Hello World and we'll send also the ID of the spreadsheet that we've been working with. Okay, so how do we actually do this deployment? We did it using the GCloud CLI tool which is what you can use to manage all of your GCP tasks and for Cloud Functions, you just pass the name of the function that you're going to use and then also we specified that it's an HTTP trigger and it uploads your code and it picks up the package JSON which is great but not the modules because that would be a ton of stuff and so this is great for if your internet connection's not super-duper and those dependencies are then downloaded in the Cloud and used by it. You could also edit the package JSON and the function if you wanted to in this UI, the Cloud Console, which is nice for quick fixes. I love a command line but I do also love a useful UI especially for a quick fix. All right, so at this point, our code's in the Cloud which is sweet but it still needs to be triggered manually. If you're doing something where you wanna be pulling in data regularly, you don't want it to be triggered manually. That stinks, that means you have to remember to do it. So instead of needing a human, we chose to use Cloud Scheduler to run our code. It's basically a big fancy cron in the Cloud. You set up the cron job in the Cloud Console that's the same UI that you can use for Cloud Functions and you use the same syntax as if you were using normal cron. Our job runs every morning at nine and we specify the HTTP URL to trigger that Cloud Function. And so at this point, we have the data collection job in the backend and Franzi's gonna talk to you for your adding a client. All right, so we have to code in the Cloud but we have to remember that weird URL for the HTTP endpoint and would be just a little more convenient if we had a UI client for it. So we did that. We added a client that shows you the chart and if you click on share analysis, the data is updated once more and then the new chart is sent out to all your teammates. So we have the data updating at nine a.m. every morning anyways but we don't want to spam everybody daily so we're only sending out emails if somebody consciously clicks the share analysis button in the front end, that's why we have to front end. And since we are sending out emails, we're using the Gmail API for this task. Just like in the serverless Node.js code, we have client libraries for front end JavaScript as well and again, those make it much easier to talk to the Google rest API than if you were to directly script against the rest interface. So we're doing Gmail that uses that messages that send and we pass in as a resource to raw email which can be HTML or plain text. So now we have the front end order crown job that triggers our cloud function and then the cloud function pulls in the new data and the cloud function also generates a new chart and then the Gmail API and the front end sends out the new chart to the team members. All right, pretty easy so far, right? Set the whole story though. What's the big thing that we left out? We left out auth, we never mentioned that and we're sending emails pretending to be somebody and we're reading and writing spreadsheets and clearly I don't want people to be able to read my spreadsheets just because they figured out how to use the sheets API. That's not a good security model. So let's look at how authorization works in our app. In the front end, we actually have a little authorize or login button and when people click that then the Google sign in flow is triggered. And Google sign in is a secure authentication system that allows your users to log in with their Google account that they already have and then they can authorize your apps in this case to send emails. They can authorize your app to use Google services on behalf of these end users and all of this is happening in a secure manner. So Google uses the O2 protocol and the application needs user consent before it can access the data. Just like you saw, the end user has to consent to send out emails on their behalf. And the key point of O2 is that the application does not have the user's credentials. The users logged in but they didn't log in to your app, they logged in to the Google Auth service and then they only gave consent. So that's a big plus because you don't have to worry about storing these credentials securely because you don't get them. O2 relies on consent, not on credentials. Let's look at how this protocol works in detail. So more specifically, the authorization flow is called three-legged OAuth. It's three-legged because there are three entities. So there's your application, they're the Google servers, and then they are the end users. So when your application wants to make a request to the Google server, then the user is prompted to log in and to consent. And if they do so successfully, then your application receives a token. And after validating the token, the application can use this token to make API calls to the Google API. Luckily, the client libraries handle this somewhat complicated workflow for you. One thing that Leah showed us earlier was that you have to specify the scope. So just like in the back end and the front end, we have to specify the scope before initializing the Google client. So here, the scope is for Gmail that's sent and that's why the sign-in flow said, do you consent to this application sending emails on your behalf? Now, if the user agrees to that and then your app gets a token back, and now you take this valid token and you try to read the user's spreadsheets, even though it's a valid token, it's not valid for the spreadsheet scope, so you'll get a four or three. So only what the user consented to and that's specified by the scope that you pass in makes the token valid for. All right, so in the front end, we have an end user clicking the login button. We use three-link OAuth with end user consent. How do we handle OAuth in the back end where we just have a cloud function talking to the Sheets API, but no end users? It's a serverless cloud function, no front end, no nothing. Well, since end users are not directly involved, we use server-to-server authentication and in this case, the spreadsheet belongs to the application rather than to an end user. Headsepto, whenever you work on a production app, not just a little prototype like we did, you always want to use three-link OAuth for anything involving G Suite. So when you work with docs or Sheets or Slides or the file drive, you always want to use three-link OAuth and use end user consent. The reason is if this is a production app and you have millions of users, then all a sudden this application owns millions of documents and spreadsheets and that's way over the quota. So your default is whenever you use docs or access like that, use three-link OAuth. In our little prototype though, we don't worry about that. We just have one team looking at this. It's not a million files, it's one spreadsheet, so we can just do server-to-server OAuth. This is also referred to as two-link OAuth. Can anybody guess why? Well, there's only two entities involved, obviously. There's no end user, there's your application and there's the Google service. And for server-to-server authentication, you can make your authorized API calls using a JSON web token, a JIT instead of an access token. And again, if you use the client libraries, they handle all of this for you, so you don't have to worry about creating or signing a JIT. So comparing three-link and two-link OAuth, three-link is on behalf of end users, it requires users' consent, and it's the best practice when you work with a G Suite API. Two-link is for server-to-server authentication, it's on behalf of the application and the resources, the documents, they have to belong to the application. Thank you, Franzi, for that deep dive into OAuth. So like Franzi said, this is handled for you by the client libraries, which is awesome. And if you wanted to take a look at our quick starts and our documentation, they will walk you through the authentication path in just a few steps. And just remember, we want to use OAuth because we want tokens, not passwords. We don't want those stored in plain text ever. But let's get back to the whole point of building this app. We want to know if food satisfaction affects GitHub repo health and productivity. So we made a lot of charts showing food satisfaction against GitHub metrics. On the left side, we have open issues in PRs and on the right side, we have the happiness scale and the original survey ran from one to five where five was the most happy. So let's actually look at a smaller, easier to read chart. So this is closed in issues in merged PRs against food happiness. Food happiness is the red thick line that goes across with the hump in the middle. And if you, blue is the closed issues, green is the merged PRs. If you're not sure which is which, the most important thing to note is that they all have that hump around the same time, which makes us think that food satisfaction does affect GitHub metrics, which is cool. But know that you could also use these things for other internal systems or application monitoring that's maybe not as silly as this. But we do want to show you that we have the proof that good food yields good code. We do want to add one last thing onto our system though. Another workflow we see a lot is that folks have to create reports of data regularly either for their managers or for a broader team meeting and they often have to do it at a repeated time interval like weekly, monthly, quarterly or sales folks have to make customized reports for their clients. If you don't have to do it, someone you work with somewhere probably does. So we're gonna pretend that we wanna prove our good food trend continues over time and we're going to share this with folks all across the company in our report and we're going to script it because doing it manually sounds awful. I don't know if you've ever had to update one of these reports but doing it manually, it works. You can start by copying an old one and moving in the new data but I know I'm prone to copy, pasta, errors, copy, paste errors and it's very tedious. So instead we're gonna create a template with placeholders for the things that we know we're going to replace like the current date and we're gonna completely automate it. So we'll have first the Google Drive API create a copy of this template every time we need it and we'll use the slides API to tie it all together in this spreadsheet. So we'll initialize that client to make modifications and this should look a lot like the other Google G Suite APIs we've been using. And if you remember in that template I showed a couple slides ago where we had the placeholder for the date and curly brackets, that's what we're highlighting here, that's what we wanna replace and we wanna use a replace text call to replace it with today's date. And if you weren't sure this is going to be a batch update we could also be making other replacements at the same time. Like if we wanted to submit the number of pull requests that were closed or issues at the same time we could do that in this call. That is gross. The good news is we only have a few more slides. Hello, I did. Sweet. All right. So now we're gonna figure out how to add our charts. So it's nice about the APIs is there just as tightly coupled as the products themselves are. So Slides has an API call that can insert a Google Sheets chart just by its chart ID. So you can generate these reports with up to date data no human intervention either save yourself or your coworkers tons of time you could become that person who has to generate the report's new best friend by teaching them this. So just a quick review. We designed a basic reporting system using Node that calls Google Cloud Function using those G Suite APIs and it fetches data from external sources and shares that back to the user either through the Gmail or through an automated report in Slides. And at its core we're using sheets for the data collection and the manipulation. So we wanted to know if the food effective GitHub productivity. Maybe you wanted to know that too or maybe you wanna take the solution and make it fit your needs. If so this is where the code lives in GitHub and we would love to inspire you to either try this or try something similar. Like be creative, take a non-technical problem and try to solve it with the tools you use every day or a tool that you wanna try out so you can get better at it. We also wanna do a quick shout out to our friend and colleague Anusra Vastava who helped write a lot of this code to make this awesome. Thank you at the conference for having us and for you for coming. We'll answer your questions probably in the hallway or at the Google booth because I think we're running low on time. So thank you.