 My name is Jason DeLancy. I am a developer evangelist with GE Digital. It's a small Silicon Valley startup. And so, from Thomas Edison to Jeffrey Immelt. I don't know if you guys are familiar with him, but he is the CEO and chairman of General Electric. And last year he was quoted as saying that we hire between four and five thousand college grads every year. And whether they join in finance or IT or marketing, they're going to code. And I think this quote speaks to me for a couple of reasons. One, technically speaking, I'm in marketing. My background is software development, but now as a developer evangelist I am in more of a marketing type of role. But also as an undergrad at college, I went to a school where everybody had to take intro to programming. English majors took intro to programming. And I think this is very relevant for where GE is and where GE is going because not everyone comes from a traditional software engineering background. I'm interacting a lot of times with a material science engineer, an electrical engineer, a mechanical engineer, someone who's an IT or a data scientist who maybe hasn't been developing web applications for the last 20 years. So they don't have a good grounding in, you know, now that they're writing code, they haven't had to deal with security flaws or other exploits and things like that. And so knowing that people who are getting involved with this sort of background can be kind of important. And so today I'm going to kind of walk through a particular project that we've been working on at GE Digital. First I'm going to try to answer, so this is the rhetorical questions part of the presentation. So vision, why was ACS created? So ACS is what we call our access control service. I don't know if anyone else is still paying student loans, but it's a very different thing, ACS. But also I'm going to try a little bit more about the vision and why it's important or how it's being used at GE. Then I'm going to go into sort of the history of, okay, now that we understand what the problem is, what is UAA, other than an acronym, what is OAuth, what is ZACML, and what do they have to do with what I'm talking about. So I'll kind of go into some of that history as well. And then from an understanding of the project, some introductory concepts, what are the subjects, what are resources, what is a policy when you're dealing with access control. And so I'll dive into some of those. And then try to get a little bit more into the usage of the tool. It's a service, so how do you create that service for yourself? How do you create policies? How do you control access to an application? So we really are looking at the application layer, restful services, how to control access, especially in IoT space. And then I'll kind of start looking a little bit more at the source and try to explain how it all fits together, how it was built, and where you can find the code if you're interested, or find out more about the project. Some things that I won't be talking about. I'm not going to dive too deeply into authentication or encryption. It's not necessarily my background, so SSL, TOS, SSO. I may mention some of these things, but I'm not going to talk about it too much, Identity Federation. Mr. Robot is not part of my talk, but if you want to start talking about Season 2 maybe. Intrusion detection, multi-tentancy, important concepts, blockchain and how that's being used, security for devices and industrial control systems. I'm not going to go too deep into those things, although those are all part of the bigger picture. So making sure that we're sort of level set on some of the terms. Authentication, authorization. Of course, we have probably heard these many times again, but authentication, are you who you say you are? I am a GE employee or I'm an attendee of this conference. I have a badge. I can prove who I am to a certain extent. And are you allowed to do that? That's the authorization concern that I'm going to be focusing a little bit more, which is I am a GE employee. We've already established my identity. Now allow me to enter the building. And so I think when we look at a lot of applications, you might see very simple logic. If users and employee show the employee entrance, they'll show them to the visitor lobby. If you have a web application, if you're an administrator, allow you to go in and modify devices and configure their properties. If you're not an administrator, you just get a regular dashboard. And so that would be a very simplistic access model. But in industrial Internet of Things cases, it gets a little bit more complex. So there's a commercial with ON. What does GE do? We're involved with a lot of different businesses, so wind turbines. When we're talking about Internet of Things, the thing that we're talking about is not necessarily a thermostat, but a big piece of heavy equipment, a jet engine, manufacturing plants, and how to automate the production of individual things. MRI machines. So healthcare. There's lots of healthcare use cases. Transportation. A locomotive is a rolling data center with a lot of sensors and information about how it's operating. Smart cities. Entire cities and every streetlight in that city can be equipped with sensors gathering data. And there's been pilot projects about this. So those are the things that we're talking about. Power generation. Hydroelectric plants. Nuclear power. Coal power. There are GE businesses that focus on those areas and develop software to help manage those applications. And light bulbs. So another, again, level set. Complicated versus complex. A complicated might be a system with a high level of difficulty. Store the user credentials in a token using Bay 64 encoding scheme transferred over to TLS connection. And with this being embedded Linux conference, there's a lot of very complicated and interesting technology to explore and understand. But on the other side, there's complex. A system that has many components and how you manage that complexity. So allow GE field engineers or a subcontractor who has electrical engineers read access to the data coming out of that wind turbine using this piece of software that was developed by an integration service provider at a particular customer's location during weekdays except for their one location in Springfield where that contractor is doing a pilot program so therefore he has right access but only for the next six months. It's a little bit more complex. Everything there individually is pretty simple but taken all together, you have a lot of complexity to manage. Last week was Valentine's Day but relationships are complex. So again this sort of this notion that you have a partner application, a customer facility. At that customer facility, you have another company's product, you have a GE product. That other company's product is actually using a different GE business's part whereas the GE product has a GE part and another manufacturer has produced parts that go into that. There was an interesting article that came out over the weekend. I think it was Business Insider about airplanes manufactured by Boeing. It was sort of a response to Boeing. It's made in America and it shows where all the different parts and components. So if you flew here from somewhere other than Portland you probably flew on a plane and when you think about that airline and all the different companies that are involved with the manufacturer, with the servicing, the operations, all the software, all the regulations that go into it, there's a lot of people that have hands in this access control problem. So now if that is the problem we're trying to solve, how did we come up with this project? Well first we started looking at Cloud Foundry as the basis for our platform. Just to get a temperature reading, how many of you are familiar with Cloud Foundry? You can give me an applause or you can raise your hand whenever you prefer. So a few people have maybe heard about Cloud Foundry. One component of that architecture is UAA. It's the User Account and Authentication Service. So what that service provides is authentication, identity management for platform as a service. This is a layer on top of your infrastructure managing lots of data centers, whether it's on-premise, whether it's equipment, whether it's AWS Azure, Google Cloud and so on. In addition, it is an OAuth 2 authorization server. It's an implementation of the OAuth 2 standard, supports SAML SSO authentication, skim-based identity management, and it is open source. And that is very important for GE in particular. It is a basis of one of our key platforms where I'm involved called Predicts. But it allows us to also look at and inspect what's going on. Now there were a couple of problems, like while we use UAA and we like OAuth in particular, there are a few issues that we ran into when we deal with those complex relationships that I was talking about before. For one, the scope-based privileges are very coarse-grained. So if we go back to that pseudocode example and if we're just making one very simple, we're not an admin. That's not so hard. That's very coarse-grained. OAuth can support that. But when we have all these nesters or data striping on very specific, you have access to this asset, but not that asset within one service. It starts getting a little bit more complicated. The other thing with OAuth is that the scope is tightly coupled to that access token. So here's my ticket for TriMet that got me here. But if I drop it and someone else picks it up, they're able to make use of that access token. So this works okay with OAuth because it's over a TOS. We understand things, but we're talking about devices with very different communication protocols. It can get a little bit more complicated. The other issue with that is that that access token during a lifetime until it expires, whatever privileges have been granted to the token, are there. So if we have time-based or hour-based or changes being made to access policy in real-time, this access token, you have to force the user to log out and log back in in order for some of those changes to take effect. Cannot efficiently make access control decisions per request. The other issue is when you think about identity management at scale in the cloud, there are a lot of different companies that have federated identity. So if you had to reach out to that identity server at one location, it may not be fortunate enough to be co-located with whatever applications you're running that need access to that data. And there are additionally constraints on the token size. So if you have a very complex relationship like I was discussing, lots of different attributes that might identify what a person is, what a resource is, what a device is, you exceed that token size. And so that, again, makes it not necessarily something that was a solution for our problem. So what about ZACML? So this is another standard to extensible access control markup language. So on a surface, it sounds like exactly what we were looking for. But there are a few things with it. One, the policy specifications are done in XML, and we're working with a lot of RESTful services in JSON, many different programming languages. So there are some complications with that. It can be difficult to understand the implementations that were out there, and I don't know if this is still true. Different vendors provide access to it, but they could be expensive to license and make use of. The code for conditions, when you're defining your policy, you're writing source code within XML, and again, from a developer experience standpoint, it was a little bit complicated, a little difficult to understand. It wasn't quite what we were looking for. And so that kind of brings us to the concept. What is the concept of ACS? What is this all about? What problem are we trying to solve? So we start with a few high-level system requirements. One, being a consistent and reusable solution that's decoupled from the application. So if we go back to my, if you're an administrator example, or if you're an employee example, if you have many, many different services, many, many different applications, many, many different facilities, having that implementation, even if it's just that simple if-else statement everywhere, if you need to make that change to that, you kind of want to decouple that logic from that application. Having a consistent way to define those policies so that you can go to an authority that says, this is the access control policy for this set of devices, or this is the access control policy for this set of applications, for this set of users can be very beneficial. Having a shareable and distributed privilege store is just pretty much the same point. So one reason this also becomes important is the trend toward microservices, small standalone services that serve a single function. So it's sort of your pendulum swinging from the other side of having big, monolithic applications, having a lot of services that solve one particular problem. So let's decouple that access control problem from them, which also gives us a certain amount of language independence. So all those brand new developers, people who don't have a computer science background, are now writing services, implementing security procedures for devices or applications. Some of them are in Python, some of them are in Go, some of them in Java. So having some independence by having a restful service layer is beneficial. And we can do that by having a service contract. So that there's an expectation of what access control should be able to do and being able to talk to those. So the ACS feature set, we're talking about attribute management. And there are a number of features within that, hierarchical attributes, scoped attributes, and attribute connectors. Where do you get those attributes from in the very first place? As well as policy management. How do you evaluate whether a policy should allow someone access to a system or not? As well as having multiple policy sets at the same time that some domain experts may understand policies but not others, and being able to fold those together. So some of the core components of ACS is this notion of a subject. So that's your entity representing a user or a device potentially. It has a configurable set of attributes. So the things that you might think about there is an identity, maybe that's a URI, that's a reference to an OAuth identity. Again, OAuth solving the authentication problem but not necessarily the authorization problem for some of these applications. Maybe I know an organization or department or role or any number of discoverable attributes for a particular subject or device. Then you have the resource, which device can be on this side of the equation as well. So in that we're talking about machine to machine. But a resource represents the configurable set of attributes of interest for the thing that you're operating on. We tend to call them assets but it could be a service, a RESTful endpoint as well that's representing that asset. Are you allowed to operate on that resource? And then if you are allowed to operate it, what exactly are you able to do? These actions match pretty cleanly to RESTful verbs, get to read something, to write something, to create something, to remove something. But they're all also support for other operations, patch, subscribe, messaging. And again, this is basically because this is an access control system for RESTful endpoints and where those RESTful services are being the common vocabulary. And then the policies themselves. So if you have a policy evaluation engine that's maintaining rules to compare these attributes of the subject and resources to determine whether a user is permitted. So you have to define conditions and then the decision. Do I allow or deny access? So to make this very concrete and really users of an application. So the user in this case is the subject and we can say, hey, they have a role, we have a location, those are attributes of the subject. So this is like a requirement for an application that somebody might be developing. The asset in this case is some physical thing. Maybe it's a wind turbine. Maybe it's a commercial application or in a manufacturing plant. And then that too has some attributes. It's manufactured, it has a location, and maybe there's exceptions. Not Oregon or maybe it's just not any other state. And that is all for the application policy. In this case, I just mentioned APM. It is one of the GE software products for application performance monitoring. So it's something that's used for monitoring lots of assets. And so what this request response looks like over HTTP, you have your app that says, hey, I have a user 123. He wants to get asset 456. So there's a subject, there's an action, there's a resource that comes in as a request. So ACS will say, okay, can I discover what attributes I can about that subject and that resource and then evaluate it against a policy for that resource and then send back a response. And so in this response, we see that, okay, we've made a decision. That subject has allowed permission to use that application. And along the way, we found out a few things about that subject and we found a few things out about the resource. So the application can make use of those attributes. They are basically cached for its usage. However, the application didn't have to implicitly know much about how to find out about those things. So one thing that does come up with this is, again, when you start getting into some of those complex relationships. So let's say user Jason Delancey. So that's me, I'm an evangelist with GE Digital. So I have a role and maybe that role has an attribute, hey, grant access to a report like for asset performance. So that means when that set of attributes is returned for me as a subject, it'll include report asset performance. Because I'm part of the group GE Digital, I'm also allowed to use this application. I'm also part of an organization and that grants me access to the application. And because I'm in that organization, it's inheriting that attribute to pass it through. I'm also part of a tenant giving me access to certain sets of services. That all seems pretty straightforward. So scoped inheritance, taking sort of that same picture but then adding in some constraints but only if the site is California. So when this is evaluated, it's basically taking away access to that asset performance report when I'm on a different location. So those are the basic concepts of the project. So what does the usage look like? Or how would you use this thing for people who are trying to solve similar access control problems in applications? So just for simplicity, I run in a Docker environment with Cloud Foundry. So the way ACS is currently set up, it is partnered with UAA for that identity management piece. It doesn't have identity management or authorization included in it. So I just have a Docker container I used to pull in some of those dependencies. And because I didn't want to attempt demo gods, I am just logging in to the predicts cloud to get access to this. You can pull it down and run all of these things because it is open source, but getting Cloud Foundry up and running, getting UAA configured on a device can be a little bit of work. There's some documentation about how to do that, not necessarily through ACS itself, but ACS is designed to run on that platform. So I'm just using our cloud instance, which allows me to create a service and get some details about it. So in this case, what I was interested in was what is the URL? How do I get to ACS? Do I have permission to get to ACS? And what is the scope of that running instance? So here, this is my example where I was running it up in the cloud. If you're running it local, it is a spoiler spring boot application. So if you pull the source code down from GitHub, you can just do a great old run and bring up a local host instance. But you do need that UAA piece. So if you were to inspect either your local instance of UAA running under Bosch Light or something out in the cloud as your identity management, you're going to get a token, a bearer token that has a set of scopes. And so when I was talking about access control before, these are examples of those course-grained scopes. Does a particular user have access to read and write policies? Does a user or a client have access to read and write attributes? And so this is really talking about whatever application or service is being written, whether or not it's even allowed to talk to ACS at all, because you don't want anyone to be able to go in and modify those access control policies. The API, there are some documentation online in the GitHub project about it, but basically it is just resource and subject when defining your attributes. So I'm doing these with just some restful or some straight calls. Again, not having a strong opinion on whether you're developing application in Python or Go or Java, so just to have a common language here. But basically the headers include that access token from OAuth saying, yes, I'm allowed to talk to ACS, and I just want to put or create a new role evangelist. So ACS is a framework for handling these policies and these attribute definitions. It doesn't have a strong opinion about how you go about it. While role-based access control makes a lot of sense or devices and assets and resources, a lot of that becomes a decision point for someone, how they want to manage their information. Some of the connectors I'll talk about make it a little bit easier, but... So one of these requests might look like having a subject identifier evangelist and some attributes, which is basically just a key value pair or a set, it's a list, but in this case just defining a role as evangelist. And that issuer is some trusted authority on where this data can come from. So now that I have that role defined, I can also set up an inheritance relationship. I'm creating a user for myself and specifying some additional attributes like a location. So now I know my location and I know what my role is for this particular subject. On the other side, you can do the same thing for resources. So if I'm creating some location in California, it's just a key value pair. And then we can start defining our assets and how you get to this. Again, this is sort of the low level, how you scale this up is a separate problem, but when you're looking at a particular asset, let's say asset 12, it has a location, and then maybe I create asset 13 where I have not specified a location. And maybe that's Oregon, maybe it's somewhere else. So this is where we start then getting into the policy specifications. So there are, like I said, a lot of similarities to Zackmull in terms of the concepts, but this is the JSON specification for how to define a particular policy set. So I've given it a name, just sort of the default that evangelists can access assets. That just seems like a good thing to have. And we can see that the action there is get. It's whether or not I'm able to read. So the resource, you know, so that's a name, an asset, and a URI template. So this is how scaling up the number of attributes and assets you have to define. That there's basically just a pattern match on the URI template. And then we look at the subject as well. In this case, I have a role. And then the condition. So what does the condition mean? So there can be a number of conditions that all would be evaluated. In this case, we're talking about, hey, let's make sure that whoever's trying to access this is an evangelist. And the condition is specified as basically a Lambda function in Groovy. So match a single instance of the attribute from this issuer, look up that role for that subject, and compare it to evangelist. And if they match, then this condition is met and they are granted access. So have that effect of permitting access to the system. So I kind of hinted there that Groovy is the thing powering those conditions. So it's very domain specific in that there's subject attributes, resource attributes, split equals size. So effectively using abstract syntax tree, only particular methods have been approved so that you can't necessarily go in and write a condition that's creating a brand new class. You can't import other libraries or write methods. So there are some limitations to what you can do there, but for most conditionals and or not, the type of things that we see in conditions for some of the industrial applications, this is sufficient in terms of a feature set. So what that then gets down to is, now you're doing a post to say, let's do an evaluation of this policy. Can this subject get this particular asset and in return permitted or denied as appropriate? Cool. So if we track this to the source, so how does this all work? So I kind of mentioned it is a spring boot application. The application is out on GitHub. It's github.com. If you want to look at the source code, there are a lot of JSON template examples and everything there. And a lot of my examples here, I went down to the rest call layer or what the actual protocol exchange is because I didn't talk about a particular client library, but ACS does actually have a spring security extension as well. So if you are working in Java and you want to use ACS, you can download the github repository, wire it up to Cloud Foundry and write any applications using the security extensions that just handle a lot of those requests to creating resources, creating subjects and everything. So Java and Groovy are a big part of this implementation. Because of that relationship, that hierarchy, as you might expect, there is a graph database. So for the current implementation, it is based off of TitanDB. It uses Apache Cassandra for storage and TinkerPop using Gremlin as sort of the graph traversal language to figure out what the policy evaluation would actually be. So if you were standing this up, these are some of the components that can be pulled down and configured to run ACS on top. So basically we are talking about a service that you can run on top of these databases. And so I think Titan would allow you to configure other data stores if you were to go in that direction. Caching becomes very important. So Redis is being used to cache a lot of these attributes. So I kind of hinted at these connectors. So it's again the idea that some of these attributes might come from your identity management system. They might come from a device manufacturer. There could be other data stores that have a lot of these attributes that can be discovered. And so a lot of efforts in the source code around what are some caching policies, how do you decide when to fetch new attributes, but without having too big of a hit on performance because if you're doing this access control with each request, where appropriate, there is some performance penalty in order to do that. The platform itself, as I've kind of mentioned, built on Cloud Foundry with UAA and Jenkins for build. So if you go to the project, you can find out a little bit more about some of those things. And so that kind of leads to a little bit more of this. What is this open source story? So ACS is potentially interesting because it has been open sourced under an Apache 2 license. So that's interesting from a GE perspective that the source code is just made available. Cloud Foundry is another area where we make a lot of contributions. So as I said, my background is not necessarily, I'm not a cybersecurity professional, but we do have those folks who are the breakers. There's another way to, I really like this notion that I happen to be on the maker side of things, but there is another personality type, the breakers who are just really good at figuring out how to exploit a system or how it's not supposed to work. So we do have a lot of breakers that go in and look at UAA because it is open source. They can kind of review the security behind how authorization and authentication work within UAA for some of those course-scrained security policies and identity management. And so there have been a number of times where we have discovered some of those vulnerabilities and provided patches back, and it's just not necessarily the story that's been well told in terms of what some of these contributions are. So within Cloud Foundry, because it's a platform as a service, was originally built around HTTP requests, WebSocket requests, didn't really fit IoT as well as it could have. And so we actually had developers who worked on the Go Router to make it work properly for other protocols, MQTT and so on. Polymer is a very, I don't know if anyone's familiar with Polymer, it's a UI or a web component. There's another example of some of the open source projects that some of the development engineers are working on, as well as just involvement with a lot of the consortia, industrial internet, open fog and so on. I do want to highlight some of the committers to ACS. So again, github.com predicts ACS. You can go through and see who's some of the committers and look at the code, look at a lot of the JSON examples, and that really can give you a flavor for how this goes. Really it comes around to not necessarily needing to reinvent a wheel. We have a lot of GE businesses that are using ACS for managing their access, instead of going off and implementing their own authorization frameworks, they're able to adopt ACS. If those businesses find things, it's almost that inner sourcing model. But by being open in the first place, it's helped set some of the ideas around governments and commits. And so that is ACS. If there are some questions, I can take some questions or I will be around for a little while and can give you a little bit more insight into ACS. Yes? How does this connect out to the user management frameworks like SCIA? So that's usually done by the UAA portion. So that's the dependency. So UAA is talking to other federated identity management systems or other identity providers. Maybe that's not what you meant, though. Yeah, so in my example, the issuer, so that would probably be UAA. So as you are putting data points, the subjects, into ACS, you can link back to the UAA record for that person, which in turn is maybe speaking to some other identity provider. And what are the resources and resources and emissions? Yeah. So in many cases, we have another service. We call it the asset service. So that is the typical pattern that we have, at least, for recording those entries. So part of the ACS project is this notion of connectors and being able to talk to some of these other repositories of resources. So I'm only familiar with the asset service because that's what we tend to use. But I think that's one of the strengths, potentially, of it being open sources. If there are other resource that should have a connector to be able to discover some of those attributes, certainly that would be a welcome conversation. Yeah. Do you preach to that as they are chosen from any framework of the masses? There has certainly been a lot of fans. Yeah. So the predicts UI that we're using for a lot of applications is based on polymer. And so I think that's it again. It's another one of these that we haven't focused enough, maybe, on what that open source story, there's lots of other companies that put out their open source report cards and really focus on what those contributions are. But our IT organization, lots of organizations across G that I'm not even aware of on weekends are working on projects and doing lots of really good work. So polymer is another example where we are actually, for some of our applications, building the UI on the framework and then trying to work back with the polymer team. Sorry, I do tend to talk fast, hopefully. Oh, yeah. This is a great question. I would put it definitely on the early stages side of things. So it is being used a lot within the GE applications. A lot of our businesses are making use of it. So that's why I say a little bit more on that inner sourcing side of things. But because it's been released under an Apache 2 license, I think there is a goal of getting it out there and evangelizing it to a extent to say, like, does everyone need to reinvent the same wheel? It's not that it's necessarily a super complicated problem, but dealing with some of that complexity and learning from some of our use cases can help contribute to that project and make it something that other people can use. And like I said, there's a, you know, under the hood wearing a Cloud Foundry shirt here. There's a relationship with the Cloud Foundry platform as well where it does seem that a lot of people have these problems. If you have that very simple case of, oh, you're an admin or not, you probably don't need it. It might be a little bit overkill in that case. But if you do have much more complex relationships, lots of assets, lots of devices, and trying to deal with all those things, then absolutely. Any other questions? All right.