 All right. Hello. Hi. Thank you so much for having me here. I'm super excited and my name is Shirley I'm in today. I want to be talking to you about building data visualizations for product So before we begin why talk about it So about two years ago when I first started at Illumio We looked out for guidance and mentorship to try and find people who might be going through similar challenges as us And we found that we couldn't find it any we could have been just living on there a whole and a rock but but In retrospect that makes a lot of sense because Product is hard to talk about a lot of the times. It's proprietary So it might even be hard to share a screenshot let alone some of the challenges somebody might have faced but Since I have the benefit of being able to talk about it I would like to share with you some of the experiences that we've had the challenges that we face and the lessons that we've learned in the last two years and really hope that this might offer some guidance or help if You're about to go through something similar or are going through something similar So I work at a company called Illumio and we create enterprise security software and My favorite analogy for what we do is that if the traditional data center is a hotel and The firewall is that big deadbolt on the front door Then if anybody malicious gets him through the front door, then they have free-for-all access to all of your rooms And what Illumio does instead is provides a lock for each of those hotel doors as well as calculates who should be allowed in and out of each of those rooms and I work on the part of the product called illumination Which visualizes the data center traffic and this is particularly important for us because a lot of our customers have huge deployments with large numbers of rooms and What that means for them is that they very often times Don't even know who might be going in and out of their rooms and Oftentimes they also need to be able to verify that Who they've given access to to go in and out of those rooms are actually accurate So while thinking about what I wanted to talk about with this talk I've figured out some Technical aspects, so Building the actual visualization and building the framework around that visualization that manages the underlying data and scaling that the visualization and application as Where I was well as some of the more human aspects of interfacing with our customers and our team So before we begin I want to give some key terms for illumination So the squares that you see are the workloads or the rooms in my analogy The lines are the traffic or the people going in and out of their rooms and Each of those squares the workloads are grouped by their labels What we call applications environments and locations So this is the very very first mock that Our designer gave us when we first started out and it's really quite beautiful And this is our very first implementation And so some of you or many of you that must be that might be familiar with the D3 force layout Might recognize that a lot of it is being used here in Particular it's being used to calculate the position and of the workloads in each of the groups and Then once we have the positions of those workloads is being used or then we calculate the dimensions of those groups Based on the positions of the workloads and then once we have the dimensions of each of those groups We then use the force layout again to calculate the positions of those groups I am Alongside the floating workloads around it So for those that are unfamiliar the force layout is hugely computationally intensive of something like O of N log in over thousands and thousands of iterations and We were doing this five six ten times on each render and Just to keep life interesting We decided that we need a collision detection at each of those levels so we were truly abusing the force and This is like I'm really excited about the slide. I hope Jim's in this room somewhere I've been like bragging to him about it. So yeah, I hope we made Jim proud We like his talk was some of our inspiration and But you could probably imagine what happened because of this This is what we got For abusing the force We're really sorry So, how do we fix it? We circle back We talked to our CTO and figure we figured out that actually the most Intuitive way of laying out our workloads was actually by observing the ports and protocols that were running on The traffic between those workloads because surprisingly that reflected the mental model our DevOps guys had of their orgs So we went and completely stripped out the force layout from all of our code and we implemented a layout algorithm that was That was designed specifically for our own needs So this is the before and after the before on your left and the after on your right and This was absolutely awesome because The refactor gave us and lay out algorithm that was much much more Performant it was an O of n over one iteration as opposed to over thousands and thousands of iterations for the force layout and It was much more orderly because we made it so and finally Because the algorithm was deterministic it was able to maintain the same layout on each refresh and That was really integral to our customer experience to our user experience So we learned some really important lessons, which was mainly that For building visualizations within product We never we oftentimes don't have an exact dataset and we very rarely know The exact shape of all of our customers potential datasets But we do have the advantage oftentimes of having a closed feedback loop With our customers in this case It was just us because we dog food our product and we were able to figure out What our users expected to see and what was familiar with them and we were able to then customize and optimize our visualization based on that user feedback and expectation and Not only was that a lot more performant for us as it turns out It also meant that it was a lot easier For our customers to adopt from a mental model perspective So at the very beginning I mentioned that one of the things that I love about Or one of the things that we do is not only visualize our application or our customers application traffic but also to provide Ways to control that the access for that application traffic And we build a lot of workflows and features around that So one of the main workflows we have is this this is one of our earliest versions of the workflow for adding a rule and so what this is showing is all of the red lines show our traffic that are not allowed for our customers and the green lines mean that they are allowed and This workflow encourages our users to take the action To write the rule to turn those lines green to allow that traffic Which means that as soon as that user clicks save They expect all of the corresponding red lines within that visualization To turn green right away without a refresh Which means that potentially each user action could be changing the visualizations in multiple places often times in sorry changing the visualization in multiple places and We were like oh my god So I really really love d3 for the fact that it gives us fine grain control of intelligently updating the DOM And we really tried to keep with that when we first started implementing the illumination. So we really tried to you know We really tried to kind of control how much we recalculated the data as well as how as well as we cherry picked How much and how often we touch the DOM on each of those user interactions? And that was great when we first started out because we had a pretty simple feature set But as we added more and more features we started having more and more user actions So this table that you kind of see a fatally in the background is actually only half of our user actions At a certain point in time about a year ago And so all of the user actions as well as their corresponding data recalculations as well as their corresponding DOM Rerenderings and it became so much for us to keep track of to keep track of what needs to be entered and what needs to be updated And what needs to be exited with d3? That we really just couldn't keep track of them all anymore Which meant that the more the user interacted with illumination the more incorrect our visualization became And that's really not good so We fixed it by first we moved most of our rendering responsibilities to react As opposed to only with pure d3 We ended up using a mix of both, but this allowed us to abstract away a lot of the managing the DOM And we decided to recalculate everything Everything all data calculations on each user action you might say wow this really counter-intuitive that must been heavily heavily unperformant But we have found that because the modern browser and because we have moved to react in flux And especially because of reacts virtual DOM diffing This was actually surprisingly. Okay So this is what illumination looked like that same exact workflow of adding a rule after our refactor So It was great for us because because we stopped recalculate Because we now recalculate all of the data on each user action. We no longer needed to keep track of anything and Because we use fluxes one directional data flow Everything was really easy to reason about from an architecture and code perspective And because we were no longer using all of our brainpower to keep track of all of our updates We could instead concentrate on Delivering features so we could do much more than we could before and everything just kind of worked so We learned here that Managing underlying data, especially data that is constantly constantly changing with each user action is one of the biggest hardest challenges of building data visualizations for a product and for us We made the mistake of optimizing prematurely and from there. We learned to instead Start just with a very stupidly simple approach When you're starting and when we're starting with our architecture and Then once it has had the time to kind of just marinate and soak in the code base Then we can start figuring out exactly where the optimizations are needed So thus far everything I have shown you had been for relatively reasonable numbers But a lot of our customers don't have reasonable numbers They have huge numbers So at one point in time Our customer's deployments started to look like this So that's kind of fun, but not really great for user experience And so the first thing we did was we figured out that the probably the most straightforward thing to do is to just Reduce the number of data being rendered on the screen and we did this by aggregating each of the workloads by their respective groups and then would buy their location labels and Then we filtered down we filtered away anything that wasn't relevant as the user was drilling down and this this is actually our dog food environment and This worked really well for us. This isn't that big of a number but What this meant what this did was it meant that and this is what it looked like after refactor and This was much much more visually appealing one and two It significantly let in the cognitive load on our users because they no longer had to dig through those spaghetti lines And then three it was a lot better for a browser But what we started to notice as We got bigger and bigger in our numbers with our customers Was that even though the pretty picture withdrawal? Every user action a drag a pan a zoom a click started to lag and That makes a lot of sense because we were recalculating everything on every single user action So this is my favorite absolute favorite part We went in and we got crazy with the Google performance tools so We figured out three main things among a bunch of other ones, but The first thing we did was we figured out what was the Biggest current roadblock of what was slowing our calculations out of what was slowing illumination down And we surprisingly hindsight in 2020 found out that Because our data stores were using arrays The look-ups on those arrays were slowing us down significantly at larger numbers So we just replaced our rays with objects in the data stores and then the second thing we went in to figure out was and What was the next most expensive operation and we learned that it was actually calculating whether the lines should be red or green And so we figured out where that could be moved to which was on each time the data was loaded from the back end after and You know some user action of adding our rule or changing a rule and Then we did the similar things with all of the other places where we figured out what was least performance and we stopped recalculating those on each user action also and Finally, we took a look at what was the most memory and which data structures were the most memory intensive and we figured out how we figured out how much of the Data fit calculations we could move to the back end we moved as much as we could to the back end and Then we only loaded just enough data to render what's needed on the screen and This meant that everything became much smoother and much snappier happier both us and the product and and We could support much larger numbers than before so the important lesson we learned here is that We very rarely know exactly the shape of the data our customers are going to have Which is probably why optimizing prematurely fails so miserably for us But What we should do instead is to figure out what is slowing us down only when it's slowing us down and do so do the optimizations Based on what we've learned the results we've learned and gathered from the profile tools that we've run and We're still working on scaling. It's an ongoing process. We've been learning learning a lot and We just keep swimming. It's we just keep swimming. I had to have that. It's an aquarium. Let me get okay. I Just I just hope you guys would really like it and And So Thus far I've talked a majority about the technical. I've talked About the technical challenges that we've gone through But I think it's really also very important to talk about the human aspects as well So we've been very fortunate in that we've been able to have very tight feedback loops with our customers for the last two years and We noticed two things two primary things which was that when we first started We were largely selling our product and so None of our customers really were you know using illumination that much Which meant that a lot of our features most all of our features were Put in based on needs of the demo and Feedback that we got on those demos Which meant that we were building for customer wants as opposed to their needs and In the last two years as we've gotten more and more customers installed and using illumination We've started to see the patterns emerge of how they're using illumination which means that we've started to be able to kind of hone in on their workflows and Start simplifying the feature set of illumination and that's really important to us because Because in product with customers. There's always always edge cases Because we'll build something illumination For you know a data set that we're pretty sure that our customers should have a shape to like this hopefully and It works for most of our customers yay, and then there will come a customer that will come along and Then they have data-shaped like nothing we ever expected and then suddenly boom We have you know like support tickets and fire drills so At one point in time when we first started working on scaling our product We made the assumption that we should opt optimized for the number of workloads that seemed pretty reasonable and Then over a period of time we started getting all these support tickets filed on us About illumination freezing at really really small number of workloads 10 15 20 things that we were supposed to like be able to support Reasonably easily and then when we started digging in and we realized that it was because The data wasn't just going to grow and scale over the number of workloads There was other places of complexity also for example the number some of our customers that you know froze illumination Had huge huge numbers of ports and protocols running between some of their workloads Or we would be tracking large numbers of IP addresses with for their IP lists So as we've scaled and as we've gotten bigger And what we can handle with illumination. It's become increasingly more important to be able to really understand the needs of our customers and cut out any Unnecessary unnecessary features or else edge cases and and everything blowing up over and over and so the lesson we learned here was that When we build for product there's obviously Different customers that have some customers that have small small data sets and some customers that have large huge data sets and We need to be able to accommodate both of them for illumination so We've learned that we need to take advantage of that type feedback loop and figure out what's important in their use cases and work towards A visualization and an application that can be flexible that can handle all of those situations and the most important To be able to do that is to be to always make sure to get and use real data as Often and as early as possible and If we can't get real data from our customers then to simulate their data as closely as possible Because only then can we make informed decisions about their needs for their use cases so When we first started with illumination We had two front-end engineers working on it full-time We are now a cross-functional team of front-end back-end QA design product project managers Most of us working full-time on this project and that's really absolutely really cool but Despite all of this emphasis and importance that's been placed on Visualization with our in our product sometimes it's still surprisingly surprisingly difficult to get immediate buy-in on new ideas of visualization and I think that's because of two primary reasons that one no matter how brilliant our team is because they are and Sometimes the visualizations that we propose are absolutely esoteric like we might think that parallel coordinates and Sankey diagrams make so much more like so much sense to us but When we pitch it to our teammates that have never seen it before It's hard for them to understand what it is let alone what it's supposed to do or should do or How that data will look how our data will look within that visualization let alone What the use case or the customer need is for that visualization and second Illumination is quite novel. I've been told often times that the security industry because the UI isn't that sexy and Has never really seen something like this before some of my favorite stories are of our customers seeing their networks visualized for the first time and Going that's our dev and that's our prod and There's a line between them that shouldn't be there Let me hold. Let me talk to our dev ops guys and That interaction is really really cool. That story is really cool But it also means that sometimes our team don't know doesn't know where we should be going Because our customers don't even really know what they want or need from us or we where we should be going so the lesson and I think that's actually and that's actually quite exciting and the way that we've gotten around that is by One when we don't know where we should be going Encourage prototyping we made prototyping really easy for ourselves where all we have to do is to drop a hidden page into our application and Then it has access to all of our resources of you know Functions like our data calculation functions are rendering functions And it means that when we have an idea for a visualization It's relatively inexpensive to just go and prototype for a few days Come back to our team and say hey that idea we had that I had This is what it looks like with our data and now our team can use that prototype and Make an informed decision of whether or not is important or valuable for it to go into our product and Sometimes it is and sometimes it makes it into the product and sometimes it isn't but that's also okay because then we've taken the time to explore and And To explore what's out there and our options that we could potentially use in a future use case and When we didn't know where we needed to go So in the past 20 minutes or so I've kind of just like dumped all these stories and experiences onto you and to recap so first customaries and optimize the Visualization based on user expectation and feedback Second don't pre-optimize but rather start with a stupidly simple approach when it comes to the framework architecture around the visualization third only Optimize when we hit the rope performance rope block and then optimize based on data from the data we get we gather and Always always get and use real data as often and as early as possible And when we don't know what to do Always prototype and always explore So for the past few years I've learned so so much from all of these experiences And if I can boil it down to some of the just most important points for me It's been to be flexible both in the product because we never know what kind of data our customers can throw at us And also to be flexible in the process so that we can explore when necessary, but not all the time and perhaps my the most important to me is Be willing to throw things out refactor and Don't don't get attached to any one prototype or iteration and I fully believe that The team that can do this is the most important because I fully believe that is because of our team and our beliefs and Our culture that we built that has taken illumination to where we are today so at the very beginning I mentioned that There was kind of not as many resources that we could find in terms of giving us guidance on How to build visualizations within the product so I've compiled a survey And if you've had these experiences of building for a product Please help by filling this out because What I've talked about is only one perspective and I'm really really hoping to compile a Multitude of perspectives and to sound into a write-up of sort of like a mistakes made and lessons learned and in the meanwhile Please join our slack channel called data vision product Because we're trying to really foster Maybe like a kind of a community support group because I'm pretty sure that I'm not alone in this so Thank you very much. That's all that I can think to say