 Welcome to, from Excel to Rails, a path to enlightened internal software. When I originally gave the title of this talk to the people running the conference, it was called the path to enlightened internal software, but they told me it was two characters too long. So that's why it's now a path, but I feel that's probably more accurately reflects my ambitions. So, my name is Nick Revel, I'm a director of engineering at StitchFix, and StitchFix is an online styling service for men's and women's clothing. We are, people fill out a style profile with us, we use a combination of data science and human styling to send them clothing, five items of clothing in a box. They try on everything they like at home, keep what they like, return what they don't like. And so the reason we're here is because we build an awful lot of Rails software. Customer-facing software is all Ruby on Rails, but we also build all our internal software. So everything that runs all our five warehouses, everything that's used by our stylists, everything that's used by all the buyers, that buy all the clothes that we sell. We're also one of the sponsors here, and so you can come in, hopefully come and see us in the exhibition hall. We have some really amazing socks that I'm sure you'll all enjoy, and cold brew coffee. And we did have t-shirts and we ran out, but we should have more t-shirts for tomorrow, so please come and get t-shirts from us. And talk to us about jobs. So I'm here to talk to you about internal software. This, as I'm sure you can tell, is a candid photograph of a colleague of mine in our office. Don't worry about the watermarks on the picture there. He's called Chad or Trent or something, I don't know. And he's very excited about the internal software we've created for him, which is really the goal for us when we write internal software. I've been writing internal software, that is not consumer-facing software, for most of my career. When I first started writing it, being a developer, I was actually working with a technology that probably almost no one has heard of, called Lassu, and it had a FileMaker Pro database, back-end, which was kind of terrifying. But then we upgraded to PHP and MySQL, which was great. And then finally, I moved on to Rails. But I spent most of my career writing what you might call expert use software, and most of that has actually been internal software. When I first started as a developer, I was reading a lot of blogs and a lot of books and trying to find out a lot about the industry. And one thing I read was Joel on software. This is Joel Spolsky, who now is most famous, I guess, for being the founder of Trello and Stack Overflow. But back then, he was mostly known as a blogger and of this blog called Joel on Software, that became this book. And I was really into reading this, and there's still a lot of good stuff that you can read in there. This picture of Joel Spolsky here, if you actually do a Google image search for Joel Spolsky, you'll see lots of pictures of him looking like the very pleasant human being he actually is. But I chose this one where he looks like a homicidal Danakroyd because it suits the purposes of my narrative. So I was reading one, it's one article, and I was writing this internal software, and I was really enjoying my job. I've got a lot of value out of it. And then I read this article by Spolsky about the five worlds of software, and he identified these five different types of software that people can end up writing. And he identified them as Shrinkwrap, which in those days actually was Shrinkwrap, like it was stuff that would come on a CD. Now, something you would buy at an app store. Embedded software, games, throwaway, and then the last one he identified was internal software. And he really dumped all over internal software, and I was really upset by this. It's like someone you really admire. It's like being friends with a cool kid at school, and then they tell you that deaf leopard are lame, and you're like, come on, that's not fair. So this has stuck with me for 11 years. I actually remember this when I was writing this presentation, and so I am gonna use this presentation as a chance to completely repradiate everything he said about internal software, based on one sentence. So the first thing he said was, in internal software, usability is a lower priority. This is definitely false. The better, the more usable the software is that you write, oh, there you go, I'm getting confused about my ear. Excuse me for that. Usability is a lower priority. If you write more usable software, your employees, the people who work on the software are more happy, and therefore less likely to leave the company. Now it's not gonna be the only thing that makes them stay, but it's obviously gonna contribute to how they feel about their job they do. It also means they can be more efficient, which means you need to hire less people, because a limited number of people need to use the software. Well, that's true, clearly. No internal software is gonna have the same number of users as a successful consumer facing pieces of software, but the difference really is that any internal software might be used by people all day, every day, as a huge part of their job, and that's certainly true with a lot of the software rewrites, so you need to make sure that it works well for them. They don't have any choice in the matter. That's false, they can go and work somewhere else. We write software for our warehouse employees who pack the boxes and receive all the goods that we have, and those warehouses, we have five of them around the country, they're all located next to other warehouses where people do similar jobs, and if they don't like what they're doing with us, they can go and work for Amazon or someone else, and people actually say that the software we write for them is one of the reasons why they like working for us. And finally, they will just have to deal with it, and unfortunately that is also not true because they have other software on their computers, they have Google Spreadsheets or Excel, and if they feel like their software isn't doing a good job for them, they'll create their own software to solve their own problems, and that causes a lot of problems for the company at large further down the road. So Excel is a kind of remarkable piece of software. It's used by most businesses, lots of small businesses, including startups, run on Excel. In fact, Stitchvix, when it was founded, our founder Katrina Lake basically created the whole business using a combination of Excel and Google Spreadsheets, and she used it to run the business at a very small scale but then work out what the processes were for running that business, and so by the time it came to write real software, she had a great idea of what she needed, and we ended up building the right thing the first time, not perfect of course, but pretty much the right thing first time. Also big businesses also use Excel. I was kind of stunned to realize when I joined Stitchvix, there are a lot of people there who come from other retail companies that places like Macy's and Nordstrom still use Excel largely to run their merchandising business, and these are companies, I think Macy's has a revenue of like 10, 17 billion a year or something, and they're using Excel to plan their whole business, which is kind of crazy. So why is it so popular? Well, a big reason is Joel Spolsky worked on version six of Excel, which he will tell you repeatedly if you read anything he writes, but very important. Also no expertise is required really, like you need to learn how to use Excel but you don't need to be a software developer to use Excel, and you have then a lot of ownership and control over the processes that you are trying to kind of model with your Excel spreadsheet. It's super flexible, and it's actually great for prototyping things, and I think that the example I gave you about our founder, that Excel was great for just trying things out and then being low overhead and changing parts of the system she was creating. And kind of most important, if you watch people use Excel, a lot of people who are like Excel power users, one of the most important features is it has both vertical and horizontal scrolling, which is kind of mind blowing, how much they use that and how important that is to people. And it's actually a feature you need to, when you're building software to replace Excel, you have to work out, how are we gonna replace what they're doing with that horizontal scrolling? But so if it's so great, then why do we build internal software at all? There's a bunch of reasons. There's no version control, people share copies of important spreadsheets over email in different ways, and they end up with multiple copies of the same spreadsheet and you don't know which is the right version, which is the real source of truth. Within a spreadsheet, people copy and paste data all over the place because if you wanna refer to the same piece of data in multiple places, it's really difficult to do that in Excel without just copying and pasting and then you don't know which is the right data. There aren't really validations. You can kind of fudge it a bit in Excel, but really to have validations as we know as Rails developers is very difficult. So even the data that initially goes in, even if it's in one place, you don't know if it's the correct data and it's not scalable. Excel on the Mac has a row limit of a million rows, which is for any kind of serious or successful business is not a lot of data. And also if you want multiple people to use the same spreadsheet, you have to ask people, hey, can I go in the spreadsheet now or are you using it? And it's kind of a nightmare. There's one other bad thing about it, which is it makes nonsense like this look professional. So although you might say the same about keynotes also that I'm using now. So internal software solves all those problems for Excel, you have a single source of truth with a relational database. You can just have one record for everything you care about and have it just have it referred to in different parts of your system. You could obviously have validations both at the database level and at the app level to make sure that the data going into your system makes sense. And it's totally scalable. You know, a million rows for a Postgres database is nothing, but even if you do reach the limits of something like Postgres, there are lots of well-known established solutions to that problem of scale. And the final thing why internal software is better than Excel I think is that it allows users, the people who are sort of running the business to actually focus on their jobs, which is making decisions and not just building spreadsheets or being like a kind of manual macro monkey, which a lot of people do as part of their jobs. Now obviously you can build internal software with lots of different frameworks, lots of different technologies. I think there's a couple of reasons why Rails is particularly good for it. Ruby is great for writing readable and maintainable code, which is obviously good for all software. But on top of that with Rails, if you follow the conventions of Rails, it means you can write a lot of good quality software very quickly, which especially in the early days of a business and a startup, it's really important. You have a lot of things to cover. You wanna make sure you get rid of a lot of those Excel spreadsheets. And so Rails really gives you the ability to do that. And then you end up with code that is maintainable and again, if you follow the conventions, it's easy to onboard new developers because they know where to look at least for all the key pieces of code that they wanna know about. So I wanna talk to you just now to give a case study of how we went transferred from Excel to a Rails app as StitchFix. And I think one of the things I found fascinating about doing this kind of process, and we do it all the time, there's always new processes to replace, is that you can kind of create, that you can kind of treat the Excel spreadsheets that people come up with and the processes that surround them as like a prototype of what you're ultimately trying to build. So you've actually, a business partner as you're building software for have already done a lot of the work for you. There's obviously lots of things you can do with Excel, but one of the things that we encounter a lot is people using Excel to store effects of a large number of rows, tabular data that they wanna do, searching, grouping, and aggregation. And this is the case for buy sheets. So buy sheets, just to give a brief overview, our buying teams will go out and plan on a monthly basis what they're gonna buy, let's say like three or four months ahead, and they will have a line on this spreadsheet for every single thing they're going to buy. They may have placed it already with a vendor, they may not have placed it already, but they would list it out, this is probably from six months ago, they would list it out, every single thing they're going to buy. Now, and so you've got things like a purchase order number, whether it's a new or a re-buy, who the vendor is, when you're expecting to ship that kind of thing, lots of very understandable data. Now this spreadsheet doesn't look too bad, like you think this is fine, we can manage with this. But of course, the real spreadsheet doesn't look like that, it actually looks like this. And that's not even all the columns. But of course, actually it doesn't even look like this, it looks like this. Because it's actually 10 times longer than the thing I showed you. And it doesn't even look like that, it looks like this, because there are 10 teams, all of whom have their own spreadsheet, trying to do the same thing all the time. And this is just for one month. So every month, a new set of these gets generated, and it's constantly being worked on by multiple people. So obviously there's a tons of problems with this. The sort of senior management in the company want to be able to see an aggregation of this data over the whole business. Well, you try and aggregate over all those 10 spreadsheets, that's pretty difficult, and there's a lot of manual work. There's a lot of repeated data in this spreadsheet, and so there's tons of copying and pasting going on, which means every time you copy and paste, that's a chance to get it wrong, and to have incorrect data. We also have a system that we've already built for creating purchase orders, that takes exactly this data, and it puts it into our system, and then we send it off to our vendor, and it flows through to receiving, and all the rest of the parts of our process, but they have to copy this data into our system to then generate a purchase order, again, a chance to make a mistake. And also the buying teams often want to compare, let's say, July of this year with July of last year. Well, comparing spreadsheets over time, the separate spreadsheets is already difficult, but the process evolves over time, and so the spreadsheet from a year ago won't even look like this, actually, it'll look like something else. So just getting the columns to align is a nightmare. And on top of that, there's actually lots and lots of tabs as part of this spreadsheet. For example, this is reference data, these are the allowed values in certain columns. Now, that's great to know that, but this isn't really enforced, so it's just reference, it's not actually a valid, it's no constraint on anything you enter. This tab shows targets as a planning team as part of merchandising that want to give the buying teams guidance about how much they need to buy in short sleeve or long sleeve, different colors, different sizes, and the buying teams have to hit these targets. And so they have these summary tables which show what they've bought in long sleeve, short sleeve, and so on, adding up from that main spreadsheet and comparing against targets. Only that summary sheet actually looks like this, and there's one of those for every single spreadsheet, and they're different because there are different targets to hit for denim, for jeans, versus buying tops. So it gets very, very complicated. So what can we do about turning this into some internal software? Well, the first thing is to look at, as I said, this is like a prototype, is to look at what they have built and what are they trying to achieve. The first thing you can actually see about this thing, and if you remember the version with the loads and loads of columns, is that this is like a pre-denormalized view of data. So everything is flattened out, everything is on one row, the lowest possible row, and it makes it easy for the team to do filtering and searching and that kind of thing. But what we need to do is actually take this and work out what the normalized version of the data is. Because one of the big problems with this is they're trying to use one view of the data, which is what they're stuck with Excel, to do many, many different things. So first of all, we need to pull this apart and create a normalized view of the data. And you can do this just kind of by talking to your business partners, but also by looking at the data itself. So for example here, you see this purchase order number is repeated on every row. And then here, we've got the date repeated in the same way. And obviously you need to confirm this with the people who enter the data, but you can kind of work out, well maybe this is a purchase order level piece of information. If you look at these three pieces of data though, on the same row, is they differ. So that's not a purchase order level piece of information. That is at some other level, what we're gonna call the byline, which represents the lowest possible level here. And then at a higher level, you have the vendor name being repeated over lots and lots of purchase orders. So that's obviously different from the purchase order. So just taking this very small example, what you can end up with is a structure like this. You have one vendor, which has multiple purchase orders, and each purchase order has multiple bylines. And so all these correspond to tables in your database and active record models. And that's how you can start constructing what your Rails app is actually gonna look like. In reality, of course, it's much more complicated than this, but it works, you can still build this out from the spreadsheet in the same way. There's a bunch of other stuff that we can get Rails to do for us and problems we can solve if we look back at this spreadsheet. In fact, one thing we like to do is go and watch our users using the spreadsheet and seeing what they're doing and identifying the times when we are most sad seeing them do what they're doing. Because sometimes they do like a, they build a pivot table and it takes like 17 different keystrokes, or they're just copying and pasting the same data again and again and again and you think, I need to solve this, otherwise I'll have a heart attack. But actually often that's really where you can add a lot of value in a short amount of time. So what can Rails do for us? You can ensure the uniqueness of a purchase order number. We know looking at this, but also talking to the buyers that purchase order numbers are always unique. Now imagine trying to enforce that across 10 different spreadsheets. That's like someone's whole job is just to go through spreadsheets to check no purchase order numbers been reused. Well, we can do that easily using Postgres and Rails validations. You can constrain values. So in this column, the value can either be rebuy, refresh, or new. It's obviously very easy, both in Postgres and Rails, to enforce those values and only those values go into that field. Everything, we can have a single source of truth where we have one vendor record and if you need to change the address of that vendor, you change it in one place and it populates throughout the rest of your system. I mean, all this stuff, obviously, is very obvious stuff that you know about Rails, but this is stuff that it's just incredibly difficult to do in Excel. You can derive values. Now, obviously, Excel has formulas and some of those are pretty powerful. But again, if you've got 10 spreadsheets and you need to change the definition, say, of this month field here, the ship month, which is derived from the ship dates, you have to change that in 10 different places and then also the formula that was in the spreadsheet from a year ago is no longer valid. And one, data types, sorry, I'm missing what's popping up. Excel kind of has a notion of data types in terms of how it formats things, but not really and it's certainly not when it comes to like Boolean values here. And here in these sheets, it's just a yes, no, a Y or an N. It doesn't mean anything to the sheet itself, but you can define a Boolean field and actually derive business value and logic from that field. And then finally, one of the more powerful things actually, especially when you're building of, you know, you're not just thinking on the database, but you're thinking about how you actually display this information is that you can combine visual information with semantic meaning. And what I mean by that is these rows have different colors. Now, the colors mean something to the buyers and they told me once what those colors mean, I have forgotten. And the only way I can find out is to go and ask them again. And they also have to, whereas if you do this in Rails, you can actually say on your view, you could say, let's say green means the vendor has been sent, the vendor has been sent this purchase order. You can actually encode that in your business logic and then just say for any row where this is true, show it as green. And so that business logic is encoded in the actual visual display. And it means it's consistent over every time you display that information and also any developer looking at it can understand why it's code green. So, some actual code, it's very exciting. What we actually need to do once we've created this normalized database is we need to be able to give our buyers the ability to search easily across this normalized database. And it's actually quite difficult to do that in Excel, especially if you have a very complex relational database and you also want to be able to search and aggregate over derived values. Let's say just that month value we had there, it wasn't a actual stored value, it was derived, it's really difficult to do a SQL query that will allow you to do that search in a simple and efficient way. So what we now need to do, and as I mentioned earlier, we go back to the original format as we denormalize all that data. In this case, we put it into Elasticsearch. Just a little example of here, we have a denormalizer class. We basically have a bunch of methods that are derived values or they're related information and we then add them all to a hash, we just flatten everything out and we load it into Elasticsearch. And that allows people then to search over those enormous number of rows. So this is what we end up with, this is our buy sheet. This is the tool we actually built for them. So you can see there's a lot of similarities with the Excel version. We've got columns with the data in, we've managed to add photos, it's obviously a nice thing to do, difficult to do in Excel, easy to do in Rails. But you know, it's essentially a tabular format. Of course, this isn't the whole view of the buy sheet. This is the whole view of the buy sheet. This is one team, this is the blouses team for one month. And you might look at this and think, well, why do you not have pagination? I know we have a code challenge that we send to people and if they don't paginate a long thing, we're like, come on, you should be paginating. This is bad. But in fact, it's interesting to watch with expert use systems, some of the UI rules that you are kind of fairly hard and fast don't apply to expert use systems. So in this case, the people using the system know what data is in there pretty well and they just use command F and search through in their browser to find the data they want and it's actually much more efficient than as there is a search tool that we built for them but they end up using, they get a huge chunk of data and then they just search through it or they scroll through and find the image that they want. But of course, this isn't really just one buy sheet. The whole system is just one buy sheet and our system allows us to sort of slice and dice it however we want. So if you do want to see like, instead of just one month or three month view, you can see that it's just a much longer list. There's a bunch of stuff that Rails allows you to do to solve these problems we're seeing in our Excel spreadsheet. So for example, you can actually have actions that are triggered from the view. So you can create a purchase order, you can duplicate the line, sort of stuff that you can't do in Excel. You can also combine your visual information with semantic meaning. So we have a flag here that says canceled. Now that actually means something to the system but we can visually display it so it's useful for the buyers or we can display a flag that means this is exclusive to Stitch Fix. Again, these colors will be used by the buyers inconsistently and the meaning would easily be lost. I mentioned before about only having one view and trying to do lots of things with one view. Well, that's not a problem with this system because once you have the data in, you can display it in as many different ways as you like. So for example, this is the same information but we've just blown out the picture. This is useful for a particular meeting the buyers have every month where they want to show their managers, this is what we're thinking of buying. Exactly the same information, very trivial to do in a Rails app. You can group things differently. It's very easy and not stuck with like the one kind of best dimension that you would use in an Excel spreadsheet. You can just change the grouping. So by default, our system group things by week but they've chosen to group it by color and then once you've done that, it's very easy to then just have a visual representation of the data that you're outputting. This is the different colors, the sort of how much of each color we're buying in a particular month and I'm sure it's obvious to all of you is it's to me this is for a winter month. One of the many interesting things I've learned working at a fashion company that I did not know before. People are sad in winter, they want gray clothes apparently and then we come to the knotty problem of horizontal scrolling. Now we obviously weren't gonna put horizontal scrolling in our system because that would be terrible and everyone would just feel weird. So, but we realize of course with Excel you need horizontal scrolling because you only have one layer. You have to just keep pushing things further and further out. So what we've done is we added a bunch of rollovers. Again, very simple, very well understood technology. So if I can just, oh, damn it. That's the button I meant to use. Here that's like these indicate different rollovers that show you different data. That's kind of useful but obviously with the Excel spreadsheet if you wanna look at one particular column all the way down the rollovers isn't very useful because you have to like mouse, mouse, mouse over. So we also allow people to pin certain rollovers across the whole set of columns. So if they wanna look at what is in the pricing information in the purple there they can just pin it and then you can use that to scroll down. So we have layers of columns instead of just going horizontal. We have that big summary tab in the Excel spreadsheet. Well what we did here instead was we had a summary tab up here on the same page, no clicking all over the place and when you request that it just loads through the magic of Ajax and it means you can just keep everything in the same place and people are having to jump all over the place all the time. So what we've ended up with is with this new app compared to Excel is we have a much greater consistency of process, a much greater consistency of data and we get scalability. As the company grows as it is growing a lot right now we can handle that now for as big as the company's gonna get because we have this system. There is one thing though, people are always gonna wanna have access to Excel and to be able to use Excel because there are things they're gonna wanna do that you haven't programmed into the system and the system is never gonna be as flexible as Excel. So the thing you always need to do in any system like this is you need to give them a download button so they can export to CSV and create another hideous Excel spreadsheet. Thank you very much listening. Yeah, so the question was what does our testing look like because we have all these columns to test. I mean, there's nothing particularly unusual about it. Testing with Elasticsearch is a little bit difficult to get set up, a little bit awkward because you have to create records in your database with the factory and then push them into Elasticsearch and then search and then work out what results you're getting back. But that really leads to slower tests rather than difficult tests to write. There's nothing particularly unusual about our testing. Yeah, that's a great question. So the question is, there are certain things in Excel that are easier than in Rails, for example, editing lots of rows of data at once. Well, I think if that, that's a question of asking the people doing it, like, how important is this to you? And if it is important, you can actually build that into your app. So we have a feature where you can select a large number of rows and then you can do multi-row editing at once on a particular piece of data. But I think, yeah, it's a question of how important it is to them, really. So the question is, did you find it made their lives easier? Yes, absolutely. Yes, I mean, I wasn't gonna say anything different was I, but no, I, yeah, they've been actually been able to grow a lot more since we gave them this tool without having to add a lot more people. And I think it goes back to also thinking where the person, the human beings actually add the value. Like what we're doing is taking away the things they were doing that they weren't really adding a lot of value. They were just doing like monkey work and now they can think about things. Things that the system are terrible at, like, is this skirt nicer than this skirt? Like that's their job, really, is to choose those sorts of things or to deal with a vendor problem, like a human relationship problem. And so they have a lot more time to spend on those things because they're not just like doing, you know, pivot tables all the time. Yeah, so the question was, was it hard to get people to adopt the system? So in one way, it's not hard because it's their job and, you know, the thing about internal software is there's no conversion problem or rather you solve it by giving people money and then they start using the software because that's part of their job. But there is a resistance from, a fear. It's more of a fear rather than, like a strong resistance because they just don't like it. And it was difficult and we would go through several iterations of, hey, we've got this version to use it and they'd be like, yeah, it's great, but it doesn't do this thing. And so we have to add another feature. So you build the whole thing and then you end up having like, you know, a few weeks or a month where you're like making sure, like for example, the stuff about multi-line editing, like if that's really important, we need to build that, otherwise they're not gonna feel like they're getting any benefit because the benefit to them, they have to take a while to learn the system so they don't feel the benefit until a little bit later on, it's not like an immediate thing. Yeah, so it does take some sort of massaging of like, you know, the fears, I think. The question is any favorite gems or libraries? I don't know really. I mean, I guess the Elasticsearch model one we use very heavily. We tend to create a lot of internal gems and stuff to manage a lot of these things. So I probably, there's a lot of like low level ones that you would use on any Rails app that we use, but there's nothing particular that stands out, I don't think. Yeah, I know, I'd ask, if my team would be able to answer that question a little bit better than me, I suspect. Yeah, so the question is, did you ever expend with any admin frameworks? When I first joined, actually we did have active admin, I think. Yeah, those things are not great because they give people too much power. Or if you don't want to have too much power, you spend a normal amount of time re-engineering everything to like limit them to doing certain things. And they're also just crud, they're just crud things. The system we built, it's not really a crud system, it does a lot more, and it's part of a process of like moving inventory through our system and stuff. So active admin, I think, has very, very limited. It's good if you just want to be able to have access to your data very early days, you know, and not allow people to go, just do, you go PSQL, you know, and actually just start putting in SQL strings. It's better than that. But no, we didn't really use that stuff. All right, thank you very much.