 Welcome to personalization for the perplexed. Wow, I feel like I did like an NPR hosting voice. That was really weird. So, a quick show of hands before we get started. Who here has done personalization on a website? Who here has been told they need to do personalization on a website? Who here feels like the person who told them to do personalization on the website knew what they met when they said you need to do personalization on the website? Okay, yeah, that sounds about right. Hi, so I'm Jeff Eaton. I work for a strategy design and development company called Lullabot. We do big content websites primarily with Drupal. And we work with, most of the clients that we work with are large publishing or education or support orgs. The common pattern that we tend to find is that they create and publish content online as a core part of their mission, not just a marketing adjunct to the real stuff that they do. We do work with some marketing orgs, but that's the meat of what we do. So, a lot of the stuff I'm going to be talking about is shaped by that perspective. So, if you feel like I'm completely full of it or you want to argue, feel free to take it to Twitter where I have always loved being heckled. So, before we start with my little wiggling meta slide, I'm going to step back and give sort of an idea of what I'm talking about when I talk about content personalization. So, this is a rough flowchart of the presentation you are about to witness. You are here. And in, you know, like a standard content publishing scenario, this might be like basically where you want people to go through the site hierarchy or something like that. And there may be goals that I have in giving this presentation. There's stuff you want out of it. But ultimately, because this is a fixed linear time frame, not some sort of like weird quantum universe where everything is branching out into a million directions, we have to choose one fixed set of stuff to go through for this talk. Wouldn't it be awesome? Wouldn't it be fantastic if we could just have an infinite number of presentations already, you know, for already, and the conversation we're about to have together was just one of many paths. And for each person in the room, I could branch off in a different direction and give something perfectly tailored to what they care about. The world would just be wonderful and we'd be swimming in bags of money. And we'd be, does anybody remember DuckTales, the cartoon? Yeah, that's Scrooge McDuck in his giant vault full of surfing money. And that's sort of the dream of content personalization, giant huge bags of money because you're able to customize everything you publish perfectly to the person who's reading it and deliver just what they want and just what will incentivize them to write a giant check to your organization. And that brings us to the gist of this personalization talk. Karen McGrane, who has written a lot on structured content and IA and UX and more recently about content personalization, one of the quotes for her that I love is that most content personalization talks are fan fiction for CMOs. They are basically about the fantasy that a marketing executive has of how wonderful the world would be if we could perfectly tailor everything to everyone and then just ski down the pile of money, the results. The problem is this personalization is really tangly and complicated, not just from a technical perspective but from an organizational perspective because there is a lot of work that goes into it. Because not only are you generating a web page or creating a particular piece of content, you're managing all of the different permutations, all of the different bits that go into that and keeping track of not only that content but the scenarios in which you want to tailor things, it's complicated. And a lot of the conversations we have focus not on the work, not on understanding the decisions you're making but on the features of the products that will eventually get selected to generate the web page or to swap out a different CTA or to track the metrics or whatever. So this is a talk that is not going to mention any particular products. It is not going to talk about technical implementation details at all. It's just a step back and an attempt to frame what we have to figure out as an organization in order to do content personalization effectively. The basic framework that I tend to use when talking to clients about this stuff is three parts. Signals, which is basically the individual little bits of data that we can capture and collect and the things that we know about what's going on when someone is interacting with our content. Scenarios, which is sort of the stories that we come up with about what's going on based on those individual signals. Things like somebody is checking, somebody is physically located near an airport. I'm going to imagine that they're a traveler, so we're going to make content that's for travelers and deliver it to them. Now the signal is actually relatively simple. It's just they're near an airport. We're telling a story to ourselves about what that means, and then reactions. The third part are assuming that we're talking to a person who is a traveler what do we want to do differently with our content in order to reach them. So those sort of three building blocks are what we're going to be looking at. The first one signals, again, it's basically what you know about the current situation in which someone is consuming your content. I tend to look at this in a couple of different ways. One is pure contextual stuff, things that if someone hits your website or they're using an app that you feed content into or something like that, stuff that you know based on the devices that they're using. That could be the geographical location that they're currently at if they're on a cell phone or something like that. The time of day in their current time zone. If they're hitting a web application, you know what specific URL they requested and sometimes what URL they came from in order to get there. So you can start piecing together certain like stories purely based on context. But these things are, they're very assumption heavy because they're sort of free-floating. You don't really know too much. You just extrapolate from, oh, it's 10 a.m. They're in Phoenix. They're looking at our support page. What can we guess about that? But some of these contextual signals are like the raw building blocks that you can assemble into bigger assumptions later. The other category of stuff that we tend to look at is behavior. Like, what are they doing? And this isn't necessarily assumptions about what we think they will do. It's what actions are they actually taking. And that could be, we know the particular path that they've taken through the website over the past five minutes. It could be in the case of like Amazon or whatever. It could be, we know the different products they've already looked at on this visit to the website. So we can start feeding that back into what the next round of products that the rotator will be for them. It's based on the behavior of the person in their current interaction with us. The next step is basically like user profiles and larger archives of information we gather across visits and across interactions with them. The easiest way to do this, technically speaking, is like the Netflix approach. Tell us who you are. Tell us when you log in. This is my wife from my Netflix account. I say, I'm Jeff. You know what I've done previously. Since I'm telling you who I am, you can base what you present to me based on what I've done before, what I've said I'm interested in, stuff like that. Probably the easiest way to find out what people want is to explicitly ask them and then give it to them. And profiles are a convenient way of doing this. But depending on the scenarios you're in, profiles might be something that you do without explicitly setting up an account for them to log on to the website. It might be just like an identifier that you use to say, we know this person's visited us before, we may not know their name, we may not know their email address, but we know the kinds of things they keep coming back for, so we'll keep track of that. That idea of an ongoing archive of what someone looks for and what they request. And then there's also the whole idea of like, there's a whole ecosystem of third party data providers that can match you up with huge swaths of signal data about the people who are visiting your website or the people who are consuming your content. So if you want to say target content to people who are between 30 and 40 and married and have two children and live in Phoenix, and all you have is an email address, there are services that can get you there. But it is not necessarily a simple prospect because there's a lot of data privacy stuff that goes along with that. If you do venture into the world of like using big third party archives of personal information about people, tread carefully, contact your lawyer, make sure that you're doing it in a way that won't get you, say, a multi-million dollar lawsuit from someone in Europe because it is entirely possible. Moving on, assuming that you have some pile of data about what's going on, who you're talking to, what kind of things they're doing, where they are, what time it is, what computer they're using, stuff like that, those raw signals. The next step is basically translating those into meaningful context, like the scenarios that you imagine someone is in, the stories that you're going to use when talking to your marketing team about, well, okay, when would we want to say something different? Those are the scenarios. I like to think of it as like all of the raw signal data. There's color, there's shape, there's flavor, there's size, there's carrots, and, wow, I'm blanking on a lot of vegetable names right now staring at that screen. The idea is this is like your raw signal data, and scenarios are when you start plucking out specific information that is going to matter to you in order to make decisions based on. It could be like specific signals that you know are going to be meaningful, like between 10 a.m. and 12 a.m. we want specific information to be visible on the site if you're like a business with a street address and you want to tell them whether you're open right now or not, that would be an example of plucking out one very specific signal and keying based on it. There are lots of other ways to slice and dice that too. Things like building out personas. What kind of person do we think this is based on the signals we have about them so we could show them different stuff? There's anticipated tasks, things like based on what they've done and who they are and what we know about them, we imagine that this is what they'll be looking for next so we can deliver that to them. There's behavioral context, things like, this person is currently away from home or we think they're studying something right now rather than just browsing for entertainment purposes or we think they're hangry. There was a fascinating personalization example a while back where they discovered that a lot of sales happened for people stuck in bumper to bumper traffic on their website so they ended up identifying like people stuck in traffic and hangry are a scenario we will deliver custom content for. Not necessarily for everybody, but it works. Demographic categories are also popular to talk about but historically we've found that they are a lot less effective because demographic categories tend to be a giant sink of personal biases and assumptions more than useful actionable data and we'll get to that in a little bit. But again, is this a GeneXer? Are they a parent? Do we think that this person lives in a large city or in a tiny suburb or something like that? Those are things that often will be talked about in the context of personalization and sometimes it can be useful. One of the examples that we like to call on is Angie's list, if anybody's ever used that it's basically a giant directory of reviews of local services like plumbers, doctors, stuff like that. We worked with them when they were working on tailoring and personalizing their initial sign up page because they get millions of visits on it because their sign up page is basically a geotargeted list of plumbers in your city or something like that. What they found though was depending on the service somebody was searching for and depending on where they were, every attempt they made to tailor the CTA language and stuff like that if they were able to increase sign ups for one group of people that they cared about it always seemed to make sign up click through worse for other groups. They were in weird trade off zones but they didn't know where they could get good signal data to really start splitting things off and personalizing this page in more detail. What we worked with them on was developing a set of scenarios rather than demographic profiles where for different types of products we worked with them to determine whether those were like by scenarios, browse around scenarios, or I am currently panicking just get me a service provider so my house doesn't blow up because I think I smell gas. Those levels of scenarios were what they ended up personalizing for and what they did was they associated that not with like weird external signal data but with each one of the provider categories. So they weren't actually relying on like exotic weird we've secretly peered into the lives of these visitors to determine what they care about kind of signal data. They just said what page have they landed on? What were they searching for? They searched for LASIK eye surgery. Well, we'll give them a list of doctors. They're searching for kitchen cabinets let's show them lots of photo galleries because that's more of a default browsing scenario and if they're searching for gas leak let's just give them an 800 number real fast and get them hooked up with somebody and that provided a useful building block for future personalization without necessarily doing like false positive matches that would cause them to put bad information or less useful and less meaningful information in front of somebody if they had made bad assumptions about them based on light signal data. So the third part which is sort of what we were already starting to edge into with that example is reactions given a giant pile of data given stories that we're telling ourselves about what that data means like this person is a student who's trying to learn more or this is a busy parent who's looking for, you know, a product to help them the reactions are basically given those stories what are we going to do differently in our content based on those assumptions? The easiest one that everybody's probably familiar with is like content recommendation like every news website has some sort of what to read next rotator going on and those things can be tailored to personal information about you or behavioral stuff about what you've read before things like what are similar articles to the article you're currently reading that may not require a lot of signal data you're just basing it on what are they reading right now oftentimes but that's still a way of providing a personalized path through the totality of your pool of content incentivization is another common one I probably won't dwell on this too much but the idea of identifying key scenarios where you want to give people a discount or show a different call to action to try to induce them to sign up or buy or call for more details or something like that prioritization is another big one that's again the Angie's list example we were talking about it didn't change what was on a page based on personalization it just changed what was the first thing people would see we reordered the page based on what we thought people would find most useful and then dynamic assembly is basically taking small chunks of content that you have throughout your CMS or your content model and building out a custom page or even just portions of the current page that are tailored to the scenario that you think you're in this is probably where a lot of like energy and personalization projects goes because the idea of building out a custom landing page or a custom product information page or whatever based on the user scenario you think you've identified is like that's the Scrooge McDuck dream being able to tailor your message perfectly one of the projects we worked on a while back was actually an HR intranet for like a Fortune 50 company that had 50,000 plus pages to document all of their HR insurance the whole kit and caboodle and it was awesome they wanted to reduce it to 50 pages instead of 50,000 purely based on heavy personalization everyone was on their intranet they had full employee information so they could perfectly generate something about all of their like HR information their insurance, the whole deal perfect but what they ended up finding was although the technology to do that existed and they could build it out it was still a huge editorial lift to start writing all of those components and accurately assemble them or so what they ran into was a really troubling scenario where call center support people had to be able to know exactly what page someone was looking at if they called in and asked for help understanding it but that meant that call center people needed to understand everything from what insurance someone had signed up for what their benefits were what the gender of their spouse was because of spousal insurance coverage and that collided with an interesting problem too they almost done they needed to be able to get all that information and they did business in countries where someone could actually be arrested and executed for having a homosexual marriage at what point was their goal of making a fabulous and wonderful insurance coverage experience potentially setting up a scenario in which they could harm the very employees that they wanted to help there was no easy answer to that for a second the final piece of this puzzle is goals and metrics that often gets reduced to very simple analytics stuff but before you actually embark on doing personalization stuff you have to think through basically what do you want to change based on the personalization work that you're doing then ask are we actually measuring this right now someone will tell you yes we absolutely are and then you need to find that person no really are we measuring this it could be we want to increase sales it could be we want to reduce the number of people who call into our call center with questions but one of the biggest challenges is something that's called if anybody's familiar with cognitive biases and stuff like that the availability heuristic as human beings were hardwired to assume that information we have at our disposal is highly relevant and that information that's off somewhere that we don't necessarily know or have access to is less relevant than what we have in hand which is why you get people measuring things like bounce rate on sites that really don't matter how fast someone leaves if the goal was accomplished the availability heuristic bites us really badly sometimes I think Jared Spool has said he's summed it up to if changing something on your website boosted sales by 10% but doubled your bounce rate would you do it a reasonable person should say yes but often times we feel really twitchy about saying yes because we're trained to think that those easy to measure analytics numbers have to keep going up even if they don't necessarily have any direct correlation to the thing that we really care about like say more sales or more customer satisfaction the state of Georgia we actually recently worked with them on a project where everything they could do to increase the value of the website to the people who were residents of the state of Georgia actually hurt easily measurable metrics because people wouldn't come back to the website they wouldn't stay as long it's like all of the easy to measure numbers were saying your websites doing worse but that was the good outcome of getting people to public services faster and easier so one of the things we settled on although it wasn't perfect it was close was a decrease in the number of call center calls for topics that they were moving over to the new web system that was something that although not perfect they could look at as an indicator that the website was doing its job better for those topics and it took time to figure out that that was something to look for because it doesn't appear in Google Analytics so that's the idea the signals the raw data that you look at the scenarios that you construct out of that raw data and that you want to do things based on the reactions you have to those scenarios what you're actually changing about what you deliver and then the goals and metrics you use as part of a feedback loop to determine whether or not you're looking at the right signals the scenarios you're telling yourself are actually true and the reactions that you have to those scenarios are actually worthwhile and accomplishing what you care about I'll do this real quick some of the places where it goes wrong are A, if you don't have good structured content and good metadata it's impossible to actually drive meaningful like content assembly from this data you're just sort of throwing giant blobs of text at it so investing in that stuff early helps unreliable signals like making assumptions based on signals that you won't necessarily be able to count on or may give you bad information like assuming that someone being geographically next to an airport or a traveler that could just mean that they're staying at a hotel near the airport or it could mean that they have an apartment that is near O'Hare there's lots of things that you have to assume but how much weight you put on that assumption can be dangerous not having a plan or looking at the wrong metrics also bad and making too many just so assumptions creating scenarios that are so finely tailored that they are basically that fan fiction for your CMO kind of stuff I'm imagining that the CEO of a company is coming to our website on a Thursday really looking for this particular piece of information and we can deliver it up building around those imaginary just so scenarios is almost always deeply frustrating because you rarely have enough detail to perfectly match it and 9 times out of 10 that's a very very narrow amount of the actual traffic you're getting reproducibility that was the problem that HR internet had figuring out how to be able to duplicate what a given user was seeing so that we could judge whether or not it's happening correctly often times very complex personalization systems can produce one-off content and you're not necessarily sure what scenarios drove the production of it so making sure that you plan for reproducibility matters too editorial overload how much work editors going to have to put into managing all of these different variations and creepy messaging that's also a big one it's very easy to generate messaging that if you do know a lot about a person just creeps them out that's bad and finally very very quick there's some ways that it can go really really really bad and that's bias amplification the scenarios the stories that we tell ourselves based on the data just serving as ways to reinforce our biases about our customers or about the kinds of people that need our information illegal discrimination it's very easy if you're say selling ads for job ads or something like that to gain to scenarios where you give people tools to do things that are literally illegal and bad for your customers for society using this stuff it's very easy to build tools that can be converted by bad actors if you're doing let's say content publishing tools in which what appears on your front page is driven by people's interests and driven by what's most popular that becomes something that bad actors can gain in order to control what is on your front page for their own purposes and literally genocide it's no shit it is actually possible to do genocide if you do your job well enough so about two years ago Facebook was working on algorithm changes to prioritize personal interactions in your news feed instead of just news stories and stuff like that what they did not anticipate was that they had created a system that was easy for people to gain by creating thousands of personal accounts and using them to publish propaganda about Muslim minorities in Myanmar they gave the government tools to start a propaganda campaign and a campaign of genocide that resulted in three quarters of a million people becoming refugees, 25,000 people being killed that was not part of the game plan for Facebook juicing the algorithm about whether personal interaction or news stories appeared in their news feed but it was the result we're building systems that identify, monitor, reward people for behaviors that we care about and increasingly we're offloading the big decisions to algorithms that we have trained and it's very easy to build bad feedback mechanisms if we're not careful now 99% of what we're doing is not going to result in that most of us are basically trying to swap out a different CTA block on our landing page that's very unlikely to result in genocide but the better we get at it the more we have to consider what these ripple effects are yeah, I'm out of time so I gotta blow through some of this but some of the things we can do to help minimize harm are making sure we integrate feedback from and perspectives from marginalized voices who are usually the first to be harmed by bad systems that we build making sure that somebody is doing like red team exercises thinking of the stuff that we build and thinking, how could I abuse this how could I do something bad with this and try to fix those problems early studying the systems and the scenarios in which you exist not just the tool you're trying to build and own the choices that you're making and if you see something that you think could go wrong or be abused tell the people that you work with tell leadership and say I think we could be doing something damaging here. Some great books to read are technically wrong by Sarah Wachter-Botcher Algorithms of Oppression and Living in Information Responsible Design for Digital Places those are really great ways to sort of help broaden your horizons about the kind of ripple effects that can come out of this I will now finish I appreciate you sitting here for this and as I talk very very quickly but have fun knock yourselves out remember to think through signals what your scenarios are and what kind of responses you care about before somebody pitches you a product that may or may not be able to accommodate what you need. Thank you.