 Okay. So how many of you would call yourselves search engine experts? Okay, that's good. I'm talking to the right crowd. In this talk, I'm not going to give you any silver bullets at all because I basically don't have any. Search engine optimization is a black magic, and I'm going to talk a lot about my mistakes, probably a little bit more frankly than I should. I do promise that you'll have a lot of fun, and that I'll help you see the mistakes that I made, so maybe you won't make them yourself. But first, let's talk a little bit about search engine 101. So when you type something into Google, when you're wondering what's going to come out on top, there's a simple formula and that's importance times relevance. And importance is all about something called PageRank. And PageRank is not named after the webpage, it's named after a guy named Larry Page, who built an algorithm that Google now uses. And it basically means it's going to take the number of links from other important sites, and it's going to pass a share of that page rank to you. And pages with higher page rank are going to, all other things being equal, appear higher. You'll hear this talked about as SPR. If you've got the Google toolbar, there's a little green bar, or if it's all white, that means you don't have any page rank at all, that's in the middle of the bar. But that's the importance, and we're not going to talk much about that in this talk, because this is primarily a public relations thing, right? If you get a couple of links from the New York Times, then on the front page of the New York Times you're probably good, or you can get like 150,000 page rank one links. What we're really going to focus on is relevance, and that's actually what's on the page, and what that's going to do to your search engine optimization. So this is really the framework side, the other side is really the public relations side, and there's a little bit of crossover, but that's basically how it goes. How many of you were in my talk last year? Okay, great, so this is a lemming, and we talked a lot about the idea that some frameworks develop these fanatical followers, you know, we just got a dog, or another example is there's a guy up on our street that will walk his three dogs in a goat, and all the neighborhood kids would feed the goat anything, right? Dad, this goat will eat anything, and then I'll ask yes, but should that goat be eating anything? And I mean, you've known Java developers like that, that would try anything, you know, from XML configuration to JSPs to EJBs, and we're much better than that, right? Well, no, we've got our fanatical followers too. In fact, any commercially successful framework has to have a core of leaders, and has to have that broad base of fanatical followers. That's what makes the ecosystem work. That's how we can get jobs that pay, that's how we can have a critical mass to have a conference, and some of those things that we follow make a lot of sense, and some of them don't. I heard that Dave talked a lot about that this morning. But some of these things are going to have a profound impact on places that, you know, the framework doesn't express a strong opinion. Let me give you an example, and everybody's gonna know it when they see it. Yeah, this is actual scaffolding, and it's actually being used from what I understand. And if you've seen my talks before, you know that I think that scaffolding is of critical importance in the Rails community, because that's often the first example of Rails application code that anybody ever sees. And, you know, other frameworks have their examples too. Java had what? Anybody know? Yeah, Pet Store, right? We've got our scaffolding. And it's going to have a pretty dramatic effect on your search engine optimization. In fact, in the first couple of years of Rails, this is what your URLs look like, right? Show 9923, that's your dog's name to the search engine. And now it looks much better, you know, with the restful routes and everything. But if you think about it, this ID that's on the end of all of our URLs, it's baked into the default migrations, it's baked into the default routes, and even the redirects, which means that every length in the system that uses those routes by default has this behavior. And I know that a lot of Rails people, myself included, and you know, back as early as a couple of years ago, believed that if you build this kick-ass application, then smart people will just work hand over fist to find your application and use it. But that's not the typical user. This is the typical user. This is either my mom the driver or some random picture from the internet, you know, you can decide which one. And my mom is an interesting driver because she's had pretty bad neck problems, so she's got this massive mirror in the middle of the car, so that she could see what's going on behind her. She's got a mirror that's almost as big that, oh gosh, it's like driving with a billboard beside you, out the other side, so that she can tell what's going on behind her. But if you look to the right, there's no mirror at all. That's because she can't turn her neck to the right. She has to turn her whole body, so she tries to devise a route from place to place where she can only make left-hand turns. The thing is that my mom doesn't see herself like this, she sees herself like this. There's this time when she got her little Corolla bumped by a beer truck, and she got mad, and she just tore off after this beer truck, after he left the scene of the accident. So he's whipping through Memphis, she's whipping through Memphis right on his tail, and she kinda corners him at the 7-Eleven where he's dropping off his beer. He probably just didn't know she was back there, and then got in his face, and this big burly guy was so flabbergasted that he pulls out his driver's license, registration, and insurance. So she doesn't see herself as someone with limitations, but she doesn't always see all the signs, and she doesn't always make the prudent choice. So to bring this back to an SEO talk, you need lots of obvious signs. So when I'm building a page, when I wanna crank up relevance, I use the tools that are most important to me. That's, at the top, the URL is probably the most important thing you could do. That should have the keyword that you're linking to, and then the page title should have it. You should have a single H1 tag with the words that you're trying to crank up the relevance for, and then there are a couple other things. But this is the way that you ought to view the Google bots. You know, it's my mom the driver, not some super intelligent thing that's going to divide the purpose of your site. You shouldn't view your users as someone, as people that are going to go out of your way to find your blog and read about what you're posting, and then go find your site. You need lots of just brain dead, obvious signs. Usually the more the better, but if you're trying to game the search engine, you're gonna get yourself in trouble. But going back to the default Rails implementation, we've all got this map resources, and for us we've got a plant application. It's a gardening application. And then we've got, somewhere in the views, there's an H2 tag that links to, I've got the title, which is good, right? That shows the relevance, but I completely missed the boat in terms of the URL that I'm pointing to. So another thing that I could do here is I can drop in a title tag and that can increase my relevance. But what I really need is a smarter URL, and there are a couple of ways to solve this problem that are interesting to me. One way is the idea that a lot of blogs use, especially in the Rails space, they use something called a slug or a permalink, which is nothing more than a smart title that's only created once. The first time that you create the object, maybe it'll grab it from the title of the blog and it'll drop that slug in there. But the problem with this kind of approach is that you have to let go of some of the conventions of Rails, and every time that you let go of the conventions, it takes you a little bit closer to hell, doesn't it? So there's also a problem with this approach that I might misspell the name of my dog Spot, right? Instead of that P-O, I might accidentally type L-U, right? Here's Slut, here's Slut, and I might not notice that until I see that, gosh, there's this page rank seven page of my dog Slut. And so when it's time to change, when it's time to improve the SEO, or when I notice a mistake, I can't have a commercial site with a dog named Slut, so I need to correct that permalink. What happens to my page rank? Well, I'll lose it all. So here, I don't really have enough information to maintain all of those incoming links. So one of the things that I can do is I can use the fuller path and actually have some redundancy. And in this case, I think the redundancy is okay. So I have the ID and then I tack on a title on the end of that. And so I actually do the lookup based on the ID and the title is for Google. And that's not a bad solution and I've actually seen this applied a couple of times. The solution that I like is to actually change the way that the object itself is parameterized. Because a Rails URL, by default, is what you see on the top. It's the site name plus the pluralized form of the controller for my application, it's plants. And then there's the object and then a two-param call to that object. That two-param call, by default, returns an ID. Well, Obi-Fernandez posted a really cool blog post on one way to handle this problem where you override that two-param and then you have the ID, which is an integer, and then a hyphen, and then whatever you wanna put over there. In this case, I put either the common name or I'll put the common name on my plan. In other places, I might use the common name or the botanical name. And the cool thing about this is that if you, what happens when you call two I on this thing? Well, you're just gonna get the ID, right? So, and that happens, by default, on the finders because we're coming from a string anyway. So this is going to work, by default, with just about everything you do except for find by ID, right? If I'm doing something like find by ID, I have to tack the two I on the end. So this gets you most of the way there from a URL perspective. Now let's switch gears to the page title, not because it's a hard problem to solve, but because I think that this is an interesting problem in that we can become too dogmatic and we can overthink this. So what's interesting is that a page title really looks and smells like a view type problem, but a lot of us would prefer to solve this problem in the controller. And the reason is that there's this rule that's been kind of hammered into our heads that you set instance variables in the controller for the consumption of the view. To me, that doesn't smell quite right, so I kind of break that rule. So in the layout itself, which is really the problem, right? You wanna set this data in the metadata which appears in the layout and not the typical body of the view. So in the layout, I just grab the instance variable and if nothing's set there, then I just tack on dig the dirt, which is the name of the site. In the view, I just call a helper method called page title and I can put pass into that whatever I want it to. And then the layout itself, that's where I set an instance variable. Again, if you're thinking about the standard Rails conventions, this smells kind of wrong, but if you look about the usage pattern in the view, well, it looks pretty cool. So the last thing I wanna talk about here is the idea that images are a very rich source for SEO because you can really crank up your keyword density this way. And so this is a good example of an early application where I really wasn't paying attention to SEO at all. I have the picture model ID and then I have something that I basically replaced the inbound user file name which would have been richer for SEO by the way, right? So, but in order to keep converting that and to keep it normalized, I replaced it with user underscore and then whichever thumbnail size gets passed and then the Rails image tag picks up as a default that an alternate by default. What I should be doing is using everything that I can to load that image tag and point it to something important. So the alt tag should be actually the person's name and actually it'll probably be better for me to preserve the user's name and I can get some keyword richness that I hadn't really anticipated before. The SRC tag is if I'm using a smart file name, that's gonna work in my favor. I can pick up a title attribute and I can make that go away with CSS if I want to but that's also going to add to the keyword density. And if I want to, I can multiply the effects by adding a link to that and I get the SRC attribute and the alt tag or the title tag on the link as well. So images can be an incredibly effective kicker for your SEO. And this is probably a mistake that I made way more often than I should have. I would use Ajax pretty liberally in our user interface and I would completely lop off huge pieces of my application that Google could no longer find. So there are two problems with coding a style of Ajax that doesn't degrade gracefully. The first one is reachability. If Google can't see it, well you've got a picture of a tree falling in a forest with nobody hearing it. You could have a beautiful application with nobody to use it. But the other problem is a little bit more subtle. If I use a lot of Ajax, it's not always easy for a user to build a link and that link is how you're gonna get your page rank built. So you have to do things explicitly to let users build links to specific content on your site. So a couple of things I wanna, a couple of solutions that I wanna talk about before we move on to faceted nav. First thing is a sitemap with heavy Ajax applications, a sitemap really becomes a must. And really it's just an XML file called sitemap.xml. There's a very standard template for it and Google has it in all the major search engines do. And essentially all you're doing is building an XML template. You dump in the URLs, priorities for those URLs and timestamps of when they last change. And as Rails developers, most of us have that information captured in the updated attribute anyway. So at least Google can see everything. Even though this approach won't pass the page rank and that's pretty important, right? So if you want people to be able to link to you, look to the Google Maps API. If you look all the way into the upper right, there's this button called link. And if you click it, this provides a non Ajax way to reach this precise page. And that's pretty cool. That allows, so this is a feature that's great for search engine optimization and great for users at the same time and that's pretty much ideal. And gosh, this is what the lake looks like. What, that'd be about a year ago now, wouldn't it? That's Lake Travis and this is the Oasis like right over here and that's home. Okay, so the last thing that you can think about is building two versions of the content. One for your users and one for Google. Now, at first plus, this seems like a very radical thing to do, but this is precisely what a lot of us are doing as we've gone to the jQuery API or the more modern implementations of the prototype API. We start with a hard link and then we code the regular implementation and then we layer on the Ajax so that this thing degrades gracefully. That's two versions of the same content. Now, what we did for Dig the Dirt since we really didn't know how Google was going to handle our fast nav and we really wanted to be able to tweak it and see where Google would take it, we had a link at the bottom of the page that said JavaScript free version. And really what I was saying was Google, click me please, right? In our faceted navigation, Google could kind of wind through that and we could see what Google was doing with it and so I left my Ajax implementation to something that was not accessible. And we'll talk about how this played out. Okay, so this first section, what you wanna remember is when you think relevance, think smart URLs, page titles in a single H1 and make smart use of your pictures and links and be smart with your Ajax. Okay, are you guys just dying to know what this bullet's gonna be about? Racist navigation? So, has anybody heard the story, by the way? Anybody guess about what it's gonna be about? Okay, that's good. That's good, it's kind of an obscure one. But faceted navigation, so how many of you have seen faceted navigation? Probably a lot of you, huh? Ah, pretty new idea, good. So faceted navigation is basically about letting your user query for data in multiple dimensions. If you bought a Nike shoe, then you've used faceted navigation. So it would ask you which shoe size do you want, you could click on what sport are you playing and you can do this in any order. So, I'm gonna shift gears a little bit and show you what we're doing on Dig the Dirt. Oh, it's there. I am going to turn off my mirror ring and I'm gonna blow my own mind. Okay. And so this is my faceted navigation here. The idea is that there are a whole lot of plant databases out there but it's hard for people to find exactly what they're looking for. I might wanna say, and I'd appreciate it if you don't go click this because you will melt my server. I have one tiny little slice, right? You can melt my server after this part of the talk. So I'm gonna click on the hardiness zone and we're in 8B here in Austin. And maybe since, you know, the deer are pretty thirsty. Oh, somebody's not following that. Okay, so since we're in Austin and the deer are pretty thirsty, maybe I want something that's deer resistant and maybe I wanna look for things that are blooming red and yellow. But the idea is, you know, as I click through this, these counts are changing. And so my search results are being narrowed as I go. We found that our facet menu was complex enough that if too much was open at one time, it confused the user. So we just close when one menu opens, we close the other ones automatically. And, you know, so maybe I want something full sun. And I can also like remove different elements of my search at any time. They all appear up here. So you know, maybe I'm not so interested in the deer resistant stuff because as we know in Austin, deer resistance is really a myth, right? So, but anyway, the ideas are that the user has multiple facets. They can drill down to any facet in any order and you limit the choices as they go. Normally what you'd see with a simpler facet search is this list, things would start to disappear. But in our particular implementation, it doesn't make as much sense. We do have some of the things down here disappear. Like if I click this and then I go to, you know, plant characteristics, certain things wouldn't show up because there aren't any plants in those categories anymore, okay? So that's faceted navigation and now you get to find out why it's racist. Okay, there's my new book. Okay, so I'm not seeing what you're seeing. So I've got to look over my shoulder, grab this guy, pull him down and now we can restart. Sometimes my technology gets me. Okay, so there's a company called Pumpkin Labs that was asked to build a facet navigation site for great search engine performance for a furniture store, small local store. They basically had a good amount of money to dedicate it to marketing and they decided to spend it in this way. So they wanted a system with that, would let you pick all the different kinds, all the different dimensions of furniture. Like you might be able to pick your color, you might be able to pick your size, the type of furniture, the typical user. And the idea is that as Google cross this and starts to attach keywords to things, you can update your URL, you can update your H1 and your page title and you can come up with some pretty interesting combinations. Some of which you might think about in advance, like a leather sofa or a teen dresser that's red. Some of those you might never conceive, like a black baby bed and a white baby bed. So these started to show up on Google and they started, and everybody started calling the CEO of Pumpkin Labs and saying, you have built me a racist website. You know, when that was never the intention at all. But the point of this part of the talk is that you can get, it's this kind of stuff with a simple facet grouping is incredible for Google because you start to get these interesting combinations that you didn't think of when you initially developed the site. Now, so this is what our implementation looks like. The hardest part about building this system is some of the implementation is backed by Active Record and some of the implementation is backed by the session. Now, we didn't wanna go all the way out to Active Record every time somebody expanded an individual piece of the facet, that would just be too expensive and we couldn't afford to put everything in the session because it basically wouldn't fit. So, we settled on an Active Record model for the individual plant. So anyway, one of the implementations that's very common with faceted navigation is to use an index search like a solar or a ferret or the hot one now is sphinx, right? And the Rails plug-in of thinking sphinx. But I knew that I wanted my plants to be cross-linked from a variety of different places so the tagging model was very important to me. So I built a model where every attribute on the plant was represented as a tag. So if something might have a tag of full sun with the context of sun exposure. So now when it's time to write an article that's about full sun plants, I can go ahead and grab a chunk of plants, a paginated list of plants for all that rich cross-linking that all the search engines are gonna love. Okay, so that's my model. It's a plant with a whole lot of tags. The plants are very skinny. The tags are very skinny. But when I'm building that faceted navigation, when I'm counting things, I can do so very quickly. And the admin, so rather than hard code the list of facets on the left-hand side, I let my admins build it. And this looks exactly like you'd expect it to. On the left-hand side, there's the name of this thing. Like maybe it's soil, which has soil pH and soil composition and things like that underneath it. So it's not really, it's really a grouping, right? And then there's soil pH, which is a multi-value field that has values of acidic, neutral and alkaline. And then there's edible, which is just a boolean which has values of yes or no. And so there's an active record model with all the different kinds of facets. And I do this with single-table inheritance and it seems to work pretty well. Now the facet implementation is a little bit trickier. Where does this kind of code live? Well, the logical place for user interface code to live is either in the viewer or in the helper. But you don't wanna drop this much code in the helper. So what I actually did is I built an object model for this that's actually in the helper directory that's consumed by a very thin wrapper in the helper. And that method basically says render the facet free. And so basically this implementation is backed by the search, which is an active record, and the session, which has things like which particular node is expanded and things like that. So there's a, the behavior, there's really not a lot of it. There's, it's mostly just rendering and things that support the rendering process. You know, there's expanded contract which saves us, changes the state of them a little bit. And on the Google API, changing the state basically just changes what's in the URL parameters. And in the other implementation, I actually changed what's in the session. Of course, I wanted Google to be able to capture the state. So I let Google carry that state with it in the URL parameters. Okay, now the downside of this, you've probably already seen it already. No matter what I did to that facet tree, Google was absolutely fascinated with it. And it would get into that facet tree and never get out. There were too many combinations. It was too rich. And you know, and that makes sense, right? Because you could have one version that's full sun and then you tack on blooms in the early summer and then you flip those parameters. Those are, that's two different flows that takes you to the same place. But Google has to try all of them before it could tell. Right? So sadly, after all that work, we turned it off. Now, we think that the facet navigation still is part of the secret sauce of the business because you have to be able to search this plant database, which right now we've only got 4,000 plants or so. By the end of the year, we'll have eight to 10. But even then, that's a lot of different plants and there are a lot of different circumstances and our database is adaptive. So you might have the experience that African daisies in your yard, well, being from Canada and stuff, they don't grow very well unless they're in full sun. And you might have the experience that being from Mexico, that African daisies can't really take the sun, the full sun will cook them so you might have to have partial shade. So as our database adapts, then our facets will play nice with that kind of model. So what we do is we actually expose those individual searches for things that we think that we might need to optimize for search engines. For real estate, you might have a save search that's lakefront, Austin homes. That's basically, you can grab a faceted search of that and drop the URL in a place for hot tags or whatever. We find that we're gonna start experimenting with user save searches because we think that our users wanna be able to save those searches and we think that if they do, we're targeting with SEO specifically the things that our users wanna do. And so we think that that'll help the page rank precisely in the places that we want to. And also, as we grab those, that will add juice to the plants that show up in those searches too. Okay, so my mom says disable the fasten ads for complex sites because it's racist. So the last part of the talk is about smashing mirrors and this is essentially about controlling duplication. Duplication is horrible for websites because it essentially dilutes the juice. If you've got four pages that are essentially the same thing, you're essentially saying you want four search results further down, way further down rather than one big fat good search result that's on the top. That's actually the opposite of what you want. You want one page with all the juice poured into that single page. So the last, I don't know, a couple of months, I've been really trying to smash the mirrors. So I'll go into what search engine tools you guys use. Anybody use analytics? Anybody use webmaster tools? Just a couple. Now this is something you have to go get. This is really the developer end of search engines. For HTML suggestions, I've been watching this first purple link for a long time. And in this case, all 479 of those are from an earlier crawl and problems that have already fixed. And then once it comes around again, it'll find other mistakes that I've made and there'll be more, right? But I find that one of the most valuable things that I can do is control the duplication. And that first focuses Google on the pages that I want it to be crawling. And second, it should give me better juice for the long term. Now duplication can show up in surprising places. One of the ones I've been thinking about is a comment or a rating. So on this particular page, this description stuff right here is the stuff that I really want to optimize for. I want basically a rich chunk of text that repeats some of the text that's important to me. And in this case, it's a Sequoia tree. And that's why it's repeated twice in the H2s. But what's gonna happen if I let Google crawl this and this list of comments gets longer due to pagination is I'll start duplicating this. Does that make sense? I'll have 17 pages. If there are 17 pages of comments, this plant page is replicated 17 times. And that would be absolutely fatal to the SEO for this particular application. So what I'm really watching is the ratings. Now there are a couple of solutions, none of which are perfect, but some of which are pretty promising. The one that I like the best, we're not doing this yet, but the one that I like the best is actually rendering all of your comments and using JavaScript to show and hide the various chunks of them. Now we had a search engine expert tell us that this is not going to get us busted for cloaking. That's not, cloaking is basically showing Google one thing and rendering something else. This is essentially, this is helping Google with one particular, with one fetch. And it's great for SEO because I have one copy of the thing and I also get all the richness of all those comments fed in as well. The next thing that I can do is I can actually use Ajax. And if I wanted to, I could sitemap the individual comments and have a separate interface to the comments, but maybe the content of the comments is not that important to you. Maybe to your search engine optimization, you know, it depends on the types of comments that you're getting. If you're getting really simple comments, it might not matter very much. But if your Ajax is not accessible and it doesn't degrade gracefully, this can actually be a good technique to keep Google out of where you want it. One of the things that you can do is you could say, okay, we're not getting much juice down here at this level anyway. I'm going to bite that bullet but I really don't want the duplication. So I want to make sure that Google knows that these are different pages and that my users know that these are different pages. And since the one without a page number is going to be shorter, the keyword density will be higher and that result should show up first. The solution that we use is, we use a robots meta tag with follow, which means go ahead and crawl these pages and pass the juice through to the individual pages. So Google is still crawling our sidebars and passing the juice to the places that we want them. But don't include these extra pages in the index. This is what that looks like. So in my layout, I have my header and if there's a page parameter that's passing in and it's greater than one, go ahead and add a robots meta tag with no index and the follow. Okay, the last couple of things I want to talk about, the two-param solution that I showed you will lead to duplication if you change the spelling, right? So my example was, I misspelled the name of my dog. But what I would like to happen, so here's a real world example. I am a hardcore dyslexic. I've written 12 books but my wife had to edit everything pretty badly. I actually misspelled in patience. I spelled it impatient, right? I'm also not a hardcore gardener. So that's kind of a bad combination on this site. So I had a duplication that I found because when I changed the name, Google kept coming back to the first one even though it's not the definitive version anymore and the new second one existed and Google found that by the links on the site and you solve that with a simple redirect. All you do is you say, okay, unless the brand's ID matches the plant to the parameter, then you're gonna redirect back to the plant and the problem is solved. And so normally I one-line this but for presentation purposes I broke it out like that. Now I know that we're short on time and I wanna set you to save time for questions but I wanna leave you with some final thoughts. If you're ever driving in Memphis, watch this lady. Okay, any questions or comments? Yeah, what's that? I don't know the answer to that. So if you're trying to game the search engine that would be very careful. If you're doing it for usability, you know, I'd probably get another opinion on that. I'd definitely ask a search engine expert. Other questions, comments? Okay, thank you.