 So while we're waiting, I'm just real quick. How many of you are actively using solar or very well-versed in solar? A couple people? Okay. How many people are actively using solar? Let's start there. Okay. And how many people are well-versed in solar? Interesting. Okay. Just curious what the audience's reaction is. Okay. So cool. So if you're using solar but not well-versed in a way, what exactly does that mean? I mean, are you using it with Drupal or are you just using it for other content or are you guys awesome? I like that. That works very well. Okay. So a lot of what this presentation is going to cover is not going to be about setting up solar. It's not going to be about using some service that maybe Aquia offers. It's going to be more along the lines of addressing a very specific problem that we had in Aquia. I think it's one that a lot of different companies have as well. And that's finding everything. So with Aquia, we basically have a large number of repositories. We have a lot of different types of systems. We use multiple different languages. It's not just Drupal. I've been programming at Aquia since March of last year, and I haven't written any PHP for the most part. It's all been ruby. So this particular presentation is going to walk you through kind of how to set up... Well, it's going to walk you through a hackathon project that we had. And basically that hackathon project was a 24-hour window that we had to address a very specific problem. So some of what you're going to see is going to be a little bit rough. It's what we were able to do in a 24-hour time period. But what it does is it basically takes a lot of different languages, ties them together, implements some things on GitHub, and shows you how to pull that into a very integrated search interface. So there's probably going to be more code than slides. And I definitely encourage people to just jump in and ask questions. I think this is going to be very informal. It's a relatively small room, so we should be all right. And I'm going to give it probably about three more minutes before I really get started. So do you guys have anything specifically that you're looking for from this before I dive into it too far? I mean, what are your aspirations? Sure, different systems. So in the Ruby world, we basically have a tool to write documentation. And I think one of the challenges of larger organizations is keeping documentation relevant and up to date, particularly as it pertains to code. So at Acquia, what we do is we keep a lot of things in markdown files in our code base. We utilize inline code, so Doxygen standards in PHP, in Drupal. In Ruby, we also do the same kind of thing, but they have something called RDoC, which will go through and basically generate a very nice HTML-based layout. And I'll show you that in a little bit. So what we're doing in this presentation is basically combining all of those systems, all of those different documentation methods, and then exposing them in a way that the newer developers particularly can find very easily. So if a developer is looking for a specific function, they may or may not know what system it actually pertains to, they have the ability to find it. So I guess I'll just go ahead and start and we'll see what happens next. So quick introduction, a little bit about me. My name is Kevin Bridges. I'm a senior software engineer at Acquia. I work on the cloud systems team, so I do a lot of scalability. I work particularly in a system called VPC, which is Amazon's new virtual private cloud. Do a lot of security work, a lot of really fun, crazy things. I'm an avid technologist. To me, Drupal, and I've been a Drupal developer for a very long time, but Drupal is essentially a component of larger and larger systems. So as we work on bigger and bigger websites, the first really large website I did was when I think Drupal 5 came out. We launched popular science, so Popside.com. And at the time, it was the first really big website that was using Drupal. And since then, that's like a drop in the bucket. I mean, we're doing so many things, the systems get larger, they get more complex, we're involving everything from the network stack all the way through to the front end, you know, mobile devices. It just goes on and on. So to me, Drupal is one component of larger systems. And I think that supporting it properly is integral to anything. We always make jokes about the Drupal blinders. Everybody tends to think that while Drupal is a fantastic platform and you can do amazing things with it, it's really not the solution to everything. So that's a little bit of what I believe in. In the Drupal space, I'm known as Cyberswatt, user ID 27802. And on Twitter, I'm also Cyberswatt. So if you need to find me, good places to get in touch. I have a set of links, including an upload of this presentation on SlideShare, all of the modules that I'm going to talk about. I give you complete links to. All of the code is in the sandboxes. I give you links to those as well. So you have the ability to download this later. So I talked about this a little bit earlier. Large organizations have lots of data. It can be anything. It doesn't have to be content or nodes or entities in Drupal. It can span wikis. It can span hosted data services like GitHub. So GitHub has the ability to publish wikis and blogs and markdown files and all kinds of neat little things. Documents and text files can be included. Solar is a very, very powerful system in that you have the ability to go through and index just about anything, depending on which modules you use in the architecture. It can span multiple languages and formats. And the problem that we are specifically trying to address in this presentation and in this hackathon I'm about to talk about is how to combine all of the sources into a single interface that is easy to use while maintaining context. So while maintaining context is an interesting little point there, because if you have a markdown file in GitHub that's in a specific repo, a developer that's looking at that, they might be able to surface relevant content as far as the search results are concerned. But to be able to truly understand that, you need to take them there. A Ruby developer is going to be looking for different things and a PHP developer is going to be looking for. So in our doc, our doc in particular has a very nice additional search interface. It has some nice JavaScript put into place so you can click through the functions. They each have a very unique feel. And it's important to deliver the content in context for the developers that are actually looking for it. So this was our Engineering Week hackathon. Basically, Acquia was generous enough to fly all of the engineers from all over the world out to the offices in Burlington. And we had a week of fun, basically. There was some paintball involved. There was a lot of beer drinking involved. A lot of really good conversations. But the culmination of the week was a hackathon project that we did. And this is exactly what I'm going to show you. This is essentially 24 hours worth of work. We had from 9 a.m. to 9 a.m. the next day. We actually finished a bit early. And the amount that we got accomplished in a very short time frame is testament to the power of Drupal and what Drupal really allows you to do if you think outside of the box. I think it's a little intimidating at first to envision that you're going to be pulling in Ruby. You're going to be executing Ruby. You're going to be working inside a Drupal. And you're going to pull all of that into a Drupal interface and make it work in a very short amount of time. So what we did was pretty cool. So some of the goals that we had before we started were pretty straightforward. We wanted to build a Drupal 7 site. We wanted to use Drupal 7 for this. It was the easiest way for us to tie into existing services that we had. So like I said, I'm not going to touch on setting up solar. Just assume for the context of this presentation that you already have a solar server working. And then things will get a lot easier. We wanted to integrate with LDAP over SSL for secure access. Acquio uses LDAP servers. It's a larger enterprise. Most larger enterprises use something similar to LDAP. We didn't want to burden newer developers with having to create an account on yet another system. So LDAP integration was very important. We wanted to serve both generated API docs like our doc and then index generated docs and GitHub docs for searching. So basically what that means is we have to tie into some different gems. We downloaded some gems from GitHub team, put those on the server. We downloaded our doc, got that running. And then we wanted to enable everything through an effective faceted search. So I'm going to walk you through how to use the facet API to actually generate some custom facets, which will be pretty interesting. So a little bit about the team. I'm a firm believer in giving people credit where credit is due. This is the number of people that we had to pull off what I'm about to show you. One, two, three, four, five. So six people. The bottom two, Amin Nistana and Chris Rudder are basically operations teams. While we were going through and doing the LDAP integration, we had a little bit of a problem with the security certificates that we were integrating with. They were able to help us out there. So for the most part, the LDAP module worked completely and totally out of the box. We didn't have to do any customizations whatsoever. And that just kind of worked. So I'm pretty much going to skip over LDAP unless somebody really wants me to go there. Richard Buford, a synaptic. He's an awesome gentleman from the UK. I worked on examiner.com with him. And he basically came from now public, if that company means anything to anybody. He helped us with the theme. I'm really not going to touch too much on how the theming of this happened. But he got it started by using Bootstrap, so Twitter's Bootstrap project. And it's a very straightforward, very simple theme. I think you'll appreciate what you're going to see. Peter Jackson is a newer developer at Aquia. He's on the cloud systems engineering team with me as well. He, ironically, is not really a Drupal developer. So when he sat down on this project, he didn't know Drupal. He knows how to maintain our infrastructure. And he knows how to work inside of the APIs that we have. But he really wasn't a Drupal developer. So basically, what you had was two Drupal developers that kind of knew what was going on. And that's Peter Willanen, who actually maintains a lot of the solar modules that you'll find on D.0 and myself. And then you had one developer that was relatively new to Drupal. You had one themeer and then a couple of operations guys that helped us for a few minutes. So you can accomplish a lot in a really short time frame. And I think that's part of the power. So basically, we have six contributed modules. The way this is going to work is I'll just go through and give you all the boring stuff up front, tell you essentially what we used, what we didn't use, and then we'll get into the code samples. So most of this presentation is going to be very akin to a code review. So you'll be able to see exactly what we did. I hope that's comfortable with everybody. If it's not, you might fall asleep about halfway through this. And I can't promise that what will happen to you if you do that. We used the Aquia Connector module. The Aquia Connector module for us was imperative because we wanted to utilize the back end solar servers that Aquia offers. You can definitely achieve this without using any Aquia servers. It's very easy. But for us, this was just a very simple thing to do. We used the Apache Solar module. This is a contrib module. You can download it on D.0. It integrates Drupal with the Apache Solar search platform. Apache Solar Attachments allows searching within file attachments from Solar. So it's kind of the whole LEGO concept. You can plug and play exactly what you need to get the most bang for your buck, for lack of a better way of raising that. We used the Apache Solar multi-site search. So the ability to search across multiple sites. We used the FACED API, which I mentioned earlier. The FACED API is actually pretty nice. I was really impressed by this simplicity that was involved in getting this to work. And I believe the FACED API went a long way to making that happen. Some of the other modules, the Apache Solar modules, those are well-baked modules. You're not going to have a lot of problems once you start using them on a Drupal site if things are just going to work. And it's very nice. So we also used the LDAP module. I spoke about that a little bit as well. The custom module that we wrote, and what I'm going to spend most of the presentation on, we call it the API Doc Search module. This allows us to do everything that I've been talking about, integrates with Solar, integrates with Ruby, basically provides a couple of stream wrappers to be able to work well with GitHub. And I'll be going over that. That's also available in a sandbox on d.o. So I'll give you the link there as well. So let's just dive in and see what it takes us. Stream wrappers, how many people have used stream wrappers or written custom stream wrappers before? One, two, four people. OK, cool. How many people know what they are? Maybe a couple more? OK, so stream wrappers basically give us the ability to do some interesting things inside of Drupal. Boom. And this is kind of what a stream wrapper looks like. On a very basic level, it tells us where in the file system the particular file type is that we're looking for. It gives us a couple methods to be able to reference the file externally. So in the case of GitHub, we have the ability to say, well, the local file that we're going to index is actually rendered on the file system outside of the web root. But when you click on that search result, we want you to go to the markdown file on github.com in the context of the repo, in the context of the branch, exactly where you need to be to be able to really ascertain what's going on. They're very simple, very straightforward, and this is basically what it looks like. So I'll get back to this in just a sec. So we had two basic types of content wrappers. We have the generated content, and then we have the GitHub style content. So the generated content is stuff like our Cloud API. It's a regular HTML file. It's stuff like the markdown files from GitHub. You basically run those through a parser, render them into something that generates them, and then call it good. We can look at those in the context of the website that I'm about to show you. So we don't want them going externally. We may or may not want that content to be publicly available. It needs to be protected by LDAP. That's basically what this one's doing. Allow the files to be viewable from the search results in the context of the Jubil site. I'll give you an example of that in just a little bit. It allows us to store raw HTML for display from search results. You'll see that in the presentation. GitHub, store GitHub content for pre-processing and indexing. So basically what we did is we created a subfolder of our main checkout. And our main checkout kind of looks a little bit like this. That's cut off for you, I assume, since I can't see it. OK, so this is our main checkout. You'll note that we have a generated folder here. We have a GitHub folder here. Each of these GitHub folders corresponds to an actual repo on GitHub. The generated content corresponds to something similar. So you'll note that we have a cloud API, a cloud API, a field, and a fields down here. Our field's documentation is just basically do the get checkout, get clone, whatever. Go into the root directory, run an R doc, dash o to pipe the output to a different location. And then basically this is the results of the R doc. So what we did is we stored this outside of the web group so that the public can't typically get to it. And then we used the stream wrappers to allow access that's controlled. So if it's a generated document and you want to look specifically at that, you have the ability to do so. I'm going to get the slides out of the way as quick as possible. Actually, before I get the slides out of the way, let me just give you some concept of what it is that I'm talking about and what we actually made. So this is the main login page. It's very simple, very straightforward. Like I said, you use Bootstrap to get a good start on it. We didn't want to complicate the layout. We didn't want it to look like Drupal. There's definitely a lot of room for improvement there as far as Drupal's layout is concerned. Using LDAP to basically log in. Once you have the log in, you have some of the custom facets that I'm going to show you how to create. So we're basically looking at two different content types now, the generated content type and then the GitHub content type. Each one of these content types, if you remember the folders that I showed you, so CloudAPI, CloudApp, Mirror, Fields, Gardens, that type of stuff, kind of all shows up in here. Depending on whether it's a generated or GitHub style content type, it shows up a little bit differently. And I can walk you through how that's created as well. So we'll just dive in and start clicking some buttons. On GitHub, actually let me do this in another tab, it's terrible having to rely on the internet around here. So while that's loading, does anybody have any questions so far? They're not really content types. They're content types in the sense of Solar considers them to be content types. So essentially what we're doing is we're creating Solar documents. And inside that Solar document, we're setting a type that tells it what that actually is. So this is going to be really slow for me. So we'll go to, we'll cheat. This is basically our Acquia repos. We've got 20 public repos, 96 private repos. One of the things that we've done that's pretty interesting is we've gone through and standardized on the format or sub format of these folders. So inside of each one of these, we might have, at the top level, we have a doc root. Inside of that doc root, and doc roots, let me break that up into two words. It's a doc folder that we use as the root for all of our documentation. So it's not the same as a web doc root. Inside of there, we store everything that's relevant to the particular project that we're working on. We've found that this gives the most relevant data to the developers the fastest. So it's much easier than maintaining a wiki. It's much easier than having to hire a technical writer to go through and document what we're actually doing. And it works out pretty well. So that's basically what GitHub is. The Cloud API, we'll try it again in an external internet. And while that's thinking about it, we'll cheat all over. So the Alcovia Cloud API is just an HTML document that we render. This is the public-facing Cloud API that most people have access to. It's a RESTful interface. And as you can see, it's totally in a different format than what we have on GitHub, or what any of the markdown actually is. That's unfortunate that we're not going to have internet. But we'll do the best that we can. And then the hosting a QAPI, I believe I have a link to that as well, kind of looks a little bit like this. So this is the RDOC generated content. And this has some neat features in it that may or may not work, because I'm not actually online. But it has a nice little JavaScript interface to do some additional searching within the API. And then it goes through and allows you to do some neat things, like actually click on the method names. And when you're online, it goes through and does a nice little Ajax-y deal where it opens up and shows you some additional data. So let's see what happens if I do a search. And this was working long enough for me to get here earlier. So while that's thinking about it, were there any other questions? Pretty straightforward so far. We're going to walk through Jenkins in just a few minutes. So basically, we're using Jenkins to do a lot of the process, all of the processing, to generate the content. And then we're basically checking that into a GitHub repo that is all the things that runs this particular website. Once that generated content is checked in, cron functions on the site to pull it down into the specific directories that we have. So just above the web route, we have that checkout that looks just like this that has all these different things. And there is a doc route that is your standard Drupal installation. But above that, Jenkins is checking into the generated folder, it's checking into the GitHub folder and doing a couple of other things. So I did a quick search. I searched for something called VPC, which is Amazon's virtual private cloud. You can see down here that this is a repo on GitHub. If I scroll down a little bit more, I see that this is actually generated our documentation. This site seems to be performing OK. So let's open that one up, and then we'll open this one up and hope for the best. I'll let those keep working. Yeah, the NNS starting to work. So you'll see that that actually takes me to the document as it exists on GitHub. Pretty straightforward, pretty easy to do. And then the developer can actually read this in the context of what their most interested in. This one I might be able to click around a little bit more. Yeah, so now you can start to see the JavaScript working. The key to this is that we wanted to allow the functionality of the different languages and the different systems to really do what they're supposed to do. We didn't want Drupal to interject any kind of artificial rules or anything like that, and wanted it to be very natural. So if I go down, I can click on different things like that and actually see in the context of the document the code that they need to do their jobs, basically. Jenkins information we'll get to in just a second. Jenkins runs a cron that gathers all of the data we want indexed and pushes it into the main Git repository as rendered content for the site. Once content is in Git, it is pulled onto the server for our stream wrappers to work. I kind of already talked about all this. So let's kind of walk through it a little bit. The first thing that it does is it checks out the repo that is running the main site, loops over each of the Git repositories we are interested in indexing, scans their standard documentation types and locations for changes, and commits them to all the things. So basically the way that we'll get to how that's managing a little bit. And then it runs our doc to generate Ruby docs and commits the documentation if anything's changed. Now what that actually looks like is kind of cool. How many people have worked with Jenkins before? A few more. That's good. Jenkins is a fantastic tool, particularly in the context of Drupal. Jenkins, let's say for example, cron management. You're working on a larger site that tends to have an increase in cron jobs that are necessary to make it function properly. Jenkins will allow you if you can use Drush to represent your commands that you're running. And I can walk you through a little bit of Drush to do that. But it gives you the ability to overcome some of the shortcomings in Drupal's cron system. And it allows you to get more active feedback. So at a minimum, if you're not using Jenkins, you think you have no reason to, maybe look at it as a good solution for cron management to help you identify when your sites are actually breaking or not executing crons properly. And this is kind of what the interface looks like. This is a configuration screen for the job that we're running. Very straightforward. One of the things in Jenkins that you always want to do is limit the number amount of data that it stores. Otherwise, you'll end up filling up your hard drives pretty quick. Pretty straightforward. Every 45 minutes, we're going to run this job. And this is kind of what the job looks like. I don't know if everybody can see that or not. But let's put it over here. Maybe this will make it a little easier to read. OK, so this is a basic bash script. There's not too much fancy about it. See if we can give you a little bit more real estate. Starts off at the beginning, goes through and looks. These dollar sign variables are environmental variables in Jenkins, so very straightforward. Inside of the Jenkins workspace for this particular job, we have a subfolder called all the things. Inside of all the things, we basically want to make sure that it's a get checkout. So when this runs, naturally, this isn't here the first time. So all that we do is clean up anything that might be there that's not relevant if we don't have get information, and then clone our job. Once it's cloned, we basically move into the directory, reset it to its original state as far as the upstream repository is concerned. And then we loop through each of the GitHub repos that we have in place. And again, all of this information is referenced in the slides, and you'll be able to download all of this if you need to go through it. But what we're doing here is looping through our fields repo, our gardens repo. We have something called search governor. We do the cloud API, gardens mobile, cloud app, and a couple of things. And it's pretty straightforward. For each repository, we're going to go through and make that top level folder that I showed you here called GitHub. We're then going to create, go through and do the same thing. So if it hasn't already exist, we're going to clone it. We're going to go through and make sure that everything's up to date, everything's reset to its original information. And then we're basically going to copy all of the relevant documentation out of that repo into the specific folder. So if you go into here, you won't actually see the entire checkout for cloud API, but you will see the relevant pieces of information that we want to pass on to Solar. So pretty straightforward. Here we get a little fancier. This is where we're actually starting to interact with Ruby a little bit. This is bash, so we can just call the command directly. So we move into the AQ gem, which is the primary heart of our hosting infrastructure. And it's primarily written in Ruby. We run an R.command to generate it. And same thing, we move it into that generated folder, so generated fields. And you'll see a lot of HTML versus the actual code that's there. This HTML that's copied over is identical to the HTML that's generated from the R.command. So then basically, we add everything, gets pretty intelligent. We don't have to worry too much about what's changed or hasn't changed. And then basically push it upstream. So then at that point, we use the website to run a cron job, pull it down on a cron job, and we suddenly have all of our information for indexing. So does that kind of help you understand how we're getting it there? Let's see. So before we can index content in Solr, we need to identify what should be indexed. Once identified, the file is tracked in MySQL so that it can be processed efficiently. So basically what we do in order to maintain that part is we loop through each of the files. We do a SHA hash of them to be able to get a signature of the content of that file, and then we store that into MySQL database with a status code. That allows us to make sure that we're only indexing things that have changed. So there's a combination of techniques to achieve that in Solr. Number one is setting a last index document position, and I'll show you the function for that in just a few minutes. And then the other is just making sure that we're only putting relevant information in the database so if it's already been indexed, we're really not interested in it. So Cron is used to pull down changes. Jenkins may have pushed. Each of the stream wrapper file directories is scanned for valid content, and we'll get into the code because code is awesome. So this is basically the function that goes through and does the scanning. We start off with an empty array. We're in the comments say it all. We're going to walk through, manage the directories, and generate a hash for each file. And we're going to insert the record if it doesn't exist. If the hash is different than the stored hash, update the hash, change the timestamp and the status in case it was deleted. And then we're going to do a stat on each file and delete those which no longer exist. The way this looks like in code, so API search get files from disk is basically this one. If you look, we're basically tying into our stream wrappers here, so our stream wrappers are generated and GitHub. So this would, if you were to type it out, would be like generated colon slash slash. And that tells Drupal at this level that it moves up one directory outside of the web route, goes down into the generated folder, and then pulls the content there. If it's the GitHub one, it goes up out of the web route, comes down into the GitHub folder, and then pulls that relevant content from there. The ones that we're interested in, and we already filtered this a little bit when we did the cron job, but we're ridiculously anal when it comes to stuff like this. So we're filtering, again, HTML and D files for the markdown files for GitHub and just raw text. So that gives us an array of files that we basically loop through. And then we go through, start creating hash of it, select entity status. So this is where we're interacting with the database to see what already exists. If it's not there, we're going to go through and do a little bit of work on it. API docs, search markup, see what that does. So this basically goes through and says, if the file is a markdown file, we want to use the RubyGems to generate it. If it's not a markdown file, we're just going to basically return the content of it. We don't need to store the generated content that comes out of the RubyGems. So basically what this does is a shell exec, and it actually returns the output directly. Ruby, RubyFile on target. So our RubyFile is this particular one, which is about the simplest thing you can ever do with a RubyFile. So that's our Ruby magic right there. Ruby is a very lightweight language. It's very powerful, very slow. And it has a lot of security issues sometimes. Probably not something you want up front on a performance website, but on the back end it's pretty awesome. So we use a gem called RubyGems, which allows us to tie into the RubyGems framework. GitHub publishes a markup gem that we pull down and we're just basically telling the file to use it. It's important whenever you're working on a distributed system, or you're pushing weird things like this out to an environment that may not be the same as what you're working on, that you declare your environment properly. So that's a neat little trick there. And then all that we're doing is puts is basically the echo or print R of the Ruby world. We're calling the GitHub markup gem. We're telling it to render the argument that we passed in, which is just basically a file. So that's pretty straightforward. And what that does is it gives us the content that we're then interested in. We go through, add all of it into the database, and then call it done. This doesn't actually process anything for solar at all. So that's basically our staging mechanism. So now that everything's scanned, we need to get it into solar. And this is probably why most people are here. For each of the scanned documents, we need to build a solar document to be used in search results. So solar has a class that allows you to define what a document actually is. And all that we're gonna be doing is filling in the attributes of that class with the information that we're the most interested in. Evaluate the content and render it using the GitHub markup gem if necessary, kind of already covered on that. Evaluate the content for HTML tags to assist with surfacing content in searches. So that basically means look at the content that's coming through, determine if it has a title on it. If it has a title, then we probably wanna use that. If it doesn't have a title, skip over to the standard markup looking for H1s, H2s, items like that. Send the completed document to solar for indexing and update our scanned document status to indicate that it has been indexed. So let's see what someone else like. So basically the first step of this is to go through and get all the files that we need to index. This does a raw query of where we're storing that hash data that I talked about earlier. It looks for changed or files that have the appropriate status to indicate that they should be re-indexed. We're gonna build documents out of those rows. So basically, for every row returned, we're gonna loop through it and create a document out of it. This one pulls in our Apache solar index file which is a part of the core Apache solar module. So this allows us to have access to the classes that we need to populate. So this is kind of where the magic happens. We're just generating arrays of documents. And this is the bulk of the magic. I mean, pure and simple. The core to getting data into solar properly is just building out these documents. So we're starting off with a new Apache solar document. We're gonna go through and process the file, render any remaining markdown that we need to. I'll show you what that looks like. I think you can already touched on this. That's the file I showed you earlier. So if it still needs to be rendered, it will, if it doesn't, then it won't. And then we talk to solar through the standard solar modules to have it evaluate the tags that are in place for searching. And this helps when you're going through with solar, you have an administrative interface to be able to say this tag should have more weight than this tag or this bit of content should be more important than that bit of content. This is the function that handles this for us. It's kind of black magic as far as most people need to be concerned. And then we build out the basic document structure. So it's ID, URL, site, site hash. These are specific to solar. So these are just kind of things that you have to do every single time. And then we go through and talk about our entity ID. Our entities are kind of those content types that you first saw. So entity is going to be field, it's going to be AQ, it's going to be the different things that are showing up in the facets. Then we have our bundle, our bundle name, and then a little bit of path information. So the path information for us is going to be things like what you saw at the bottom of those blocks when I did the search. Is it on a GitHub repo? Is it a generated piece of content that we should show you? From there, we're going to extract a well-formed label. It's pretty straightforward. Go through and pull the table out or title out of it. And that's what I was talking about with this. So we go through and evaluate the content. And at this point, we're basically just dealing with HTML files. So we're doing a couple of real quick greps to see if there's a title in place. If there's not a title in place, we go through and start looking for just standard markup to make the search results relevant. And where this shows up inside of here is what you're going to see basically popping up here. So pretty straightforward. Nothing too fancy or magic yet. There's a couple of, you have the ability when structuring documents in solar to go through and add custom attributes to it. So some of the attributes that we were the most interested in were the API source and the repo path. So in the search results, you actually see that the repo path come up from here. And if it has the API source is either generated or GitHub. So we wanted those pieces of information available to solar. And the way that you actually do that is just depended onto this standard file document object. It's very straightforward, very simple. And then we return that. So in essence, that that's how we're generating the document for solar. Any questions before I dive on? I know that was probably a lot at once. Yes, that's an internal tracking mechanism for solar. And this is a standard function. This entire block up here, like if you were to go through and appropriate this code to do your own thing, you'd literally leave that identical. I mean, it's just copy paste done with it. Those are the things that solar needs to be able to reference its document structure. And we're running a little short on time, I believe. So I will go ahead and skip over some of this. I touched on this a little bit. And I'll get into more detail here because this is kind of the coolest part. The Fasted API is used to create custom facets. We wanted a facet to allow filtering by API source and content type. During generation of the solar document populate the SS API source attribute. That's what I just showed you in that document structure. The Fasted API provides a block for each content type. This corresponds with the entity type and attribute in our solar document. So what that looks like is you've got your API source, your content type, and stuff like that. If I go back to the beginning, you'll see it in a little bit clearer context. But basically what we're doing is generating these blocks completely and totally using the Fasted API. I believe this is how it's done. So it's very simple. We just need two functions to be able to communicate with the Fasted API. The first one is hook Fasted API, Fasted Info. I love Drupal naming conventions because, but all that this does is tell the system the metadata about the block that we wanna create. And in this case, we're creating the API source doc, which is this block down here. We're telling it that we should use generated and get hub. Generated and get hub come from this mapping function. So what you're doing whenever you're defining this metadata is you're giving it a callback that we'll go through and show you basically what you want to show up on the block. So we have the generated EGR doc that's showing up right here. And then we have the get hub style stuff here. Then Solar will go through and add in the counts of what it's found inside of its database that corresponds with what you're defining here. So that's really straightforward, really simple. I mean, there's not a lot to it, which is part of the cool part of this. I am thinking that the best way to help really drive these lessons home or to understand what's going on here is just gonna be to dive in, download this code and take a look at it. It's kind of fun. Drush integration is very important with any project. So the Drush integration, I'm a command line guy. So whenever you're invoking on developing more complex system or doing more complex tasks, I find that it's imperative that you first do it in Drush and then expose it to the front end. So I don't know if you've ever written any Drush commands or anything like that. But this is essentially what we did. We wanted to have a couple of functional areas. We wanted to be able to clean up documents that are no longer relevant. So we added an API docs clean function. We added an API docs index function that'll allow you to go through and just generate the index automatically. You don't have to rely on Khan. You don't have to rely on Drupal to do it. Just log into the system and this is a good way to develop as you're going forward. So you get more active feedback immediately. We have the ability to scan the documentation. Everything that our front end is doing, we represent in Drush and it makes it very nice. So yeah, that's the nuts and bolts of what we have here. I mean, I can go through the code a little bit more or we can talk about questions or any specific concerns you guys might have that might help you in your day-to-day work. But yeah, cool. So these are the main components of the module. You can download these all. The APIsearchindex.inc, which I walked through, it manages the solar indexing. There's really not a lot to it. You just generate the necessary document that you need. You tie into the existing hooks that are made available and solar handles all of the rest of it. The search install just does basic database stuff. So our schema has the entity type, the entity ID, the bundle, the status of that, whether it's been indexed or not, that gets updated after it's been indexed. Whether or not, the timestamp of the change date, so we take that into consideration, the hash that I talked about earlier, the URI of it, and then the MIME type. And the MIME type is used and passed to solar as well. So that's pretty straightforward. So these are the developers that worked on it. We've got a very awesome blog post that's written about this. And obviously, that's not the link I'm trying to click on. So I'll just blame Apple for all the good things in the world. So you can go to that blog, read a little bit more about it. It outlines some of what I talked about in more detail. And then we have the actual contrived modules that we used. So there's the Aquia Connector module, the Apache Solar Attachments, the Apache Solar Multi-Search. Those are just basically enable, walk away from. You don't really have to worry about them too much. The Facet API was the one that was the most useful to us while doing this. The C tools was necessary because the Facet API uses it as a dependency. The LDAP integration was just basically plug and play. I mean, it was very straightforward. And then the module that I just showed you and all of the code that I just showed you is available in Peter's sandbox project. So if you go to that URL, you can download this code. You can go through it line by line. I'll be here for the rest of the conference. So if you have any questions or anything that you'd like to accomplish with Solar, feel free to find me and I'll sit down with you and help you solve your problem. So it should be pretty straightforward. And the mandatory Aquias hiring in Australia. In case you didn't know it, Aquias is looking for anybody that is interested. We are actively hiring very skilled professionals. We are expanding into the Australian market, basically working with AWS as they expand into Sydney. And we are interested in people in the local area. And finally, last but not least, the Drupal Association basically needs your feedback on every single session that you've seen here. I know that it's kind of annoying to focus on giving feedback at the end of the session. But if you go to the Sydney website, and you have 348 and fill out the session form, it gives the association the ability to target content, make it better for the Drupal Cons at every single show. So if you haven't done it or you haven't been asked to do it for any of the sessions that you've seen so far, please do so. And that's it. So, yeah. The multi-site search is just enabled. I don't actually know why we're having it in there, but I did a module list and that one came up. The attachments, same basic thing. I mean, the core to walk away from this presentation with is that that document that you create is where everything is at. The way that you go through and generate that document can be done in a variety of different ways. And it's as simple as putting, you know, this attribute field equals this callback. That callback can go through and it can talk to Ruby, it can talk to other web services, it can do anything that it needs to do. It doesn't have to be just Drupal. And that's what I'm really trying to drive home here. Yes. Yeah, that facet API is amazing. I mean, it's incredibly powerful and you can mold it to just about anything you can think of. Yep, and this is completely what you're looking at. I mean, this is a Drupal site that doesn't have a single piece of content on it. So, yeah. And the website's being slow. But yeah, that's all the content that we're using on Drupal. So, it speaks a lot to the power of where Drupal's going as a platform and what I was talking about earlier with the Drupal blinders. Drupal's a small component of larger systems and there's so many projects out there that does so many neat things. Ruby's awesome, the Ruby community is incredible. There's so many gems, so many things that you can plug in and just get content out of or achieve so many tasks. So, the other thing that I'd like you to walk away from with this presentation is that think outside the box. You know, I don't necessarily convince yourself that, well, Drupal's gonna solve all of my problems and do everything that I need it to. There are other ways of integrating with other technologies that make a lot of sense and are very easy to do. That shell exec command tends to cause a lot of people's security concerns. But if used properly, it can be very powerful. Thank you.