 Yeah, so first, and then, and then now I can go back. All right, we'll at least introduce ourselves in this session. So my name is Adam Shepard. I am a complete Drupal noob for those of you guys who are wondering who the heck is this person presenting next to Staphon Corlis Geth, I'm sure you guys all know. So this session is about link data in Drupal. And so if you know a lot about link data and are already using it with Drupal, this might not be a great session for you. But I would say to that point that if you love fun and are interested in reminiscing about the days where data was dumb, I encourage you to stick around because we will have a good time. But if you wanna go to other sessions because you know this stuff already, that's totally cool. But I just wanna mention that we'll be having a boss tomorrow at 11.45 on link data in Drupal. So if you wanna totally geek out on link data in Drupal and all the idiosyncrasies there, come and join us. All right, well, so I really wanna immerse you guys in the experience of being an oceanographer and what it feels like to search for data. And so the data needs of these oceanographers is really vast and it's very difficult to find. And so I wanna show you the questions that they ask of our software. Just how bad our software is at handling those questions and then ways that link data might respond to those and improve that experience for the oceanographer. So, is everyone ready? Yeah, okay, I got a fist bump so someone's alive, here we go. Okay, so to do this, to immerse yourself in the experience of an oceanographer, what we really need to do is go all the way back to 1984. Okay, I know Webchik took us back to the 90s yesterday but we're gonna go back even further to the days where hair was super crimped, you had your Varney sunglasses, the sweater over the shoulders, everything super preppy, you really wanted to drive a Corvette. And so I want you to imagine yourself as a teenager at this time. You're 16 years old and you have your license to drive. You see, it's Friday night and you and your friends are gonna get together with their friends at their house on the other side of town. But before all this happens, it's up to you because you have a license to drive to really make sure that this night goes off really well, right? So you have to get food and entertainment for this night. Okay, so you call the local pizza joint from your town, you're like, hey, I know these guys, they're good. So you say, okay, I'm gonna order my pizza and you tell them to deliver it to this address. And so you're talking to this guy and you're like, I don't know, man, this kid kind of sounds like a dweeb. You know, he says, so what's your address? And you say, well, 1428 Elm Street. And the kid says, okay, well, we'll be there in 30 minutes or less, or it's free, you know? Because, you know, back in the day in 84, that was really the way that they sold pizza delivery to you, as they said. Well, if the pizza doesn't get there in 30 minutes, we'll give it to you for free, right? So you just think this kid's a total dweeb and you're like, whatever. You know, these guys are good, they're reliable. So you need to go find entertainment, right? So you hop in your car, you put your sunglasses on because the sun's going down, right? And then in the 80s, you only wear your sunglasses at night. So you pop those bad boys on and you really head out to secure the night's entertainment, right? Which is a video rental. So as you're heading out to the car, your friend comes out the house, he's like, hey, hey, hey. Do you know what you're gonna rent? And you're like, no. He says, hey, get that one with that like righteous dude and that bodacious girl or, I mean, the bodacious guy and that righteous girl and get the one with the light beams flying all over the place. You know what's what I'm talking about? The one with the force, right? And so meanwhile, you're sitting there, you're scratching your head. You're like, I have no idea because you spend all your time thinking about the oceans and science and all that weird stuff. And so not to look like a total, you know, dweeb to your friends. You're like, okay, man, okay. I'll come back. And so with that rich description of a video cover art, you go off and head to the video store. And so you enter this place and remember, there's no like Google, there's no like computer research finding where these videos are. So you have to like walk up and down these aisles and figure out where this video is. And you're just feeling terrible, right? Cause you're like, what is this force movie? I'm so dumb. I have no idea what my friends are talking about. And so you're walking up and down these aisles and it's taken forever. And you're like, oh my God, the pizza's gonna be there in like 30 minutes. I gotta go. So you just happen to like stumble upon this one video and you're like, yes, I found it. It's mega force. And you're thinking, yes, I've got the righteous girl. I've got the bodacious dude, light beams flying everywhere. And it's got force in the title, right? Like, yes, I'm gonna come through for my buddies this time. We're gonna have a good time. But little do you know that this is the beginning of where things are starting to get really unfortunate for you. Because back at the house, they're still hungry and the pizza hasn't shown up. And little do they know that you have no idea what Star Wars is. So they're just like completely bumming and they're just in for a surprise when you get back with this mega force video, right? So this pizza situation, you didn't realize that there might have been two elm streets in the same spot. And so because you're going to their house in this other town, you completely didn't even think that you should mention that you were going to another town. And so if this was to happen in Austin, Texas, it just so happens that there's two elm streets with like 15 minutes distance apart where this could actually happen. And so you don't know this yet because you're on your way back, but the pizza kind of got delivered to the wrong house. And because it took a little bit longer than 30 minutes, that pizza was free. And those people that lived in that elm street at 1428 totally didn't mention that they didn't want that free pizza. So they totally gobbled that up. But there's no worries, right? I mean, at least you have mega force. So when you get back to the house and they totally realize that you don't have the right movie that you totally don't know what Star Wars is. Even though it's like in the prime, man, like, come on. Your friends are totally dogging you out, right? So like, well, we'll just watch this mega force movie and just have a good time anyway. Except at this house, they have this other thing called a Betamax player. So you're taking this videocassette out and you're like trying to put it in this machine and it totally doesn't fit like no way. So 1984 was the time where there was this other videocassette type that really competed with VHS. And it just was a lot of confusion. It was bad, man. So I can't even tell you. So this Friday night is just completely trashed. And you go away walking like a loser. So what happened here? And what the heck does this have to do with oceanography, right? So in the search for this pizza, what happened was is that you gave an ambiguous query and got no results back, which is often the case for oceanography when they're searching for data, especially when they don't know what they're looking for. And with mega force, you found something, you got results, but it really wasn't what was expected. And then finally with the Betamax player, even though you had some data, this mega force, you couldn't use it because you didn't have the right tools to access it. So let's fast forward to the present. You're this hardened weathered oceanographer. Your experiences and life of asking these questions have completely set you up for this type of career. I mean, just the constant disappointment of not finding what you're looking for is like primal. That's prime oceanography right there, okay? But you have hope, right? So the funding agencies are starting to fund these sort of grand challenges of science, like, wow, tell me what impacts the marine ecosystem from a grand scale, like biology, chemistry, geology, physics, what's going on? And so this gets you really excited. And so you pitch one of these grand ideas and you happen to get funding. But you slowly start to realize, wow, I'm gonna need to use computers to figure out how to get data to answer these questions because you're only an expert in a certain particular area. And to answer these big questions like climate change and marine ecosystem ecologies, you need other experts who collected data maybe even 10 years ago to augment and answer these questions. So this right here is the data that you guys as the oceanographer desperately need and you have no idea what it is. You don't even realize that you need it yet, but this zooplankton abundance here is gonna help you answer the questions for your funding. So this zooplankton right here is collected by some kind of net off the coast of Alaska and that's pretty much all we know because we're not experts on this stuff. So to answer this question, to give researchers access to this data, the National Funding Agencies back in 2006 realized that all the funding of the data that they had done before for oceanography, they weren't realizing the full potential of all that data because it was just a file and it was really hard to find. And as new research themes emerged, those data could be reused. And so in 2006, the National Science Foundation funded a project called BecoDemo. And I know BecoDemo kind of sounds like some weird variation of pig Latin, but BecoDemo stands for the Biological and Chemical Oceanography Data Management Office. And so what BecoDemo, which is who I work for, what we were set up to do was really help the researchers through the entire data life cycle. So through collection, analysis, to get their data published online so it was accessible and discoverable. So that other researchers could reuse it and various other things. So remember, you're an oceanographer. You're looking for zooplankton abundance. You have no idea how to help it or how to find that. And so here's where we step in. So what BecoDemo does is collect all this information surrounding this data file to make it easier for you to do that. And so those things are really the people who collected it, the organizations that they worked for, even the names of their projects, that becomes the cruises and the locations where those cruises were. That also is the names of the ships or the platforms. We have some oceanographers call these things platforms. So platforms are typically ships, but they could be submarines or moorings or buoys. So we also collect information about the instruments that are used and deployed off these platforms and the measurements that those things collect. And then of course, the dollar bills. You always gotta know who spent the money right for this data. So all this information that we're collecting about data files is gathered up and stored in a database. And as of last year, we're running that database and presenting that information online through Drupal. And so all this information gives what we call context to those data files. And so BecoDemo had these aha moments throughout the years. And this first aha moment was data needs context. So let's take for instance, this image right here and imagine it was a data file. What do you think this data file would be talking about? What got collected? What got measured here? Is it a candlestick or is it two faces facing each other? What was this researcher really looking into? Was he researching antiques or was it just awkward invasion of personal space? We have no idea. So context as we understand it influences the understanding of a subject by surrounding it with information. Okay and I'm not telling you guys, this isn't like this aha moment. I mean you guys know this. You guys are Drupal developers. You get it, right? Because when we build websites, we put all this information surrounding the main idea of that webpage or that content like related links and photos and videos and media and all this stuff. And so one of the main context, one of the most important context for an oceanographer is the geospatial context. So where was this data collected, right? And let me find it on a map somehow. So BecoDemo really understood this and knew that we needed to present this to the oceanographers in a way that they'd understand. And so we wanted to build an interface to do this and build it around the cruise, right? So cruises have these really nice tracks that you can follow along the map where data might be collected and these little points all over the place which makes it really hard to digest on a map. So this was our first iteration. You can see that you can use this map down here at the bottom and zoom in and look at these cruise tracks and click on them and get more data. But if you don't want to use the map, you've got these facets up at the top and they're using words like program and project and deployment. And that might mean something to an oceanographer like which project got funded or which deployment or cruise was used to collect this data. But if you're looking for your zoo plankton abundance now, how do you find that here? And so naturally like all other folks that put up a search interface, you throw up like a keyword search, your free tech search. And so you're the oceanographer, you come to this site and you type in, well, I want to search for plankton off the coast of Alaska. And this is the results that you get and it doesn't look that much different from what was before because the machine's just through a horrible job of processing these free tech things until we give it some type of idea of what those words mean. So you're this researcher and you still haven't found what you're looking for. I mean, come on, this zoo plankton abundance is out there somewhere, come on, help me find it. So the next aha moment that Biko Dimo had was that we need data to be cooperative, right? So to address these grand challenges where data might be all over the place, we need interoperable systems to aid in the exchange and discovery of this data. And so there are a couple of places out there where there's data and we can access those through web services and APIs and that's great, you know, APIs are great, but the problem here is that the machines can't consume the data even though the humans can. So even if these places have APIs, APIs only work because us programmers and us developers can look at the documentation and consume it and then write programs around it. And so we realized that these data centers were gonna be popping up all over the place and it just didn't scale to implement custom APIs for each individual one. And furthermore, sometimes the web services themselves at these APIs can be kind of vague as to what they're gonna deliver you. And so just to give you a couple examples, right? So most of you guys have seen LinkedIn and Facebook and they use words like connected and friends to describe relationships between things. And so you as an oceanographer, right? You've got a LinkedIn account and you're connected to another oceanographer, but you're also connected to your next door neighbor because last week they sent you an email that says connect with me and you felt guilty and there's all this backstory why you said yes. And so now you're connected to these two different people in two different ways, but you're still connected, right? So what does our site do with that data? Can we infer that your next door neighbor worked on your oceanographic research with this other connected person? Or with Facebook, like your friends with your coworkers and your friends with your family, but you're also friends with your high school acquaintances? Like maybe there's something here that's missing in the way that we describe our data. So let me give you an example, right? So here's a picture of the moon and in the middle of this picture is a red dot for Apollo 11 landed. And so Apollo 11 is the mission that sent Neil Armstrong and Buzz Aldrin to the moon and Neil Armstrong stepped out and he took his first steps right there at that red dot. And just to the left of that is this crater called Copernicus. And so there's software and websites out there that try to mash up data and one of those takes, looks out on the web and looks for lat long coordinates. And it's really cool, you can map Copernicus, this lunar crater. So as I did this, I was looking at it and I was like, wow, that map just looks kind of weird, right? So as you zoom out, you realize that Copernicus got plotted in Chad, Africa, which means that Neil Armstrong walked on the moon for the first time in Southern Sudan, which is crazy. Like, did he get hit by an antelope or something? I don't know. But the point here is that either the data didn't have the right context to say, okay, this lat long is really about the moon and not the earth, or maybe the software just didn't have the wherewithal to think that. Or maybe this is what the software intended to show the relationship between earth coordinates and lunar coordinates. We have no idea. But okay, that's a geeky example. Sorry, we work for science. Let's give one that's just a little bit more digestible. So let's say there's this site out there for grass fed beef supplies, right? You can buy grass fed beef products right off this website, right? So the site owner's got this site up here for a while, and he's not really making that money off this. He's, you know, it's not really generating an income for him, and he's getting kind of worried that he might have to go down to McDonald's to eat, and that's really freaking him out. So he says to his site engineers like, hey man, can you guys like build some type of system where we can generate some revenue off this site? Like, maybe you can connect it up to products based on like what users search for in the search box, right? So we'll deliver them content, but we'll also deliver them like related products that they can buy. So your engineers are like, yeah, I love a challenge. They're getting geeked up just about the challenge, right? So they go off and implement this thing. And Healthy Joe comes to the website because he's heard some story that like coastal beef is way tastier than Midland beef. No offense Texas, I don't know if there's no, no offense to your beef here, it's great. But anyway, Healthy Joe's like, you know, is this really true? Like, is coastal beef really tastier than Midland beef? So he comes to this website, types in coastal beef. That algorithm goes off and finds this article about coastal beef, but it also goes off to the internet and is like, okay, what products can I find about coastal beef? What can I find? And so if you Google coastal beef, the first two hits are about hip hop and the death of Tupac and Biggie. So now that algorithm has gone off and found, okay, Biggie had this album called Ready to Die. Great, it's a product, let's sell it. So this coastal beef site is now selling you a product that says, are you ready to die? So maybe this isn't the message that we want to send on this website, right? And so Healthy Joe completely freaks out. He doesn't know what's going on and what this whole coastal beef thing is. So he totally leaves, he doesn't even buy any beef. So now you're really out of pocket. So what we're realizing here is that for reliable data exchange to happen between systems, we really need semantic interoperability. And so semantic interoperability sounds like some phrase that some undergrad just throws into a paper to sound smart, right? But semantic interoperability in this case means that the sender knows that Ready to Die is a musical work and that the receiver of that data should interpret it as a musical work. And so you can define semantic interoperability as this means of shared understanding that can happen between two systems throughout content exchange. And so I want to make this point to all my web developers, content producers, that if your content is worthy of a website, then it deserves to be understood by not only the humans, which we do a really good job at, but also by the mediators and the machines. So I think that link data really solves this problem well and I'll show you how it solved it for BicoDemo in terms of this geospatial mapping interface. And so I'll throw up a little screencast and I'll talk you through that and then we'll just sort of dissect what's really going on here so we can understand what this link data thing is all about. So let me move this over here and am I any close to making this full screen here? So I'm gonna have to like hunch over this desk. This is crazy. Okay, so what's going on here? This is the next iteration of the geospatial mapping interface with link data. Okay, so you're this oceanographer, right? And you're coming to this and all you know is that you want plankton data. You want it in the northeast Pacific. And so you know that there was a ship or a platform called RV Wacoma that does work in the northeast Pacific. So you go off and you find that and it adds that little RV Wacoma to your search criteria. And you know that you're looking for, what are we on? Parameters, instruments, help me out here, people, I can't see. Is it biota? Okay, so you know that you want abundance data so you add that to your search. And so this is just sort of compounding the facets and just drilling right down as you add more facets to this thing here. And so you add your instruments, you're looking through these sort of like categories, looking for like specific instruments or just even a category that relates to this instrument. Like, I don't even know what plankton nets are called but I know I want a plankton net so let me grab that. And so finally you know that out of this list of people that did work in the northeast Pacific that you recognize Ken Cole's name. So you're like, ah, Ken Coyle. I know he does good work. Looks like he did some plankton network so let me click on him. And so this just like really drills down into what he did and you can click on what looks like a data set here even though it has a completely illegible name. And you can go off and look at the metadata in our Drupal site and say, okay, there's Ken Coyle, he was the copie. So if you went through traditional search mechanisms you might not even find that Ken Coyle was associated to this data. So you go back to the interface, you're like, yep, this looks good. Let me just plot it and just see what kind of data there is. And now with abundance data there's different ways to classify the organisms and you can do that by stage. And so these stages are classified like show me how many adult males or adult females. So you find some species that you know and you say, you know, map the abundance data and then show me all like the adult males, right? So you're waiting, waiting and I pop these little dots and you can click on them and see the exact results, the abundance values and great. Like this is exactly what I want. I didn't have to like look through a million cruises on this wonky interface and I can download the data right there. Okay, so let's cut this video out here. All right, so what's the application stack? Here, so this is map server, some open source software sitting on top of open search which is just like an open standard to describe search engines. That's sitting on top of link data and Drupal is supplying this link data. And so at this point, if I was you, I'd be thinking like, oh, what's the big deal? That's just faceted search. There's nothing amazing about what we just saw, right? What link data was doing there was it was giving some facets for data from third parties. So this is data that BecoDemo didn't collect, didn't curate, didn't do anything with but found through link data. And those two facets were the instruments by type and the parameters by type. And so those two facets came from this community called CDataNet over in Europe. And so this is like a conglomerate of 44 different data centers and organizations across Europe that got together and formed this repository of oceanographic terms. And they're really considered like the de facto authoritative source to describe instruments and parameters. And so one of their partners, British Oceanographic Data Center, BODC, decided to expose it as link data. And we think that it was just assumed that maybe there's other partners in these 44 that would want to use this in this sort of common format. But they had no idea that one of their consumers would be across the pond 3,000 miles away in BecoDemo. So what does link data do for us? In addressing the sort of three ahas that BecoDemo had, link data provides context through RDF vocabularies. It provides cooperation through common frameworks which means that my data and their data can talk and exchange ideas over a common sort of like format and framework. There's this idea of cooperation without coordination which means that I didn't need to call up BODC and say, hey, I wanna use your data. I didn't have to call CdataNet. I didn't have to register for an application key or an OAuth secret key or whatever all that other stuff is for APIs. I could just use their data without having to do anything but just call on it. And finally, semantic interoperability through HTTP URIs for naming things. So I've just described all these funky things like what the heck is the HTTP URIs for naming things mean? What is cooperation in RDF vocabularies? Like it just kind of feels like we just drove off the cliff, right? But I don't wanna explain too much because the good news is that on Drupal these tools make it so easy to implement these things that it abstracts out all this sort of like funky junk like RDF vocabularies and semantic interoperability through URIs. So link data and Drupal really play nicely. So on this URI thing, Drupal content already has globally unique URIs, right? You've got node slash NID and taxonomy terms slash TID and user slash UID, right? Drupal also has mechanisms for serializing that content in many different formats. So you've got your rest WS which you can take your node and yeah it serializes it into HTML without rest WS but you can also serialize it as JSON and XML. And finally the Drupal community has got these really awesome folks like SCORE who contributes modules that do amazing work. So with these three you can really publish quality link data. So I wanna introduce SCORE and bring him up to talk about RDF. Thank you Adam. So I wanna tell you a bit more about the modules that Adam used to build VicoDemo. So it starts with the core, Drupal 7 core as a basis. It has like Adam said, entities all have globally unique URIs out of the box. And then what we do for generating RDFS, RDF out of your data that lives in Drupal is that we map your content types to RDF classes and then we map your fields to RDF properties. And then we generate RDF data from there and what happens by doing so is we give more context to your data. So instead of being ambiguous data it becomes much more contextualized and it can be reusable and understood by other peers. Another module that Adam is using is RDFX in Contrib. So the main point of this module is that it provides a platform or solution to serialize RDF in other formats. So Drupal core alone only supports one format which is RDFA in HTML. This RDFX module allows you to serialize RDF in other formats like JSON, with JSON-LD, XML and text. And that's also combined with the RDFS, RESTWS, sorry, module. And so we have a picture of several bags of chips here and it's just to illustrate the fact that serializations are just different flavors of your same data. So at the end of the day, no matter what format you're using to serialize your data, the meaning remains the same, the data model remains the same. We also have an RDF UI module. So this module is useful to create the mappings that I was telling you about the RDF classes and the RDF properties. So you can reference any, pretty much any kind of RDF vocabulary from out there. You import namespace and then you can choose the mappings for all your content types and all of your fields. So I just wanna reference a video or a talk that Tim Berners-Lee did a few years back where he was showing a bag of potato chips. And his point was that if you look at a bag of potato chips, the front usually is for us humans. It's got a delicious picture of chips. Wanna make it appealing for us to eat. And then in the back, you have different bunch of data basically targeting different kind of consumers. There's the rectangle of nutrition facts. Then there's the barcode which is used by the store. And then there might be a list of ingredients for again for us humans. And then you might also find kind of an obscure number somewhere at the bottom. And it doesn't really make sense to you but it makes sense to the manufacturer and it was probably printed on the bag at the factory. So this is to illustrate the fact that even if you have different kinds of data and you might not understand all of those data but they might make sense for other people and other organizations. So RDF supports that, it's fine. You don't have to understand all of the data that you find in your RDF. And that's why we have different vocabularies affecting different kind of data and describing different kinds of data. So one of the things that happens when you turn on RDF or RDFX is that you get these already out of the box default RDF vocabularies for all your content fields or excuse me, for the most common content fields. You can add others and there's clues into that. But I wanted to show up or put up on the screen what BecoDemo was using for vocabularies just to give you a hand or maybe this is a good direction on where to start. So first off, we created our own vocabulary called the ocean data ontology just to describe all the intricate relationships between cruises and people and the projects and the funding and all that stuff. But really we ended up using a lot of vocabularies that already existed for things like just generic metadata, we use Dublin Core. So this is like the created date of your nodes or the updated date of your nodes or the title of your node. And we use DCAT which is a data set catalog vocabulary so that helps us describe what the data sets are like where you can download those files. We use FOF which stands for a friend of a friend even though I made fun of friends this is a pretty good vocabulary. So we use FOF to describe people and organizations. We also use another vocabulary called void or vocabulary of interlinking data sets. Wicked nerdy, I don't know how they came up with that title. But void describes how your content links to other's content. So this would describe how BecoDemo links to the CDataNet vocabulary and gives machines a way to figure out how you talk about those relationships. Another vocabulary that we use is geosparcal and this talks about geometric features. So when we talk about cruise tracks and where this instrument was towed, the path that it followed, geosparcal describes that content and then finally PROV-O which is really about revisions and activity. This user updated this node or this person deleted that or this one generated another node based on this other node. And finally I just wanted to put up a link here in case you're interested in this that you can create your own RDF vocabularies and this is a good place to start. So if you're gonna power up your sites, the first thing you wanna do is enable RDF and RDFX and you don't have to do anything with those things but just enable them to gain immediate semantic interoperability on your site. So the next sort of power up that you can do on your site is really about how do you make this data queryable? So in that open, excuse me, in that stack of technology that we showed earlier, what's sitting in between open search and this link data is this sparkle language that lets open search figure out how to talk to link data. And so I wanna bring up score to talk about his next module. So this is another module. It's kind of the corner piece of Bicodemo. So we've seen how we can publish data as RDF in different formats, but what about querying this data? If you don't know what you're looking for, you wanna interface to query. So I wrote this module RDF indexer and its role is to simply index your data in a triple store and it uses the search API to keep track of all the entities that need to be indexed. And on this diagram, you can see how it works. You have your triple site on the left, regular stack for a triple site, and the same way you could have a solar instance on the top, on the right, the same RDF indexer module works the same way. There's an RDF store that serves as an index on the bottom right corner and the site will send data, will ship data. Every time there's an update to an entity, it will ship that new version of the entity to that index, to this RDF store. And this RDF store could be any backend. On this diagram, it's R2 backend. Adam will talk about another backend in a moment, but that doesn't matter, it's extensible. It uses search API, like I said. So when you register your server or your index, you can specify how to access it. By default, it's R2, so this is a local store. If it's a remote store, like it would be most of the time, you might have to fit in the IP address and whatever other HTTP credentials to be allowed to index data and send data. And depending again on your backend, you will most likely have a sparkle endpoint. This is just the R2 sparkle endpoint that you get out of the box, but each backend has its own interface for this sparkle endpoint. So when Score released this RDF indexer module, my eyes immediately lit up, right? So, BicoDemo was doing all this work before we got to Drupal. And so one of the ways that we had to make our data queryable was we had to write these PHP scripts that talk to this MySQL database and create the RDF, and then that had to wait for a batch process to take all those RDF and dump it into this store so that it could be queryable. It was just what a pain in the neck. And so when this module came along, it was like the lights went off, like oh man, we can leverage this to import this RDF data that RDFX is already generating for us on the fly. So we don't have to do that. We can throw away those old PHP scripts. And then we can have the search API just automatically update this index. And so because we were using this external store called Virtuoso, we were just able to really quickly write this patch to index and update and delete out of this Virtuoso. So I've thrown up where you can find the patch at the Drupal.org link. And here's the BicoDemo Sparkland point if you wanna play and just look at the data that's in there. So power up number two is to enable RDF indexer to make your data queryable and just use the defaults like Stefan said. You've got the Arc2 store there and that'll create a Sparkland point for your data on your Drupal site. And so we've been talking about how BicoDemo linked to CDataNet terms and I just wanna show you real quickly how that looks. So here's just like a quick snapshot of one of our parameter pages. This is the abundance parameter at BicoDemo. And so we've, you can see a little link there that says external identifier and it says link and you can click it. And that's really the only connection between what we think abundance is and what CDataNet talks about for abundance. It's really just a content type field that we populate with a link. And so if you click that link for CDataNet you get this RDF serialization and they just happen to serialize in XML. And you can see that this is that biota, abundance, biodiversity folder that we clicked which was somewhere along the hierarchy of all their terms. And so you can see that there's all these narrowings of what that means and so these are more specific parameters. So somehow our abundance parameter is linked and can relate to all these other more specific parameters about biodiversity and biota and abundance and what that means. Just through a link through a field in Drupal. And so just to show you what our RDF looks like in the same format, you can see that this link to CDataNet is in this RDF file and this is what makes semantic interoperability between BicoDemo and the CDataNet terms. And so the last power up I just wanna talk about is that you can generate value for your own content by linking to other data sets. And there's a ton out there. Here's three examples, Freebase, Wikidata and DBpedia. And so one of the advantages that we've been finding at BicoDemo is that by linking to the CDataNet terms when BODC starts to add multilingual support, right? Like here's the Spanish version of this term. Here's the whatever French version of this term. That is providing multilingual support at our site for free because we can now accept the French term and query it against the CDataNet dataset and figure out what it means in English and return that back to the user. So there's a ton of link data out there, a ton of sparkle end points. So whatever your use case it is, whatever you're trying to do on the web, whether that's expand knowledge or generate revenue, there's a ton of data out there and you can leverage that. So I wanna just share just a quick enclosing, just a quick excerpt out of this innovation report that was published about a month ago by the New York Times. And they're talking about link data and structured data and they're just sort of remorse about not doing it sooner. So they say there's substantial cost to waiting. And they cite an example about their recipe database. They say, we've floundered for 15 years to build this useful recipe database and it didn't work for all that time because it wasn't properly tagged by ingredients and cooking time. And it's just so weird to hear that, like you'd think, well, why wouldn't you? And so I guess the point here is that we have no idea what the next new facet for searching data is gonna be for recipes. Like maybe it's, oh my gosh, show me all the recipes in this part of the region. We don't know, but if you have that semantic information about these recipes and you tag it in RDF, it enables this much later down the road to respond to these situations when they arise. And so this report goes on to say, we can't do it now, but only after spending, or we can do it now, but only after spending a huge sum to retroactively structure this data. And they sum this section of the report up by saying, our lack of structured data helps explain why we're unable to automate sales of photos on the New York Times website and why we continually struggle to attain higher rankings on the search engines. And they finalize it by saying, we need to reclaim our industry leading position, but right now our needs are far more basic. We must expand our structured data we create. So I wanna close this talk by giving you a quote from Mark Twain who said, I like a good story, well told. I know you guys have great content and I hope you tell it well with structured data because it's all about the context and the content. Thank you very much. So I just wanna throw up here a couple of references that really helped me out. This link data book by David Wood published by Manning is a fascinating resource that talks about other use cases for link data, specifically the BBC who with a small development team really leveraged link data to push out content that they didn't own at all by really using Wikipedia data. So Learning Sparkle is a really great resource for just trying to figure out how to query this data. It's very similar to SQL and it's really easy to jump into. This book is great. And then finally, the definitive guide to Drupal 7 chapter 28, which Stefan contributed. And lastly, I just wanna mention there's a buff on link data. So if this topic interests you or there was something that you have questions about that we can't address at the end of this talk, we encourage you to come out to this buff and totally geek out with us and bang us up on questions. So we open you up to ask questions and you can always hit us up later on Twitter or Drupal.org. So yeah, have at it. So hi, thanks. So could you talk a little bit about the manual process involved in adding a data set and getting that into RDF format? Sure. So the actual data files that Bicodemo gets, they might just get emailed by the oceanographer to us. And so that just gets thrown on a server somewhere. And so what NSF is funding at Bicodemo is really the man hours for these data managers to say to go out and collect all this metadata like, okay, who are the PIs involved? What was the project? Who funded them? And so this is something that typically before Bicodemo existed, it would be a responsibility of the oceanographer, the PI, but they just don't have time for that. They just wanna do their research. They don't care. They don't care if someone else can find their data, but they care when they can't find other people's data. So that's why Bicodemo got funded and that's why we really feel like this effort is important. Can you talk about solar? Sure, I'm gonna turn that over to score because I have no idea. So yeah, we talked about, I mentioned solar earlier and it was really as a comparison. So Bicodemo is not using solar as far as I know. So it was just to say that usually the way solar works is the same way the RDF indexer module works. Drupal will send data to a solar server to be indexed. Solar is a search engine that is very fast and for full text search. So it has also facets, so you can have a very nice and very fast efficient search experience. So Drupal sends the data, the entities to solar and then when the time comes to answer a search query, it will query the solar index for results. We use the same approach in RDF indexer. We send our entities translated to RDF to the tripper store and then the search part doesn't necessarily happen between Drupal and the RDF store. Typically, there's a sparkle interface and oftentimes it's gonna be a different workflow. So there's gonna be, in the case of Bicodemo, it's a different interface. So the user interface that Adam demonstrated earlier where there was a map and different facets, that's not Drupal, that's built separately and it requests the data from the sparkle endpoint as RDF data and then it visualizes it on the inside the user interface. But so I just mentioned solar, it's a comparison. Just to say we use the same idea as solar where we send the data for indexing. Does that make sense? Yeah, yeah, yeah, okay. I have another question if nobody else has. Are you storing this external data? In the mic, please. Are you storing external data? So you're gathering this from external sources. Do you store it in Drupal or not? Yeah, that's a great question. So there's a lot of usability issues. So with these sparkle queries and especially if you're linking off to external data sets, what's the availability of that sparkle endpoint? Like what if it goes down and you have an oceanographer in the middle of a query on that interface? So what we do is that we harvest all that external data that we need, that we link to at CDataNet and we store it inside our virtuoso triple store. So we don't store it in Drupal, we just keep it in a separate location on the triple store so that we can query it at will and then just keep it up to date behind the scenes. How often do you update that, daily? Yeah, we do it nightly. Okay. Yeah, so that was, oh, go ahead. No, go ahead. I was just gonna say that since we've used this data from CDataNet, we've established relationships with folks at BODC and to be honest with you, I'm sure we knew them before we started this collaboration and they probably made mention of this data to us. I wasn't at the office when that happened but we've talked to them and we've realized that, they don't update that a ton of the time so we felt like daily was fine. Did you ever watch Megaforce? I've never watched Megaforce but that slogan, deeds not words, totally makes me wanna watch it. Can we please get Megaforce showing at the Alamo Draft House tonight? Come on, man. Hi, I'm new to linked data. The question I have is I've used the schema.org module and I would like to understand the differences between that and RDF, the use cases for it. Okay, yeah, that's a good question. So I would qualify schema.org as more of a lightweight solution compared to this and schema.org, you remember I mentioned something about the vocabularies earlier, how we pick a vocabulary depending on what we're describing and Adam showed the list of vocabularies that they're using. So schema.org is just another vocabulary and it's in fact maybe the main vocabulary that people should be aware of and possibly use first and see if that covers their needs. Schema.org was designed and sponsored by all the main search engines like Google and Yahoo and Bing to cover the majority of the use cases of regular sites like e-commerce and recipes and events and hundreds of other types. But it doesn't necessarily cover all the niche, the more niche topics like oceanography research. So that's why you need to add on top of schema.org more specialized vocabularies like we can deal with it. But if you're just starting out, schema.org is a fine vocabulary to start with and there is in fact a module called schema.org that kind of makes the mapping process a bit easier. It abstracts away the aspect of multiple vocabularies. The UI just talks about schema.org only because oftentimes that's all you need. So the API, you can start with schema.org and then later on switch, turn on the full blown audio effects to integrate more vocabularies if you need them. Does RDF allow you to create your own vocabularies then? Like a domain specific language? Right, RDF in general, and that's not specific to Drupal, but RDF was designed, yes, to let you design other vocabularies and mix and match vocabularies the same way you mix and match data. So yes. Yeah, yeah, I just wanna say something about that question specifically. And this really illustrates the beauty of linked data is that linked data is really built from the bottom up. You can build your own RDF vocabularies and talk about the relationships between your concepts, the way that you understand them and the way that your site thinks about them. You don't have to let someone else dictate to you what those things are. And so you might think that's like a bad thing, like, oh, geez, well, how do we then interoperate if we all think differently about these things? Well, RDF provides a mechanism for you to link to those other vocabularies when they emerge and describe how your concept relates to that concept. So for instance, like the potato bag of chips, right? The nutrition facts, you might come out with your own nutrition RDF vocabulary in the way that you think nutrition is important. And then the FDA comes out a couple of years later and releases its de facto nutrition RDF vocabulary. Linked data and RDF gives you a way to talk about how you talk about calories in the way that that relates to the way that the FDA talks about calories, if that helps. You had a list of RDF vocabularies. I was wondering, do you manage some of them within Drupal or is that from a separate data place? Yeah, great question. So I don't use Drupal to manage those. The tool that we use is a tool called Protege which helps you build these vocabularies and serializes what's called a web ontology language file, an OWL file. But I would be happy to use Drupal to do that because it's got all the revision tracking built right in. That's really hard to build. Yeah. I could try it. Maybe a weird question to ask, but how did you integrate Protege into Drupal and how did the interface to manage that or to reference those vocabularies that also tagged in Drupal or that separately? No, sorry, maybe I've misspoke. So we use Protege to help us build these RDF vocabularies and the outcome of that is an OWL file. And then we serve that from a website somewhere which might be Drupal, might not be. We just need to make sure that it's on the web accessible so that if someone dereferences that URI to that OWL file that they in fact get some type of RDF that talks about that vocabulary. And then I think you copy paste basically from the OWL file URI into the RDF UI, right? Yes. Classes and properties basically. Yeah, so okay, so in Drupal when we, so in our BecoDemo site when we say, okay, well this thing, this cruise has the ocean data ontology class cruise. We could do that through RDF UI, that sort of user interface that gives you the way to talk about those things. But you can also do it in code. There's some hooks that you can use which I'd be happy to show you afterward if you wanna just ping on this. And when we talk about code, I forgot to mention that I guess all the modules that we showed today and especially RDF indexer and also the mappings, all of those configuration settings can be exported as features. So you can deploy that very easily. You can manage the versioning. If you make changes to your data model, all of that goes through code. And so it can be tracked very easily. So you can use the UI on your local host to kind of set the mappings up. And then you export that into code and that leaves in the code and then it gets deployed. Thank you. So we hope that you enjoyed this session. Please go off to the website and evaluate us. Tell us how horrible we did or how good we did or something that you thought was funny like Megaforce or something. So thank you so much for coming. And please just wanna remind you that we do have a bof, which is tomorrow. I understand tomorrow. Yeah, yeah. So come check us out. I think we had a slide, but I don't know where. Oh, there we go. So come and talk to us more. It was great to have you.