 Okay, we'll make a start. So first things first, soundcheck. Those at the back, are you happy with this? Thank you. Second thing is, unlike many presenters, it might not sound like it, but English is actually my first language. I will speak too fast, and you will object. That is your role. So when things get too fast, let me know. So this presentation is about using, I have to say, I spent, this was a bit of a rushed session proposal, and it's shown in the title. And I really intended to tidy this up at some point and come up with something shorter and snappier, but it never happened. So here we are, using Varnish to serve content from your new Drupal site alongside your legacy platform. Having your cake and using it too, keeping two platforms live at the same time. Again, at the back, there are seats at the front. Feel free to take them and leave again if it's not working for you. So from most website projects, the scariest thing is the big switch. It's the day you go live with your brand new website. And we're often faced with problems in that not everything is ready the day we go live. So we're looking for ways to alleviate those problems and make that big switch a little bit scarier. So in particular, we're looking for ways to keep our old and legacy platforms running at the same time. And that's what this presentation is about. So we're going to talk about how we used Varnish, how we used Fastly, some VCL. We're going to talk a bit about the strategy that we took to get to that decision as to how we'd use those technologies. So a little bit about me. My name is Alan Burke. I'm a director at Anartec. We're celebrating our 15-year anniversary this year. Drupal sponsors, Drupal association partners come and talk to us at our booth. And we'll tell you about all the amazing things we do. Is that enough of a pitch? We work with some really, really nice clients and we've got lots of talented people. Okay, we'll go with that. We'll go with that. So yeah, that's what my job is to build websites and keep our clients happy. So let's talk about the problem we have. So the situation is we've built our brand new shiny website and we're about to go live and we've done all our tests and everything looks great. And we're, you know, launch date arrives. We've sent out the invites to the party. Everything's looking good. And then it turns out that the problem is the content is not ready and this is a completely unprecedented situation because up to now content has always been ready for every website just as we go live. The content is ready, but things are different this time. The content is not ready. So what are we going to do about this? So what we're going to do is we're going to look and see what content we have ready and what we can go live with and what content is not ready and is not ready to go live. So let's have a quick look. Yeah, some of it is ready. And the reality is we're dealing with lots of different stakeholders within the organization. So just a brief aside, the client for this particular project is the University of Limerick in Ireland. It's a large third level university serving Irish and international students. It's kind of a big deal. They're lovely people. Some of them are here. They're far too shy to say hello until perhaps later on. But they've got to deal with a lot of different entities within the university who provide content for the website. And indeed, they provide a platform for these entities like different events that go live, different faculties or departments or even research institutes within the university. They've got to serve all of these different people and make sure all their content is available on the central website. But not all of these entities are equal in terms of the resourcing they can provide for a new website launch. So they weren't really in a position whereby everything was going to be ready on day one. And the reality is that forcing them to do it or threatening them to be ready for this wasn't a viable option, much as though we would like to do that. And I forgot my timer. Let's see how we get on. Okay, so some of the content's ready and not all, and we can't wait. We've done our job as website developers. We've redeveloped the whole new website. Again, seats at the front if people want them. We've redeveloped this whole new website and we've got our jobs to do. We've got to move on to the next project and we want to go live. So we're going to work out which bits are ready to go live and we can put them live immediately. And then the plan over time is as new sections of the website become ready to go live, they will incrementally make it live. So we knew this was coming. We knew this was going to be a problem. Perhaps we were a little bit naive as to the scale of the problem in terms of how much would be ready and how much would not. But we knew it was coming. We knew about the different entities. So one thing that we did was when we built out the new website, we built it in such a fashion that it would be possible to bring sections of the website live before other sections. So this site is a large group module installation and each group will represent either a website section or it will represent a faculty or department within the university, or perhaps those groups will represent events or other short-lived websites that are needed within the university itself. So what did the old platform look like? So the old platform is still interesting because it is still live. The bulk of it was one or I think two large group of multi-site installations. They were not particularly well maintained and were not particularly user friendly and they were seen as the key target that we were going to get live in phase one. But there was lots of other stuff there as well. There was a large selection of standalone Drupal 7 websites that were very much out of date and non-maintained and they needed to be brought live as well. There was some static content, old HTML pages, legacy pages that had been there for some time. There were some WordPress sites. There were some Perl scripts and duct tape and post-it notes and bubble gum and various other things just holding the whole thing together. But it did work. It did serve the content most of the time. But in many ways it doesn't really matter what the old website looked like and that's the point I wanted to make here is that regardless of what platform you're coming from you can come up with a plan to work with an approach like this. It isn't necessary that it was a Drupal 7 website or any kind of Drupal website. It could have been anything at all and if you're in any way at all able to split old and split new with some kind of a rule, however complicated that might be then this approach should work. From our perspective the key part was to keep those rules within the URL. There are other things you could use for rules like proximity to the user or age of the content or various other things you could have used as rules but we were going to base our rules based on the URL. So one of the key things was we need to map the URL paths to, sorry, the key, literally the key was the first part of the URL. That's what we're going to use to define what to find an old part of the website and what to find a new part of the website. So as we built out the new Drupal, well it's now Drupal 10 but anyway the new Drupal 8 website at the time, we worked and all of the groups fell into this first part section that you see on the screen. So we built out new sections of the new site using that as the key for all of the new groups and groups represent sections or like I said departments, research institutes, events. And that worked very well. We were able to come up with a plan going, okay, such and such a department is ready, they're on the new platform, their content is ready and it's been maintained possibly in both platforms for a short period of time before they go live. So the first thing we needed to do was to split the traffic. So what I mean by this is as requests for URL came in to the web server we needed to find a way to take some of that traffic and send it to the legacy platform and take the rest of the traffic and send it to the new platform. And what we needed was a front-end server that would do this work. So we have our own web server as it happens with running a platform.sh that's not particularly important in terms of the requirements for this. The legacy platform was there. We knew more than we wanted to know about the legacy platform. In many ways all we really cared about was the fact that over there at a particular URL lived legacy content and if that's all on you I would have been a much happier man. Unfortunately I learned a lot more about what lay behind that legacy platform and you guys are going to learn some of it too. So we sat down and we worked out what options did we have to split this traffic and it turns out that it really falls into two broad categories in terms of our options to split the traffic. So what we're looking at here is a couple of software tools that you could use to spin up on some bare metal and manage yourself and write the logic in the domain-specific language of your choice to split the traffic. So we looked at varnish and we looked at an option of running our own varnish server. We looked at tools like Squid and Nginx and Pound and a few more. Pound interestingly didn't have a logo that I could use on this site. So that was enough to rule it out because I wouldn't be able to talk about it in a presentation. But again I guess at the point I keep trying to make here what's important here is how we did it, not the specific tools. We're going to talk a little bit about the specific tools but you have a number of choices here and if some of these things are within your stack or within your expertise then they're totally valid in terms of use cases that you could use for this kind of a solution. But we're not huge fans of running our own servers. We have friends who do that for us. And another option is to use your own provider. So the only problem with using your own provider or maybe not the only problem but the main problem with using your own provider that you've used to build the website is that by the time the traffic hits your newly minted server it's sometimes a little bit late. By that I mean you want to intercept the request relatively early so that you can apply your logic before you and it decides which URLs should be served from the legacy platform and which URLs should be served from your new platform. Now I was promised heckling from some people in the room but Jochen has failed to heckle at the appropriate opportunity. Of course there are other hosting providers. If only there was somebody else who could help you out with this stuff as well. So I apologize for the lack of Jochen's logo on this slide as well. So Fricelbox of course could have handled this as well. Now the limitations from a technical perspective might be those imposed by your hosting provider. So at any of these providers if you go and say I want to run my own custom varnish configuration depending on how much money you're willing to throw at them they'll either laugh at you or happily take your money. And in most cases the amount of money I was willing to throw at these people or I had on my budget to throw at these people wasn't really enough to stop them laughing. So it didn't seem like it was going to be a realistic option for me to use their infrastructure to run the custom scripts on some of the software we talked about it there. Now for the sake of an example Platform have some routing rules that you can put in a routing file and it isn't quite malleable enough for the use case that we had in mind but it might be malleable enough for your use case. So it's a very early way to intercept the traffic and apply some logic to it. It just wasn't quite flexible enough for the requirements that we had. So the other option is to look at some other dedicated cloud-based services that will buy their bread and butter they will intercept traffic and do something with it. So we looked at a CDNs that are out there and what they could do for us and generally speaking pretty much all CDNs will have a concept of origins. So an origin is a place where the content lives so when they get a request in for a URL they know that they can send the traffic to a particular origin and then send that traffic back out to the client. Now ordinarily they will let you configure multiple origins so just like our requirements we had a legacy system and we had our new system we were able to find those multiple origins within our CDN of choice. Now there are other things that you can do with origins origins that are more used based on say geographic location could be used as a rule to prefer one origin over another or response time or up time or things like that. What was slightly unusual about this was we were going to segment the traffic purely based on what the URLs were, nothing else not staleness or up time or things like that things you would ordinarily use a CDN for. And we're looking to leave water on the table with two laptops so sorry about that. So in the end we went with Fastly so we've had good experience with Fastly as a CDN on multiple products. It's got an excellent UI in that it's very easy to use and for most projects it's not quite a set and forget but it doesn't take an awful lot of configuration to get it working with normal websites. One of the other things that was really useful for us was that under the hood Fastly runs varnish. Now I'm sure they run lots of other things too but their key tool for serving their cash content is varnish. So as well as that being under the hood they also let you add on custom snippets of VCL code so VCL is varnish configuration language it's a domain specific tool that you can use to apply this specific logic. So you're not just limited by a rules UI to let you do specific things which is there but you can also add on your own specific snippets of VCL. And that was really key for our use case because we couldn't come up with logic that fitted neatly into a UI on Fastly or any of the other tools either. So I think this is the only code I have in the presentation I have no idea how legible that is at the back but don't worry I'm going to walk you through it and it's not particularly important to what it does but I just wanted to point out that I had absolutely no knowledge of how to write VCL so if there's anyone here who knows more about VCL you can give me the list of things that is wrong with this much later on please. But basically that's the code that translates the slide we saw earlier on it takes the inbound request, finds out what the first part of the URL is and then runs some logic based on that to send it to the legacy platform or to send it to the new platform. And one of the key things that we needed was the ability so the logic itself is not woefully complex it wasn't something that we wanted to be updating every day nor was it something that we were in a position to hand over to the client and say hey you can maintain all of this in your custom code so one of the things we needed was a simple way to update the rules on an ongoing basis. So we'll take a step back briefly and we want to talk about the plan for after we had gone live. So we go live we know we've got new content we know we've got old content and old content comes from legacy and new content comes from the new platform and that's all very good. But we needed a plan to make sure that on an ongoing basis we can bring new sections of the website live. So the way that looked the way that worked was we use a thing called lookup tables. So lookup tables are relatively simple key value pairs or a set of key value pairs within varnish and within Fastly that we could just put in a list of URLs and say hey this list of URLs is in one specific group sorry not URLs just sections. And the useful thing about this is that it's trivial through the UI to add and remove items from these tables. So this was very important for us so what we wanted to do was make it very simple to bring new sections of the website live. So as an example whenever the library site was ready on the new platform it couldn't be a very onerous or complex task to make that section live. It needed to be something that we done really quickly and really easily and that's where these lookup tables came. So they're really useful when those rules change it happens instantly. You don't have to restart your Fastly configuration or clear your caches short of the caching rules that are in place for keeping content there for a period of time and those things change straight away. So this was really useful and this is probably the key factor that we used in deciding to use Fastly. It fitted really well with our plan for this site. So let's see. And our plan, our first plan was to what we call default to new. So what we really wanted this scenario was when a URL came in by default we would send it to the new platform unless for some reason it matched a value and it had to go to the old platform. So this seemed logical to us. It seemed like this would work. You know the client would sit down with us and give us a long list of hey, here are all the old sections that we're not quite ready with. Please put them in your lovely list and then we'll take them away from that list whenever they're ready, etc. And that was all going very well until we hit a 3000 line HD access file in the root of the legacy platform. Now this... 3000 lines isn't the end of the world but this HD access file was maintained for want of a better word by a lot of different people who didn't really know what was happening with it and it turned out that the safest answer to any change or addition was we'll just put it at the bottom and it'll keep working. So there was all row generated segments of HD access there was handwritten segments that conflicted with other segments further up in the HD access file it was frankly too much at the point of launch and we had to go with the plan B. So the plan B was we had to default to legacy so what happens or what happened at point of launch was that when traffic came in by default it would be sent to the legacy platform unless we knew which sections were ready to go live so instead we had a different list of hey here are all the sections of the new website that are ready to go if the URL matches any of those that traffic gets sent to the new server and sent back up to the CDN and back down to the client and that worked relatively well relatively well but we did hit a couple of gotchas so the first gotcha we hit was system paths so things like slash user and slash let's see slash user slash admin sites default files and more as we learned along the way so the certain paths that are just there at the root of Drupal that Drupal expects to be available to use and over time as we saw things not working this is pretty go live when we started doing our testing we found certain things not working within Drupal there were some very obscure ones whereby everything looked fine we were pretty much ready to go but then we went to start using the media UI and suddenly images weren't appearing so again through a process of elimination we could see that some of the requests for certain Ajax callbacks were being sent to the legacy platform not to the new Drupal platform so in some way it wasn't too bad all we had to do was make an addition to our lookup table so immediately you can see that one of the potentials for problems is if a particular path is somehow a system path on your new Drupal website but might have been a valid content path on your legacy platform so for whatever reason if you have slash admin as a valid URL of content on the old legacy platform that's going to present a problem when you present your new Drupal website we had some stuff like that but we found ways around it getting a little bit more specific in the rules that we had in place but it's just something to note that what might be considered a valid path in your old legacy platform could also be a valid path in your new platform and you'll have to come up with logic like checking cookies or something like that to determine where to send the appropriate traffic let's see, any other gotchas to talk about so the other main gotcha was things like where items are content that lived under the Drupal root so things like slash cookies or slash privacy or even things like slash cookies slash privacy anyway, that kind of content we tried to convince the client could we not just leave all that content under a specific section slash news slash cookies or slash home slash cookies or something like that and that didn't wash we had to do then was maintain another list of content that lived at the top level and again should have been sent to the new platform so that was fine we hit the problem we worked through it with the client came up with a solution and the solution was yet another lookup table and that's fine now one of the advantages of this setup one of the advantages of this setup is that it actually became really easy to add new content to the new site so as the client is working on a section for say slash library they could work on that at a specific URL that represented the new platform they could add all the new content they could do testing in house they could get approval from their own internal clients and when they were ready all they had to do was just make one change to the lookup table so that was a nice sort of feature that we gained by having a default to legacy approach that we had to go with the go live we kept that what we did was we made sure that if anybody accessed the new platform directly that access was restricted by IP so regular people could not get in there and have a look at new traffic that was in a live website but not ready for public consumption one of the questions posed to me is would we ever be done so at what point can we say this temporary solution is now finished with and we can take away this custom infrastructure that was not really intended for long term use it was there as a crutch to get us from all site to new site well the answer was that the first thing we had to do was tackle that 3000 line hd access file and in fairness the client sat down worked through it line by line and came back to us with well actually of that 3000 line file here's a spreadsheet with a list of valid redirects valid rewrites and do with them what you have to do Anartek so it turned out that was relatively straightforward they were all reproduced as either redirects within the new platforms setup or more than not we used to use the redirect module in Drupal it was as content within Drupal so once that was in place we were able to switch the logic at the front end and fastly to be a new platform by default and we added another table which represented the traffic that should now be sent to the legacy platform now at this stage that's a much shorter list it's down to I don't know something like 10 or so subsites last time I checked and the process for going live now is slightly different what it means is that as a new site goes live what we do is we delete from that lookup table and the traffic then just hits the new platform so it's the logic that you saw in that VCL snippet is actually a little bit simpler now and whenever that table is empty whenever the last section goes live we can take away all of this custom VCL logic and we can just use fastly as as God intended as a CDN that caches the site and delivers good performance let's see so lessons learned I guess the first lesson is that this is all doable and it's not necessarily a bad idea because the alternative of waiting for all of the content to be ready was just not feasible and no amount of forcing or cajoling or threatening was going to have the content ready for go live so there are multiple other ways of managing that particular problem but this as a way to manage it worked out pretty well one of the concerns might be well if the new platform has a completely different design then the user experience could be pretty jarring as they move around and hit old section and new section but we could see from traffic patterns that didn't happen a whole pile people generally went to a specific site and they stayed there for most of their traffic the other thing that was that we rebuilt the design in modern css compared to the legacy platform it wasn't a radical different redesign so what user stories move from legacy to new and that kind of stuff it wasn't particularly jarring this was more complex we did have a few more moving parts which we managed to eliminate and that was definitely a lesson to learn is there are things we could have done to make this a little bit fancier but in the end it just represented relatively brittle infrastructure that we didn't really want to have to maintain so we strive for simplicity and we got there as far as we could the last thing is if you're going to tackle something like this you're going to learn an awful lot about caching about HTTP headers you're if you literally follow it down to the to the varnish level you'll have to learn a little bit of VCL but it turned out the VCL was fine and I got managed to get it working although I'll be told by somebody that I could have done it much better in a few minutes time and that's great let's see any other bits I suppose one other part that we hit a little bit of luck on this in that we had already decided on Fastly as a tool and then we just spun up on platform.sh and it turned out that their subscription included a Fastly subscription so while I showed you the lovely UI and you might be familiar with it or you can have a look at the nice Fastly UI that actually wasn't available to us all of the configuration was done using the Fastly sorry the Fastly user interface wasn't available to us for this project so all of the changes we had to make just custom curl commands to update new VCL statements and things like that and when we add and remove things from the lookup tables again we used their API to do that so it made it a little bit more problematic but from a moving parts perspective it was actually very good because all of that Fastly infrastructure was provided and configured for us all we had to do was worry about uploading our custom VCL and I think that's about what I have to talk about yeah I think that's it thank you very much so do we have any questions that came in through the tool yes long time listener first time caller this is Deirdre in Galway she's wondering why the grass was not cut before you left because it will still be there when I get back next question we have no questions but I have one here and I don't know how fair this is and I know our client is here as well so maybe he can answer if we didn't take this approach here and we said we're not supporting two platforms at the same time you have to get everything finished would we be quicker getting all the new sites moved across or would they get the finger out and start doing that because say this is a temporary solution that's now in place 1.5 years and counting so I guess it's to you Michael if we didn't do this would you have said we'll get it done or do a bigger migration what was the alternative to this you'll have to come up I didn't mean to round of applause for Michael he loves the attention I didn't write the hairstyle it had to be done this way because there is 500 websites all of different shapes and forms different platforms they all had to be brought across they all had different stakeholders most of them didn't want to do it would be the answer they still don't we have to hand hold all of them and bring them across bit by bit at the moment I'm working on a visual arts website that has 2500 pieces of content on us so that's what I'm doing day by day so we have to do it this way basically they have to be brought across one by one and hand held but it will be over I think Michael do you want to stay here because there's a few more questions and you might actually be able to answer some of them as well I'll just point out one of the gotcha that we didn't hit on this project that you may hit in a different scenario because of the way the legacy platform was configured user accounts were already a mess if you were somebody who had responsibility like Michael to look after content across the entire legacy platform you had to have multiple different logins across loads of different systems so the fact that we introduced one more site with another new login that didn't represent an extra burden if you had a very normal system whereby you had a single login on the old platform and then another on the new platform that might present a problem that you would have to deal with but that was not one that we had to deal with okay, we've got some questions here now Philip wants to know is the website multilingual and could you extend this setup to a multilingual application yes it is not as much as it possibly should be but you know it is there's elements of multilingual there and it has to be expanded and the short answer is yes it might make your tools a little bit more complex because you'll have to deal with the fact that certain languages might not be available on certain sections so I imagine your logic would have to be a little bit more complex but from a technical perspective no problem at all if you can constrain the logic to what's in the URLs then yeah that will work okay thanks Alan which one no I think looks like an interesting question James wants to know could you please go into some more detail about how you split system paths that are the same between both platforms for example how do you know which site an AJAX request was for let's see how do we know on the old platform most of the requests were AJAX content would have been at the second part of the URL so if we take a sample URL of slash library the AJAX request would have all started with slash library so the routing rules would have redirected those AJAX requests to the legacy platform as well one of the things that we definitely did was that at the top level URLs every URL that we wanted to handle at the top level on the new site had to make it into a table so AJAX callbacks paths for content robots.txt if we wanted to serve it things like that all of that we had to explicitly say yes we want to serve that file or that path from the new platform Philip wants to know I think Michael maybe for you if you want to grab the mic here how did we handle Google and other bots and for example was there any effect on indexing your site search and analytics no I think all the sites had site maps themselves and they just all had to be submitted individually but we came up with a way of doing that too that took care of most of it really Mario wants to know how do we deal with registered users across two environments we tell them to log in twice in two separate places we didn't have anything more elegant in that but like I said there was lots of different systems where people had to log in and this was just one more it didn't present any extra burden really so one of the things actually was quite interesting it's not just two platforms because each section in the UL website so slash library slash art and science slash whatever else each of those was actually an individual Drupal installation so it was actually 50 Drupal websites that we migrated down into one Drupal website so now they can log into two systems rather than 51 systems which was a win that's a very good point there was a presentation about the migration approach and some of the site built approach last year so I kind of stayed away from all that this year so you'll have a look at that on the web somewhere okay and this is the last question that's been submitted so far we can open up to the floor then Florian wants to know how do we keep the legacy link on the new platform well yeah I probably should have made this clear to the end user this is all entirely transparent so from the user's perspective they have no clue ideally which content they're seeing is coming from the legacy site or from the new site and this was a key objective we didn't want them to know that the content they were seeing was either new or old it doesn't matter to them and we wanted to keep that invisible the more observant will have spotted a slightly fresher design on some sections of the website but from a URL perspective or a traffic perspective it should have been entirely transparent to them I hope that answers the question okay that's the end of our questions is there any questions from the floor or anybody want extra explanation were you using any of the other abilities of Fastly like caching or actually using it as a CDN alongside using it to split the traffic between the different sites yes yes we were so given that Fastly was part of the platform available we were definitely going to use it for caching I learned more than I really needed to know about caching and lifetimes we had one particularly entertaining problem to deal with yeah I'm not entirely sure how it happened but redirects on the legacy platform were served with a relatively long cache lifetime and despite efforts this cache lifetime couldn't be removed and no one really wanted to tackle the HD access file to remove caching header from redirects on the old platform and in the end we used Fastly's logic to remove that caching header so that the redirects weren't cached and that caused a problem because there's a lot of what we call temporary URLs like marketing URLs that are used for a particular period of time that are just redirected into a particular piece of the website with perhaps a longer section so ul.ie slash open days redirects to a specific page maybe further down for a particular department so these URLs were being cached for quite a long time in Fastly because as we found out the headers were caching though so what we did was we just simply removed those caching headers from at the Fastly layer and there was no one really need to cache them anyway. Any more? Okay thanks very much I'll be around the AnarTech booth and I'll be up here if there's anybody else who has any more questions and comments thanks very much