 Hei, mae'r ymarfer yn Gwreun Taylor, mae'r ymddorol yma yw Heliart, ac mae'r Andrew. Mae'r ymddor yn gychwyn i'ch gwrthod gyda'r jackblock, rydyn ni'n ymweld i'ch gwaith yw 3. Rydyn ni'n gŵr i'r gweithio'n gwybod, fel ar y bydd yma rwy'n gyhoedd nhw, rydyn ni'n cael ei wneud yn ei wneud. mae'n gweithio'n gweithio'r yn cyfrifiad o'r gwell, ac rwy'n fyddei'r gwaith iddyn nhw'n meddwl i'w ddechrau dwylo'r gwasanaeth, ac rydyn nhw'n ddim yn cael i'r gwaith yma, yn ymweld i'r lab barath. Dydyn ni'n gofio'r gwneud o'r cwestiynau, ac rydyn ni'n gofio'r gwneud o'r gwneud o'r gwasanaeth, i'n gwneud i'r gwneudio'r gwneud o'r gwneud o'r gwneud o'r gwneud o'r gwneud o'r gwneud. Dyma ydych chi'n gweld yn ei ddweud? Yng Nghymru, rydyn ni'n gweithio Caipgemiw. Ac maen nhw yna yng Nghymru. Mae'r gwybod rydyn ni arferhau yng Nghymru. Mae'r rydyn ni'n gweithio pobl gennwys gwaith ei ddiwrnodau. Mae gennu Drupal lŵr yn ystod o'r pwysig cyff�ariskol o'r pi gyml. Mae ganwyo'n gallu wahanol, ac mae'n gwneud. Mae fydd o eu Drupal ymlaenigfer yma, yn y gwaith yma'r bandwydau sydd fyrdiad gåll, ac mae gennu Drupal yma yn ymddir ichildrenion in France, India, Belgium, Sweden, and the Netherlands. In the UK we are around 30 Drupal developers strong and we have a stand here, as I mentioned before, we've given away kazoos so come along, give us a tune on the kazoos, we'll record you, put it on YouTube and you can win some prizes. So what do we do? Here are some of the clients that we work with. Our biggest project in the UK is the Royal Mail, which also includes parcel force and the post office websites. Royal Mail, just to give you some stats, on average is doing around 21 million commerce orders per year. Our next biggest project is probably Eurostar, which is averaging around 2 million in revenue per day. Obviously, as you can see, we do big sites. I'm going to talk a little bit about building a scalable developer workflow. Let's drill into that a little bit. The first thing you're going to need is developer, obviously. If you're doing things at scale, it's likely you're going to have more than one, so you're going to have multiple developers. The first thing I think that's really important is that you have the right people on the bus and the right blend of people as well. So you need to decide, do you need three senior developers, one mid-level, one junior, a tester, a business analyst. It's really important that you have the right blend of people in your development team because it's really important to get off in the right fit. I've seen projects starting where master projects start, no developers arrive on day one. It's just lots of other people, other disciplines, so not really any technical people on the project at all. Decisions are being made, things that can get out of hand. You should always have a technical input, I believe, at the start of your projects. I've also seen other projects where we have many junior developers. The blend isn't quite right, and, again, things don't exactly get off in the right fit and you kind of end up with a burden or things that you need to fix while the project's in flight, which becomes more difficult. Aside from having the right people on the bus, you also need to make sure that everybody has what they need to do their job. Make sure your development team is tooled up. Make sure they are in a comfortable environment. All these things are fairly obvious. These are some of the tools that we use. It doesn't really matter what you use as long as you know what you're going to use and as long as you have everything that you need when you start a project. We use Atlassia and PHP Storm, xDbug. That one's really important for your developers. Get every developer, get them set up properly with a development environment so that they have everything they need and everything at their disposal to work properly. The amount of times I've had to help people get set up because they're not quite able to work as productively they could have. So make sure that that's really important. Make sure they have everything. Back to the... I've talked a little bit about developers and what you kind of about blend of people and what they need. So let's talk a little bit about workflow. The first thing is communication. It's really key. You can have the best workflow in the world but if nobody talks to each other things will fall apart. So make sure you have, you decide on your communication channels, like are you going to use Skype, are you going to use IRC and are you going to have regular daily stand-ups, how your project's going to work. Make sure you decide on that all up front and you know how you're going to communicate with your developers and with your client and how you can bridge the gap between your developers and your client as well. Don't run before you can walk. Don't promise one million things to your client on day one or don't sell a project on Friday and expect your team to be fully up and running on the Monday. If you can, what we try and do is have bootstrap iterations where we go in and tool up and set up and that may last four weeks, it may last eight weeks. It depends on the size of the project but make sure I would recommend you do this and spend time up front with your development team to make sure they have the processes and tools in place to do their job properly because it'll make your life a hell of a lot easier. If you try and do these things mid-flight it's painful. Clarity of purpose, your team, this is really important because your team, without knowing how to do anything, nothing will start and without knowing what's done you'll never finish. Without knowing what's good you'll inevitably end up taking shortcuts in your approach or just to get the job done so the team really needs to know how to do things well, what it means to get things done and what good it looks like. So I guess if you're going to set up a developer workflow the first thing you're going to need is version control. I'm probably telling you all to suck eggs, it's pretty obvious. Most people nowadays use git but you can use SVN if you want or you can use something else. I'd recommend git because pretty soon what's going to happen when you start building things and especially if you're doing things at scale as well what you're going to want to do is have lots of things happening in parallel. For example on the Royal Mail project we have a larger development team which at the moment is about 20 developers. That's all split up into sub teams within the larger development teams so we have about six streams or six sub teams of maybe four, three, six it depends on what the nature of the work is and they're all doing things in parallel. So one team may be doing new features, one may be doing bug fixes, one may be doing enhancements but that's all happening at the same time and that's all coming together in the end. I'll explain a little bit about how that all comes together but yeah version control is really key so you can have branches and lots of things happening in parallel. For workflow you don't need to reinvent the wheel for your git workflow. So example we use the git flow workflow, we find it is a good fit for us but check that link there's other workflows on there. It really depends how your project wants to work. Pick one that suits you but I would advise not to invent a completely new workflow. Pick one that's already established and documented on the web standards. This is kind of links back to my clarity of purpose like your team needs to know how to do things and what good looks like so we already have these things available, the Drupal coding standards, your team should follow those if they're right in Drupal modules. I mentioned on the tools page that we, one of the tools we had there was PHP Storm, we like to use that a lot. If you check out those two articles that'll give you help to get your IDE configured properly for Drupal projects. So I'd recommend like I love PHP Storm but I'm not going to get into like IDE wars in this session. But yeah if you like PHP Storm check those out, make sure if you want to use it, get your developers set up, get them all set up in the same way. So it's easy, people can help each other, people can pair a program together. It just makes those things a lot easier. And then finally commit messages so yeah check that link, make sure your commit messages basically know, don't assume that the reviewer of the code knows what the original problem was and in your commit message make it descriptive about why etc etc. Code review, so this graph I'll explain it a little bit. The top graph is the percentage of code committed over time that has been reviewed. And the bottom graph is the number of bugs raised per week over the same time period. And this is like a graph there from one of our projects. As you can see as more code is being reviewed we have less bugs. Always make sure that you do code review. It's the same for like Drupal 8 right? Any patch can't get into Drupal 8 without being reviewed by at least two other people. So a core committer can't commit it until it's been RTBC'd by at least two others. For code review we use Crucible, which is a part of that lasian stack, but if you're using things like GitHub or whatever that's got code review tools built in, you can use other tools too like fabricator, which is a Facebook code review tool. Pick one that fits best for you and review code. Testing. So if you, I would highly recommend if you can to do either test driven development or behaviour driven development, and Andrew is going to speak slightly later on about behaviour driven development and the kind of things we do at Capgemini related to that. But it's really important to know how you're going to test your code before you even write a single line. So are you going to test it manually? Are you going to write a PHP unit test for that? Are you going to write a behat test? And if you have tests, particularly a PHP unit or a behat test, you can automate all those things. So you can have your test running on commit, on a nightly build, before a developer pushes to a branch, et cetera. You can slice and dice however you want once you have these things in place, which leads me on to builds. If you can, you should automate everything. And that includes automating your environments as well. So the creation of your test environment, the creation of your UAT environment. We use, so these are some of the tools that we use internally. So yeah, logo on the top left is Puppet, so that's for environment automation thing. We use that to build files or build the Drupal site files in combination with Rushmake. We use Capistrano for deployment of that to cluster machines, so if you have to deploy your web application, which includes a bunch of files and a database to 15 different machines, Capistrano can handle that quite nicely. We use Jenkins for running overnight builds, so we refresh our test environment daily or we use Jenkins to trigger PHP unit test, things like that. But I would highly recommend setting up the continuous integration environment at the start of your project because it allows you to fail fast and fail early. It allows you to catch issues before you deploy it to production. And it allows you to practice your deployment to production on a daily basis. And you should deploy often, including to production. You should deploy often to production as often as you can. So yeah, I would highly recommend setting it up at the start of a project. I've seen projects that didn't have a build system or a continuous integration system, and halfway through it's had to be implemented because there was lots of issues, lots of bugs looking into production, no way to build a known state of the code, and implementing that halfway through a project is very painful and you will inevitably end up taking shortcuts to get it in there. So spend the time up front to set this up. And I mentioned some tools that we use on the previous slide, but it doesn't really matter what you use. You can use whatever you want as long as it's reliable and repeatable. It's good so you can have everything in shell scripts if you want, it's up to you. I wouldn't recommend that, but as long as it's reliable and repeatable, you're doing the same thing all the time then, and it works, it doesn't matter. And have your developers worked smarter, not harder. So although theroyalmail.com is on Drupal 6, we still try to use object-oriented patterns wherever possible. We're using some components from Symphony, so we're using the event dispatcher and various other things. So we're borrowing concepts and tools from the PHP community that were invented elsewhere. And if anybody listened to the Dries note or to Alex Potts presentation yesterday, the line was from not invented here to proudly found elsewhere. So always try and reuse others code if it's stable, reliable and tested. You don't have to be on Drupal 8 to use Symphony. We're already doing it on a Drupal 6 project. We're using Guzzle on a Drupal 6 project as well. It doesn't matter. It's just PHP at the end of the day. And in terms of working smarter, not harder yet, automate as much as you can. Anytime you have a manual step in your deployment process, something will go wrong. Somebody will forget to do that on the night of the release. So yeah. And aside from that, trust your developers. Let them experiment on things whether that be a component from Symphony that might help your team. Maybe don't do it right in a live project, but let them experiment what we do at Capgemar. We have every second Friday, we have a whole day, our developers can basically do whatever they want for that whole day. They can hack on AngularJS if they want NodeJS. It just allows them a bit of freedom to experiment. And some of the things that we hack on in those days end up going into live projects. So the benefit is coming back to your teams all the time. And lastly, don't be afraid to fail. Like failure is inevitable. It's what you learn and how you improve the matters. In relation to the experimentation, it's always good if you don't fail in a live project. So if you can, and you want to do experimental projects and try and do them on the side, we have an innovation team at Capgemar that's currently working on things like AngularJS and NodeJS. And some of the work that they've been doing is actually going to come into our project. So we're now looking at using AngularJS on the royalmail.com, for example, as part of a new journey, things like that. So don't be afraid to try things, don't be afraid to fail. We're all here to learn. As long as you're getting value at the end of the day after that failure, that's good. So in summary, linking back to my clarity of purpose, your team must know how to do things. So I went over the Git workflow coding standards, what tools to use. They must know the definition of done, so that means does that mean your code has been reviewed? It's got correct documentation. It's got a unit test applied with it. It has a behact test applied with it. You need to define these things. And then finally, what good looks like? So if a developer is approaching a problem that's already been solved, then do we know what pattern to apply in that scenario? And is your code tested, built to an environment automatically and working and ready to deploy to production? So that is me. I'm going to now pass on to who's up next, Andrew, who's going to talk a little bit about behact and behaviour driven development and some of the things we do at Camp Gemini on that. So I've got a slight technical difficulty here. Okay, sorry about that. A slight technical difficulty. So I'm going to speak to you about behaviour driven development. So has anyone heard of BDD or behact? Stuff like that. Yeah, a few hands going up. Good. So yeah, there's been a lot of quite a few sessions about BDD and stuff from various different camps and Andrew Pocon today. But I've got a philosophy raptor to ask a question. There's a lot of the talks sort of sent around behact, which is a PHP driver for doing certain elements of BDD. But the question being posed, I'm using behact, am I doing BDD? It's quite an interesting one. So just because you're using behact, which is like an automation framework doesn't necessarily mean you're doing BDD. You might just be doing behaviour driven testing, which means you might just be using half of the tool, not necessarily getting the full benefit of what you could be doing. And behaviour driven development isn't about tools. It's not about the hat or mink or cucumber or any of these things. So the guy who came up with behact was BDD, was a chap called Dan North, and a few years ago he came up with this definition of BDD, the second generation outside in pool base, multiple stakeholder, multiple scale, high automation, agile methodology, which I'm sure is clear to everyone. That's the usual reaction to that. So we can take this apart a bit in the next few slides, but basically it's a process in a framework for actually looking at the requirements of your project and the behaviour of your project rather than necessarily any way of particularly testing. So the problem we have is one around language and communications. We have generally in terms of web development, we have complex communication problems. We don't really have technical problems. It's sad to say it's just PHP and it's just the web. It's not searching for wormholes in distant galaxies or searching for cures for cancer. It's just PHP. But the reason why we fail a lot is because we don't necessarily communicate well or in a standard manner to each other and to all the stakeholders in the project. So we have things like jargon, so people using terms that aren't necessarily well known for everyone, language, so you might be speaking to people whose native languages are different, so their communication issues there. Chinese whispers where you don't get direct communication between all the stakeholders, you get messages passed down a line and things get missed out or confused and early solutionising can imply that. I mean people, clients coming up with ideas about what they want or how they want ideas about how you will solve the problem rather than what their requirements are and unknown unknowns. So if you think about your last project and guess how long it would take you to redo that project, there would be a lot less time than it took you to actually do the project because you, coming to the end of the project, you know about all the things you didn't know about at the beginning of the project. So the consequence of this kind of language problem is that we're locked in often to delivering late delayed or the wrong products, which is even worse, before we've even committed to writing a line of code or written an architecture pattern or anything like that. And that's neatly kind of summed up in this slide that you probably all seen before, which is one of the classic kind of agile slides. So this is kind of how the customer explained it, but this is what they actually wanted. And these are all the kind of various misinterpretations along the way by various different stakeholders in the project. And this is like classic and we're still doing this, right? You know, everyone's still getting these things confused and we're testing the wrong things, delivering the wrong things, not doing up to expectations. So, you know, TDD was kind of a precursor to, or it's a different way of approaching development. It's where you do inside out testing. So you unit test small bits of code. So it's a very kind of granular approach to testing. And it's, you know, often touted as a thing to do. And it is a good thing to do. It doesn't solve our main problems of language and communications. It focuses on the how, it focuses on the implementation. So it presupposes you actually know what you want in the first place. It suffers from a refactoring problem, whereby you can change bits of code, but you don't necessarily want to change the behaviour of the whole application. It can only test small discreet parts of the code base and also the accessibility problem. So the only people who can actually read and validate these tests are in all likelihood mid to senior developers. You know, it's quite like that junior developers may not have the skills or experience yet to know what good TDD tests look like. Certainly your QA testing probably can't look at it and almost certainly your clients can't validate these tests and they're the guys who are actually playing for your deliverable. So what is BDD? So it's a process, it's a technique for looking at requirements. And it combines TDD, which is some elements of it, which in terms of you can write it first, it's repeatable and it's automated. But it takes elements of this thing called domain driven design, which comes out of a book written by a guy called Eric Evans. I don't know if you've read it, but it's something that anyone who's building a software should at least look at. And he talks about describing requirements and solutions in terms of the business domain. So for instance, if you were talking about a conference site, you would talk about sessions and schedules and visitors and things like that. You wouldn't necessarily talk about individual pages or forms because that's not what the solution is about. The solution is about delivering access to sessions and booking BOF sessions and things like that, it's not filling out forms. So you speak in the language of the business domain, of the thing you're actually delivering. And you start by looking at business outcomes and come up with behavioural stories. So rather than saying you want to drive traffic through this certain funnel, then that's your business outcome. And then you come up with the acceptance criteria to work out what is the best way of testing that and understanding and showing people how to deliver that. And it's supported and it uses, certainly we use a lot of established tools, things like Gherkin is a language, Behatt is a testing framework, and Jira and Jenkins, which I think Peir Graham just talked about just now. So but these are kind of established tools that we do use. So requirements, this is quite a nice quote on requirements. It's apocryphal I think, but if I asked customers what they wanted, they would have said a faster horse. You can look at requirements from many different perspectives, but no one person has kind of like the full knowledge of the entire project or the full knowledge of the entire domain. Every team members come from a different background, they've got different experiences, may have different ways of solving the same problem and can offer many different insights. And so as a team, basically you need to get together with the business stakeholders and discuss your business outcomes and then work out the best way of delivering these. Sort of typically we do that within kind of during a sprint in design clinics or practical training sessions, things like that. So you kind of find time to fit these very important sessions in because it's only by directly communicating with the client and the delivery team, the sprint team that you'll actually come anywhere near getting the whole question about requirements and things like that fixed. And during these sessions you need to make sure that everyone is free to ask questions, everyone's free to question assumptions, breaking the model, and that you go on this kind of path around deliberate discovery. So you actually set out to discover things, you kind of admit your ignorance and you sort of go out to work out what is the best method of achieving these business outcomes. And what you should get at the end of each session is a set of acceptance criteria for each of your requirements. Ideally in this sort of form, so it's in a standard kind of given when, then. Which basically means it's repeatable, it's able to be automated. And it's also in a language because if you're using a domain business domain language and your testers can look at that and validate it, your client can validate that you're actually doing the right thing because they can read the test here. They're not, you know, reading some PHP code, some other PHP code, they wouldn't know how to interpret. So this is something that we've come up with, which is an acceptance criteria for acceptance criteria. So we try to ensure that all our acceptance criteria for any particular piece of work is complete. So it fully describes everything. So there's no scope for the client to come back and say, actually I meant to do this or I forgot to tell you about that. So everyone's fully aware of the entire scope of the requirements. But it's clear that all team members have got a clear understanding. So there's any jargon in there is, you know, jargons, you know, you're not going to get rid of it, but as long as everyone understands what is meant by each term, then that's fine. And it's testable. So you have specific events that are, you know, repeatable. So you push something into the system. I get a fixed set of things out. So one of the things we do a lot of is systems integration. So we integrate with third party systems. You have to work with a lot of things like fixtures. So basically you build a set of user set of mock objects and things like that to kind of mock out your services. So you're actually calling the back end services directly because you may get sort of the real results out. You just want something you can test against. You want something you can get that is predictable and repeatable. So this is kind of a very simple, but it's kind of instructive about how we might go about looking at an acceptance criteria for capgemonise sponsorship of Drupal Comprar. So this is our feature. You might recognise this from like a Scrum and Agile user story. Although this is in terms of actually testing it, this is just here, it's kind of annotation. But you see as the marketing director, obviously capgemonise logo on the home page and the sponsor page that site visitors can see capgemonise commitment to the Drupal community. It's all very good. The interesting thing here is we're looking at it from as the marketing director, so not as a user because quite often everyone writes these user stories as a user, but the user isn't normally or quite often it's not the most often the regular stakeholder. It could be a marketing director. He's a stakeholder in the delivery room and he wants to see the logo on various different pages. So this is our actual acceptance criteria. It's just a small demonstrator one. So given I'm on the home page, then I should see the capgemonise logo when I follow sponsors and that's just a link on the site. Then I should see the capgemonise logo. Now that's a test and that can be run through Behat and that's which means it will be automated. We run through Firefox so we can do regression testing on it. We know when we're done because when this test passes then we're done and complete. But also you see there's nothing about divs or forms or buttons or CSS selects or anything like that. It's written in the domain of the problem. The problem here is making sure capgemonise logo appears everywhere. So it's a specification. It also acts as documentation so you can you know you've got this kind of built-in organic evolving documentation as well. So it acts. That is also a regression test. So I'm now going to very quickly switch to a demo and hopefully if the Wi-Fi is holding up. So just to prove this isn't magic. This is actually the feature we're running and that's just no you can't see that. So that is the feature we're running and hopefully if this works it should fire up Firefox and you'll just go to the home page and then go to the sponsored page and it's failed typically. Never do a live demo. Okay but so the first one passed right so I've given on my home page and I see the capgemonise logo. So that's a great regression test because obviously somewhere something has failed and we've got red so we can then go back and fix that in our next sprint. But that's just an example. So that can show you the demo can the test can be run in an automated fashion but it's also written in a very clear and plain language. So in summary BDDs isn't about the tools. It's about communication and solving the communication issues. It's a framework about and a process but it's not about testing. It's also about the main part is about requirements and discovery of the problem of the main. Got some resources here so some readings so these are just the blog posts. These are kind of interesting people to follow if you're interested in this sort of thing and north and this Keo and Eric Evans. These are some things that are interesting. So B-hat is the testing framework. There's a Drupal extension which is which is enables you to do kind of testing on Drupal sites. So it contains both capgemonies and grammars for doing testing like and you know testing for anonymous users testing whether you're logged in that sort of thing and Doobie is a project for the dupity named Doobie is a project for testing Drupal.org so Drupal does some form of BDD testing so and there's a quick start that I've put together for setting up BDD tools and all the frameworks that you need for that. So that's BDD in a 15 minute nutshell. I'm going to hand over my colleague Graham who's going to talk about migration. I think you've got a missed call on there under actually. So my name's Elliot Ward. I'm a project lead at capgemonie and migration is a hot topic when you're dealing with large enterprise sites for a number of reasons. You are not going to be doing an enterprise client's very first website so it's very likely they've got content from an existing site they'll want to start pulling in. It's also as any of you have been involved with Drupal upgrades know that they can be very problematic in an alternative strategy rather than using the Drupal upgrade path is actually to build a new site and migrate your content into it. So the best way to actually perform a migrate is by using the migrate module. There are many other ways you can try and do it using feeds module for instance is an alternative approach but I realized that I definitely had geek credentials when I realized I had a favorite Drupal module and it is migrate. It's been around for a while. It's been rearchitectured at least once and it's currently got a major rearchitecture going on that's currently in release candidates so some of the things that I'm going to talk about will probably change. So what is it in a picture? My great is a module that can take data from any of these different sources politic and then create familiar Drupal entities such as nodes, terms, menu items, comments and users and it's designed to be completely extendable so that you can have your extension if you find a data source that isn't supported by anybody and there's those out there in Contrib as well as that come with the migrate module. You can extend it to deal with the content types of all the data sources that you need and similarly if you can't find a handler at the end to create your Drupal elements you can create your own here as well. So an example of that we found there wasn't a particularly good one for migrating mini panels into sites so we created our own there. So just blowing up that little circle that we just had on migrate is the piece in the middle of that puzzle. There's a few pieces to it. There's the migrate module itself. That comes with a migrate UE in the same way that Views has abused UE so you have a separate module handling the code for user interface that you can disable when you're actually on production. There's another module called migrate extras and that's becoming deprecated. What that was that was a kind of collection for all the extra pieces of content around new data sources and new destinations, different types of data within your Drupal site but that's no longer having anything added to it. The idea is if you create a new content type or something you distribute it in a feature if you want to put your migrate support into your own module that you're distributing there. Got a few more extra requirements on D6. Anyone still using D6? Lots of D6 still out there. I was wondering whether I should pull that out but I think there's enough D6 still going. So it uses the autoload that was just included in D7. It uses dbtng, database of next generation so actually when you're writing your database code you write it in the same way that you would do for Drupal 7. One of the major reasons for using dbtng in all places is A, it keeps the D6 and D7 versions of the module as close as they can be from a code point of view. Also dbtng is much better at handling multiple databases which is exactly what you might be doing if you're migrating data from one place to the other. It's also got elements in D6 as well just for some of the UE stuff that's in 7 but not in 6 and unfortunately because this is the business track one of the final piece that you're going to need to actually do anything with migrate is implement your own migration module that will have a dependency on migrate. This is one of the really interesting things that is possibly changing in the latest 7.26 release candidate because that's going to have a wizard that you're going to be able to use to create your migration structures. Bit like a feed, huh? Pardon? Bit like a feed. So it'd be interesting to see how far you can get with that whether you can implement everything or whether you're still going to need to get your hands dirty with code for some of the finer details when you're mapping your data from one place to the other. So what do you need to do in that module? What's the actual work that you're going to have to do when you are migrating your content from one place to the other? Bulk of the work comes by creating subclasses of some of the classes that you will get for migrate for free. So the class hierarchy for migrate has this migration class and you're going to want to subclass that for each type of type of thing that you're going to migrate. So if you've got multiple node types in your destination system, you are going to have to subclass the migrate node one for each. You're going to have to create a class if you're migrating users by subclassing the migration users class and also migration terms. And there's other things that you can migrate in. These are just some examples that I pulled off previous demonstration. For each of those classes, you need to put them in a migration group. A migration group doesn't actually have any impact on the data once it's in the destination system. It's more for control and reporting purposes. So you can decide you want to migrate all migration classes in a certain group at a time or just pull up a report on how many have been migrated for a certain group. You need to create a migrate source instance. So you don't need to subclass this if this is one of the already supported data sources that we saw a few slides ago. So if it's for any of these, you don't need to actually create that subclass. There will be one available for you to use and it's how you actually instantiate that subclass will give you your specific details on what data you want from that data source. Of course, because we've got a MySQL one, we can just hang WordPress, PHPB and Drupal to Drupal site migrations through that one. There is a dedicated one for WordPress now and it's a separate project for Drupal to Drupal migrate. But you may as well just use the, in my experience use the MySQL source for that. You also need to create a migration, migrate destination and that maps to these items here. So again we implemented a new migration destination for the mini panels when we found that there was something that wasn't particularly handled in the ecosystem. Create field mappings. So you say where your data is coming from in the source and then which fields on your destination objects that's going to correlate to and list anything that you are selecting a data that is explicitly unmapped or anything that could be put into your destination system where you're not actually going to map it. That doesn't have any practical effect but it does help you with some of the audit tools that migrate will provide. You also need to define how your data is mapped between the two systems. So you need to tell the migrate module how a particular data element that you're going to select from your source information is going to end up in your destination data because all we're doing is pulling this across. So it will build a database table and you need to tell it how to construct these source IDs and the destination IDs. So typically the destination ID will just be a node and source ID will completely depend on what the system is that you're migrating from. Because we've got this map that means that we can actually run migrate multiple times without migrating the same content unless it's changed. It means that we can accurately map how far a migrate we are and we can batch it effectively as well. So you don't need new UIDs? We will come to that and it's interesting and that question comes up every time. I have a longer version of this talk where we actually go into a lot more detail and we're kind of flying through it at the moment but there's a talk online that will be in the references where it actually does demos and code samples of an example migrate that's kind of designed to be a really good jumping end point for people who need to get up and running with migrate. So once you've defined all those things you can actually the migrate tool, one of the reasons why it's great is it gives you really good auditing tools. So this is going through the UE but there's equivalent drush functionality as well and this is just telling us for some migration classes that we've implemented or we've subclassed, how many rows we have to migrate in the source system, how many we've imported so far, how many unimported and so on. You can also get any messages if we have any failures or anything interesting that's happened once we try to update there. So that gives you an overview for all the things that you're going to migrate. You also have auditing on the level of detail here. So here, I know these aren't particularly readable apologies for that but what this is is a list of fields that we've got in our destination node type and it's showing us either we have mapped them, these ones that already knows where it's going to come from and down here are the ones we haven't yet mapped. So migrate is flagging that you could have some data here and it's in your data types. Why haven't you got it in your migration? Helping you to audit how far through your implementation of those migrate classes you are. If there's anything that you actually don't want to migrate across you can flag it in your migration classes do not migrate and then it will no longer show up in red and you can list it under one of these vertical tabs you can see. Let migrate know that you are deliberately not importing that data. So that's a very whistle-stop tour of how you would implement a migration using the migrate class and I had to give you that so I can highlight some of the difficulties that you might encounter along the way. So I put non-guid references or uid but unique IDs as raised are an interesting point. Migration order can have an issue, circular references, stubbing, many to one mappings and developing alongside moving targets. We'll go through these and look at how you might have a strategy for these. So if I've got a kind of relationship between my data here so this is an example from a demo where a monkey has an ID mapping to its favorite tree in your source system. When you actually map those across the IDs at Drupal will allocate the node IDs won't actually be the same and I'll be pointing in all the kind of wrong wrong elements maybe pointing at nothing or may point at a completely different node type. So the way we would deal with that in this kind of simple case with migrate is you actually just have to let migrate know that these are dependent and then when you run your migrate always run the referenced class migrate first. So here if we bring in all the trees first and then migrate the monkeys then there will be no problems and it will remember that it has to work out the new mapping for each one by reference to the map that we set up when we were defining our classes. Circular references a bit more difficult to deal with. Obviously here we've got a monkey that has a relationship with another monkey so we can't migrate all of these before we migrate all of these because they may point back at themselves they're the same class doesn't make any sense. So you will have some work to do here but migrate helps you out by having a mechanism by which you can define a stop. When migrate comes to migrate one of these data elements it will have a look and see if the referenced element has been created yet. If it has there's no problems and it just uses the new reference. If it hasn't what it will do is it will actually create you a blank node in this instance and it will just remember the mapping for the old value of that reference in the source system and what the new reference is. So it will create a blank node and then when it comes to migrate later it will find that it's already stubbed it. The reason we can't use UUIDs to get around this problem is because you may not be in control of that source system. You may not actually be able to go and recode it if it's not already using UUIDs. It's probably a site that you're migrating away from. It could be a dating site. Okay so another big source of problems that you could have is that you are always developing your migration amongst a sea of changing things. You're changing the migration code so as with anything you could break it there but you might also be having the source content that you're going to be migrating. That might still be changing. That might be changing every day. You might still be building the site that you're going to migrate it to. You might actually have existing content on the site that you're going to migrate it to and you've got your own migration code changes that you're making. These are a lot of different things that can cause you problems so you need to have a robust strategy going back to what Graham was talking about having these reliable development processes that this is built into. So the way we would typically do this and the way that we did this very successfully for a very large migration we did this year is quite a kind of continuous integration process with overnight builds where overnight we would take the destination system which was already a live and operational system. We'd pull that back code and database to our test platform. We'd apply all our new code changes immediately to that and then after we've got our new code and this could be new code being developed not just the migration code but all the code that's being developed for that site and then only once we've got all that in would we actually test the migration. The really important thing about this is you need to do it every day. You need to get your developers to do it every day so they will spot any problems and you need to get your testers testing it every day and those testers should genuinely be the people actually writing the content on behalf of the customer as well as your own technical testers. The other things that you need to be aware of is all those pain points so that you can plan your migration. It's very difficult to predict a timeline for a migration task quickly. You need to find out where all your references are, all your pain points, pull them up out front and because data migration often gets overlooked compared to functional development people tend to want estimates quickly because they suddenly realise they haven't done it or haven't allocated time because they were getting the real work done. So I've kind of whizzed through that and some of the potential pain points because we don't have much time but there is a fuller version of that talk both on Vimeo and slides on Prezi that goes in, has all your code and actually live demos of how it reacts when in all the different ways you can audit your migration. Mike Bryan is the driving force behind Migrate so have a look at his blog. He blogs occasionally and it's interesting to see the contributions that he's making and also there's a bunch of great extra resources on the Migrate project itself on the homepage. So does anybody have any questions about what either Andrew's talked about or what Graham's talked about or what I've talked about? We've got about five minutes I think. Yes, so the question was do we have experience of updating data with Migrate? Yes, we definitely do. So Migrate is designed so that you can run it multiple times on the same date or it will check that it's been updated and then pull in the latest content only if it's changed so you can definitely use it in that configuration and in that release candidate that I mentioned there's been some changes around hashing the data in the source system so that you can more easily identify whether it has changed. When we were doing that large migration which was tens of thousands of nodes because our database servers we were running on was so quick we ran it and we're doing other work to do with the upgrade whilst that ran and it ran so quickly we assumed it had failed and rerun it so you can definitely run it multiple times on the same data set and it won't make further changes. You might want to consider whether you want all that code to be enabled all the time if you're using it to do continuous pulling of content and it may be that something like feeds may be a better fit for that. Yes. So the question is what's your experience of migrating the themes and blocks and other stuff. So yes you have to be practical at looking what is content that you're going to migrate and what is content that you are going to put into a feature and implement on your new site that way. So certainly things like notes, sorry content types definitely put those in teachers. I'd be anything that's more kind of configuration than content as such. I would not look to deploy it with migrate although there's absolutely no reason why you couldn't. Yes. Yes you absolutely could use it and we have done projects where we have used it in that capacity. Not so much as an ongoing deployment. Again it may be that deploy is better suited to that but yes you certainly could do that and it would work because you have this repeatable migrate and the identification of the changed content. You can always have a job that will disable migrate and then run it and then run the migrate commands that will actually transfer the data and then disable migrate after that. Okay if there's no other questions leave it there. Thanks on behalf of Andrew Graham and I for attending. Thank you. And if you wouldn't mind racing on the Drupalcon site and it really helps to get feedback to work out whether we're talking about the kind of things that people are interested in.