 All right, it's I think about time to start so I Hereby welcome you to My session loss them laws and statistics a tale of cookies some more people are coming in everybody is more than welcome I Hope everybody is having a nice conference. I know I am so Hope that's a positive thing. Hope a lot interesting sessions You have witnessed and I hope to provide one too Cookie story who am I I'm not gonna bore you too long with that But I think there should always be a classical car in any presentation, especially in my car So I have my own Drupal job. I run project management. I do event management I do basically everything I can find a client crazy enough to pay me for I've been doing Drupal since about 4.3 4.4 ish After Drupal 7 released I decided simply to drop basically all other CMSs or CMFs I was working with which were at the time mainly Juno WordPress HIPPO and Drupal and focus on Drupal entirely Actually the Biggest thing for me Drupal consists of it's not only code. It's the community. It's the fact that Here and in other events in the Netherlands some of which I helped organize were hundreds of enthusiasts all sharing knowledge Not in a call competitive sense of business way But in an absolute competitive sense of intelligence and code quality and feature-wise and that's what's Drupal for me It's an fantastic ecosystem. So that's why I decided to spend my time on that and Then something happened. We had a cookie issue or better. We had a telecommunications law issue This presentation mainly consists of two parts. I'm going to go into a bit of the legislational side of Telecommunication law which is privacy related and then I'm going to do the demo and that is absolutely on purpose Keep the best for last so people don't run away. I hope but Telecommunication law is a European Issue privacy related legislation is a European issue which got translated into Dutch legislation and this presentation is Based on the Dutch implementation of it That's not only because I'm Dutch, but I also think that the Dutch implementation is a remarkable Achievement in stupidity actually because it completely focused on the wrong issue As far as I'm concerned In the Dutch law and also in the European law Distinction of information is made which boils down to a distinction of two types of cookies It's functionally necessary cookies like session cookies shopping carts things you actually need to make your site work and Analytics cookies those are in the view of the lawmaker optional and our privacy invasive by nature The assumption is that analytics cookies they contain Identifiable information or at least some means of identifying a profile which could be Constructed to resemble a certain individual and why is that profile interesting? Well, everybody knows in e-commerce the best thing to know is simply who is your customer? What has he bought in the past and Therefore we can offer him more the assumption Which is made under Dutch law is that analytics cookies always contain Personally identifiable information until the webmaster or the owner can prove otherwise Well, which of course nobody's gonna do so that's what I met with pretty much brain that implementation Then we have functional cookies And they're legal as long as the user is informed about them. So cookie control something like that Any cookie resembling personal information even if it's analytics or functional shoot be explicitly accepted by users and This is something Very hard wired in the Dutch legislation on this issue Now, this is of course very interesting because most systems utilize only One cookie to offer a lot of functionality. Actually, it's pretty bad practice to like force a lot of cookies on users. So This ends up an interesting mess. So it's it's it's a nuisance. It became pretty difficult to separate this functionality because In a e-commerce site, you also do analytics of your client behavior In a normal website, you also build up Profiles of interesting users like targeted groups and stuff. How does that relate to personally identifiable information? And is that always a problem is or is it only a problem if a commercial interest is Actually there in the privacy information a lot of sites actually offer content to form profiles of the users physical and the content and which Their business model is based on the fact that they can then sell the profiles. So is e-commerce always a direct thing or not well Dutch Public television, which is one of those content providers who provides mainly knowledge has no Commercial interest or whatsoever. It's a public broadcasting service. They decided well, we're gonna play it safe We're gonna put a cookie wall in front of our website But that's a problem because Dutch public broadcasting like the BBC or any other ARD or ZDF something like that. They have a public service to offer They have to provide public information without any constraints, but now there was also legislation that Demanded of them to provide information about the audience They were reaching with their communication because that actually gave them a reason of existence. They had to measure that audience Okay, there you use cookies one side. All right, that became a total mess They put down a cookie wall and Visits of those site they dumped they they everybody simply went to the commercial sites, which simply ignored the law mostly Website owners in Dutch in implementation of the law until recently Were also responsible for all cookies by the site which were set by any site visit on one side anyway So they would also be responsible for data leaks in like LinkedIn when they lose a couple of million passwords Which happens every couple of months. It seems Facebook also gets hacked Website owners themselves under Dutch law are responsible or could be held accountable so This is really an incentive for Dutch shop owners and website owners to start thinking about What am I actually doing through my website? What is What is it I offer in third-party means for instance is this Facebook like button only Some sympathetic vote or is it potentially illegal minefield? so Any side owner has to have a privacy or cookie statement stating what they do with their information So that's annoying and Then came the penalties of course because you cannot have legislation without any kind of enforcement and first enforcement was Zilch naming shaming something like that But actually data loss under European law is pretty much being Demanded that these incidents are being made public But for the Dutch implementation this would also mean that if you would have a LinkedIn button on your website or something like that And LinkedIn would leak any information you as the website owner the visitor that visited should also inform the client Like I said, that's legislation. It's a mess, but It gets simpler. Thank God. I'll get into that a bit later, but it also gets Severely penalized in future in European Union Reading the Minister of of this particular area she has Suggested in the current proposal a maximum of one million euros or two percent of the yearly turnover of a company as a Penalty on data loss related incidents and that's quite a lot. I mean Websites web shops don't they may have pretty big turnovers, but the margins on that turnover are pretty small So some hit on that particular margin it hits hard so that this actually is pretty steep a penalty, I think Well, I'm asking how about usability penalties because all these cookie walls and stuff like that are also chasing users away That's a different approach But the biggest problem right now in the EU is the fact that all different countries have different implementations on this field No, well, actually it became even more confusing in the Netherlands bear with me But now it gets a little clear because right now There are being made to Pretty big changes in the interpretation of the law that I refer to earlier actually Information domains are being split in two domains. It's gonna be first party and third party Domains so anything you use and you store yourself for your own in-house purposes. You can pretty much do without that much Trouble, of course, you're still have to comply if you have a data loss You have to publish it But if you keep it in the house, you don't have to bother your user with it too much Which is a very good thing I think because Nobody likes having to go through cookie walls to get to the web shop to order that new fantastic gadget And you have third-party cookies or third-party information domain, which basically is everything else The review phase for this law has ended at first of July and it has still to be made into effect But this is pretty much the way it looked like it's going right now in the EU as well So and actually I think this makes sense I mean everything you can control yourself anything you can host on your own servers or at least keep in your own protective Zone well That is of a completely different quality than something that's aggregating around the web Of course you could ask How far you should go by interpreting this because if you host your website on like say Amazon or as you're then How much is that first party? I don't know Those are questions still have to be answered the bigger picture about this and Is simply that the European Union has started to take an interest in the digital identity of its citizens and has Very much taken an interest into the safety and the control their citizens have over this digital identities and The discussion It's made it's labeled the right to be forgotten. I Like that from a philosophical standpoint But it is mostly about Autonomous control of personal information a user should have control over who uses their own information Regulation should be based Not only in the EU area, but mainly to businesses who do work in the EU area. So If a European citizen does business with an American company European laws should apply this is something that is not yet this firm the in effect because well the US opposes it and Let's say that China mainly ignores it, but still it's it's a step in the right direction Users have to provide Actually intangible Consent before their information has to be used and I think this is a really good thing because I think that the Average user has the awareness of a cucumber when it comes to like their own private data and Simply isn't interested. That's it. That is a problem which can only be solved by Educating people and well people only want to be educated if they are annoyed by the fact that they have to do something or scratch an itch or something The regulation thank God will now apply for the entire EU so you don't get any more Fantastic magical implementations of the law like we had in the Netherlands. Thank God Because somehow I don't think the internet stops at the Dutch border. It's just my impression and There are some other things like it should be possible to take all of your personal information when you switch content or service providers The whole idea is simply that Keeping information inside the EU is a good thing both privacy wise both economic wise Because yeah, else we're simply giving it away and the EU is still the largest economic area on this particular planet Would be nice if we do something with it. Oh, well, so so what all this Lawful stuff is fantastic. I hope you are happy that you now have I hope not have a headache about all this legal stuff, but So I use Google Analytics Actually, who does use Google Analytics here? I think yeah, alright, but that's that's a lot who don't actually that's Last time I gave this talk it was like the whole room Yeah, it's a great tool and actually who of you is actually interested in Compliance to law in this point who has made it a point researching that before they started using Google Analytics You want him to have to great? Cool. Wow That's nice, and you don't all work for government agencies or NGOs or stuff like that. You also work for commercial cool Because actually I know that There wasn't much of a consideration at first for me It was like a good tool. Let's use it and my clients ask for it and then this whole debate started out Well, you could do one thing Screw that still use Google Analytics. Don't mention it to your users stuff like that I actually have customers who tell me not to implement stuff like cookie control because they don't think it's important Yeah, what what do you do with that? Do you actually tell your customer like oh, I'm not gonna build your site because then you will be Breaking the law or well. No, I started researching other avenues and That's basically also became Actual and current again since Google a couple of days ago decided to change its Search pattern again in a way now that all search queries are being done over HTTPS over SSL Which basically means that you don't get your keywords anymore Well, that caused a bit of a panic in SEO land as you probably can imagine and And of course from a Google standpoint. Yeah, well, that's business. They can do that I don't have an opinion that much on that But it's annoying because you somehow cannot Utilize the tools you had because some third-party entity decided to change something Now being that some third-party being Google does something you probably cannot work around but alright, it's still Annoying and what do you do if your user actually does press no because most of the sites I come across and also I do a lot of audits They have like cookie banners cookie control stuff like that But they don't have do not track or opt out or something like that But what do you do with the user who actually doesn't want to? Have its information store because pretty soon it's going to be legislation to that this might be Inforced as a default setting on clients. There is actually talk about love Enabling laws to like enable do not track by default which is interesting, but also from a business point of view If you use a commercial platform Whether it be analytics or giraffe or anything at all. What is your exit strategy? What if you at one time want to simply stop using it use something else? Google is pretty doable, but there are also tools which are basically Some kind of file locker you will never get your information out I actually have clients who used some services which basically meant they had to make print screens of all the old report There went the data and that that is mostly the the argument the I have with clients these days is like What if Google changes something what if you have a question that you cannot fulfill with Standard analytics software and what if you want to leave that software and it's not that much focused on law It's focused on functionality All right, and then Pwick comes in so we have now entered the second part of this story. I usually now hear a sigh of relief What's Pwick Pwick is the example I'm gonna focus on today. I'm gonna show or talk about a couple of alternatives that are also researched, but I found that Pwick is Basically the tool I can go to sit down at a client demo it and The whole question of Google analytics simply goes off the table because it's actually that good I don't know if it interests you that much, but once upon a time there was PHP my visits Which is now ancient and a debt project because it's not Pwick. It's actually pretty modern. It's it uses modern PHP platform stuff It's usable on shared hosting environments, which is a good thing It integrates well with all major CMSs I even done implementations with SharePoint on a couple of occasions which works also beautifully in Completely all contraire to in built-in SharePoint statistics, which suck It's pretty user-friendly. I don't know if any of you have ever tried stuff like a double your stats, but it's horrible Don't ever put that into user Or give them value or something Most importantly, it stores your data locally cookies are optional and it's a big project It's being used by a lot of websites according to the Pwick website. It's over half a million websites who use Pwick today So that's pretty nice Are there alternatives? Yep, there are it wouldn't be open source if there weren't like five alternatives Sure, we have open web analytics and that's a pretty on par functional wise It's pretty much on par with with Pwick. You can do a lot with that It's a little older and the Drupal model module is not that polished. So that's why I skipped on that. Yep Crawl track my crawl track is a really interesting thing because it claims to not only do statistics, but also block hack attempt. I don't know but I don't need magic poision In my website and I really don't like my statistics platform to do security stuff. I use more security for that Well, then we have AW stats, web allizer, analog, W3, Perl They're all Perl or CBase. They're static log parsers and they look something like something that you would control a space shuttle with It's horrible So those are the alternatives. I have researched if any of you have any other experience I'd be very interested to know because it's always good to have Options, but I'm gonna focus on Pwick. So what does it do then? Well, just about anything a normal statistics program do but then if you want with cookies so that you can actually track individual users or at least Computers and sessions Trouble nowadays, especially with the IP4 to IP6 migrations. You start to see a lot of more adverse pooling IP based stats software They really don't give any Meaningful information if you look in the Netherlands for instance, you have mobile telephone providers who offer Proxy services for their mobile internet connection, which basically means like a million people connect over for IP4 addresses Okay, there goes your information So Pwick does can use cookies to do that It can do click paths entry exit pages All the stuff that you actually can do with Google Analytics But there's more you can also leave extra information in refers you can use annotations you can do e-commerce implementations also It's it integrates very nice with Drupal commerce for like Abandoned card Detection you can get automated email reports you can do campaigns stuff like that. It's it's a complete platform Is that all no it can also do something very nice which Google Analytics cannot do it can parse server logs You can actually use it as a background tool and why is that cool? Well, actually because if your user decides to press the do not track button by law You are still allowed To parse server logs and get information out of that so you can get a more much more complete picture of the actual site user There are also a lot of privacy related options anonymize IP addresses purging data and There's 20 third-party plugins you can do a lot with so check it out Well Drupal integration wouldn't you know there's a module for that And what does that do? Well actually the same as the module for Google Analytics It's stuck sticks some JavaScript into your theme To send the information to the public server nice thing is it can do that's both in HTTP as well as SSL So that's pretty much privacy concerned well thought of it can Offer reporting from the Drupal administration interface. I'm gonna show you that a bit later because that's really nice You can have actual reports live reports in your Drupal content editing interface for your content editors And you can customize what what to track from within Drupal, which is also nice So you can simply set up a pwik server somewhere in your farm Have them connect to multiple Drupal sites and then from the Drupal sites configure what not and not to do You still however need something at least within the EU to Make sure your cookie or at least your law compliant like cookie control, but at least I Using pwik and the Drupal pwik module. It's bringing you one step closer to independence from a lot of nastiness How is it in the XSS sense? I hear because placing JavaScript on a page Could be dangerous. Well, it's not more dangerous than any other analytics tool in that manner And you are still responsible for your site's security level And but security website audits is a different workshop and I'll focus on pwik now What's offered out of the box for Drupal? It offers visits trends visit times entry exit pages refers Commerce anything you can combine reports. You can show them to your users So that's pretty nice, but instead of talking about it a lot. I'm gonna do a demo and I hope that the internet will not fill me but in case or that I did prepare a local host thing Seems to be working. All right this is demo.pwik.org and I Strongly invite anyone to go there and simply have a look around click around best part about pwik is the fact that it's so insanely easy To customize the interface and have that done by just about anyone In your organization They can set up their own filters date ranges stuff like that They can select For specific filters here, not only dates. You they can add widgets to their own dashboard a User in pwik can pretty much create his or her own experience and that's really nice. That is something most other Traditional open source platforms really do not do it's like these are the reports stick to them You have visitor logs Also a very nice graphs Life visitor log you can see real-time who is doing what on which website which is also very nice Usually in Google there is a delay of a couple of seconds minutes between These things it has device detection Very nice in the age of responsive websites it also does location mapping and it does a best effort on Internet providers how that is relevant. I don't know but it could be useful You could even detect browser settings to optimize your site for certain audience if you detect that your site is Mainly being visited by only English-speaking people then it might not be the most interesting strategy to only put up a check website or dodge Operating systems resolutions and it resolves it all really nice. You don't get the craft of a dozen of identical browser strings which Only differ one Letter or something like you get with AW stats. It simply clusters it very well also interesting visit times Engagement lovely report see how how much time your users spend on a website This is either done by the cookies or by the IP Translation You can Look for exit actions entry exit pages, etc. Refers campaigns website social Search engines and keywords. Well, that is actually here's the thing. I was mentioning earlier Keyword not defined is the thing that you will see a lot in SEO tools these days because Google simply decided not to provide them anymore That is a problem because that is actually information you would like to have But the question is of what cost This also in has impact on the inbuilt search engine of Drupal or at least the Google version of the built-in search engine By Drupal. I don't know. Do you have users on websites who use Google to index their Drupal site and search their Drupal site? Do you have any or Okay, come across it sometimes. So if you if you have users like that Then you've lost the ability to track the inside searches as well and that really sucks But that's actually showing how Hardening and how fierce the competition in this market area is going goals Very nice for both e-commerce and content steering so Take a look on pwik and See what you can do with it You can also have and this is also a nice feature have users set up their own email reports I've just switched to my local host environment Because of this You can also have witches added to other sites, but let's look at the Drupal integration for a little bit It has two modules It has a module pwik web analytics, which integrates the actual JavaScript with the Drupal site And it has a module pwik reports which enables a Administrator user to have a look at the content of the actual pwik reports How does it look well? Like this if you go to reports, we now have a new entry pwik reports, which shows Just about anything I Have just shown you just now, but then directly in the Drupal interface Settings times etc. configuration for this is all also pretty straightforward Tracking Which means what? Most this pwik instance do you have to provide the site ID which you get out of pwik the installation you have to Point the module to the actual pwik installation that probably makes sense You can also track multi-site or multi subdomain implementation separately You can track pages. You can exclude pages anything you want just think of it as block configuration or something like that context you can Decide for which roles tracking code must be enabled You can allow users to customize which they want to be tracked or not Links and downloads can also be tracked Which is also nice if you're a content provider you would like to see what people are doing Internal search can also be tracked, but you have to switch it on by default. It's off and Here we go universal web tracking opt out which actually is compliant with do not track And this is pretty unique for most open source tracking stuff page titles You can add custom variables useful for campaigns and stuff like that and there is also what they call advanced settings And you can add also some JavaScript code if you want to have Something else on to pwik you want to track. That's also possible through the interface So you can basically Configure it all there is no theming editing or anything needed to implement this and It outputs a nice cashable piece of JavaScript. So there is also no performance penalty This is the tracking reports That's pretty easy. Just point it to the URL if you don't fill in anything. It's using the URL that's tracking it Provide the authentication string of the user and you can restrict The sites your users can track through this implementation. That's pretty easy. I think that's not harder than Then Google Analytics and it works. I mean I have here home I'm very creative. I have a test page and if I look into my pwik I Can see That if I select the date of today That there have actually been visits and I can see What I have done well actually that is pretty much it so No, not yet Whatever want to start using pwik. I have already a lot of information stored in other systems like Google Analytics what to do now There are migration tools that is cool So you can actually take your history from Google put it into pwik and synchronize it You can also do log file imports. So you can also pick up the Apache logs from your past years or something import them to really nice and That was it any questions If you have questions, please go to the microphones by the way because the session is recorded Just like anything else and then your question comes back Yeah, hi. Yeah, nice to talk. I never even actually come across Is it something that's pretty easy to install? I mean, is there sort of like if they have to get the dev guys to kind of set it up because I take you set it up on server somewhere and Yeah, the installation is actually go to pwik.org and Download a turbo which is mostly not surprising and of course now the end. Oh, no, it's doing it's actually three steps download it install it and Configure a MySQL database for it and then you can run it. It's it the installation of pwik is Definitely not harder. It's even simpler than installing Drupal. So if one of your admins can install Drupal You can definitely install pwik and the best part is you can set up one pwik server for a whole Forest of websites. So you only have to have one That actually leads into my next question. So How intensive is it? I mean like what sort of server do you need? So if you've got a 10 sites that are getting a lot of hits Yeah, but pwik actually does a sequence processing so the actual logging of the Requests is done in a completely separate and very small. It's like 10 lines of PHP code which simply dumps the log and let The interface and the user who requested do the rest. So it's it's not It's Definitely not very much of an overhead Last question. Yeah, so you're talking about dropouts and like say a shopping cart figuring out when people are dropping open Would you still need to I presume add like on click events or or some sort of specific tracking code to do that type thing or is it something that You can go to Things like e-commerce in girls in settings Let me check websites and you can simply enable e-commerce here and that will talk to Drupal commerce because of the Drupal pwik module will translate commercial and if you have things like web forms for instance Multi-step ones and you wanted to figure out if people are dropping off after a couple of steps. You can look with entry and exit pages pretty well and that's Here entry exit pages You can see where somebody comes in and what their paths is and where they go out. You can see also in engagement What the actual route is that users are doing here also in the visitor log? You will see that this is my current user wait This is my current user and you see that This is already bundling The site requests I made a couple of seconds ago into one session So it tried to make a session of it so you can analyze the sessions And see where users drop off and this of course is also something you can automate. So yes, you can Have reports to see Where users stop being interested and drop out I'm sorry, I can't can you speak up I presume that then works with repeat visits and when people come back to the site kind of Picks up that they yeah before and all that. Yeah At least a few sorry If you have cookies enabled then it will beautifully pick up the session again Thank you. All right. Thanks. Anything else you cool. Yeah, thanks Yeah, actually running pwik and And then later we had to install Google Analytics as well because the marketing people wanted it to and Their main argument was that pwik lacks an integration of Google AdWords. So yeah a way of anything that That pwik is capable of doing this. I mean like to integrate in a way with that works or something Well, it's pretty hard for just about anybody else than Google to integrate with Google AdWords really well Which probably is one of the reasons why they're so damn successful In these cases This is not something that's solvable by pwik because the actual information that is stored In the Google request is simply not being provided in any usable way by Google because it's all HTTPS Nowadays. So basically you're you're blocked but what you can do is add several variables custom variables to Your campaigns so that you can track what is The actual traffic coming from a certain Server and actually AdWords come from a different refer than regular Google search Up search actions so you can differentiate with that, but if you use AdWords Yeah, you basically have to use Google tools to do it. Well But however once you're Inside your own website Especially due to the fact that pwik can also parse logs and do stuff that you can actually do without scripting or stuff like That you actually get a lot more information a lot more factual information about The behavior of people on your website and that's also valuable. So I think it's two types of information each Has its own merits Any other questions right now No, well, thank you very much for being here and if you do have questions then look me up There is a brief announcement from the organizers of the conference Well, which will be presented by this person here Hi everyone I've just been asked to make an announcement which we're trying to feed out some information to people who are attending on the site builders track So I'll just read it out to you Drupal.org content working group wants to improve Drupal.org for site builders One thing they're working on is building landing pages Pages where you can find relevant information around site building topics They're working on finding out what has to be on those landing pages and That's where you can help So if you have 30 minutes to talk about how you figure things out with Drupal using Drupal.org Leave your name and some contact details at a form which you'll find online It's a bit.ly SB interview So that site builders interview if you'd like to take part in that Working group members will contact you and set up a time when you can have that chat. Thank you