 to our almost final session of the Global Roodleboot. And I'd like to introduce Jordy from the Terrorist, Jordy, and Anthony from the Kedipant, talking about the Cloud Interceptor for High Concurrency. Thank you. Thank you very much, and thank you for coming. First of all, we will present ourselves and we'll talk about it. OK, yeah, sure. OK, anything better? Hi, everyone. Thank you for coming. Let's talk about High Concurrency. And this is the agenda we'll talk about. But first of all, we present ourselves. I am Antonio Bertram. I'm a CTO from the CIPUN, a model partner here in Barcelona. And we created CIPUN 15 years ago. And we are involved with open source projects since the beginning. We worked with model for more than 10 years ago. Jordy Molina. Hi, I'm Jordy Molina in the city of Terrorist. We founded Terrorist in 2014 to join a hosting business that my partner had since 2003 with my consulting and DevOps business that I had by my own since 2017. And today we will talk about what we have found in the public cloud to be interesting to know when you are deploying model there. Because not everything in the public cloud is as simple as they made it to be on YouTube videos and Twitter accounts. So how many of you are working with public cloud? OK, one, two, three. How many of you are working in AWS? OK, there's more hands in AWS than in public cloud. That's interesting. How many in Google Cloud? Azure? Azure? OK. Cool. So probably you got a visit from a sales representative or a partner of the aforementioned public cloud saying, come with me. You will have a limited scalability. You will get your code to work magically in my new infrastructure. You won't need VBAs anymore. And we have an API for everything. Because you know when everything has a problem, you put an API on it and the problem disappears. Well, that's what they tell you when they are selling the solution to you. Once you start working on it, you discover that there are limits that sometimes are not even documented on their own documentation that they are providing for you to know how to use those APIs, how to use those services. If you decide to go the platform as a service way, the platform as a service has its own limitations and requirements, and you need to adapt your code to this platform as a service. And regarding databases, well, you forget about backups and updates, more or less. But you still need the VBA way to turn your queries to check the parameters on the databases because quite curiously, it turns out that the parameters they set by default are spectacular in most of the public cloud providers. And yes, every service has its own API. His own API, its own, sorry, adding this. Its own API. It's written by separate teams. So most of the APIs are unrelated. They only share the authentication mechanism. So you need to learn how to work with those APIs when separated. So that's the different things you need to do. It's not that it's not as easy as they made it appear on marketing products. So you have to go the same way you used on premises infrastructure, taking the decision by the cost. If you have a high SLA, let's say more than 99.5%, you will need to work this way. Why? Because I don't know if you checked on public cloud providers documentation, most of them offer this SLA for individual availability zones. So if you need more than that, you will need to combine more than one availability zone or more than one resource. And that will imply that you put one load balancer, create a pool of application server. The model data has to be placed on a networked file system, or any kind of distributed file kit system that performs well, we'll talk about that later. You will need to separate the gate services at some stage. Search accelerator is used, and the database will be also a separate service. Now I would say, here, in popular opinion, it's unpopular because most of the cloud providers are saying, split everything and make distributed architecture. But well, we're talking about model here. It's a PHP monolith. It's not bad, it's a monolith. If you don't need more than 99.5% of high availability, you can go with a single VM and still scale a lot. Because right now, we think that the biggest machine has like 128 cores and 1 terabyte of RAM. I think that you can put in this machine a lot of users before starting this way. Starting this way is not an easy task. Why? Mainly, I don't know if I talk about that in the next. Mainly because the load balancers are not something you control. It's controlled by the cloud providers, and most of the time, they have hidden capacities. So for instance, in AWS, if you are expecting a spike load, which is not linear, you have to warn them saying, look, at 10 o'clock tomorrow morning, I will start a training course, and 1,000 people will log in suddenly at 10,000. At 10 o'clock, between 10 o'clock and 5% and you have to do that because if not, the infrastructure in Amazon expects the load to increase linearly, and then the load balancer does not respond properly. I won't talk about Azure load balancer because I think it's not even worth mentioning. You will be left crying in a corner. And in Google, you have the same issue. So it's something normal. They prepared infrastructure for linear growth because it's impossible to predict spikes in newly deployed infrastructures or in infrastructures that are used eventually once or twice a year. Then you have to plan for a pool of application servers. You may have heard about auto scaling in the public cloud. That's one of the main features they use for selling the public cloud. Well, it has its own problems. When you work on auto scaling, you have to plan very well how you will implement that, more on that on the next slide. Then we go to the network model data. I was lucky enough to build my first big model installation in AWS before they made an offering on an NFS service. And it may sound strange, but the network and face system offering from public cloud providers are more focused on keeping legacy applications working while you are moving the contents on the Shattered File System to an object-based storage like S3 than to be designed to perform well. What this means is that elastic file system from AWS performs like crap on small files. I was lucky enough to look for another solution because there was no elastic file system before 2015, I think it was. I built the first model in AWS. We used a solution called SOFNAS. It's basically a cluster-based NFS service. It performs brilliantly. Google and AWS have their own NFS services based on NATE-APP, which is a network-attached storage solution provider. It performs way better, but you have to spend between, depending on if it's Google Cloud or Azure, $2,050 for $100 to begin with, every month. So it's not that good neither. The other services are, yes. You can separate them as well with a single VM approach. If you are using a single VM, let's say one with 32 cores, and things start to go bad, you can just separate the cache and the database in the services that the providers offer for Redis, for Memcache, or for the database of choice, they work pretty well. Well, the Redis service in Azure doesn't work well, but it's expected to improve. At least they told me that last week. Now, for the deployment and autoscaling. If you are using a single VM, there is no change. You still deploy the code the same way you were doing. But if you go with the autoscaling group option, you have two possible approaches. Which means that you generate a first silver image, which has the operating system, the PHP application servers, and GIMS, Apache, whatever you want. The dependencies, if you use special libraries. Then you add your code to those silver images to generate a golden image. And then you have the deployed onboard approach, which you have the silver image, and the autoscaling group generates a new server. You place the code there. Both approaches are good, but you have to take in account that this approach will take longer, because every time you boot the server, you have to perform a task. This way, when a new server is created by the autoscaling group, you don't have to do anything on this server, but you still have to warm up the application server. When you plan for elasticity and autoscaling groups, I will go fast, because if not, my colleague won't have time to speak. You need to plan for a minimum capacity. So doing an autoscaling group of minimum one, and it will grow on request, it's a very bad idea. You always need at least two or three servers to maintain the rest capacity. Plan for a maximum expected capacity, because if not, your cost will rise eternally. Plan for health checks. Health checks are the most critical thing in an autoscaling group. If you cannot detect if an instance is performing poorly, or if it's not working at all, you will have problems like users not being able to work suddenly, but then they refresh and everything works, and then they submit and it fails. So health checks, plan for health checks. When you define the rules for autoscaling, use this model. Increase fast decreases low. It's better to add two instances, and decrease one per one, or add three instances, and then decrease one per one once the load decreases, than add one by one. Why? Because you are attending a search in traffic. A search is a heavy spike. You won't be able to attend this search probably with just one machine, and the machines take time to boot. So better increase at least by twice the capacity you decrease. Once the machine is... the new instances is created, warm it up, perform a load of the main pages maybe on boot with code. So you just have the code on memory and the users start having better results. Also, some of the code providers provide a feature in the load balancers that allows you to warm up the instances slower, so they don't redirect the redirect full traffic. They spend two minutes redirecting just part of the traffic to these no instances. When dealing with PHP monoliths, it's better to use bigger instances with a high number of CPUs. And that's it. Now your cloud layer already spoke about load balancers and that they need to be warmed up sometimes. Cost will escape rocket. I wanted to speak a little bit more about that, but I talked too much on the slides. Just taking account that network costs money, load balancers cause you by traffic and processed data, plan for capacity. There are some cloud providers that allow you to buy infrastructure at lower cost using direct seating offers. Clean up and watch out for your bill. Let's continue. We continue with the power of more ready with model. Also, taking account every image has very good and designed by Turadis with the last version PHP that performs better with all the requirements of our server. You have to analyze what plugins and things you have in your code. Check it, why I need this additional because I can do that with a plug-in core that I have the guarantee that it works better, probably or not. Yes. Then checking in additional themes, take always who will love it, support with the user and the usage, version updated, there is a word, and also we don't need all these plugins, these 43 plugins here, yes or not, we don't know it. Then also we suggest to take a look if you have any broken plugin there or theme, everything has to go out. Also we suggest to do a simple test. This makes a simple benchmark but gives you some guidelines. If everything is green, it's okay but you have to review later on, but at least that doesn't, it's red. Then stress it. To do stress, you can do it using a jmeter test plugin. Mula helps to use that, to do that using this option, but always in development you have to create a new, the same versatility you have in production to another in pre-production or testing. Then make it the same hardware, same architecture, same version, plugins also, everything. Then we recommend to add a ceremony tool to get some data, enable the development mode, make the test plan, make the push plan and the test. Then create it, you can create large, extra large. What do you want to test it? Then launch the client with jmeter, analyze it. Get it running and try to get all the metrics, analyze all this data. It's very important to have it the same architecture you have in your production environment. I was very quick, but we are late. Any questions for us? Thank you. We only have time for one quick question. Go ahead. Sorry, I got a little late. You mentioned tons of users. Can you give us a more precise number of how many is tons that each actually is good to scale horizontally instead of vertically? Ten thousand concurrently. You need more than 32 gigabytes of run for that. That installation of Moodle with 10,000 concurrent users at 9 a.m. was the contrary of what Tony said. It had a lot of plugins that were needed. It performed horribly, so it needed like 10 R3, 8 large instances to work. At that time now, they improved it and worked with smaller machines and smaller pool of servers. About 10,000 concurrent, it's a good idea to start. Do you have any experience with Glossier FS or some distributed file system and what do you use? With Glossier FS, let's say it was a proof of concept. It was not a big environment, but we had the same issue that with Elastic File System from Amazon. Small files perform very bad, so Moodle data is not the best solution for Moodle data, sorry. What we found out was that CFS from the old Solaris works like a charm and NFS exported by CFS works brilliantly. Softness, I'm not endorsed by softness in any way, by the way. It's a solution that is based on CFS and Linux and you can cluster two nodes and we are providing the big Moodle instance. We are providing it with two terabytes of data in the Moodle data and it works flawlessly. It's a really good solution, but we have made tests with homemade CFS and Linux solutions and it works the same. It does that softness, it gives you a nice probe interface and a very expensive license. Great, thank you. Thank you. If you're one, we are around here. Feel free to reach Antonio or Rodis if you have any more questions. Those on the stairs, there's plenty of seats here if you'd like to stay for the final session. Please do, move along the row and then if people come late there's a space. The front row is very comfortable seats where you can stretch your legs and I don't think the presenter bites. No, I don't. You either use this one or if you want to walk around. Dany, can we have that mic? Much better. I'm happy to say hi myself. Hello folks, bon dia. My name is Jeff Rubinstein. I'm the Vice President of Product Strategy and welcome to the last session of the afternoon I think. This could not be more different than the one you just heard which was very practical hands-on stuff that you should do right now. I want to talk for a few minutes about what you should be thinking about the next three or five years and about learning analytics. Part of my job is actually I work with the IMS Global Group on standards like LTI and one roster and my favorite bit is learning analytics which has been around for a few years and everyone wanted to engage in this endeavor but we've been stuck in the hype cycle and in that trough of disillusionment and I think we're just starting to come out of it now and have some progress to report. So at this point it's something you should keep your eye on and begin thinking about for the next couple of years. So what is the vision? What do we want out of learning analytics? Well there's something in it for everyone but for students ideally they'll get some visibility on their own work and how they can make their own work better by benchmarking it against their colleagues or prior cohorts that sort of thing. For instructors they will get a complete view of student activity across all the systems that those students use. This is going to improve the student experience and improve the ability of the instructor to help that student achieve. For the institution one of the main benefits is you can intervene earlier because you can see when a student may be in trouble. They're at risk of not graduating or failing out of the classes they're in. And then for researchers we can get now a much larger data set and begin doing better research on what helps students succeed in terms of their behaviors in terms of the content types they use et cetera. These are the goals. We've got a couple of these that are coming up pretty soon but this is where we want to be taking our progress in the next couple of years. What's the biggest challenge today with learning analytics? Well today getting the data in one place. Most of you probably have an architecture that looks something like this. You've got a student information system at the bottom that knows what students you have and what courses of study they're enrolled in. That will feed roster information up to your LMS or your VLE if you're British. And then you will launch that student into some number of tools often via LTI. And that tool may be a piece of courseware designed by a publisher like Pearson or McGraw Hill. It might be a tool like an ebook platform like VitalSource or others. It might be a collaborative environment like Adobe Connect or Big Blue Button. It might be a tool where you play videos like Kaltura and the problem is that all of these systems know a piece of what Jeff did. And sure you can log into them one by one and see what Jeff did in tool number one and then log in and see what Jeff did in tool number two and then in tool number three no one's going to do this ever. Teachers will not do this. They haven't got time and it's going to work. So this is what explains the sort of next step in the learning ecosystem and that is the LRS or the Learning Record Store. The idea is to get all that data from all of those tools that know a different piece of information about Jeff into one place and not just into one place but in a way that it carries along some semantic information about that activity itself and what context Jeff is working and what kind of thing I was doing. So there are two ways in the ecosystem now that can be done. Some of you may have used XAPI in the past. It's been around for longer. The one that's propagated by IMS Global is called Caliper. It came out about two years ago and we were one of the first participants in Caliper. These formats are actually fairly similar which you'll see in a minute. My sense is, again don't commit yet but keep an eye on this for the next few years. The corporate training world is going more toward XAPI and the education world is going more toward Caliper for a couple of reasons. One is that Caliper is kind of the flip side of LTI. That is if you launch with LTI you can get data back via Caliper and you have a complete sort of loop and you won't need any other authentication mechanism but right now to get XAPI into your data store you're going to have to have a shared secret or something between those two systems which is now one more thing to worry about whereas Caliper can build off of what LTI has already done but secondarily that Caliper is creating a structure where they can pack more semantic richness into the messages that it sends. These briefly are the differences. Caliper is by IMS Global XAPI is managed by ADL the folks who brought you SCORM and there are some existing LRS's that are already out there that consume Caliper and XAPI. I forgot to mention up there actually you may have heard that Blackboard just released a product called Blackboard Data. That's also an LRS that they're going to fold into their version of Moodle which they can't call Moodle anymore and they now call OpenLMS. So if you don't know who IMS Global is I encourage you to take a look because it's well it's us. IMS Global is a consortium of universities of edtech providers like myself of publishers who have content that they want to deliver to universities. There is now a European group here at IMS and they met I think a month or two ago here in Barcelona and you can find it out online which universities are playing actively with it but I encourage you to take a look and if you can participate because at the end of the day it's just us. We all volunteer our time to try to sort of raise the level for everyone to encourage more interoperability and to encourage efficiency in how we do operations. If you want I just put a link up there to one of the common standards now that LTI Advantage which is now the newest version of LTI that IMS is now working on. And so what is Caliber? Caliber is an open standard for telemetry for sending of student activity data from a tool provider or an LMS to an LRS. It has some semantic richness to it that is to say it has a controlled vocabulary of terms and this is extensible. That carries along with the information so in the case of video you'll see in a minute like Jeff started this video Jeff paused the video Jeff restarted the video, etc. So there's information there so it's not just raw data. And its goals are to do everything that I mentioned earlier to get a complete view of student data to be able to do things like intervene earlier when students are in trouble to do research on large data sets of student behavior and student outcomes. So basic format and XAPI is similar it's triple. It's action object. Jeff did this to that object. And this applies to almost everything you think of with a different set of verbs depending on what the tool you're in. And so there's a controlled vocabulary what they call metric profile for things like logging in which you can get logged out of course but then you've got to put it somewhere else to do research on it. Video views ebook reading assessments and you can get all this stuff sent to a central repository and then begin to work on that data. To give you an example so Caltura as a video provider we emit on what's known as the media metric profile. And it's just a little package of Jeff did this so Jeff started this video and a time and date and it will wrap her. And it's a message that just Caltura will send to the end point that you specify that is the end point which is your LRS. And this is true for other technologies in the market. There is a an ebook profile. I don't know if you guys have a provider here in Europe yet that is as well implanted as vital sources but vital source is a big provider in the US of ebooks and they can send data from Jeff Redd page 5 of this book. Or Jeff spent 5 minutes on page 6. So you can see how this data gets useful because if Jeff Redd all of his ebooks and watched no videos he's probably okay. If he read no ebooks and watched all his videos but if I watched no videos and read no ebooks I'm probably in deep trouble. And that only emerges when you get this data in one place. I will skip this for now just because we're getting low on time but I wanted to show you that this is not just smoke and mirrors now. This is actually happening. This is a product a visualization product actually on top of an LRS which is in testing right now by University of Michigan which is one of the larger research universities in the US. They built this visualization themselves and it's stuck inside their LMS. But the data it's pointing to is data in the LRS. And what this does is it actually tells a student and this is actually student facing good idea, we don't know yet but this is actually student facing where the student can see a student who is excelling in this class who is doing a good job watched this many videos read this many ebooks in LMS. Here's what I did Maybe I should be more like that student who is watching many more videos, reading more of the ebook and taking more of the assessments and that is facing the student him or herself and so this is now live in other schools they're using Tableau or they're using QuickView or some other data for the visualization tool to do the visualization of this data but once this data is here this is now possible. Now don't get me wrong once you get this in one place all the actual work is still ahead of you analyzing this data for what it means for student success but at least it's possible now and it was much more difficult before it may have taken person weeks of time of massaging data sets and doing ETLs just impossible at scale but this is really here, it's here now and we're going to see much more in the next few months and years. Any questions? Thank you for your kind attention and enjoy the rest of the conference. Any question? Never heard about Kaliber I know XAPI and learning record stores for some years now but it sounds quite similar the actual object model to what is the difference between Kaliber and XAPI? So they're very similar standards in the information they send there's some architectural differences so XAPI doesn't have the advantage of LTI to set up its authentication mechanism so XAPI needs more work to deal with the system-to-system negotiation and authentication it also was designed more around SCORM and it has the ability to embed SCORM packages and stuff like that so it's unclear whether or not it's going to be as responsive to education-specific needs as Kaliber will be because Kaliber is meant to be the flip side of LTI and it's meant to be extensible in a way that any education tool provider can pretty quickly get approved and put into the standard a new metric profile so as soon as somebody comes up with a plagiarism tool or other things you may see here on the floor they have a very quick mechanism to get that with rich data about what it means into your LRS so if you're invested in XAPI there's not a particular reason to move unless we see that the industry goes mainly toward Kaliber for me, by the way, for Kaliber we do both we serve both markets and we're happy doing Ida1 my guess is that more of the education providers you'll see in the future will go more toward Kaliber but it's an evolving thing so keep your eye on it this might be a bit difficult because I am not in the XAPI or Kaliber but for video a lot of video is now on YouTube so how would I go about you are saying that with Kaliber I would have to set up some LTI towards YouTube and then that would work with Kaliber to bring back data YouTube it's very commonly used because there's nothing else to use that's that easy and video is a heavy content type it's expensive I don't know what's going to happen with YouTube over time but if I were in your shoes I wouldn't rely on it for the long term because at some point they got to make money off of it I don't know what's going to happen in the end but anyway the real problem with YouTube is it doesn't give you any data back YouTube is not instrumented to collect that kind of user data and send it back all they want to do is show ads so you have a couple of options you can migrate those videos into something that's more of a commercial product we can do that we can also actually stream the YouTube video through our player which is instrumented and if you stream YouTube videos through our player you get all the data back right but suppose one could convince say Google to add that the user could add a sort of an LRS token to their profile so that once you view it it would be sent directly to the LRS token writer in principle it would work I mean you would need authentication into YouTube to begin with so you have to go from your LMS into YouTube in a way that carried with it an authentication token and a user ID that's what LTI does and then you would need YouTube to be instrumented in a way that it would then gather data send it in the caliper format to your LRS endpoint and you would need an authentication there as well again which LTI would do for you with Caliper that's one advantage Caliper has over XAPI because XAPI doesn't have that authentication in front end that it could then pass to the LRS in the back end thank you thank you so much thank you very much everyone for attending this session the next one is the Moodle Roadmap Brainstorm down in the global so we all need to clear the room here and they're going to I was told I had to clear the room I'm not exactly sure why thank you