 Um, feedback on design, um, I personally don't have a lot of feedback. It looks, looks all, looks all fine. I mean, overall scaling makes sense. I personally like Apache and I don't say like that. I don't like Apache, but I don't hate Apache. So it works. So it's, it works. So that's fine. If, if PHP 7 is supported, absolutely, it's a lot faster. Um, MariaDB Galera, as you said, um, there's, we support Oracle, we support Postgres. Um, but MariaDB Galera cluster setup seems to be the most popular one nowadays. I think that's, that's a good choice. Um, SSD. Makes sense. Have you said something about caching here? Redis? No. So we would absolutely, um, recommend that you use a, do use a Redis for caching here. I think in the reference architecture, this is only added in the biggest tier, if I saw it correctly. So, but you can absolutely use it in your, in your setup too. So this is really recommended. Yeah, we actually did, uh, took it with us in the planning phase, but I forgot to put it there. Yeah. And that's used for overall caching of NextCloud and also for session management. It's just a lot faster. Yeah, exactly. One, one thing I wanted to say about the load balancer is, um, just that it's clear here. Redis, because you mentioned at the beginning that, um, SSL termination should happen in the load balancer. It, it can, and it's a common setup, but there are always two choices. There are two ways to do that. You can terminate SSL in the load balancer or you can terminate it in the other web server. And there are pros and cons for both scenarios. Terminating SSL in the web server is probably cheaper to scale because at some point you need a bigger load balancer, especially if you have a hardware one and this is like a single instance and for scaling, having a single instance is always bad, but web servers are easier just by another node. So terminating any other web server might be cheaper and easier to scale. The drawback of that is that you have an encrypted channel going through the load balancer. So the load balancer can't see the data, which means the load balancer can't do things like sticky connections that a certain request always goes to the same node or can't do other more intelligent ways to handle the traffic because it, it's still encrypted at the stage. I thought it was the off-loading. Yeah. So you don't worry about HTTP to the own cloud service or HTTPS with the low encryption. That's what I thought that meant the SSL. No, I mean you need to terminate SSL connection at the load balancer or at the web server. Yeah. But you said it goes through the load balancer. If it, if you terminate it at a web server, then it goes through the load balancer encrypted and you can't read cookies for example or other things at a load balancer. So there are pros and cons of both approaches. I just wanted to mention that this is something to, to keep in mind. It has some benefit to, to do, to use something like sticky connections that a request always goes to the same node because then you can use local caching. You can say for example, the session management is just local on this machine. And every time, I'm sure Frederick has some feedback there, every time you do something local, it's always better because then it scales linearly. If you have to maintain a Redis cluster and every request has to go through the network to this cluster, get back and forward, you always create this. I'm not saying it's bad, it's a recommended setup and it's fine. Just to keep in mind, the more you can do local, the better it scales. I don't want to go into MySQL with MariaDB discussion. I don't know. Both works fine. This is why I said I don't want to go into this topic. I think in the next cloud context, there is no, there is no difference that I'm aware of. I don't want to point out that the reason why we're here is that Frederick Steyer has pushed him a lot due to licensing and all this stuff. Do you know who we've been talking about? We don't know. We're after we need this engine on the CPU. The next CPU was 660, the next one is 650, but 575 came out of the CPU. So pay attention to our marketing. Okay, let's move on. My statement is still valid. It doesn't matter for next cloud. Both works. So Oracle absolutely works. There are big installations who use it and they're happy with it. When we did this recommendations, the goal was to come up with something which fits most scenarios. And in my experience, there's only a relatively small percent of people who are fine buying Oracle licenses. And this is why I didn't put Oracle as a recommended database in there. But if you're happy with it and you want to use it, it's absolutely fine and it works and it kills well. I wouldn't call it the standard solution. It's just a bit more expensive and more complex and difficult to use and so on. It works. Sorry, and the systems serve my customers. Requirements that make me use it? It works. That's the main reason why we support it. Because there are people that after trying to work with setups for thousands of euros that they don't remember. So it works. How do you support it? Or you say, okay, it's a niche and actually we don't exclude them. It's fully supported. Everything works. I'm just saying that as a developer, I don't like programming against it. But yes, that's my problem with yours. If you want me to pay money, I don't particularly care. These recommendations that you showed earlier. So maybe what we wanted to achieve with that, because at the beginning as own cloud, we had lots and lots of different things that you can use. Like the software supports all kind of crazy setups. And then we noticed that people deploy crazy setups. So what we wanted to do is we make a distinction between what works and then we wanted to have a few templates or blueprints. Because people come to us and say, I want to run this, but I have no idea what we should do. Then we say that. Do it like that. This doesn't mean that it's the only way to do it. But it's like the recommendation. There are no bad intentions here. This is probably just missing and we will fix it. I mean, it's like maybe it's an artifact of the former project where the policy was that Oracle is only for enterprises, but the code was actually in the community edition. But no one was told how to use the code. It's a weird situation. But then I found a website where the project was not implemented. Yeah, whatever. Next load, it's fully supported. You can absolutely use it to fix the documentation. Absolutely. It's just not something that Oracle is, in my opinion, I don't know if you agree, more for something for people who know what they're doing a bit. It's a more grown-up database in a good way and in a bad way. It's very powerful if you know what you're doing. You have the money and the knowledge and everything. It's great. If not, then maybe use something simpler, which might be other. That's my opinion. But it's fully supported. Okay, next question. I think this boils down to the question. If you can replicate an active directory via LDAP, I have no idea. Maybe later we should ask Arthur, who is the expert here. I mean, as far as I understand, Active Directory is only a different schema on top of LDAP, so it theoretically should work, but I have no idea. What would you recommend for a storage system? That's... Oh, yeah. What is a site in your context? What's a site? A data center. So two sites means two data centers. So... Okay, yeah. So this doesn't work at all. So we have this wonderful slide here. Next slide works like that. There's a WebDAP request or HGP request coming in here, going to a load balancer, then it's forwarded to a web server here, and the web server is also the place where the next cloud code runs. There is a request coming in, like upload this file, please, for example. Then the next cloud code here starts to run and does all kinds of things. Checks like the credentials if they're correct against an LDAP server. Checks if the path is there, if they are the right permissions of the path. Then if this is a shared file, maybe then it actually should go into a different storage. Then if the upload is done, it sends out maybe a notification mail to the one who shared the file, maybe puts an entry into the activity feed. Maybe before that checks if the upload is possible in the first place with the firewall rules and so on. So there's a lot of things happening in next cloud here. And these things result in several database queries and several requests to the storage system. And there's a lot of stuff going on here, also with Redis. And if everything is done, then basically the request ends and goes back. Yes, file upload successful. So this is how it works if you do a simple file upload with Web Dove. So the thing is here that all those SQL queries that happen here, there's a lot of read and write requests, update requests and so on happening. And also basically on the storage where maybe you create a sub directory, put the file into a sub directory, you check if the parent directory is there, you check the end time and so on and so on. A lot of things happening. And this is actually the same here. If the database or the storage has some kind of synchronous replication, like for example a Galera cluster, that's fine. Or some other storage which has some kind of synchronous replication, it's all good. But as soon as you have asynchronous replication, it falls apart. Because then you might run in the situations where you put something into the database and then a millisecond later you do a select and it's suddenly not there. Because you do the query on a slave and the replication didn't happen yet. Or the same with the storage. You upload a file and then this file is somehow asynchronous to transfer to a different data center. And then a millisecond later you want to check like the I node or something of the file and then the file is not there because you read from a different node and replication didn't happen yet. Okay, so this just would fall apart. So this whole asynchronous replication, which is mandatory if you talk about different hosting centers, it just doesn't work. So the only way if you want to have an own cloud, next cloud instance distributed over different sites is that you do it with our Federation concept. That you have an instance on one hosting center with its own storage, with its own database, with its own application and load balance and the same in a different hosting center. And they are connected via Federation. So one user can share something here and it pops up on the user on a different site. This works. But you can't, if you start to somehow do, I don't know. The most extreme and most stupid example of a customer once was that has had like two hosting centers and they had an R-sync cron job which just synchronizes the files between those two hosting centers. You see the problem, right? You upload a file and suddenly not there anymore because suddenly you look on the other side. And it just doesn't work. Now Federation is more on a higher level of the stack. So the way, for example, it works with Skibo that 25,000 users, they're actually running, I don't remember, 10, 20 different universities on their site. And this instance is distributed over three hosting centers. And what I did is they basically took, take the 20 universities, split them up into three groups and assign them to different hosting centers. This hosting center, university A, B, C and D is running. Other hosting centers, other universities are running. Other universities is its own single entity, own single entrance. But if a user from one university wants to share something with a different university, in the normal web interface and normal sharing dialogue, you can just type in the user of this other university. Even with the latest version, we are developing, even can do auto completion. You can find the user on a different university, you say share. And then those two instances talk to each other. And every time I access a file that's shared with me by someone of a different university, it basically procs it through one next-cloud server to the other, fetches the file. That's all the activities, fetches it back and serves it to the user. Exactly. I, um... I completely understand the question. And I discussed this for five, six years. And I just, I just, I'm not aware of any architecture. And that's the same for next-cloud and own-cloud and PID and CIFI and everyone. I'm not aware of any architecture where you can deploy one instance over three data centers. And you can basically randomly switch them on and off and connect the connection between them and the user has no downtime. I'm not aware of that architecture. And... This would be possible to do with GFS. What? With GFS, I can understand that it's kind of impossible to do with GFS. Replication... Your files are there, your files are saved with your database. Okay. Yeah. I don't want to go super deep in the technology, but just a bit, what it's needed is a kind of replication system that replicates transactions and a transaction has to be a combination of file system changes and database changes. You have to bundle file system changes and database together and replicate those in an atomic way around. And I'm not aware of... I looked into Hadoop for a while because they actually have like... But I'm not aware of something like that. Replicating files is fine. That's not a problem. Also replicating databases is also fine. But you have to replicate them together. Let's talk about it later. It's... It's... It's... It's tricky. It's tricky. Yeah, but then... Still with... Still with... No, still with active passive. It's still possible that an SQL query is replicated and the file change is not. And then you have a broken instance on the other side. Yeah, let's see. Just a bit faster. Oh wow, there are lots of questions. We should move on. Sorry. What would you recommend for a storage system? Yeah, we answered that. Okay. Maybe Docker? I don't know. I mean... One strategy, another change you want to do with next cloud, because we fight it with Linux packages for a long, long time. It's really hard. We definitely want to distribute next cloud more in container form. Maybe Docker. But of course the challenge is then, again, people want to have different setups. Not every container is the same. If you look into the GitHub organization, we already have two official Docker containers. That's the mess already starting. So it's... Are you so good on YouTube now? Yes. If you need to spin out a lot of YouTube since the up and down of time or... Question and... Sorry. Question and... The question... So we've done with Docker and it's really, really easy now. It's been operating, there's always a... Operating and training, man. Yeah, but it's also... The support doesn't depend on the outside, but you have to play it out and it's easy to work in construction. Because that's the motivation for us now. Because we need an specific container, for your instance, for cities. And usually you with a Docker container a lot of information within and become the way something like that. You provide a total, I think it's a total, and provisioned by our answer bill, a chef, a soul, a god, a business. Whatever you want. Whatever you feel you can do. Let's spin it. We started with the support of my master, the support of your job. If we run it, you can also use it. That's what we require. That's nothing that we don't have to do with the support. That would really not tell you if you don't use it. That's fine. Yeah. It's like a question, this is ordered on an HP or something like that. We say, yeah, we don't care about how many we require. It's the layout of the only we buy as on HP version. If you take our code and put it into a Docker container, then it's supported, but it's your container. Virtualization has similar problems, impact can be Docker instance, because if you cannot use the new system, then suddenly you start saying, okay, we have a Docker system, we cannot use it, make it in real world again, and then we can support it, or it's the same as the Docker. You won't hear this from us. I'm also not aware of any bug that could be triggered by a container by using, I don't know. And apache engine X and Docker, it shouldn't matter, it shouldn't matter for the next load, shouldn't matter. Okay, so switch has this interesting setup, I saw this in different, I saw this in other instances too, so some people have the configuration where they're, for example, the load balancer implement a rule based on the user agent and say, all the sync line web stuff regress go to this boxes and other regress go to other boxes. You can do that. I'm not really sure why it makes sense. I think it makes sense because from the user's perspective, it doesn't matter if the sync line is a bit slower. Even when I'm facing, I click a button, and it takes more immediately. If my web, if my sync line takes two more minutes to synchronize the change that you made in the chat window. Maybe, yeah. Okay. Documentation recommends this rule. Do you support all those or else and why not? As discussed earlier, I mean, most customers want to have a fully supported stack. They want to have support for the hardware, for the operating system, for the database, for the storage, and for the application layer which would be next cloud. And they're only happy if they have the full chain of support for everything. Yeah. And because of that, you usually don't want to use something like OpenSuser where you can't buy commercial support for. From our perspective, it's totally fine. A lot of our developers use OpenSuser. It's totally fine. It's just if you ask us what should we use to have a fully supported platform, fully supported service for everything, then you probably want to go to rel, slash, or Ubuntu server. We'd love to see Object Store. Oncler does not offer a solution. It actually works for our size installation. Also, there's no migration test from NFS to Object Store yet. Oh, God. That's a long topic. Okay. So, first of all, to be a little bit nitpicking here, you can, of course, use an Object Store, even switch using an Object Store, but with an NFS gateway. But, okay, what that probably means is use directly Swift or S3 to directly talk to the Object Store instead of going through NFS first. And this is supported by Next Cloud. Same as Oncler here. But there is some few things to say. And this is a bit more like my personal opinion. I would actually love, this is a great topic for a bit of a discussion. People might think that things are faster because they are an Object Store. But there are just a few things that need to happen. But if you want to handle files, which you want to do, is then what Next Cloud does, we provide a POSIX file system you to the user. So, you have in the web interface, you have directories, you go into directories, out of directly upload files, you have M times, you have MIME types, you have all kinds of things. You have a full POSIX directory tree. And this is also accessible via web.gov. And this is also what we sync to the desktop. We provide a POSIX file system tree to the user. Somewhere in the back end, you have of course data blobs. Could be sectors on a hard disk, or could be objects in an Object Store or something. Where you have key values. You have an ID and data and ID and data. And somehow, between those two things, you need to map this. You need to map like the sectors to the tree structure. And this mapping is usually called a file system. That's what the file system does. It maps like blocks to a tree structure. And this need to happen somewhere. And this can happen in different layers of the stack. You can use an NFS gateway on top of SAF. Then this NFS gateway does exactly that. Or you can use S3 and Swift directly from NextCloud. And then NextCloud actually does that. So in this scenario, NextCloud implements a file system. So we have a full file system implemented in PHP in NextCloud. And all the metadata, all the inodes and all the directory structure and everything is then stored in our database. Then we have like this is a folder and there's a subfolder and there's a file in the folder and it has this size and has this m time and so on and so on. And the data of that is then somehow mapped to an object, which is then comes from an Object Store. So in both is possible. I'm personally, even if I of course like NextCloud a lot, I'm not sure if NextCloud is the right place to implement a file system. I would implement it somewhere else. But it's still supported and it works. But what people shouldn't expect is that only because we inject the term Object Store here somehow and do some Swift communication that it somehow is faster than using an NFS server. It's the same work has to happen just in different places of the stack. That's my opinion. I can't say, I mean this leads to really a lot of database load. Then the question is that how do you scale the database? I can't give a good answer to that. So we're actually talking about it. Actually it was on the latest roadmap somehow dropped out that we provide a script to migrate from Object Stores to NFS back and forward. This is something we want to do, that at least people can try it and have some flexibility. I personally wouldn't expect any wonders only because you directly talk with Swift to a self-store. Because the thing is also with Object Stores. I mean the theory is Object Stores is of course that it scales perfectly. You have key values and then you have some kind of mapping of keys to nodes and then you need more storage and more performance, no problem. Just double the number of nodes and you change your mapping and then you have more performance and it scales linearly. But this is only the case of course if you have a complete random access of all the nodes at the same time. Which is in reality not really the case if you talk about a file system because you want to access for example the root node a lot and the sub-sub-sub-sub node not so much. So even with this Object Store you might end up in situations where you have hot spots of access. You might have one self-node that is totally overloaded and the other is not. So I don't know. I'm just a bit critical of this Object Store does wonders theory. I talk too much. Other opinions here? Fredrik you have a lot of opinions about that. Sure you can do that. That's another thing that we discussed and tried in the past. Okay just get rid of the database and store everything in I don't know however as a chasing blob or I don't know in one Object in the Object Store. Okay let's move on. Yeah hardly a lot of balances nobody seems to use them or I think it's I think we already covered that. It's a cover that too. I think the current large recommended architecture that we have here is it only scales to a certain limit. And if you reach if you want to have more users than that then I would recommend the thing that Skibo does break you up into sub instances and then you can scale all every instance on its own also on different hosting centers. And for that you don't really have a limit. At this point there's no limit anymore because the federation basically server to server communication it's linearly. So sure it's double the users, double the shares, double the network load sure but beside that there's no limit. We want to implement this multi-bucket feature that there are I'm not really an expert here but there are some scenarios especially the Amazon S3 implementation I think where if a maximum bucket size in this case it's really bad to have everything in one bucket and because of that we support multi-buckets in the future. But yeah what should we use for multi-side deployments? Yeah we covered that yeah. Scaleability yeah I mean there's absolutely a limit right I mean at the moment a lot of people seems to use the Galera cluster architecture and I haven't really looked into that too much but I assume after a certain number of nodes maybe even more than four already I don't know it doesn't really scale anymore because as I said before this is synchronous replication which means every single change every single insert has to be distributed to all the nodes before the transaction can be completed and if you have more nodes it doesn't scale anymore. So scaling database is hard at some point super expensive or impossible so because of that after a certain time you have to break up instances in smaller ones if you go over this 100,000 user limit that we have here I mean if you modded 100,000 users in one instance I would recommend to break it up into two instances with two different databases. Sure I mean scaling my perspective means perfect scaling perfect scaling my perspective means linear scaling which means you only need double the hardware for double the user not more than double that's the ideal situation but I can't think about a situation where you need less than double It makes sense, it makes sense so NextCloud is running in lots and lots and lots of small isolated processes so every single request going to a NextCloud server is one process because of that it perfectly scales over lots and lots of CPUs and memory What should we use beyond the petabyte to support EOS? Is this EOS? Okay sorry, I didn't get it Let's talk about it during BR Yeah I mean there are solutions like Dcache and EOS which are great and they are great if you're really big and if you really know what you're doing I think Dcache and EOS installation is nothing you can just quickly do it has a lot of complexity to it Yeah I mean it's great technology but it's big guns This is fine as long as it's synchronous if it's standard MySQL MariaDB replication then this will fail Yeah that's fine, that's like Galera Yeah, also Master Slave is fine as long as it's synchronous Like one note of the Galera cluster Yeah I don't know, I can't answer that That's the Galera cluster issue Where should Redis run? It's drawn as a separate box with your Docker deployment Redis and Docker So as I said before we don't really care how you deploy the software you can run it as Docker or not, it's okay So you have two choices you can run a Redis cluster for all your nodes then of course you have some network latency to have all the talk to this central cluster and back and forward or you can run it locally but then you need to have sticky connections where one user always goes through the same instance Thanks a lot for the questions