 Okay, so the Tomcat container, this is where it fits in the architecture, right? The three main pieces for running the HIS-2 is your database, which we looked last week, in your HIS-2 instance, if you like. That's basically what's running inside a Tomcat container. And then there's the reverse proxy, which is running at the front of all of that. We're not really going to look at the reverse proxy till tomorrow. So the Tomcat container is probably the most complicated, I guess, of all the three main instance, the three main containers. And it's also arguably the one that needs to be treated the most carefully. We'll talk a little bit about that as we go on. Just in terms of principle, if you have a DHS-2 instance running in this environment, then we create a new container for it, and we load a Tomcat into that container, and the instance runs on its own in there, with no other things running, just the Tomcat. In the past, I know with a previous, previous incarnation of DHS tools, we had a number of DHS-2 instances running in the same kind of memory in CPU slice. We're not doing that anymore. If you want a new instance, you make a new Tomcat, it runs on its own in there. That keeps it simple in a way. Each Tomcat is set up exactly the same as the other one. A little bit of protection for the containers. They all run with a firewall inside the container, which basically just allows a connection from the proxy server. It also allows a connection from the monitor. As a convention, it doesn't have to be like this, but this is the way we've done it. It's maybe a little bit inflexible, but it's kind of clean and clear. When you make a container, a DHS-2 instance, let's say we made one called HMIS, then you'll have a database called HMIS. It'll be accessed with the database user called HMIS, and then you'll access the web front end to it with a context of HMIS. Once you've picked a name for your DHS-2 instance, we use the same name throughout those four different places. I guess that does actually imply there's some restrictions on what characters and things you can use in a name. I've not really been too careful with that. You just need to don't get too exotic. By default, we're running version 9 of Tomcat. It has a few interesting quirks to it, but it seems to work very well by and large. It's running on open JDK version 8, and this is something that will change at some point, because JDK 8 is now pretty old. The default JDK I think you get with the Bundy 2004 is probably 11. I can't remember. I think it's version 11. We understand that the latest versions of DHS-2 will run on 11, but I need to get it in black and white from the devs fast before we move from version 8. It's going to be a little bit tricky to move, because if I make version 11 the default, then it will depend on which version of DHS-2 warfile you've installed, as to whether it will work or not. The package, I'm going to show you this in a bit more detail later anyway, but the package that we're using to run it, the actual runtime is the JRE runtime. Morton was asking me earlier this morning why we don't use the JDK, rather than just the JRE. I guess the JRE is smaller, it's a more minimal runtime. There may be a couple of good reasons to install the full JDK, particularly for debugging purposes. Perhaps we'll revise that. The Tomcat 9 itself is running under system D. That's basically the mechanism for starting and stopping. The operating system uses for starting and stopping and running the container, or running the Tomcat executable. That has a couple of implications for things like file permissions and logging, which we'll look at going forwards. DHS-2 home directory. If any of you have set up DHS-2 in a more manual way, one of the things you'll know that you needed to set is your environment variable for DHS-2 home. If you're looking for where it is with this setup, it defaults to opt DHS-2, so that's where you'll find your DHS-conf file and things like that. Again, we'll have a look at the files. In fact, we don't have to set the environment variable at all, because this is actually the default. If you don't set DHS-2 home, then it's assumed to be in opt DHS-2. Perhaps before we go on to customize, let's make a container. Just so that we all move this out of the way. Click on here. All right, so the way you create a container is that DHS-2 create an instance. I'll call it Bob. Now before we do that, let me just have a look at my IP address. We'll put it on my IP address, 192.168.0. I can see the next one free is 13, and we'll tell it to use the Postgres container for its database. If you just go DHS-2 create instance Bob, you will try and guess a reasonable IP address and default the database container for you, but I prefer to be explicit. Okay, this takes about a minute or two to create. We went through this process before. I probably should have sued you. You can see it's created the database. In fact, it's created a user, a database user called Bob. It's created a database called Bob. It's now creating a container called Bob. I hope it's a little bit quiet. It's going to install some stuff into that container. Basically, it will install Tomcat 9 and maybe two or three other packages, but not very much at all. Then it does a few little tweaks, which I'll show you. There's the open JDK8JRE. It's done. Usually what we do now, we could run anything in there, but we use it for running a DHS-2 warfile. Let's pop a warfile in there so we can talk a bit about how you pop a warfile in there. Moment, I'm just going to do it afterwards. I'll explain why I do it like this. I'll be the link address of 235, which we're all waiting for the next release, and I can phis to deploy a warfile from a link. What's the link? That's the link. Deploy it where? Ploy it into Bob. Okay, so what we do with downloading the warfile, it actually is going to unzip it and check it to make sure that the download wasn't corrupted. Sometimes we see people have errors by deploying corrupt warfiles. It's not very common, but it does happen from time to time. As I was saying, when we created this thing, let's go and have a look at the database quickly. You know where your database container is now. It should have created a database called Bob. There it is, and owned by a user called Bob. That's the other thing it should have done is it should have created on the reverse proxy. Talk more about your reverse proxy tomorrow. It should have created a file in a folder called upstream with the name of the container as well. We're basically just saying if somebody comes to my proxy with slash Bob, it's going to push it through to my homecat at the back there. Those are the three things that happen when you do DHIS to create instance. Okay, let's get this slide. So summarizing, after you've installed an instance like that and you've deployed Tomcat onto it, there's a few things that you're typically going to want to have to make some little tweaks on. Java opts, right? That's the sort of parameters for your JVM, the place where you should adjust those or fiddle with them is in etc default Tomcat 9. That's the directory that Ubuntu gives us by default when it installs Tomcat. Probably the most important thing you need to do, set in there's your heap size and that kind of depends quite a lot on the number of users that you're expecting, the kind of instance that you're running. Often you may have to have a go pick a heap size and then adjust it a bit as you go along. There are a few other suggestions in that file, which I'll show you in a second, which you could also tweak. I've tried to make a few helpful comments. Let's have a quick look in that file then. In there we can see, here this is commenting out, right? Your heap settings. If you don't set anything, if you leave it like this and this is what you'll get by default, then your JVM has to design. The algorithm for doing this has changed a little bit over the years, but I understand what we'll do currently. It's going to look at however much RAM that there is available on the system and it's going to pick a quarter of that, which is fine if you're running one instance. If you're running six or seven instances, then you obviously cannot give each one a quarter of the RAM. You have to tone it down a bit. So in production, certainly it's better to set this thing explicitly. Need to make a few calculations. In my case, I think this machine, I can't remember off hand. I think this machine has 16 gigabytes of RAM of which I gave eight gigabytes to Tomcat. I mean, to Postgres, probably a sensible heap to run with would be something like four gig. That'll be a bit low for a production instance. It's a good idea, yet with a server, to set the maximum heap size, that's the first setting, to set the minimum heap size the same as the maximum. The reason for that really is that it gives the garbage collector a little bit of an easier job. The garbage collector typically does two things. It basically cleans up the memory. If your Java application has created objects on the heap, then every now and again, the garbage collector scans the heap and gets rid of them. What it will also do is, after deallocating those objects off the heap, it will try to reduce the size of the heap down towards what you specified as the minimum heap size. So that's a second operation that it's doing. First of all, it's scanning and deallocating objects. And then secondly, it's trying to shrink the heap down. If you set the minimum size the same as the maximum size, then it's not going to bother trying to shrink the heap. It makes the Java collection a little bit simpler to do, a little bit faster, at the cost of which you basically have that memory reserved. There's a couple of other settings I've put in here. This one apparently is important. You can go read that stack overflow post yourself. A lot of the Java cryptography functions depends on a source of randomness from somewhere. Traditionally, that would come from a UNIX pseudo device called DevRandom. The problem with randomness is that it's actually a limited resource. The sources of randomness that you get out of DevRandom comes from things like the CPU clock and if it's on a laptop, you're picking up your mouse movements and things like that, uses all those things to create randomness. If you have a lot of containers running, you can actually run out of random numbers in that file and you have to wait for more randomness. And that's particularly a problem with virtual machines and with containers because they don't have that hardware, those hardware randomness sources, the jitter on the network interface and things like that. It has happened. They've seen it on occasion. Your Tomcat can simply just stop. It won't go anywhere because it's blocked, trying to read random numbers out of DevRandom. So setting that source of randomness to DevURandom rather than random basically just ensures that that operation will never block. It's quite an important thing to do as I say on virtual machines. It's not necessary typically on physical hardware. The garbage collector, you probably shouldn't go fiddling down here unless you know what you're doing. The most kind of modern garbage collector, I suppose the one with supposedly all the best features is something called G1. And that's what this line does. It specifies to use G1. G1 GC, that's a G1 garbage collector. If you were using Java version 11 or Java 9, then G1 would be there as a default because we're using using Java 8. We have to actually tell it to use G1. There's some circumstances where the older parallel GC might actually perform better than G1. I've certainly seen those cases. A lot of times, I mean there have been a few buggy parts of DHS2 where we're allocating much too much heat memory. So the garbage collector is very, very busy. And sometimes with G1, the process of cleaning up that garbage uses much more CPU than the older garbage collector. So there's a possibility to change your garbage collector algorithm. You just comment this one out and comment that one. But in most cases, I would suggest you just leave that alone. There's another couple of options to fiddle around with your garbage collector here. This is really a couple of settings which you're basically filling your JVM and it's allowed to stop everything for one and a half seconds while it cleans up garbage. That's again a useful setting if your Tomcat is doing quite a lot of garbage collection. And I would probably just leave that the way it is. It's going to work for you in most cases. The very last line has got to do with if you're installing a full file like Glowroot. We need to specify on the command line the agent. We're actually going to do that later this morning. So yeah, the main thing in this file after you've installed is you probably want to go and set your heap size. What you set depends on how many containers you have, how much memory you have available, how much you've already allocated to Postgres. You've got to do those calculations to see what you've got left. I used to run Tomcat with very very large heaps. There are some cases with very busy servers handling many, many parallel connections. You might need very big heaps. Typically you wouldn't have heaps much over 32 gig, though I think we do have a couple running with 48 gig. Then it needs to be, right? Start with a reasonable side heap and if you find you have problems then you might have to increase that. Okay, so tweaks I've mentioned. After you've installed you probably want to go in there and set your, oops, set your heap size. Other files, Tomcat9server.xml is one worth looking at. The main thing you might think of looking at in there is the size of the thread pool. Let's have a quick look in that file. VI. You don't have to use VI. You can use nano. I just like VI. Okay, this file is a little bit customized. It's actually quite a lot customized from the default server.xml that you get. A few things in here worth noting, I guess. So a port equals minus one. It's about Tomcat's control ports. We've enabled Tomcat users. This is basically for the Tomcat manager application. The username and password on that. This one here, the Tomcat thread pool. You can see it's set maximum number of threads of 100 and minimum number of threads of 10. You can adjust that in both directions. If you've got a small instance, but often people will create a staging instance or a test instance. Now that doesn't need to have 100 threads. We work quite happily with 10 threads. So you could reduce the size of the thread pool. That'll save a few resources. If you're having to deal with a lot of concurrency and you're finding some bottleneck, it's possible to increase the number of threads from here. Typically, you've got two places where you can unthrottle your concurrency. I guess one is on the database connection pool and the other is this Tomcat thread pool. There may be some instances where if you find all your Tomcat threads are busy and connections are not being made, increasing the maximum threads can help. You've got to be careful because it could also make it worse if you're using all your CPU already or all your memory and you're just adding more threads. Sometimes you're not attacking the source of the problem. Either way, this is what you get the default thread pool of 100. The thread pool of 100 may not suit every particular environment. There's nothing else in here. Generally, you'd need to change. I accept this one. I've printed out the Tomcat access log. Basically, because you're going to get access logging on your property, whether it's a actually two or whether it's NGINX, you don't also have to log those same accesses on Tomcat. Sometimes you want to, if you're debugging, if you find for some reason you can't access your Tomcat and you want to see that the connections are actually being made, it can be a good idea to just uncomment these lines here. We get rid of this comment, down to this comment. That'll enable access logging on the Tomcat itself. Typically, I would only do that if I'm trying to discover a problem. Having satisfied myself that the requests are actually getting through to Tomcat, I turn off the logging again there. I need to log all those requests twice. This is a interesting recent addition. This is a bit of security, really. Now, by default, when you get an error page on Tomcat, it shows you quite a bit of information about the Tomcat server, which is typically not necessary. That's probably easier if I show you the effect of that. Here's my instance here. If I were to go to HMS, I try to get to an error page. No, no, no. That's not going to get me an error. Going to a URL doesn't exist. Oh, I'm getting a knock found. That's not giving me an error. I'm struggling to get an error. We've got an instance here called test2. It's giving me a 404 knock found. Note that this page is not telling me that it's getting served up by Tomcat 9. It's not telling me that it's running on Ubuntu. It's simply telling me that I've got a 404. That is because we have got those couple of lines in server.xml, which tells it not to produce those bits of information. I've mentioned before that the dhs.conf is in opt dhs2. What I've done with that file is I've taken the configuration reference from here. I've put it in there just as a starter, rather than you just having a blank file to start with. You should always still go and read the documentation here, because the reference file that you have in there may be out of date with new versions of dhis. It's always better to read the reference, but I thought it's still helpful that you have it. Let's go to it. Instead of giving you a blank file here, you've got a bit of a starter file. The database connection stuff is all automatically for you. You don't need to worry about what happens when the instance was created. It's created you the database Bob, the user Bob, a funny password. Password's not really important. You don't ever really need to use this password. Only Tomcat needs to know it. A few things that you can peek in this file. The connection pool. This is the number of connections that dhs2 makes to Postgres, Morton, a note. We need to make this default exactly incorrect. Default is actually 80. If you don't set the size of the connection pool, you actually will get a connection pool size of 80. 80 is pretty random. It doesn't necessarily going to fit all situations. Again, if you've got a small test instance, you can make the connection pool size down to 10. If you've got a very busy instance with a lot of concurrency, you might find that connection pool size is too small. You might increase it above 100. You've got to be careful if you keep increasing the size of the connection pool and you have a number of different instances. Each one have their own connection pool. On the Postgres server itself, you might need to at some stage increase the number of maximum connections. Okay, a couple of other settings in here worth noting. Yeah, this is important setting really. I should actually uncomment it in default. This is server side cache for your analytics. If you don't have this set, then every time you make an analytics query, it makes a backend SQL query to the database. If you set a cache on here, then some of those queries to the backend database get cached. Release is quite a bit of load on the server. There's a few other things in here I typically change. This is another important thing. I think particularly with tracker based systems, systems where you're dealing with individual data, this is a session timer. That's the time where if you log into DHIS2, then you go off and have your lunch and come back and start working again, you'll find whether you've been logged out or not. The time that you remain logged in until you actually touch something, click on something new or whatever it might be. Default session timeout on DHIS2, I think is actually ridiculously high, 3600 seconds. That's an hour. 60 seconds times 60 minutes. An hour is kind of long procession timeout. I think particularly for a system with fairly sensitive data, 10 minutes is probably much more reasonable. That means if you don't touch your computer for 10 minutes, then you'll find your session will get timed out. You'll have to log in again. Those are a few of the bits that you might want to do inside DHIS.conf. There are numerous other options in there. I think there are even some options which are not properly documented on the reference page that we need to update. Those are probably the main ones that you'd need to set on most systems. A little bit of word about security. The Tomcat container made the point here that it's perhaps the weakest link in your whole setup. Your operating system, your web reverse proxy, your database, all of those things, their default security settings are probably quite good. Inside your Tomcat, you're running this sprawling massive DHIS2 application. I forget what the current size is. It's like 270 megabytes zipped or something. Very large application. Some of the code in it is very old, going back 10 years or more. It's including many, many libraries by default. At any moment in time, you might find that a vulnerability gets introduced into the web application. We work really hard to try to reduce the chance of that happening, but it can happen. It has happened before as a result of using a library which a vulnerability was discovered on. It wasn't actually the fault of DHIS2 coders, but it meant that Tomcat containers running DHIS2 and other things around the world all got exposed to this struts exploit. The problem is that if a vulnerability is exploited in your Tomcat container, what is typically going to happen is that your container becomes exposed to the user that's running Tomcat. Exposing one up two ways. Either that user is able to read or write files onto the system, or worse, that user is able to execute programs on the system. One of the worst-case scenarios is if you're running your Tomcat as the root user. That was actually quite commonly done. I still find it from time to time. People running the Tomcat as root. The problem with running the Tomcat as root is that if there is a vulnerability, it means that effectively the root user has got access to your container. If that happens, your only option really is to delete the container and start again. But anyway, what we've tried to do, I guess, is to try to reduce the potential damage so that if a vulnerability is exploited and this Tomcat user does effectively get access to your container, the best we can do at this point is to limit what that user can do. The user is limited quite a bit. One thing the Tomcat itself is running inside a container. That means that access to anything is restricted to whatever can be accessed inside the containers. We can't access other containers, things like that. There are only certain files and directories which it's able to see. The web app's directory itself is not able to modify the web application. It's only able to run it. That's because the way that it's deployed, we don't just throw a war file into the web app's directory and allow Tomcat to unzip it. Basically, we unzip the war file, make it owned by root. Even if you have a rogue Tomcat, it's not going to be able to modify the application that's running. I spent quite a bit of time looking at the CIS security benchmark. You can google that CIS security benchmarks. I've gone through the benchmark for Tomcat. We don't implement all of it. We implement quite a lot of it. A few other things I've spoken to already was, yeah, we've allowed file wall connections only from the proxy. I've done you a little bit of a tour already, I think, through the config files for Tomcat. I don't think we need to run through that again. I think what I do is pause at this point.